Mozilla

Articles

Sort by:

View:

  1. Streaming media on demand with Media Source Extensions

    Introducing MSE

    Media Source Extensions (MSE) is a new addition to the Web APIs available in all major browsers.  This API allows for things like adaptive bitrate streaming of video directly in our browser, free of plugins. Where previously we may have used proprietary solutions like RTSP (Real Time Streaming Protocol) and Flash, we can now use simpler protocols like HTTP to fetch content, and MSE to smoothly stitch together video segments of varied quality.

    All browsers that support HTMLMediaElements, such as audio and video tags, already make byte-range requests for subsequent segments of media assets.  One problem is that it’s up to each browser’s implementation of a media engine to decide when and how much to fetch.  It’s also tough to stitch together or deliver smooth playback of segments of different quality without pauses, gaps, flashes, clicks, or pops.  MSE gives us finer-grained control at the application level for fetching and playing back content.

    In order to begin streaming, we need to figure out how to transcode our assets into a meaningful byte format for browsers’ media engines, determine what abstractions MSE provides, and figure out how to instruct the browser to play them back.

    Having multiple resolutions of content allows us to switch between them while maintaining a constant viewport size.  This is known as upscaling, and it’s a common technique for real-time rendering in video games to meet a required frame time.  By switching to a lower quality video resolution, we can meet bandwidth limitations at the cost of fidelity.  The loss of fidelity causes such artifacts as aliasing, in which curves appear jagged and blocky.  This technique can often be seen by Netflix subscribers during peak viewing hours.

    Rather than having an advanced protocol like RTSP handle bandwidth estimates, we can use a simpler network protocol like HTTP and move the advanced logic up one level into the application logic.

    Transcoding

    My recommended tools, ffmpeg and Bento4, are both free and open-source software (FOSS). ffmpeg is our Swiss army knife of transcoding, and Bento4 is a collection of great tools for working with mp4.  While I’m partial to non-licensed codecs like webm/vp8-9/opus, current browser support for those containers and codecs is rather poor, so in this post we’ll just be working with mp4/h.264/aac.  Both of the tools I’m working with are command line utilities; if you have nice GUI tools in your asset pipeline you’d like to recommend to our readers, let us know in the comments below.

    We’ll start with a master of some file, and end up transcoding it into multiple files each of smaller resolutions, then segmenting the smaller-res whole files into a bunch of tiny files.  Once we have a bunch of small files (imagine splitting your video into a bunch of 10-second segments), the client can use more advanced heuristics for fetching the preferred next segment.

    MSE multiple resolutions

    Our smaller-res copies of the master asset

     Proper fragmentation

    When working with mp4 and MSE, it helps to know that the mp4 files should be structured so that metadata is fragmented across pieces of the container, and across the actual audio/video streams being fragmented, instead of clustered together.  This is specified in the ISO BMFF Byte Stream Format spec, section 3:

    “An ISO BMFF initialization segment is defined in this specification as a single File Type Box (ftyp) followed by a single Movie Header Box (moov).”

    This is really important: Simply transcoding to an mp4 container in ffmpeg does not have the expected format and thus fails when trying to play back in a browser with MSE.  To check and see if your mp4 is properly fragmented, you can run Bento4’s mp4dump on your mp4.

    If you see something like:

      $ ./mp4dump ~/Movies/devtools.mp4 | head
      [ftyp] size=8+24
        ...
      [free] size=8+0
      [mdat] size=8+85038690
      [moov] size=8+599967
        ...

    Then your mp4 won’t be playable since the [ftyp] “atom” is not followed immediately by a [moov] “atom.”  A properly fragmented mp4 looks something like this —

      $ ./mp4fragment ~/Movies/devtools.mp4 devtools_fragmented.mp4
      $ ./mp4dump devtools_fragmented.mp4 | head
      [ftyp] size=8+28
        ...
      [moov] size=8+1109
        ...
      [moof] size=8+600
        ...
      [mdat] size=8+138679
      [moof] size=8+536
        ...
      [mdat] size=8+24490
        ...
      ...

    — where mp4fragment is another Bento4 utility.  The properly fragmented mp4 has the [ftyp] followed immediately by a [moov], then subsequent [moof]/[mdat] pairs.

    It’s possible to skip the need for mp4fragment by using the -movflags frag_keyframe+empty_moov flags when transcoding to an mp4 container with ffmpeg, then checking with mp4dump:

      $ ffmpeg -i bunny.y4m -movflags frag_keyframe+empty_moov bunny.mp4
    Creating multiple resolutions

    If we want to switch resolutions, we can then run our fragmented mp4 through Bento4’s mp4-dash-encode.py script to get multiple resolutions of our video.  This script will fire up ffmpeg and other Bento4 tools, so make sure they are both available in your $PATH environment variable.

    $ python2.7 mp4-dash-encode.py -b 5 bunny.mp4
    $ ls
    video_00500.mp4 video_00875.mp4 video_01250.mp4 video_01625.mp4 video_02000.mp4
    Segmenting

    We now have 5 different copies of our video with various bit rates and resolutions. To be able to switch between them easily during playback, based on our effective bandwidth that changes constantly over time, we need to segment the copies and produce a manifest file to facilitate playback on the client.  We’ll create a Media Presentation Description (MPD)-style manifest file. This manifest file containing info about the segments, such as the threshold effective bandwidth for fetching the requisite segment.

    Bento4’s mp4-dash.py script can take multiple input files, perform the segmentation, and emit a MPD manifest that most DASH clients/libraries understand.

    $ python2.7 mp4-dash.py --exec-dir=. video_0*
    ...
    $ tree -L 1 output
    output
    ├── audio
    │   └── und
    ├── stream.mpd
    └── video
        ├── 1
        ├── 2
        ├── 3
        ├── 4
        └── 5
    
    8 directories, 1 file

    We should now have a folder with segmented audio and segmented video of various resolutions.

    MSE & Playback

    With an HTMLMediaElement such as an audio or video tag, we simply assign a URL to the element’s src attribute and the browser handles fetching and playback.  With MSE, we will fetch the content ourselves with XMLHttpRequests (XHRs) treating the response as an ArrayBuffer (raw bytes), and assigning the src attribute of the media element to a URL that points to a MediaSource object.  We may then append SourceBuffer objects to the MediaSource.

    Pseudocode for the MSE workflow might look like:

    let m = new MediaSource
    m.onsourceopen = () =>
      let s = m.addSourceBuffer('codec')
      s.onupdateend = () =>
        if (numChunks === totalChunks)
          m.endOfStream()
        else
          s.appendBuffer(nextChunk)
      s.appendBuffer(arrayBufferOfContent)
    video.src = URL.createObjectURL(m)
    

    Here’s a trick to get the size of a file: make an XHR with the HTTP HEAD method.  A response to a HEAD request will have the content-length header specifying the body size of the response, but unlike a GET, it does not actually have a body.  You can use this to preview the size of a file without actually requesting the file contents.  We can naively subdivide the video and fetch the next segment of video when we’re 80% of the way through playback of the current segment.  Here’s a demo of this in action and a look at the code.

    Note: You’ll need the latest Firefox Developer Edition browser to view the demo and test the code. More information below in the Compatibility section. The MSE primer from WebPlatform.org docs is another great resource to consult.

    My demo is a little naive and has a few issues:

    • It doesn’t show how to properly handle seeking during playback.
    • It assumes bandwidth is constant (always fetching the next segment at 80% playback of the previous segment), which it isn’t.
    • It starts off by loading only one segment (it might be better to fetch the first few, then wait to fetch the rest).
    • It doesn’t switch between segments of varying resolution, instead only fetching segments of one quality.
    • It doesn’t remove segments (part of the MSE API), although this can be helpful on memory constrained devices. Unfortunately, this requires you to re-fetch content when seeking backwards.

    These issues can all be solved with smarter logic on the client side with Dynamic Adaptive Streaming over HTTP (DASH).

    Compatibility

    Cross-browser codec support is a messy story right now; we can use MediaSource.isTypeSupported to detect codec support.  You pass isTypeSupported a string of the MIME type of the container you’re looking to play.  mp4 has the best compatibility currently. Apparently, for browsers that use the Blink rendering engine, MediaSource.isTypeSupported requires the full codec string to be specified.  To find this string, you can use Bento4’s mp4info utility:

    ./mp4info bunny.mp4| grep Codec
        Codecs String: avc1.42E01E

    Then in our JavaScript:

    if (MediaSource.isTypeSupported('video/mp4; codecs="avc1.42E01E, mp4a.40.2"')) {
    // we can play this
    }

    — where mp4a.40.2 is the codec string for low complexity AAC, the typical audio codec used in an mp4 container.

    Some browsers also currently whitelist certain domains for testing MSE, or over-aggressively cache CORS content, which makes testing frustratingly difficult.  Consult your browser for how to disable the whitelist or CORS caching when testing.

    DASH

    Using the MPD file we created earlier, we can grab a high quality DASH client implemented in JavaScript such as Shaka Player or dash.js.  Both clients implement numerous features, but could use more testing, as there are some subtle differences between media engines of various browsers.  Advanced clients like Shaka Player use an exponential moving average of three to ten samples to estimate the bandwidth, or even let you specify your own bandwidth estimator.

    If we serve our output directory created earlier with Cross Origin Resource Sharing (CORS) enabled, and point either DASH client to http://localhost:<port>/output/stream.mpd, we should be able to see our content playing.  Enabling video cycling in Shaka, or clicking the +/- buttons in dash.js should allow us to watch the content quality changing.  For more drastic/noticeable changes in quality, try encoding fewer bitrates than the five we demonstrated.

    Shaka Player in Firefox Dev Edition

    Shaka Player in Firefox Developer Edition

    dash.js running in Firefox Developer Edition

    dash.js in Firefox Developer Edition

    In conclusion

    In this post, we looked at how to prep video assets for on-demand streaming by pre-processing and transcoding.  We also took a peek at the MSE API, and how to use more advanced DASH clients.  In an upcoming post, we’ll explore live content streaming using the MSE API, so keep an eye out.  I recommend you use Firefox Developer Edition to test out MSE; lots of hard work is going into our implementation.

    Here are some additional resources for exploring MSE:

  2. Trainspotting: Firefox 39

    Trainspotting is a series of articles highlighting features in the lastest version of Firefox. A new version of Firefox is shipped every six weeks – we at Mozilla call this pattern “release trains.”

    A new version of Firefox is here, and with it come some great improvements and additions to the Web platform and developer tools. This post will call out a few highlights.

    For a full list of changes and additions, take a look at the Firefox 39 release notes.

    DevTools Love

    The Firefox Developer Tools are constantly getting better. We’re listening to developers on UserVoice, and using their feedback to make tools that are more powerful and easier to use. One requested feature was the ability to re-order elements in the Inspector:

    Editing and tweaking CSS Animations is easier than ever – Firefox 39 lets developers pause, restart, slow down, and preview new timings without having to switch applications.

    Menu of animation easing presets in the Inspector

    CSS Scroll Snap Points

    CSS Scroll Snap Points in action

    CSS Scroll Snap Points let web developers instruct the browser to smoothly snap element scrolling to specific points along an axis, creating smoother, easier to interact with interfaces with fewer lines of code.

    Improvements to Firefox on Mac OS X

    Firefox gets some Mac- specific improvements and updates in version 39:

    • Project Silk enabled – Improves scrolling and animation performance by more closely timing painting with hardware vsync. Read more about Project Silk.
    • Unicode 8.0 skin tone emoji – Fixed a bug in the rendering of skin tone modifiers for emoji.
    • Dashed line performance – Rendering of dotted and dashed lines is vastly improved. Check out the fixed bug for more information.

    Service Workers Progress

    Firefox’s implementation of the Service Workers API continues – fetch is enabled for workers and is now generally available to web content, and the Cache and CacheStorage are now available behind a flag.

    There’s lots more changes and improvements in Firefox 39 – check out the Developer Release Notes for developer-oriented changes or the full list of bugs fixed in this release. Enjoy!

  3. Performance Testing Firefox OS With Raptor

    When we talk about performance for the Web, a number of familiar questions may come to mind:

    • Why does this page take so long to load?
    • How can I optimize my JavaScript to be faster?
    • If I make some changes to this code, will that make this app slower?

    I’ve been working on making these types of questions easier to answer for Gaia, the UI layer for Firefox OS, a completely web-centric mobile device OS. Writing performant web pages for the desktop has its own idiosyncrasies, and writing native applications using web technologies takes the challenge up an order of magnitude. I want to introduce the challenges I’ve faced in making performance an easier topic to tackle in Firefox OS, as well as document my solutions and expose holes in the Web’s APIs that need to be filled.

    From now on, I’ll refer to web pages, documents, and the like as applications, and while web “documents” typically don’t need the performance attention I’m going to give here, the same techniques could still apply.

    Fixing the lifecycle of applications

    A common question I get asked in regard to Firefox OS applications:

    How long did the app take to load?

    Tough question, as we can’t be sure we are speaking the same language. Based on UX and my own research at Mozilla, I’ve tried at adopt this definition for determining the time it takes an application to load:

    The amount of time it takes to load an application is measured from the moment a user initiates a request for the application to the moment the application appears ready for user interaction.

    On mobile devices, this is generally from the time the user taps on an icon to launch an app, until the app appears visually loaded; when it looks like a user can start interacting with the application. Some of this time is delegated to the OS to get the application to launch, which is outside the control of the application in question, but the bulk of the loading time should be within the app.

    So window load right?

    With SPAs (single-page applications), Ajax, script loaders, deferred execution, and friends, window load doesn’t hold much meaning anymore. If we could merely measure the time it takes to hit load, our work would be easy. Unfortunately, there is no way to infer the moment an application is visually loaded in a predictable way for everyone. Instead we rely on the apps to imply these moments for us.

    For Firefox OS, I helped develop a series of conventional moments that are relevant to almost every application for implying its loading lifecycle (also documented as a performance guideline on MDN):

    navigation loaded (navigationLoaded)

    The application designates that its core chrome or navigation interface exists in the DOM and has been marked as ready to be displayed, e.g. when the element is not display: none or any other functionality that would affect the visibility of the interface element.

    navigation interactive (navigationInteractive)

    The application designates that the core chrome or navigation interface has its events bound and is ready for user interaction.

    visually loaded (visuallyLoaded)

    The application designates that it is visually loaded, i.e., the “above-the-fold” content exists in the DOM and has been marked as ready to be displayed, again not display: none or other hiding functionality.

    content interactive (contentInteractive)

    The application designates that it has bound the events for the minimum set of functionality to allow the user to interact with “above-the-fold” content made available at visuallyLoaded.

    fully loaded (fullyLoaded)

    The application has been completely loaded, i.e., any relevant “below-the-fold” content and functionality have been injected into the DOM, and marked visible. The app is ready for user interaction. Any required startup background processing is complete and should exist in a stable state barring further user interaction.

    The important moment is visually loaded. This correlates directly with what the user perceives as “being ready.” As an added bonus, using the visuallyLoaded metric pairs nicely with camera-based performance verifications.

    Denoting moments

    With a clearly-defined application launch lifecycle, we can denote these moments with the User Timing API, available in Firefox OS starting with v2.2:

    window.performance.mark( string markName )
    

    Specifically during a startup:

    performance.mark('navigationLoaded');
    performance.mark('navigationInteractive');
    ...
    performance.mark('visuallyLoaded');
    ...
    performance.mark('contentInteractive');
    performance.mark('fullyLoaded');
    

    You can even use the measure() method to create a measurement between start and another mark, or even 2 other marks:

    // Denote point of user interaction
    performance.mark('tapOnButton');
    
    loadUI();
    
    // Capture the time from now (sectionLoaded) to tapOnButton
    performance.measure('sectionLoaded', 'tapOnButton');
    

    Fetching these performance metrics is pretty straightforward with getEntries, getEntriesByName, or getEntriesByType, which fetch a collection of the entries. The purpose of this article isn’t to cover the usage of User Timing though, so I’ll move on.

    Armed with the moment an application is visually loaded, we know how long it took the application to load because we can just compare it to—oh, wait, no. We don’t know the moment of user intent to launch.

    While desktop sites may be able to easily procure the moment at which a request was initiated, doing this on Firefox OS isn’t as simple. In order to launch an application, a user will typically tap an icon on the Homescreen. The Homescreen lives in a process separate from the app being launched, and we can’t communicate performance marks between them.

    Solving problems with Raptor

    Without the APIs or interaction mechanisms available in the platform to be able to overcome this and other difficulties, we’ve build tools to help. This is how the Raptor performance testing tool originated. With it, we can gather metrics from Gaia applications and answer the performance questions we have.

    Raptor was built with a few goals in mind:

    • Performance test Firefox OS without affecting performance. We shouldn’t need polyfills, test code, or hackery to get realistic performance metrics.
    • Utilize web APIs as much as possible, filling in gaps through other means as necessary.
    • Stay flexible enough to cater to the many different architectural styles of applications.
    • Be extensible for performance testing scenarios outside the norm.
    Problem: Determining moment of user intent to launch

    Given two independent applications — Homescreen and any other installed application — how can we create a performance marker in one and compare it in another? Even if we could send our performance mark from one app to another, they are incomparable. According to High-Resolution Time, the values produced would be monotonically increasing numbers from the moment of the page’s origin, which is different in each page context. These values represent the amount of time passed from one moment to another, and not to an absolute moment.

    The first breakdown in existing performance APIs is that there’s no way to associate a performance mark in one app with any other app. Raptor takes a simplistic approach: log parsing.

    Yes, you read that correctly. Every time Gecko receives a performance mark, it logs a message (i.e., to adb logcat) and Raptor streams and parses the log looking for these log markers. A typical log entry looks something like this (we will decipher it later):

    I/PerformanceTiming( 6118): Performance Entry: clock.gaiamobile.org|mark|visuallyLoaded|1074.739956|0.000000|1434771805380
    

    The important thing to notice in this log entry is its origin: clock.gaiamobile.org, or the Clock app; here the Clock app created its visually loaded marker. In the case of the Homescreen, we want to create a marker that is intended for a different context altogether. This is going to need some additional metadata to go along with the marker, but unfortunately the User Timing API does not yet have that ability. In Gaia, we have adopted an @ convention to override the context of a marker. Let’s use it to mark the moment of app launch as determined by the user’s first tap on the icon:

    performance.mark('appLaunch@' + appOrigin)
    

    Launching the Clock from the Homescreen and dispatching this marker, we get the following log entry:

    I/PerformanceTiming( 5582): Performance Entry: verticalhome.gaiamobile.org|mark|appLaunch@clock.gaiamobile.org|80081.169720|0.000000|1434771804212
    

    With Raptor we change the context of the marker if we see this @ convention.

    Problem: Incomparable numbers

    The second breakdown in existing performance APIs deals with the incomparability of performance marks across processes. Using performance.mark() in two separate apps will not produce meaningful numbers that can be compared to determine a length of time, because their values do not share a common absolute time reference point. Fortunately there is an absolute time reference that all JS can access: the Unix epoch.

    Looking at the output of Date.now() at any given moment will return the number of milliseconds that have elapsed since January 1st, 1970. Raptor had to make an important trade-off: abandon the precision of high-resolution time for the comparability of the Unix epoch. Looking at the previous log entry, let’s break down its output. Notice the correlation of certain pieces to their User Timing counterparts:

    • Log level and tag: I/PerformanceTiming
    • Process ID: 5582
    • Base context: verticalhome.gaiamobile.org
    • Entry type: mark, but could be measure
    • Entry name: appLaunch@clock.gaiamobile.org, the @ convention overriding the mark’s context
    • Start time: 80081.169720,
    • Duration: 0.000000, this is a mark, not a measure
    • Epoch: 1434771804212

    For every performance mark and measure, Gecko also captures the epoch of the mark, and we can use this to compare times from across processes.

    Pros and Cons

    Everything is a game of tradeoffs, and performance testing with Raptor is no exception:

    • We trade high-resolution times for millisecond resolution in order to compare numbers across processes.
    • We trade JavaScript APIs for log parsing so we can access data without injecting custom logic into every application, which would affect app performance.
    • We currently trade a high-level interaction API, Marionette, for low-level interactions using Orangutan behind the scenes. While this provides us with transparent events for the platform, it also makes writing rich tests difficult. There are plans to improve this in the future by adding Marionette integration.

    Why log parsing

    You may be a person that believes log parsing is evil, and to a certain extent I would agree with you. While I do wish for every solution to be solvable using a performance API, unfortunately this doesn’t exist yet. This is yet another reason why projects like Firefox OS are important for pushing the Web forward: we find use cases which are not yet fully implemented for the Web, poke holes to discover what’s missing, and ultimately improve APIs for everyone by pushing to fill these gaps with standards. Log parsing is Raptor’s stop-gap until the Web catches up.

    Raptor workflow

    Raptor is a Node.js module built into the Gaia project that enables the project to do performance tests against a device or emulator. Once you have the project dependencies installed, running performance tests from the Gaia directory is straightforward:

    1. Install the Raptor profile on the device; this configures various settings to assist with performance testing. Note: this is a different profile that will reset Gaia, so keep that in mind if you have particular settings stored.
      make raptor
    2. Choose a test to run. Currently, tests are stored in tests/raptor in the Gaia tree, so some manual discovery is needed. There are plans to improve the command-line API soon.
    3. Run the test. For example, you can performance test the cold launch of the Clock app using the following command, specifying the number of runs to launch it:
      APP=clock RUNS=5 node tests/raptor/launch_test
    4. Observe the console output. At the end of the test, you will be given a table of test results with some statistics about the performance runs completed. Example:
    [Cold Launch: Clock Results] Results for clock.gaiamobile.org
    
    Metric                            Mean     Median   Min      Max      StdDev  p95
    --------------------------------  -------  -------  -------  -------  ------  -------
    coldlaunch.navigationLoaded       214.100  212.000  176.000  269.000  19.693  247.000
    coldlaunch.navigationInteractive  245.433  242.000  216.000  310.000  19.944  274.000
    coldlaunch.visuallyLoaded         798.433  810.500  674.000  967.000  71.869  922.000
    coldlaunch.contentInteractive     798.733  810.500  675.000  967.000  71.730  922.000
    coldlaunch.fullyLoaded            802.133  813.500  682.000  969.000  72.036  928.000
    coldlaunch.rss                    10.850   10.800   10.600   11.300   0.180   11.200
    coldlaunch.uss                    0.000    0.000    0.000    0.000    0.000   n/a
    coldlaunch.pss                    6.190    6.200    5.900    6.400    0.114   6.300
    

    Visualizing Performance

    Access to raw performance data is helpful for a quick look at how long something takes, or to determine if a change you made causes a number to increase, but it’s not very helpful for monitoring changes over time. Raptor has two methods for visualizing performance data over time, in order to improve performance.

    Official metrics

    At raptor.mozilla.org, we have dashboards for persisting the values of performance metrics over time. In our automation infrastructure, we execute performance tests against devices for every new build generated by mozilla-central or b2g-inbound (Note: The source of builds could change in the future.) Right now this is limited to Flame devices running at 319MB of memory, but there are plans to expand to different memory configurations and additional device types in the very near future. When automation receives a new build, we run our battery of performance tests against the devices, capturing numbers such as application launch time and memory at fullyLoaded, reboot duration, and power current. These numbers are stored and visualized many times per day, varying based on the commits for the day.

    Looking at these graphs, you can drill down into specific apps, focus or expand your time query, and do advanced query manipulation to gain insight into performance. Watching trends over time, you can even pick out regressions that have sneaked into Firefox OS.

    Local visualization

    The very same visualization tool and backend used by raptor.mozilla.org is also available as a Docker image. After running the local Raptor tests, data will report to your own visualization dashboard based on those local metrics. There are some additional prerequisites for local visualization, so be sure to read the Raptor docs on MDN to get started.

    Performance regressions

    Building pretty graphs that display metrics is all well and fine, but finding trends in data or signal within noise can be difficult. Graphs help us understand data and make it accessible for others to easily communicate around the topic, but using graphs for finding regressions in performance is reactive; we should be proactive about keeping things fast.

    Regression hunting on CI

    Rob Wood has been doing incredible work in our pre-commit continuous integration efforts surrounding the detection of performance regressions in prospective commits. With every pull request to the Gaia repository, our automation runs the Raptor performance tests against the target branch with and without the patch applied. After a certain number of iterations for statistical accuracy, we have the ability to reject patches from landing in Gaia if a regression is too severe. For scalability purposes we use emulators to run these tests, so there are inherent drawbacks such as greater variability in the metrics reported. This variability limits the precision with which we can detect regressions.

    Regression hunting in automation

    Luckily we have the post-commit automation in place to run performance tests against real devices, and is where the dashboards receive their data from. Based on the excellent Python tool from Will Lachance, we query our historical data daily, attempting to discover any smaller regressions that could have crept into Firefox OS in the previous seven days. Any performance anomalies found are promptly reported to Bugzilla and relevant bug component watchers are notified.

    Recap and next steps

    Raptor, combined with User Timing, has given us the know-how to ask questions about the performance of Gaia and receive accurate answers. In the future, we plan on improving the API of the tool and adding higher-level interactions. Raptor should also be able to work more seamlessly with third-party applications, something that is not easily done right now.

    Raptor has been an exciting tool to build, while at the same time helping us drive the Web forward in the realm of performance. We plan on using it to keep Firefox OS fast, and to stay proactive about protecting Gaia performance.

  4. ES6 In Depth: Collections

    ES6 In Depth is a series on new features being added to the JavaScript programming language in the 6th Edition of the ECMAScript standard, ES6 for short.

    Earlier this week, the ES6 specification, officially titled ECMA-262, 6th Edition, ECMAScript 2015 Language Specification, cleared the final hurdle and was approved as an Ecma standard. Congratulations to TC39 and everyone who contributed. ES6 is in the books!

    Even better news: it will not be six more years before the next update. The standard committee now aims to produce a new edition roughly every 12 months. Proposals for the 7th Edition are already in development.

    It is appropriate, then, to celebrate this occasion by talking about something I’ve been eager to see in JS for a long time—and which I think still has some room for future improvement!

    Hard cases for coevolution

    JS isn’t quite like other programming languages, and sometimes this influences the evolution of the language in surprising ways.

    ES6 modules are a good example. Other languages have module systems. Racket has a great one. Python too. When the standard committee decided to add modules to ES6, why didn’t they just copy an existing system?

    JS is different, because it runs in web browsers. I/O can take a long time. Therefore JS needs a module system that can support loading code asynchronously. It can’t afford to serially search for modules in multiple directories, either. Copying existing systems was no good. The ES6 module system would need to do some new things.

    How this influenced the final design is an interesting story. But we’re not here to talk about modules.

    This post is about what the ES6 standard calls “keyed collections”: Set, Map, WeakSet, and WeakMap. These features are, in most respects, just like the hash tables in other languages. But the standard committee made some interesting tradeoffs along the way, because JS is different.

    Why collections?

    Anyone familiar with JS knows that there’s already something like a hash table built into the language: objects.

    A plain Object, after all, is pretty much nothing but an open-ended collection of key-value pairs. You can get, set, and delete properties, iterate over them—all the things a hash table can do. So why add a new feature at all?

    Well, many programs do use plain objects to store key-value pairs, and for programs where this works well, there is no particular reason to switch to Map or Set. Still, there are some well-known issues with using objects this way:

    • Objects being used as lookup tables can’t also have methods, without some risk of collision.

    • Therefore programs must either use Object.create(null) (rather than plain {}) or exercise care to avoid misinterpreting builtin methods (like Object.prototype.toString) as data.

    • Property keys are always strings (or, in ES6, symbols). Objects can’t be keys.

    • There’s no efficient way to ask how many properties an object has.

    ES6 adds a new concern: plain objects are not iterable, so they will not cooperate with the forof loop, the ... operator, and so on.

    Again, there are plenty of programs where none of that really matters, and a plain object will continue to be the right choice. Map and Set are for the other cases.

    Because they are designed to avoid collisions between user data and builtin methods, the ES6 collections do not expose their data as properties. This means that expressions like obj.key or obj[key] cannot be used to access hash table data. You’ll have to write map.get(key). Also, hash table entries, unlike properties, are not inherited via the prototype chain.

    The upside is that, unlike plain Objects, Map and Set do have methods, and more methods can be added, either in the standard or in your own subclasses, without conflict.

    Set

    A Set is a collection of values. It’s mutable, so your program can add and remove values as it goes. So far, this is just like an array. But there are as many differences between sets and arrays as there are similarities.

    First, unlike an array, a set never contains the same value twice. If you try to add a value to a set that’s already in there, nothing happens.

    > var desserts = new Set("🍪🍦🍧🍩");
    > desserts.size
        4
    > desserts.add("🍪");
        Set [ "🍪", "🍦", "🍧", "🍩" ]
    > desserts.size
        4
    

    This example uses strings, but a Set can contain any type of JS value. Just as with strings, adding the same object or number more than once has no added effect.

    Second, a Set keeps its data organized to make one particular operation fast: membership testing.

    > // Check whether "zythum" is a word.
    > arrayOfWords.indexOf("zythum") !== -1  // slow
        true
    > setOfWords.has("zythum")               // fast
        true
    

    What you don’t get with a Set is indexing:

    > arrayOfWords[15000]
        "anapanapa"
    > setOfWords[15000]   // sets don't support indexing
        undefined
    

    Here are all the operations on sets:

    • new Set creates a new, empty set.

    • new Set(iterable) makes a new set and fills it with data from any iterable value.

    • set.size gets the number of values in the set.

    • set.has(value) returns true if the set contains the given value.

    • set.add(value) adds a value to the set. If the value was already in the set, nothing happens.

    • set.delete(value) removes a value from the set. If the value wasn’t in the set, nothing happens. Both .add() and .delete() return the set object itself, so you can chain them.

    • set[Symbol.iterator]() returns a new iterator over the values in the set. You won’t normally call this directly, but this method is what makes sets iterable. It means you can write for (v of set) {...} and so on.

    • set.forEach(f) is easiest to explain with code. It’s like shorthand for:

      for (let value of set)
          f(value, value, set);
      

      This method is analogous to the .forEach() method on arrays.

    • set.clear() removes all values from the set.

    • set.keys(), set.values(), and set.entries() return various iterators. These are provided for compatibility with Map, so we’ll talk about them below.

    Of all these features, the constructor new Set(iterable) stands out as a powerhouse, because it operates at the level of whole data structures. You can use it to convert an array to a set, eliminating duplicate values with a single line of code. Or, pass it a generator: it will run the generator to completion and collect the yielded values into a set. This constructor is also how you copy an existing Set.

    I promised last week to complain about the new collections in ES6. I’ll start here. As nice as Set is, there are some missing methods that would make nice additions to a future standard:

    • Functional helpers that are already present on arrays, like .map(), .filter(), .some(), and .every().

    • Non-mutating set1.union(set2) and set1.intersection(set2).

    • Methods that can operate on many values at once: set.addAll(iterable), set.removeAll(iterable), and set.hasAll(iterable).

    The good news is that all of these can be implemented efficiently using the methods provided by ES6.

    Map

    A Map is a collection of key-value pairs. Here’s what Map can do:

    • new Map returns a new, empty map.

    • new Map(pairs) creates a new map and fills it with data from an existing collection of [key, value] pairs. pairs can be an existing Map object, an array of two-element arrays, a generator that yields two-element arrays, etc.

    • map.size gets the number of entries in the map.

    • map.has(key) tests whether a key is present (like key in obj).

    • map.get(key) gets the value associated with a key, or undefined if there is no such entry (like obj[key]).

    • map.set(key, value) adds an entry to the map associating key with value, overwriting any existing entry with the same key (like obj[key] = value).

    • map.delete(key) deletes an entry (like delete obj[key]).

    • map.clear() removes all entries from the map.

    • map[Symbol.iterator]() returns an iterator over the entries in the map. The iterator represents each entry as a new [key, value] array.

    • map.forEach(f) works like this:

      for (let [key, value] of map)
        f(value, key, map);
      

      The odd argument order is, again, by analogy to Array.prototype.forEach().

    • map.keys() returns an iterator over all the keys in the map.

    • map.values() returns an iterator over all the values in the map.

    • map.entries() returns an iterator over all the entries in the map, just like map[Symbol.iterator](). In fact, it’s just another name for the same method.

    What is there to complain about? Here are some features not present in ES6 that I think would be useful:

    • A facility for default values, like Python’s collections.defaultdict.

    • A helper function, Map.fromObject(obj), to make it easy to write maps using object-literal syntax.

    Again, these features are easy to add.

    OK. Remember how I started this article with a bit about how unique concerns about running in the browser affect the design of JS language features? This is where we start to talk about that. I’ve got three examples. Here are the first two.

    JS is different, part 1: Hash tables without hash codes?

    There’s one useful feature that the ES6 collection classes do not support at all, as far as I can tell.

    Suppose we have a Set of URL objects.

    var urls = new Set;
    urls.add(new URL(location.href));  // two URL objects.
    urls.add(new URL(location.href));  // are they the same?
    alert(urls.size);  // 2
    

    These two URLs really ought to be considered equal. They have all the same fields. But in JavaScript, these two objects are distinct, and there is no way to overload the language’s notion of equality.

    Other languages support this. In Java, Python, and Ruby, individual classes can overload equality. In many Scheme implementations, individual hash tables can be created that use different equality relations. C++ supports both.

    However, all of these mechanisms require users to implement custom hashing functions and all expose the system’s default hashing function. The committee chose not to expose hash codes in JS—at least, not yet—due to open questions about interoperability and security, concerns that are not as pressing in other languages.

    JS is different, part 2: Surprise! Predictability!

    You would think that deterministic behavior from a computer could hardly be surprising. But people are often surprised when I tell them that Map and Set iteration visits entries in the order they were inserted into the collection. It’s deterministic.

    We’re used to certain aspects of hash tables being arbitrary. We’ve learned to accept it. But there are good reasons to try to avoid arbitrariness. As I wrote in 2012:

    • There is evidence that some programmers find arbitrary iteration order surprising or confusing at first. [1][2][3][4][5][6]
    • Property enumeration order is unspecified in ECMAScript, yet all the major implementations have been forced to converge on insertion order, for compatibility with the Web as it is. There is, therefore, some concern that if TC39 does not specify a deterministic iteration order, “the web will just go and specify it for us”.[7]
    • Hash table iteration order can expose some bits of object hash codes. This imposes some astonishing security concerns on the hashing function implementer. For example, an object’s address must not be recoverable from the exposed bits of its hash code. (Revealing object addresses to untrusted ECMAScript code, while not exploitable by itself, would be a bad security bug on the Web.)

    When all this was being discussed back in February 2012, I argued in favor of arbitrary iteration order. Then I set out to show by experiment that keeping track insertion order would make a hash table too slow. I wrote a handful of C++ microbenchmarks. The results surprised me.

    And that’s how we ended up with hash tables that track insertion order in JS!

    Strong reasons to use weak collections

    Last week, we discussed an example involving a JS animation library. We wanted to store a boolean flag for every DOM object, like this:

    if (element.isMoving) {
      smoothAnimations(element);
    }
    element.isMoving = true;
    

    Unfortunately, setting an expando property on a DOM object like this is a bad idea, for reasons discussed in the original post.

    That post showed how to solve this problem using symbols. But couldn’t we do the same thing using a Set? It might look like this:

    if (movingSet.has(element)) {
      smoothAnimations(element);
    }
    movingSet.add(element);
    

    There is only one drawback: Map and Set objects keep a strong reference to every key and value they contain. This means that if a DOM element is removed from the document and dropped, garbage collection can’t recover that memory until that element is removed from movingSet as well. Libraries typically have mixed success, at best, in imposing complex clean-up-after-yourself requirements on their users. So this could lead to memory leaks.

    ES6 offers a surprising fix for this. Make movingSet a WeakSet rather than a Set. Memory leak solved!

    This means it is possible to solve this particular problem using either a weak collection or symbols. Which is better? A full discussion of the tradeoffs would, unfortunately, make this post a little too long. If you can use a single symbol across the whole lifetime of the web page, that’s probably fine. If you end up wanting many short-lived symbols, that’s a danger sign: consider using WeakMaps instead to avoid leaking memory.

    WeakMap and WeakSet

    WeakMap and WeakSet are specified to behave exactly like Map and Set, but with a few restrictions:

    • WeakMap supports only new, .has(), .get(), .set(), and .delete().

    • WeakSet supports only new, .has(), .add(), and .delete().

    • The values stored in a WeakSet and the keys stored in a WeakMap must be objects.

    Note that neither type of weak collection is iterable. You can’t get entries out of a weak collection except by asking for them specifically, passing in the key you’re interested in.

    These carefully crafted restrictions enable the garbage collector to collect dead objects out of live weak collections. The effect is similar to what you could get with weak references or weak-keyed dictionaries, but ES6 weak collections get the memory management benefits without exposing the fact that GC happened to scripts.

    JS is different, part 3: Hiding GC nondeterminism

    Behind the scenes, the weak collections are implemented as ephemeron tables.

    In short, a WeakSet does not keep a strong reference to the objects it contains. When an object in a WeakSet is collected, it is simply removed from the WeakSet. WeakMap is similar. It does not keep a strong reference to any of its keys. If a key is alive, the associated value is alive.

    Why accept these restrictions? Why not just add weak references to JS?

    Again, the standard committee has been very reluctant to expose nondeterministic behavior to scripts. Poor cross-browser compatibility is the bane of Web development. Weak references expose implementation details of the underlying garbage collector—the very definition of platform-specific arbitrary behavior. Of course applications shouldn’t depend on platform-specific details, but weak references also make it very hard to know just how much you’re depending on the GC behavior in the browser you’re currently testing. They’re hard to reason about.

    By contrast, the ES6 weak collections have a more limited feature set, but that feature set is rock solid. The fact that a key or value has been collected is never observable, so applications can’t end up depending on it, even by accident.

    This is one case where a Web-specific concern has led to a surprising design decision that makes JS a better language.

    When can I use collections in my code?

    All four collection classes are currently shipping in Firefox, Chrome, Microsoft Edge, and Safari. To support older browsers, use a polyfill, like es6-collections.

    WeakMap was first implemented in Firefox by Andreas Gal, who went on to a stint as Mozilla’s CTO. Tom Schuster implemented WeakSet. I implemented Map and Set. Thanks to Tooru Fujisawa for contributing several patches in this area.

    Next week, ES6 In Depth starts a two-week summer break. This series has covered a lot of ground, but some of ES6’s most powerful features are yet to come. So please join us when we return with new content on July 9.

  5. ES6 In Depth: Using ES6 today with Babel and Broccoli

    ES6 In Depth is a series on new features being added to the JavaScript programming language in the 6th Edition of the ECMAScript standard, ES6 for short.

    ES6 is here, and people are already talking about ES7, what the future holds, and what shiny features a new standard can offer. As web developers, we wonder how we can make use of it all. More than once, in previous ES6 In Depth posts, we’ve encouraged you to start coding in ES6, with a little help from some interesting tools. We’ve teased you with the possibility:

    If you’d like to use this new syntax on the Web, you can use Babel or Google’s Traceur to translate your ES6 code to web-friendly ES5.

    Today we’re going to show you step-by-step how it is done. The above-mentioned tools are called transpilers. A transpiler is also known as a source-to-source compiler—a compiler that translates between programming languages operating at comparable levels of abstraction. Transpilers let us write code using ES6 while also guaranteeing that we’ll be able to execute the code in every browser.

    Transpilation our salvation

    A transpiler is very easy to use. You can describe what it does in only two steps:

    1. We write code with ES6 syntax.

    let q = 99;
    let myVariable = `${q} bottles of beer on the wall, ${q} bottles of beer.`;
    

    2. We use the code above as input for the transpiler, which will process it and produce the following output:

    "use strict";
    
    var q = 99;
    var myVariable = "" + q + " bottles of beer on the wall, " + q + " bottles of beer."
    

    This is the good old JavaScript we know. It can be used in any browser.

    The internals of how a transpiler goes from input to output are highly complex and fall out of scope for this article. Just as we can drive a car without knowing all the internal engine mechanics, today we’ll leave the transpiler as a black box that is able to process our code.

    Babel in action

    There are a couple of different ways to use Babel in a project. There is a command line tool, which you can use with commands of the form:

    babel script.js --out-file script-compiled.js
    

    A browser-ready version is also available. You can include Babel as a regular JS library and then you can place your ES6 code in script tags with the type "text/babel".

    <script src="node_modules/babel-core/browser.js"></script>
    <script type="text/babel">
    // Your ES6 code
    </script>
    

    These methods do not scale when your code base starts to grow and you start splitting everything into multiple files and folders. At that moment, you’ll need a build tool and a way to integrate Babel with a build pipeline.

    In the following sections, we’ll integrate Babel into a build tool, Broccoli.js, and we’ll write and execute our first lines of ES6 through a couple of examples. In case you run into trouble, you can review the complete source code here: broccoli-babel-examples. Inside the repository you’ll find three sample projects:

    1. es6-fruits
    2. es6-website
    3. es6-modules

    Each one builds on the previous example. We start with the bare minimum and progress to a general solution, which can be used as the starting point of an ambitious project. In this post, we’ll cover the first two examples in detail. After we are done, you’ll be able to read and understand the code in the third example on your own.

    If you are thinking —I’ll just wait for browsers to support the new features— you’ll be left behind. Full compliance, if it ever happens, will take a long time. Transpilers are here to stay; new ECMAScript standards are planned to be released yearly. So, we’ll continue to see new standards released more often than uniform browser platforms. Hop in now and take advantage of the new features.

    Our first Broccoli & Babel project

    Broccoli is a tool designed to build projects as quickly as possible. You can uglify and minify files, among many other things, through the use of Broccoli plugins. It saves us the burden of handling files, directories, and executing commands each time we introduce changes to a project. Think of it as:

    Comparable to the Rails asset pipeline in scope, though it runs on Node and is backend-agnostic.

    Project setup

    Node

    As you might have guessed, you’ll have to install Node 0.11 or later.

    If you are in a unix system, avoid installing from the package manager (apt, yum). That is to avoid using root privileges during installation. It’s best to manually install the binaries, provided at the previous link, with your current user. You can read why using root is not recommended in Do not sudo npm. In there you’ll find other installation alternatives.

    Broccoli

    We’ll set up our Broccoli project first with:

    mkdir es6-fruits
    cd es6-fruits
    npm init
    # Create an empty file called Brocfile.js
    touch Brocfile.js
    

    Now we install broccoli and broccoli-cli

    # the broccoli library
    npm install --save-dev broccoli
    # command line tool
    npm install -g broccoli-cli
    

    Write some ES6

    We’ll create a src folder and inside we’ll put a fruits.js file.

    mkdir src
    vim src/fruits.js
    

    In our new file we’ll write a small script using ES6 syntax.

    let fruits = [
      {id: 100, name: 'strawberry'},
      {id: 101, name: 'grapefruit'},
      {id: 102, name: 'plum'}
    ];
    
    for (let fruit of fruits) {
      let message = `ID: ${fruit.id} Name: ${fruit.name}`;
    
      console.log(message);
    }
    
    console.log(`List total: ${fruits.length}`);
    

    The code sample above makes use of three ES6 features:

    1. let for local scope declarations (to be discussed in an upcoming blog post)
    2. for-of loops
    3. template strings

    Save the file and try to execute it.

    node src/fruits.js
    

    It won’t work yet, but we are about to make it executable by Node and any browser.

    let fruits = [
        ^^^^^^
    SyntaxError: Unexpected identifier
    

    Transpilation time

    Now we’ll use Broccoli to load our code and push it through Babel. We’ll edit the file Brocfile.js and add this code to it:

    // import the babel plugin
    var babel = require('broccoli-babel-transpiler');
    
    // grab the source and transpile it in 1 step
    fruits = babel('src'); // src/*.js
    
    module.exports = fruits;
    

    Notice that we require broccoli-babel-transpiler, a Broccoli plugin that wraps around the Babel library, so we must install it with:

    npm install --save-dev broccoli-babel-transpiler
    

    Now we can build our project and execute our script with:

    broccoli build dist # compile
    node dist/fruits.js # execute ES5
    

    The output should look like this:

    ID: 100 Name: strawberry
    ID: 101 Name: grapefruit
    ID: 102 Name: plum
    List total: 3
    

    That was easy! You can open dist/fruits.js to see what the transpiled code looks like. A nice feature of the Babel transpiler is that it produces readable code.

    Writing ES6 code for a website

    For our second example we’ll take it up a notch. First, exit the es6-fruits folder and create a new directory es6-website using the steps listed under Project setup above.

    In the src folder we’ll create three files:

    src/index.html

    <!DOCTYPE html>
    <html>
      <head>
        <title>ES6 Today</title>
      </head>
      <style>
        body {
          border: 2px solid #9a9a9a;
          border-radius: 10px;
          padding: 6px;
          font-family: monospace;
          text-align: center;
        }
        .color {
          padding: 1rem;
          color: #fff;
        }
      </style>
      <body>
        <h1>ES6 Today</h1>
        <div id="info"></div>
        <hr>
        <div id="content"></div>
    
        <script src="//code.jquery.com/jquery-2.1.4.min.js"></script>
        <script src="js/my-app.js"></script>
      </body>
    </html>
    

    src/print-info.js

    function printInfo() {
      $('#info')
      .append('<p>minimal website example with' +
              'Broccoli and Babel</p>');
    }
    
    $(printInfo);
    

    src/print-colors.js

    // ES6 Generator
    function* hexRange(start, stop, step) {
      for (var i = start; i < stop; i += step) {
        yield i;
      }
    }
    
    function printColors() {
      var content$ = $('#content');
    
      // contrived example
      for ( var hex of hexRange(900, 999, 10) ) {
        var newDiv = $('<div>')
          .attr('class', 'color')
          .css({ 'background-color': `#${hex}` })
          .append(`hex code: #${hex}`);
        content$.append(newDiv);
      }
    }
    
    $(printColors);
    

    You might have noticed this bit: function* hexRange — yes, that’s an ES6 generator. This feature is not currently supported in all browsers. To be able to use it, we’ll need a polyfill. Babel provides this and we’ll put it to use very soon.

    The next step is to merge all the JS files and use them within a website. The hardest part is writing our Brocfile. This time we install 4 plugins:

    npm install --save-dev broccoli-babel-transpiler
    npm install --save-dev broccoli-funnel
    npm install --save-dev broccoli-concat
    npm install --save-dev broccoli-merge-trees
    

    Let’s put them to use:

    // Babel transpiler
    var babel = require('broccoli-babel-transpiler');
    // filter trees (subsets of files)
    var funnel = require('broccoli-funnel');
    // concatenate trees
    var concat = require('broccoli-concat');
    // merge trees
    var mergeTrees = require('broccoli-merge-trees');
    
    // Transpile the source files
    var appJs = babel('src');
    
    // Grab the polyfill file provided by the Babel library
    var babelPath = require.resolve('broccoli-babel-transpiler');
    babelPath = babelPath.replace(/\/index.js$/, '');
    babelPath += '/node_modules/babel-core';
    var browserPolyfill = funnel(babelPath, {
      files: ['browser-polyfill.js']
    });
    
    // Add the Babel polyfill to the tree of transpiled files
    appJs = mergeTrees([browserPolyfill, appJs]);
    
    // Concatenate all the JS files into a single file
    appJs = concat(appJs, {
      // we specify a concatenation order
      inputFiles: ['browser-polyfill.js', '**/*.js'],
      outputFile: '/js/my-app.js'
    });
    
    // Grab the index file
    var index = funnel('src', {files: ['index.html']});
    
    // Grab all our trees and
    // export them as a single and final tree
    module.exports = mergeTrees([index, appJs]);
    

    Time to build and execute our code.

    broccoli build dist
    

    This time you should see the following structure in the dist folder:

    $> tree dist/
    dist/
    ├── index.html
    └── js
        └── my-app.js
    

    That is a static website you can serve with any server to verify that the code is working. For instance:

    cd dist/
    python -m SimpleHTTPServer
    # visit http://localhost:8000/
    

    You should see this:

    simple ES6 website

    More fun with Babel and Broccoli

    The second example above gives an idea of how much we can accomplish with Babel. It might be enough to keep you going for a while. If you want to do more with ES6, Babel, and Broccoli, you should check out this repository: broccoli-babel-boilerplate. It is also a Broccoli+Babel setup, that takes it up at least two notches. This boilerplate handles modules, imports, and unit testing.

    You can try an example of that configuration in action here: es6-modules. All the magic is in the Brocfile and it’s very similar to what we have done already.


    As you can see, Babel and Broccoli really do make it quite practical to use ES6 features in web sites right now. Thanks to Gastón I. Silva for contributing this week’s post!

    Next week, ES6 In Depth starts a two-week summer break. This series has covered a lot of ground, but some of ES6’s most powerful features are yet to come. So please join us when we return with new content on July 9.

    Jason Orendorff

    ES6 In Depth Editor

  6. ES6 In Depth: Symbols

    ES6 In Depth is a series on new features being added to the JavaScript programming language in the 6th Edition of the ECMAScript standard, ES6 for short.

    Note: There is now a Vietnamese translation of this post, created by Julia Duong of the Coupofy team.

    What are ES6 symbols?

    Symbols are not logos.

    They’re not little pictures you can use in your code.

    let 😻 = 😺 × 😍;  // SyntaxError
    

    They’re not a literary device that stands for something else.

    They’re definitely not the same thing as cymbals.

    (It is not a good idea to use cymbals in programming. They have a tendency to crash.)

    So, what are symbols?

    The seventh type

    Since JavaScript was first standardized in 1997, there have been six types. Until ES6, every value in a JS program fell into one of these categories.

    • Undefined
    • Null
    • Boolean
    • Number
    • String
    • Object

    Each type is a set of values. The first five sets are all finite. There are, of course, only two Boolean values, true and false, and they aren’t making new ones. There are rather more Number and String values. The standard says there are 18,437,736,874,454,810,627 different Numbers (including NaN, the Number whose name is short for “Not a Number”). That’s nothing compared to the number of different possible Strings, which I think is (2144,115,188,075,855,872 − 1) ÷ 65,535 …though I may have miscounted.

    The set of Object values, however, is open-ended. Each object is a unique, precious snowflake. Every time you open a Web page, a rush of new objects is created.

    ES6 symbols are values, but they’re not strings. They’re not objects. They’re something new: a seventh type of value.

    Let’s talk about a scenario where they might come in handy.

    One simple little boolean

    Sometimes it would be awfully convenient to stash some extra data on a JavaScript object that really belongs to someone else.

    For example, suppose you’re writing a JS library that uses CSS transitions to make DOM elements zip around on the screen. You’ve noticed that trying to apply multiple CSS transitions to a single div at the same time doesn’t work. It causes ugly, discontinuous “jumps”. You think you can fix this, but first you need a way to find out if a given element is already moving.

    How can you solve this?

    One way is to use CSS APIs to ask the browser if the element is moving. But that sounds like overkill. Your library should already know the element is moving; it’s the code that set it moving in the first place!

    What you really want is a way to keep track of which elements are moving. You could keep an array of all moving elements. Each time your library is called upon to animate an element, you can search the array to see if that element is already there.

    Hmm. A linear search will be slow if the array is big.

    What you really want to do is just set a flag on the element:

    if (element.isMoving) {
      smoothAnimations(element);
    }
    element.isMoving = true;
    

    There are some potential problems with this too. They all relate to the fact that your code isn’t the only code using the DOM.

    1. Other code using for-in or Object.keys() may stumble over the property you created.

    2. Some other clever library author may have thought of this technique first, and your library would interact badly with that existing library.

    3. Some other clever library author may think of it in the future, and your library would interact badly with that future library.

    4. The standard committee may decide to add an .isMoving() method to all elements. Then you’re really hosed!

    Of course you can address the last three problems by choosing a string so tedious or so silly that nobody else would ever name anything that:

    if (element.__$jorendorff_animation_library$PLEASE_DO_NOT_USE_THIS_PROPERTY$isMoving__) {
      smoothAnimations(element);
    }
    element.__$jorendorff_animation_library$PLEASE_DO_NOT_USE_THIS_PROPERTY$isMoving__ = true;
    

    This seems not quite worth the eye strain.

    You could generate a practically unique name for the property using cryptography:

    // get 1024 Unicode characters of gibberish
    var isMoving = SecureRandom.generateName();
    
    ...
    
    if (element[isMoving]) {
      smoothAnimations(element);
    }
    element[isMoving] = true;
    

    The object[name] syntax lets you use literally any string as a property name. So this will work: collisions are virtually impossible, and your code looks OK.

    But this is going to lead to a bad debugging experience. Every time you console.log() an element with that property on it, you’ll be looking a huge string of garbage. And what if you need more than one property like this? How do you keep them straight? They’ll have different names every time you reload.

    Why is this so hard? We just want one little boolean!

    Symbols are the answer

    Symbols are values that programs can create and use as property keys without risking name collisions.

    var mySymbol = Symbol();
    

    Calling Symbol() creates a new symbol, a value that’s not equal to any other value.

    Just like a string or number, you can use a symbol as a property key. Because it’s not equal to any string, this symbol-keyed property is guaranteed not to collide with any other property.

    obj[mySymbol] = "ok!";  // guaranteed not to collide
    console.log(obj[mySymbol]);  // ok!
    

    Here is how you could use a symbol in the situation discussed above:

    // create a unique symbol
    var isMoving = Symbol("isMoving");
    
    ...
    
    if (element[isMoving]) {
      smoothAnimations(element);
    }
    element[isMoving] = true;
    

    A few notes about this code:

    • The string "isMoving" in Symbol("isMoving") is called a description. It’s helpful for debugging. It’s shown when you write the symbol to console.log(), when you convert it to a string using .toString(), and possibly in error messages. That’s all.

    • element[isMoving] is called a symbol-keyed property. It’s simply a property whose name is a symbol rather than a string. Apart from that, it is in every way a normal property.

    • Like array elements, symbol-keyed properties can’t be accessed using dot syntax, as in obj.name. They must be accessed using square brackets.

    • It’s trivial to access a symbol-keyed property if you’ve already got the symbol. The above example shows how to get and set element[isMoving], and we could also ask if (isMoving in element) or even delete element[isMoving] if we needed to.

    • On the other hand, all of that is only possible as long as isMoving is in scope. This makes symbols a mechanism for weak encapsulation: a module that creates a few symbols for itself can use them on whatever objects it wants to, without fear of colliding with properties created by other code.

    Because symbol keys were designed to avoid collisions, JavaScript’s most common object-inspection features simply ignore symbol keys. A for-in loop, for instance, only loops over an object’s string keys. Symbol keys are skipped. Object.keys(obj) and Object.getOwnPropertyNames(obj) do the same. But symbols are not exactly private: it is possible to use the new API Object.getOwnPropertySymbols(obj) to list the symbol keys of an object. Another new API, Reflect.ownKeys(obj), returns both string and symbol keys. (We’ll discuss the Reflect API in full in an upcoming post.)

    Libraries and frameworks will likely find many uses for symbols, and as we’ll see later, the language itself is using of them for a wide range of purposes.

    But what are symbols, exactly?

    > typeof Symbol()
    "symbol"
    

    Symbols aren’t exactly like anything else.

    They’re immutable once created. You can’t set properties on them (and if you try that in strict mode, you’ll get a TypeError). They can be property names. These are all string-like qualities.

    On the other hand, each symbol is unique, distinct from all others (even others that have the same description) and you can easily create new ones. These are object-like qualities.

    ES6 symbols are similar to the more traditional symbols in languages like Lisp and Ruby, but not so closely integrated into the language. In Lisp, all identifiers are symbols. In JS, identifiers and most property keys are still considered strings. Symbols are just an extra option.

    One quick caveat about symbols: unlike almost anything else in the language, they can’t be automatically converted to strings. Trying to concatenate a symbol with strings will result in a TypeError.

    > var sym = Symbol("<3");
    > "your symbol is " + sym
    // TypeError: can't convert symbol to string
    > `your symbol is ${sym}`
    // TypeError: can't convert symbol to string
    

    You can avoid this by explicitly converting the symbol to a string, writing String(sym) or sym.toString().

    Three sets of symbols

    There are three ways to obtain a symbol.

    • Call Symbol(). As we already discussed, this returns a new unique symbol each time it’s called.

    • Call Symbol.for(string). This accesses a set of existing symbols called the symbol registry. Unlike the unique symbols defined by Symbol(), symbols in the symbol registry are shared. If you call Symbol.for("cat") thirty times, it will return the same symbol each time. The registry is useful when multiple web pages, or multiple modules within the same web page, need to share a symbol.

    • Use symbols like Symbol.iterator, defined by the standard. A few symbols are defined by the standard itself. Each one has its own special purpose.

    If you still aren’t sure if symbols will be all that useful, this last category is interesting, because they show how symbols have already proven useful in practice.

    How the ES6 spec is using well-known symbols

    We’ve already seen one way that ES6 uses a symbol to avoid conflicts with existing code. A few weeks ago, in the post on iterators, we saw that the loop for (var item of myArray) starts by calling myArray[Symbol.iterator](). I mentioned that this method could have been called myArray.iterator(), but a symbol is better for backward compatibility.

    Now that we know what symbols are all about, it’s easy to understand why this was done and what it means.

    Here are a few of the other places where ES6 uses well-known symbols. (These features are not implemented in Firefox yet.)

    • Making instanceof extensible. In ES6, the expression object instanceof constructor is specified as a method of the constructor: constructor[Symbol.hasInstance](object). This means it is extensible.

    • Eliminating conflicts between new features and old code. This is seriously obscure, but we found that certain ES6 Array methods broke existing web sites just by being there. Other Web standards had similar problems: simply adding new methods in the browser would break existing sites. However, the breakage was mainly caused by something called dynamic scoping, so ES6 introduces a special symbol, Symbol.unscopables, that Web standards can use to prevent certain methods from getting involved in dynamic scoping.

    • Supporting new kinds of string-matching. In ES5, str.match(myObject) tried to convert myObject to a RegExp. In ES6, it first checks to see if myObject has a method myObject[Symbol.match](str). Now libraries can provide custom string-parsing classes that work in all the places where RegExp objects work.

    Each of these uses is quite narrow. It’s hard to see any of these features by themselves having a major impact in my day-to-day code. The long view is more interesting. Well-known symbols are JavaScript’s improved version of the __doubleUnderscores in PHP and Python. The standard will use them in the future to add new hooks into the language with no risk to your existing code.

    When can I use ES6 symbols?

    Symbols are implemented in Firefox 36 and Chrome 38. I implemented them for Firefox myself, so if your symbols ever act like cymbals, you’ll know who to talk to.

    To support browsers that do not yet have native support for ES6 symbols, you can use a polyfill, such as core.js. Since symbols are not exactly like anything previously in the language, the polyfill isn’t perfect. Read the caveats.

    Next week, we’ll have two new posts. First, we’ll cover some long-awaited features that are finally coming to JavaScript in ES6—and complain about them. We’ll start with two features that date back almost to the dawn of programming. We’ll continue with two features that are very similar, but powered by ephemerons. So please join us next week as we look at ES6 collections in depth.

    And, stick around for a bonus post by Gastón Silva on a topic that isn’t an ES6 feature at all, but might provide the nudge you need to start using ES6 in your own projects. See you then!

  7. Build an HTML5 game—and distribute it

    Last year, Mozilla and Humble Bundle brought great indie titles like FTL: Faster Than Light, Voxatron, and others to the Web through the Humble Mozilla Bundle promotion.  This year we plan to go even bigger with developments in JavaScript such as support for SIMD and SharedArrayBuffer.  Gaming on the Web without plugins is great; the user doesn’t have to install anything they don’t want, and if they love the game, they can share a link on their social media platform du jour.  Imagine the kind of viral multiplayer networking possibilities!

    Lately, I’ve been focusing on real-time rendering with WebGL and while it is quite powerful, I wanted to take a step back and look at the development of game logic and explore various distribution channels.  I’ve read a few books on game development, but I most recently finished Build an HTML5 Game by Karl Bunyan and thought I’d share my thoughts on it.  Later, we’ll take a look at some alternative ways other than links to share and distribute HTML5 games.

    Book review: Build an HTML5 Game

    Build an HTML5 Game (BHG) is meant for developers who have programmed before; have written HTML, CSS, and JavaScript; know how to host their code from a local server; and are looking to make a 2D casual game. The layout presents a good logical progression of ideas, starting small, and building from work done in previous chapters. The author makes the point that advanced 3D visuals are not discussed in this book (as in WebGL). Also, the design of familiar genres of game play mechanics are avoided. Instead, the focus is on learning how to use HTML5 and CSS3 to replace Flash for the purpose of casual game development. The game created throughout the chapters is a bubble shooter (like the Bust A Move franchise). Source code, a demo of the final game itself, and solutions to further practice examples can be found online at: buildanhtml5game.com.

    bhg_cover

    I would recommend this book to any junior programmer who has written some code before, but maybe has trouble breaking up code in to logical modules with a clean separation of concerns.  For example, early on in my programming career, I suffered from “monolithic file” syndrome.  It was not clear to me when it made sense to split up programs across multiple files and even when to use multiple classes as is typical in object-oriented paradigms.  I would also recommend this book to anyone who has yet to implement their own game.

    The author does a great job breaking up the workings of an actual playable game into the Model-View-Controller (MVC) pattern.  Also, it’s full of source code, with clearly recognizable diffs of what was added or removed from previous examples.  If you’re like me and like to follow along writing the code from technical books while reading them, the author of this book has done a fantastic job making it easy to do.

    The book is a great reference. It exposes a developer new to web technologies to the numerous APIs, does a good job explaining when such technologies are useful, and accurately weighs pros and cons of different approaches.  There’s a small amount of trigonometry and collision detection covered; two important ideas that are used frequently in game development.

    Another key concept that’s important to web development in general is graceful degradation.  The author shows how Modernizr is used for detecting feature support, and you even implement multiple renderers: a canvas one for more modern browsers, with a fallback renderer that animates elements in the DOM.  The idea of multiple renderers is an important concept in game development; it helps with the separation of concerns (separating rendering logic from updating the game state in particular); exposes you to more than one way of doing things; and helps you target multiple platforms (or in this case, older browsers). Bunyan’s example in this particular case is well designed.

    BHG

    Many web APIs are covered either in the game itself or mentioned for the reader to pursue.  Some of the APIs, patterns, and libraries include: Modernizr, jQuery, CSS3 transitions/animations/transforms, Canvas 2D, audio tags, Sprite Atlases, DOM manipulation, localStorage, requestAnimationFrame, AJAX, WebSockets, Web Workers, WebGL, requestFullScreen, Touch Events, meta viewport tags, developer tools, security, obfuscation, “don’t trust the client” game model, and (the unfortunate world of) vendor prefixes.

    The one part of the book I thought could be improved was the author’s extensive use of absolute CSS positioning.  This practice makes the resulting game very difficult to port to mobile screen resolutions. Lots of the layout code and the collision detection algorithm assume exact widths in pixels as opposed to using percentages or newer layout modes and measuring effective widths at run time.

    Options for distributing your game

    Now let’s say we’ve created a game, following the content of Build an HTML5 Game, and we want to distribute it to users as an app.  Personally, I experience some form of cognitive dissonance here; native apps are frequently distributed through content silos, but if that’s the storefront where money is to be made then developers are absolutely right to use app stores as a primary distribution channel.

    Also, I get frequent questions from developers who have previously developed for Android or iOS looking to target Firefox OS. They ask,“Where does my binary go?” — which is a bit of a head-scratcher to someone who’s familiar with standing up their own web server or using a hosting provider.  For instance, one of the more well known online storefronts for games, Steam, does not even mention HTML5 game submissions!

    A choice of runtimes

    I’d like to take a look at two possible ways of “packaging” up HTML5 games (or even applications) for distribution: Mozilla’s Web Runtime and Electron.

    Mozilla’s Web Runtime allows a developer with no knowledge of platform/OS specific APIs to develop an application 100% in HTML5.  No need to learn anything platform specific about how windows are created, how events are handled, or how rendering occurs.  There’s no IDE you’re forced to use, and no build step.  Unlike Cordova, you’re not writing into a framework, it’s just the Web.  The only addition you need is an App Manifest, which is in the standards body within the W3C currently.

    An example manifest from my IRC app:

    
    {
      "name": "Firesea IRC",
      "version": "1.0.13",
      "developer": {
        "name": "Mozilla Partner Engineering",
        "url": "https://github.com/nickdesaulniers/fxos-irc/graphs/contributors"
      },
      "description": "An IRC client",
      "launch_path": "/index.html",
      "permissions": {
        "tcp-socket": {
          "description": "tcp"
        },
        "desktop-notification": {
          "description": "privMSGs and mentions"
        }
      },
      "icons": {
        "128": "/images/128.png"
      },
      "type": "privileged"
    }
    

    Applications developed with Mozilla’s Web Runtime can be distributed as links to the manifest to be installed with a snippet of JavaScript called “hosted apps,” or links to assets archived in a zip file called “packaged apps.”  Mozilla will even host the applications for you in https://marketplace.firefox.com, though you are free to host your apps yourself.  Google is also implementing the W3C manifest spec, though there are a few subtleties between implementations, currently, such as having a launcher rather than a desktop icon.

    Here’s the snippet of JavaScript used to install a hosted app:

    
    var request = window.navigator.mozApps.install(manifestUrl);
    request.onsuccess = function () {
      console.log('Installed!');
    };
    request.onerror = function () {
      console.error(this.error.name);
    };
    

    A newer io.js (Node.js fork)-based project is Electron, formerly known as Atom Shell and used to build projects like the Atom code editor from GitHub and the Visual Studio Code from MicrosoftElectron allows for more flexibility in application development; the application is split into two processes that can post messages back and forth.  One is the browser or content process, which uses the Blink rendering engine (from Chromium/Chrome), and the main process which is io.js.  All of your favorite Node.js modules can thus be used with Electron.  Electron is based off of NW.js (formerly node-webkit, yo dawg, heard you like forks) with a few subtleties of its own.

    Electron

    Once installed, Mozilla’s Web Runtime will link against code from an installed version of Firefox and load the corresponding assets.  There’s a potential tradeoff here.  Electron currently ships an entire rendering engine for each and every app; all of Blink.  This is potentially ~40MB, even if your actual assets are significantly smaller.  Web Runtime apps will link against Firefox if it’s installed, otherwise will prompt the user to install Firefox to have the appropriate runtime.  This cuts down significantly on the size of the content to be distributed at the cost of expecting the runtime to already be installed, which may or may not be the case.  Web Runtime apps can only be installed in Firefox or Chromium/Blink, which isn’t ideal, but it’s the best we can do until browser vendors agree on and implement the standard.  It would be nice to allow the user to pick which browser/rendering engine/environment to run the app in as well.

    While I’m a big fan of the Node.js ecosystem, I’m also a big fan of the strong guarantees of security provided by the browser.  For instance, I personally don’t trust most applications distributed as executables.  Call me paranoid, but I’d really prefer if applications didn’t have access to my filesystem, and only had permission to make network requests to the host I navigated to.  By communicating with Node.js, you bypass the strong guarantees provided by browser vendors.  For example, browser vendors have created the Content Security Policy (CSP) as a means of shutting down a few Cross Site Scripting (XSS) attack vectors.  If an app is built with Electron and accesses your file system, hopefully the developer has done a good job sanitizing their inputs!

    On the other side of the coin, we can do some really neat stuff with Electron.  For example, some of the newer Browser APIs developed in Gecko and available in Firefox and Firefox OS are not yet implemented in other rendering engines.  Using Electron and its message-passing interface, it’s actually possible to polyfill these APIs and use them directly, though security is still an issue.  Thus it’s possible to more nimbly implement APIs that other browser vendors haven’t agreed upon yet.  Being able to gracefully fallback to the host API (rather than the polyfill) in the event of an update is important; let’s talk about updates next.

    Managing updates

    Updating software is a critical part of security.  While browsers can offer stronger security guarantees, they’re not infallible.  “Zero days” exist for all major browsers, and if you had the resources you could find or even buy knowledge of one.  When it comes to updating applications, I think Mozilla’s Web Runtime has a stronger story: app assets are fetched every time, but defer to the usual asset caching strategy while the rendering engine is linked in.  Because Firefox defaults to auto updating, most users should have an up-to-date rendering engine (though there are instances where this might not be the case).  The runtime should check for updates for packaged apps daily, and updates work well.  For Electron, I’m not sure that the update policy for apps is built in.  The high value of exploits for widely installed software like rendering engines worries me a bit here.

    Apps for Mozilla’s Runtime work currently anywhere where Firefox for desktop or mobile does: Windows, OS X, Linux, Android, or Firefox OS.  Electron supports desktop platforms like Windows, OS X, or Linux.  Sadly, neither option currently supports iOS devices.  I do like that Electron allows you to generate actual standalone apps, and it looks like tools to generate the expected .msi or .dmg files are in the works.

    Microsoft’s manifold.js might be able to bridge the gap to all these different platforms and more.  Though I ran into a few road bumps while trying it out, I would be willing to give it another look.  For me, one potentially problematic issue is requiring developers to generate builds for specific platforms.  Google’s Native Client (NaCl) had this issue where developers would not build their applications for ABIs (application binary interfaces) they did not possess hardware to test for, and thus did not generate builds of their apps for them.  If we want web apps to truly run everywhere, having separate build steps for each platform is not going to cut it; and this to me is where Mozilla’s Web Runtime really shines. Go see for yourself.

    In conclusion

    If I missed anything or made any mistakes in regards to any of the technologies in this article, please let me know via comments to this post.  I’m more than happy to correct any errata. More than anything, I do not want to spread fear, uncertainty, or doubt (FUD) about any of these technologies.

    I’m super excited for the potential each of these approaches hold, and I enjoy exploring some of the subtleties I’ve observed between them.  In the end, I’m rooting for the Web, and I’m overjoyed to see lots of competition and ideation in this space, each approach with its own list of pros and cons.  What pros and cons do you see, and how do you think we can improve?  Share your (constructive) thoughts and opinions in the comments below.

  8. The state of Web Components

    Web Components have been on developers’ radars for quite some time now. They were first introduced by Alex Russell at Fronteers Conference 2011. The concept shook the community up and became the topic of many future talks and discussions.

    In 2013 a Web Components-based framework called Polymer was released by Google to kick the tires of these new APIs, get community feedback and add some sugar and opinion.

    By now, 4 years on, Web Components should be everywhere, but in reality Chrome is the only browser with ‘some version’ of Web Components. Even with polyfills it’s clear Web Components won’t be fully embraced by the community until the majority of browsers are on-board.

    Why has this taken so long?

    To cut a long story short, vendors couldn’t agree.

    Web Components were a Google effort and little negotiation was made with other browsers before shipping. Like most negotiations in life, parties that don’t feel involved lack enthusiasm and tend not to agree.

    Web Components were an ambitious proposal. Initial APIs were high-level and complex to implement (albeit for good reasons), which only added to contention and disagreement between vendors.

    Google pushed forward, they sought feedback, gained community buy-in; but in hindsight, before other vendors shipped, usability was blocked.

    Polyfills meant theoretically Web Components could work on browsers that hadn’t yet implemented, but these have never been accepted as ‘suitable for production’.

    Aside from all this, Microsoft haven’t been in a position to add many new DOM APIs due to the Edge work (nearing completion). And Apple, have been focusing on alternative features for Safari.

    Custom Elements

    Of all the Web Components technologies, Custom Elements have been the least contentious. There is general agreement on the value of being able to define how a piece of UI looks and behaves and being able to distribute that piece cross-browser and cross-framework.

    ‘Upgrade’

    The term ‘upgrade’ refers to when an element transforms from a plain old HTMLElement into a shiny custom element with its defined life-cycle and prototype. Today, when elements are upgraded, their createdCallback is called.

    var proto = Object.create(HTMLElement.prototype);
    proto.createdCallback = function() { ... };
    document.registerElement('x-foo', { prototype: proto });

    There are five proposals so far from multiple vendors; two stand out as holding the most promise.

    ‘Dmitry’

    An evolved version of the createdCallback pattern that works well with ES6 classes. The createdCallback concept lives on, but sub-classing is more conventional.

    class MyEl extends HTMLElement {
      createdCallback() { ... }
    }
    
    document.registerElement("my-el", MyEl);

    Like in today’s implementation, the custom element begins life as HTMLUnknownElement then some time later the prototype is swapped (or ‘swizzled’) with the registered prototype and the createdCallback is called.

    The downside of this approach is that it’s different from how the platform itself behaves. Elements are ‘unknown’ at first, then transform into their final form at some point in the future, which can lead to developer confusion.

    Synchronous constructor

    The constructor registered by the developer is invoked by the parser at the point the custom element is created and inserted into the tree.

    class MyEl extends HTMLElement {
      constructor() { ... }
    }
    
    document.registerElement("my-el", MyEl);

    Although this seems sensible, it means that any custom elements in the initial downloaded document will fail to upgrade if the scripts that contain their registerElement definition are loaded asynchronously. This is not helpful heading into a world of asynchronous ES6 modules.

    Additionally synchronous constructors come with platform issues related to .cloneNode().

    A direction is expected to be decided by vendors at a face-to-face meeting in July 2015.

    is=””

    The is attribute gives developers the ability to layer the behaviour of a custom element on top of a standard built-in element.

    <input type="text" is="my-text-input">

    Arguments for

    1. Allows extending the built-in features of a element that aren’t exposed as primitives (eg. accessibility characteristics, <form> controls, <template>).
    2. They give means to ‘progressively enhance’ an element, so that it remains functional without JavaScript.

    Arguments against

    1. Syntax is confusing.
    2. It side-steps the underlying problem that we’re missing many key accessibility primitives in the platform.
    3. It side-steps the underlying problem that we don’t have a way to properly extend built-in elements.
    4. Use-cases are limited; as soon as developers introduce Shadow DOM, they lose all built-in accessibility features.

    Consensus

    It is generally agreed that is is a ‘wart’ on the Custom Elements spec. Google has already implemented is and sees it as a stop-gap until lower-level primitives are exposed. Right now Mozilla and Apple would rather ship a Custom Elements V1 sooner and address this problem properly in a V2 without polluting the platform with ‘warts’.

    HTML as Custom Elements is a project by Domenic Denicola that attempts to rebuild built-in HTML elements with custom elements in an attempt to uncover DOM primitives the platform is missing.

    Shadow DOM

    Shadow DOM yielded the most contention by far between vendors. So much so that features had to be split into a ‘V1′ and ‘V2′ agenda to help reach agreement quicker.

    Distribution

    Distribution is the phase whereby children of a shadow host get visually ‘projected’ into slots inside the host’s Shadow DOM. This is the feature that enables your component to make use of content the user nests inside it.

    Current API

    The current API is fully declarative. Within the Shadow DOM you can use special <content> elements to define where you want the host’s children to be visually inserted.

    <content select="header"></content>

    Both Apple and Microsoft pushed back on this approach due to concerns around complexity and performance.

    A new Imperative API

    Even at the face-to-face meeting, agreement couldn’t be made on a declarative API, so all vendors agreed to pursue an imperative solution.

    All four vendors (Microsoft, Google, Apple and Mozilla) were tasked with specifying this new API before a July 2015 deadline. So far there have been three suggestions. The simplest of the three looks something like:

    var shadow = host.createShadowRoot({
      distribute: function(nodes) {
        var slot = shadow.querySelector('content');
        for (var i = 0; i < nodes.length; i++) {
          slot.add(nodes[i]);
        }
      }
    });
    
    shadow.innerHTML = '<content></content>';
    
    // Call initially ...
    shadow.distribute();
    
    // then hook up to MutationObserver

    The main obstacle is: timing. If the children of the host node change and we redistribute when the MutationObserver callback fires, asking for a layout property will return an incorrect result.

    myHost.appendChild(someElement);
    someElement.offsetTop; //=> old value
    
    // distribute on mutation observer callback (async)
    
    someElement.offsetTop; //=> new value

    Calling offsetTop will perform a synchronous layout before distribution!

    This might not seems like the end of the world, but scripts and browser internals often depend on the value of offsetTop being correct to perform many different operations, such as: scrolling elements into view.

    If these problems can’t be solved we may see a retreat back to discussions over a declarative API. This will either be in the form of the current <content select> style, or the newly proposed ‘named slots’ API (from Apple).

    A new Declarative API – ‘Named Slots’

    The ‘named slots’ proposal is a simpler variation of the current ‘content select’ API, whereby the component user must explicitly label their content with the slot they wish it to be distributed to.

    Shadow Root of <x-page>:

    <slot name="header"></slot>
    <slot></slot>
    <slot name="footer"></slot>
    <div>some shadow content</div>
    

    Usage of <x-page>:

    <x-page>
      <header slot="header">header</header>
      <footer slot="footer">footer</footer>
      <h1>my page title</h1>
      <p>my page content<p>
    </x-page>

    Composed/rendered tree (what the user sees):

    <x-page>
      <header slot="header">header</header>
      <h1>my page title</h1>
      <p>my page content<p>
      <footer slot="footer">footer</footer>
      <div>some shadow content</div>
    </x-page>

    The browser has looked at the direct children of the shadow host (myXPage.children) and seen if any of them have a slot attribute that matches the name of a <slot> element in the host’s shadowRoot.

    When a match is found, the node is visually ‘distributed’ in place of the corresponding <slot> element. Any children left undistributed at the end of this matching process are distributed to a default (unamed) <slot> element (if one exists).

    For:
    1. Distribution is more explicit, easier to understand, less ‘magic’.
    2. Distribution is simpler for the engine to compute.
    Against:
    1. Doesn’t explain how built-in elements, like <select>, work.
    2. Decorating content with slot attributes is more work for the user.
    3. Less expressive.

    ‘closed’ vs. ‘open’

    When a shadowRoot is ‘closed’ the it cannot be accessed via myHost.shadowRoot. This gives a component author some assurance that users won’t poke into implementation details, similar to how you can use closures to keep things private.

    Apple felt strongly that this was an important feature that they would block on. They argued that implementation details should never be exposed to the outside world and that ‘closed’ mode would be a required feature when ‘isolated’ custom elements became a thing.

    Google on the other hand felt that ‘closed’ shadow roots would prevent some accessibility and component tooling use-cases. They argued that it’s impossible to accidentally stumble into a shadowRoot and that if people want to they likely have a good reason. JS/DOM is open, let’s keep it that way.

    At the April meeting it became clear that to move forward, ‘mode’ needed to be a feature, but vendors were struggling to reach agreement on whether this should default to ‘open’ or ‘closed’. As a result, all agreed that for V1 ‘mode’ would be a required parameter, and thus wouldn’t need a specified default.

    element.createShadowRoot({ mode: 'open' });
    element.createShadowRoot({ mode: 'closed' });

    Shadow piercing combinators

    A ‘piercing combinator’ is a special CSS ‘combinator’ that can target elements inside a shadow root from the outside world. An example is /deep/ later renamed to >>>:

    .foo >>> div { color: red }

    When Web Components were first specified it was thought that these were required, but after looking at how they were being used it seemed to only bring problems, making it too easy to break the style boundaries that make Web Components so appealing.

    Performance

    Style calculation can be incredibly fast inside a tightly scoped Shadow DOM if the engine doesn’t have to take into consideration any outside selectors or state. The very presence of piercing combinators forbids these kind of optimisations.

    Alternatives

    Dropping shadow piercing combinators doesn’t mean that users will never be able to customize the appearance of a component from the outside.

    CSS custom-properties (variables)

    In Firefox OS we’re using CSS Custom Properties to expose specific style properties that can be defined (or overridden) from the outside.

    External (user):

    x-foo { --x-foo-border-radius: 10px; }
    

    Internal (author):

    .internal-part { border-radius: var(--x-foo-border-radius, 0); }
    Custom pseudo-elements

    We have also seen interest expressed from several vendors in reintroducing the ability to define custom pseudo selectors that would expose given internal parts to be styled (similar to how we style parts of <input type=”range”> today).

    x-foo::my-internal-part { ... }

    This will likely be considered for a Shadow DOM V2 specification.

    Mixins – @extend

    There is proposed specification to bring SASS’s @extend behaviour to CSS. This would be a useful tool for component authors to allow users to provide a ‘bag’ of properties to apply to a specific internal part.

    External (user):

    .x-foo-part {
      background-color: red;
      border-radius: 4px;
    }

    Internal (author):

    .internal-part {
      @extend .x-foo-part;
    }

    Multiple shadow roots

    Why would I want more than one shadow root on the same element?, I hear you ask. The answer is: inheritance.

    Let’s imagine I’m writing an <x-dialog> component. Within this component I write all the markup, styling, and interactions to give me an opening and closing dialog window.

    <x-dialog>
      <h1>My title</h1>
      <p>Some details</p>
      <button>Cancel</button>
      <button>OK</button>
    </x-dialog>

    The shadow root pulls any user provided content into div.inner via the <content> insertion point.

    <div class="outer">
      <div class="inner">
      <content></content>
      </div>
    </div>

    I also want to create <x-dialog-alert> that looks and behaves just like <x-dialog> but with a more restricted API, a bit like alert('foo').

    <x-dialog-alert>foo</x-dialog-alert>
    var proto = Object.create(XDialog.prototype);
    
    proto.createdCallback = function() {
      XDialog.prototype.createdCallback.call(this);
      this.createShadowRoot();
      this.shadowRoot.innerHTML = templateString;
    };
    
    document.registerElement('x-dialog-alert', { prototype: proto });
    

    The new component will have its own shadow root, but it’s designed to work on top of the parent class’s shadow root. The <shadow> represents the ‘older’ shadow root and allows us to project content inside it.

    <shadow>
      <h1>Alert</h1>
      <content></content>
      <button>OK</button>
    </shadow>

    Once you get your head round multiple shadow roots, they become a powerful concept. The downside is they bring a lot of complexity and introduce a lot of edge cases.

    Inheritance without multiple shadows

    Inheritance is still possible without multiple shadow roots, but it involves manually mutating the super class’s shadow root.

    
    var proto = Object.create(XDialog.prototype);
    
    proto.createdCallback = function() {
      XDialog.prototype.createdCallback.call(this);
      var inner = this.shadowRoot.querySelector('.inner');
    
      var h1 = document.createElement('h1');
      h1.textContent = 'Alert';
      inner.insertBefore(h1, inner.children[0]);
    
      var button = document.createElement('button');
      button.textContent = 'OK';
      inner.appendChild(button);
    
      ...
    };
    
    document.registerElement('x-dialog-alert', { prototype: proto });
    

    The downsides of this approach are:

    1. Not as elegant.
    2. Your sub-component is dependent on the implementation details of the super-component.
    3. This wouldn’t be possible if the super component’s shadow root was ‘closed’, as this.shadowRoot would be undefined.

    HTML Imports

    HTML Imports provide a way to import all assets defined in one .html document, into the scope of another.

    <link rel="import" href="/path/to/imports/stuff.html">

    As previously stated, Mozilla is not currently intending to implementing HTML Imports. This is in part because we’d like to see how ES6 modules pan out before shipping another way of importing external assets, and partly because we don’t feel they enable much that isn’t already possible.

    We’ve been working with Web Components in Firefox OS for over a year and have found using existing module syntax (AMD or Common JS) to resolve a dependency tree, registering elements, loaded using a normal <script> tag seems to be enough to get stuff done.

    HTML Imports do lend themselves well to a simpler/more declarative workflow, such as the older <element> and Polymer’s current registration syntax.

    With this simplicity has come criticism from the community that Imports don’t offer enough control to be taken seriously as a dependency management solution.

    Before the decision was made a few months ago, Mozilla had a working implementation behind a flag, but struggled through an incomplete specification.

    What will happen to them?

    Apple’s Isolated Custom Elements proposal makes use of an HTML Imports style approach to provide custom elements with their own document scope;: Perhaps there’s a future there.

    At Mozilla we want to explore how importing custom element definitions can align with upcoming ES6 module APIs. We’d be prepared to implement if/when they appear to enable developers to do stuff they can’t already do.

    To conclude

    Web Components are a prime example of how difficult it is to get large features into the browser today. Every API added lives indefinitely and remains as an obstacle to the next.

    Comparable to picking apart a huge knotted ball of string, adding a bit more, then tangling it back up again. This knot, our platform, grows ever larger and more complex.

    Web Components have been in planning for over three years, but we’re optimistic the end is near. All major vendors are on board, enthusiastic, and investing significant time to help resolve the remaining issues.

    Let’s get ready to componentize the web!

    More

  9. ES6 In Depth: Arrow functions

    ES6 In Depth is a series on new features being added to the JavaScript programming language in the 6th Edition of the ECMAScript standard, ES6 for short.

    Arrows have been part of JavaScript from the very beginning. The first JavaScript tutorials advised wrapping inline scripts in HTML comments. This would prevent browsers that didn’t support JS from erroneously displaying your JS code as text. You would write something like this:

    <script language="javascript">
    <!--
      document.bgColor = "brown";  // red
    // -->
    </script>
    

    Old browsers would see two unsupported tags and a comment; only new browsers would see JS code.

    To support this odd hack, the JavaScript engine in your browser treats the characters <!-- as the start of a one-line comment. No joke. This has really been part of the language all along, and it works to this day, not just at the top of an inline <script> but everywhere in JS code. It even works in Node.

    As it happens, this style of comment is standardized for the first time in ES6. But this isn’t the arrow we’re here to talk about.

    The arrow sequence --> also denotes a one-line comment. Weirdly, while in HTML characters before the --> are part of the comment, in JS the rest of the line after the --> is a comment.

    It gets stranger. This arrow indicates a comment only when it appears at the start of a line. That’s because in other contexts, --> is an operator in JS, the “goes to” operator!

    function countdown(n) {
      while (n --> 0)  // "n goes to zero"
        alert(n);
      blastoff();
    }
    

    This code really works. The loop runs until n gets to 0. This too is not a new feature in ES6, but a combination of familiar features, with a little misdirection thrown in. Can you figure out what’s going on here? As usual, the answer to the puzzle can be found on Stack Overflow.

    Of course there is also the less-than-or-equal-to operator, <=. Perhaps you can find more arrows in your JS code, Hidden Pictures style, but let’s stop here and observe that an arrow is missing.

    <!-- single-line comment
    --> “goes to” operator
    <= less than or equal to
    => ???

    What happened to =>? Today, we find out.

    First, let’s talk a bit about functions.

    Function expressions are everywhere

    A fun feature of JavaScript is that any time you need a function, you can just type that function right in the middle of running code.

    For example, suppose you are trying to tell the browser what to do when the user clicks on a particular button. You start typing:

    $("#confetti-btn").click(
    

    jQuery’s .click() method takes one argument: a function. No problem. You can just type in a function right here:

    $("#confetti-btn").click(function (event) {
      playTrumpet();
      fireConfettiCannon();
    });
    

    Writing code like this comes quite naturally to us now. So it’s strange to recall that before JavaScript popularized this kind of programming, many languages did not have this feature. Of course Lisp had function expressions, also called lambda functions, in 1958. But C++, Python, C#, and Java all existed for years without them.

    Not anymore. All four have lambdas now. Newer languages universally have lambdas built in. We have JavaScript to thank for this—and early JavaScript programmers who fearlessly built libraries that depended heavily on lambdas, leading to widespread adoption of the feature.

    It is just slightly sad, then, that of all the languages I’ve mentioned, JavaScript’s syntax for lambdas has turned out to be the wordiest.

    // A very simple function in six languages.
    function (a) { return a > 0; } // JS
    [](int a) { return a > 0; }  // C++
    (lambda (a) (> a 0))  ;; Lisp
    lambda a: a > 0  # Python
    a => a > 0  // C#
    a -> a > 0  // Java
    

    A new arrow in your quiver

    ES6 introduces a new syntax for writing functions.

    // ES5
    var selected = allJobs.filter(function (job) {
      return job.isSelected();
    });
    
    // ES6
    var selected = allJobs.filter(job => job.isSelected());
    

    When you just need a simple function with one argument, the new arrow function syntax is simply Identifier => Expression. You get to skip typing function and return, as well as some parentheses, braces, and a semicolon.

    (I am personally very grateful for this feature. Not having to type function is important to me, because I inevitably type functoin instead and have to go back and correct it.)

    To write a function with multiple arguments (or no arguments, or rest parameters or defaults, or a destructuring argument) you’ll need to add parentheses around the argument list.

    // ES5
    var total = values.reduce(function (a, b) {
      return a + b;
    }, 0);
    
    // ES6
    var total = values.reduce((a, b) => a + b, 0);
    

    I think it looks pretty nice.

    Arrow functions work just as beautifully with functional tools provided by libraries, like Underscore.js and Immutable. In fact, the examples in Immutable’s documentation are all written in ES6, so many of them already use arrow functions.

    What about not-so-functional settings? Arrow functions can contain a block of statements instead of just an expression. Recall our earlier example:

    // ES5
    $("#confetti-btn").click(function (event) {
      playTrumpet();
      fireConfettiCannon();
    });
    

    Here’s how it will look in ES6:

    // ES6
    $("#confetti-btn").click(event => {
      playTrumpet();
      fireConfettiCannon();
    });
    

    A minor improvement. The effect on code using Promises can be more dramatic, as the }).then(function (result) { lines can pile up.

    Note that an arrow function with a block body does not automatically return a value. Use a return statement for that.

    There is one caveat when using arrow functions to create plain objects. Always wrap the object in parentheses:

    // create a new empty object for each puppy to play with
    var chewToys = puppies.map(puppy => {});   // BUG!
    var chewToys = puppies.map(puppy => ({})); // ok
    

    Unfortunately, an empty object {} and an empty block {} look exactly the same. The rule in ES6 is that { immediately following an arrow is always treated as the start of a block, never the start of an object. The code puppy => {} is therefore silently interpreted as an arrow function that does nothing and returns undefined.

    Even more confusing, an object literal like {key: value} looks exactly like a block containing a labeled statement—at least, that’s how it looks to your JavaScript engine. Fortunately { is the only ambiguous character, so wrapping object literals in parentheses is the only trick you need to remember.

    What’s this?

    There is one subtle difference in behavior between ordinary function functions and arrow functions. Arrow functions do not have their own this value. The value of this inside an arrow function is always inherited from the enclosing scope.

    Before we try and figure out what that means in practice, let’s back up a bit.

    How does this work in JavaScript? Where does its value come from? There’s no short answer. If it seems simple in your head, it’s because you’ve been dealing with it for a long time!

    One reason this question comes up so often is that function functions receive a this value automatically, whether they want one or not. Have you ever written this hack?

    {
      ...
      addAll: function addAll(pieces) {
        var self = this;
        _.each(pieces, function (piece) {
          self.add(piece);
        });
      },
      ...
    }
    

    Here, what you’d like to write in the inner function is just this.add(piece). Unfortunately, the inner function doesn’t inherit the outer function’s this value. Inside the inner function, this will be window or undefined. The temporary variable self serves to smuggle the outer value of this into the inner function. (Another way is to use .bind(this) on the inner function. Neither way is particularly pretty.)

    In ES6, this hacks mostly go away if you follow these rules:

    • Use non-arrow functions for methods that will be called using the object.method() syntax. Those are the functions that will receive a meaningful this value from their caller.
    • Use arrow functions for everything else.
    // ES6
    {
      ...
      addAll: function addAll(pieces) {
        _.each(pieces, piece => this.add(piece));
      },
      ...
    }
    

    In the ES6 version, note that the addAll method receives this from its caller. The inner function is an arrow function, so it inherits this from the enclosing scope.

    As a bonus, ES6 also provides a shorter way to write methods in object literals! So the code above can be simplified further:

    // ES6 with method syntax
    {
      ...
      addAll(pieces) {
        _.each(pieces, piece => this.add(piece));
      },
      ...
    }
    

    Between methods and arrows, I might never type functoin again. It’s a nice thought.

    There’s one more minor difference between arrow and non-arrow functions: arrow functions don’t get their own arguments object, either. Of course, in ES6, you’d probably rather use a rest parameter or default value anyway.

    Using arrows to pierce the dark heart of computer science

    We’ve talked about the many practical uses of arrow functions. There’s one more possible use case I’d like to talk about: ES6 arrow functions as a learning tool, to uncover something deep about the nature of computation. Whether that is practical or not, you’ll have to decide for yourself.

    In 1936, Alonzo Church and Alan Turing independently developed powerful mathematical models of computation. Turing called his model a-machines, but everyone instantly started calling them Turing machines. Church wrote instead about functions. His model was called the λ-calculus. (λ is the lowercase Greek letter lambda.) This work was the reason Lisp used the word LAMBDA to denote functions, which is why we call function expressions “lambdas” today.

    But what is the λ-calculus? What is “model of computation” supposed to mean?

    It’s hard to explain in just a few words, but here is my attempt: the λ-calculus is one of the first programming languages. It was not designed to be a programming language—after all, stored-program computers wouldn’t come along for another decade or two—but rather a ruthlessly simple, stripped-down, purely mathematical idea of a language that could express any kind of computation you wished to do. Church wanted this model in order to prove things about computation in general.

    And he found that he only needed one thing in his system: functions.

    Think how extraordinary this claim is. Without objects, without arrays, without numbers, without if statements, while loops, semicolons, assignment, logical operators, or an event loop, it is possible to rebuild every kind of computation JavaScript can do, from scratch, using only functions.

    Here is an example of the sort of “program” a mathematician could write, using Church’s λ notation:

    fix = λf.(λx.f(λv.x(x)(v)))(λx.f(λv.x(x)(v)))
    

    The equivalent JavaScript function looks like this:

    var fix = f => (x => f(v => x(x)(v)))
                   (x => f(v => x(x)(v)));
    

    That is, JavaScript contains an implementation of the λ-calculus that actually runs. The λ-calculus is in JavaScript.

    The stories of what Alonzo Church and later researchers did with the λ-calculus, and how it has quietly insinuated itself into almost every major programming language, are beyond the scope of this blog post. But if you’re interested in the foundations of computer science, or you’d just like to see how a language with nothing but functions can do things like loops and recursion, you could do worse than to spend some rainy afternoon looking into Church numerals and fixed-point combinators, and playing with them in your Firefox console or Scratchpad. With ES6 arrows on top of its other strengths, JavaScript can reasonably claim to be the best language for exploring the λ-calculus.

    When can I use arrows?

    ES6 arrow functions were implemented in Firefox by me, back in 2013. Jan de Mooij made them fast. Thanks to Tooru Fujisawa and ziyunfei for patches.

    Arrow functions are also implemented in the Microsoft Edge preview release. They’re also available in Babel, Traceur, and TypeScript, in case you’re interested in using them on the Web right now.

    Our next topic is one of the stranger features in ES6. We’ll get to see typeof x return a totally new value. We’ll ask: When is a name not a string? We’ll puzzle over the meaning of equality. It’ll be weird. So please join us next week as we look at ES6 symbols in depth.

  10. Firefox multistream and renegotiation for Jitsi Videobridge

    Firefox multistream and renegotiation for Jitsi Videobridge

    Author’s note: Firefox landed support for multistream and renegotiation support in Firefox 38. This article talks about how the team at Jitsi Videobridge, a WebRTC service, collaborated with the Firefox WebRTC team to get Jitsi’s multi-party video conferencing working well in Firefox. In the process, several issues were identified and fixed on both sides of the system. Firefox 40 (our newly released Developer Edition) and later versions include all those fixes. This post, written by Jitsi engineer George Politis, assumes some basic knowledge of WebRTC and how it works.

    Firefox is the first browser to implement the spec-compliant “Unified Plan” for multistream support, which Chrome will be moving to, but hasn’t implemented yet. Thus, services that currently work on Chrome will need some modifications to work on Firefox. I encourage all service providers who have or are thinking of adding multistream support to give Firefox 40 or later a try and let us know how it works for you. Thanks.

    Maire Reavy
    Engineering Manager, Web RTC

    Introduction

    Many of you WebRTC developers out there have probably already come across the name Jitsi Videobridge. Multi-party video conferencing is arguably one of the most popular use cases for WebRTC and once you start looking for servers that allow you to implement it, Jitsi’s name is among the first you stumble upon.

    For a while now, a number of JavaScript applications have been using WebRTC and Jitsi Videobridge to deliver a rich conferencing experience to their users. The bridge provides a lightweight way (think routing vs. mixing) of conducting high quality video conferences, so it has received its fair share of attention.

    The problem was that, until recently, applications using Jitsi Videobridge only worked on a limited set of browsers: Chromium, Chrome, and Opera.

    This limitation is now gone!

    After a few months of hard work by Mozilla and Jitsi developers, both Firefox and Jitsi have added the missing pieces and can now work together.

    While this wasn’t the most difficult project on Earth, it wasn’t quite a walk in the park either. In this post we’ll tell you more about the nitty-gritty details of our collaborative adventure.

    Some basics

    Jitsi Videobridge is an open source (LGPL) lightweight video conferencing server. WebRTC JavaScript applications such as Jitsi Meet use Jitsi Videobridge to provide high quality, scalable video conferences. Jitsi Videobridge receives video from every participant and then relays some or all of it to everyone else. The IETF term for Jitsi Videobridge is a Selective Forwarding Unit (SFU). Sometimes such servers are also referred to as video routers or MCUs. The same technology is used by most modern video conferencing systems like Google Hangouts, Skype, Vidyo, and many others.

    From a WebRTC perspective, every browser establishes exactly one PeerConnection with the videobridge. The browser sends and receives all audio and video data to and from the bridge over that one PeerConnection.

    Jitsi Videobridge based 3-way call

    In a Jitsi Videobridge-based conference, all signaling goes through a separate server-side application called the Focus. It is responsible for managing media sessions between each of the participants and the videobridge. Communication between the Focus and a given participant is done through Jingle and between the Focus and the Jitsi Videobidge through COLIBRI.

    Unified Plan, Plan B and the answer to life, the universe and everything

    When discussing interoperability between Firefox and Chrome for multi-party video conferences, it is impossible not to talk a little bit (or a lot!) about the Unified Plan and Plan B. These were two competing IETF drafts for the negotiation and exchange of multiple media sources (i.e., MediaStreamTracks or MSTs) between WebRTC endpoints. Unified Plan has been incorporated into the JSEP draft and Bundle negotiation draft, which are on their way to becoming IETF standards. Plan B expired in 2013 and nobody should care about it anymore … at least in theory.

    In reality, Plan B lives on in Chrome and its derivatives, like Chromium and Opera. There’s actually an issue in the Chromium bug tracker to add support for Unified Plan in Chromium, but that’ll take some time. Firefox, on the other hand, has, as of recently, implemented Unified Plan.

    Developers who implement many-to-many WebRTC-based videoconferencing solutions and want to support both Firefox and Chrome have to deal with this situation and implement some kind of interoperability layer between Chrome and and Firefox. Jitsi Meet is no exception of course; in the beginning it was a no-brainer to assume Plan B because that’s what Chrome implements and Firefox didn’t have multistream support. As a result, most of Jitsi’s abstractions were built around this assumption.

    The most substantial difference between Unified Plan and Plan B is how they represent media stream tracks. Unified Plan extends the standard way of encoding this information in SDP which is to have each RTP flow (i.e., SSRC) appear on its own m-line. So, each media stream track is represented by its own unique m-line. This is a strict one-to-one mapping; a single media stream track cannot be spread across several m-lines, nor may a single m-line represent multiple media stream tracks.

    Plan B takes a different approach, and creates a hierarchy within SDP; an m= line defines an “envelope”, specifying codec and transport parameters, and a=ssrc lines are used to describe individual media sources within that envelope. So, typically, a Plan B SDP has three channels, one for audio, one for video, and one for the data.

    Implementation

    On the Jitsi side, it was obvious from the beginning that all the magic should happen in the client. The Focus communicates with the clients using Jingle, which is in turn transformed into SDP, and then handed over to the browser. There’s no SDP going around on the wire. Furthermore, there’s no signaling communication between the endpoints and the Jitsi Videobridge, it’s the Focus that mediates this procedure using COLIBRI. So the question for the Jitsi team was: “What’s the easiest way to go from Jingle to Unified Plan for Firefox, given that we have code that assumes Plan B in all imaginable places?

    In its first few attempts, the Jitsi team tried to provide general abstractions wherever there was Plan B specific code. This could have worked, but at the same period of time Jitsi Meet was undergoing some massive refactoring and the inbound Unified Plan patches were constantly broken. On top of that, with multistream support in Firefox in its very early stages, Firefox was breaking more often than it worked. Result: 0 progress. One could even argue that the progress was negative, because of the wasted time.

    It was time to change course. The Jitsi team decided to try a more general solution to the problem and deal with it at a lower level. The idea was to build a PeerConnection adapter that would feed the right SDP to the browser, i.e. Unified Plan to Firefox and Plan B to Chrome, and that would give a Plan B SDP to the application. Enter sdp-interop.

    An SDP interoperability layer

    sdp-interop is a reusable npm module that offers the two simple methods:

    • toUnifiedPlan(sdp) that takes an SDP string and transforms it into a Unified Plan SDP.
    • toPlanB(sdp) that, not surprisingly, takes an SDP string and transforms it into a Plan B SDP.

    The PeerConnection adapter wraps the setLocalDescription(), setRemoteDescription() methods, and the success callbacks of the createAnswer() and createOffer() methods. If the browser is Chrome, the adapter does nothing. If, on the other hand, the browser is Firefox the PeerConnection adapter does as follows:

    • Calls the toUnifiedPlan() method of the sdp-interop module prior to calling the setLocalDescription() or thesetRemoteDescription() methods, thus converting the Plan B SDP from the application to a Unified Plan SDP that Firefox can understand.
    • Calls the toPlanB() method prior to calling the createAnswer() or the createOffer() success callback, thus converting the Unified Plan SDP from Firefox to a Plan B SDP that the application can understand.

    Here’s a sample PeerConnection adapter built on top of adapter.js:

    function PeerConnectionAdapter(ice_config, constraints) {
        this.peerconnection = new RTCPeerConnection(ice_config, constraints);
        this.interop = new require('sdp-interop').Interop();
    }
    
    PeerConnectionAdapter.prototype.setLocalDescription
      = function (description, successCallback, failureCallback) {
        // if we're running on FF, transform to Unified Plan first.
        if (navigator.mozGetUserMedia)
            description = this.interop.toUnifiedPlan(description);
    
        this.peerconnection.setLocalDescription(description,
            function () { successCallback(); },
            function (err) { failureCallback(err); }
        );
    };
    
    PeerConnectionAdapter.prototype.setRemoteDescription
      = function (description, successCallback, failureCallback) {
        // if we're running on FF, transform to Unified Plan first.
        if (navigator.mozGetUserMedia)
            description = this.interop.toUnifiedPlan(description);
    
        this.peerconnection.setRemoteDescription(description,
            function () { successCallback(); },
            function (err) { failureCallback(err); }
        );
    };
    
    PeerConnectionAdapter.prototype.createAnswer
      = function (successCallback, failureCallback, constraints) {
        var self = this;
        this.peerconnection.createAnswer(
            function (answer) {
                if (navigator.mozGetUserMedia)
                    answer = self.interop.toPlanB(answer);
                successCallback(answer);
            },
            function(err) {
                failureCallback(err);
            },
            constraints
        );
    };
    
    PeerConnectionAdapter.prototype.createOffer
      = function (successCallback, failureCallback, constraints) {
        var self = this;
        this.peerconnection.createOffer(
            function (offer) {
                if (navigator.mozGetUserMedia)
                    offer = self.interop.toPlanB(offer);
                successCallback(offer);
            },
            function(err) {
                failureCallback(err);
            },
            constraints
        );
    };
    

    Beyond the basics

    Like most things in life, sdp-interop is not “perfect,” it makes certain assumptions and has some limitations. First and foremost, unfortunately, a Plan B offer/answer does not have enough information to rebuild an equivalent Unified Plan offer/answer. So, while it is easy, with some limitations, to go from Unified Plan to Plan B, the reverse is not possible without keeping some state.

    Suppose, for example, that a Firefox client gets an offer from the Focus to join a large call. In the native create answer success callback you get a Unified Plan answer that contains multiple m-lines. You convert it in a Plan B answer using the sdp-interop module and hand it over to the app to do its thing. At some point later-on, the app calls the adapter’ssetLocalDescription() method. The adapter will have to convert the Plan B answer back to a Unified Plan one to pass it to Firefox.

    That’s the tricky part because you can’t naively put any SSRC in any m-line, each SSRC should be put back into the same m-line that it was in the original answer from the native create answer success callback. The order of the m-lines is important too, so each m-line has to be in the same position as it was in the original answer from the native create answer success callback (matching the position of the m-line in the Unified Plan offer). It is also forbidden to remove an m-line, instead they must be marked as inactive, if they’re no longer used. Similar considerations have to be taken into account when converting a Plan B offer to a Unified Plan one when doing renegotiation, for example.

    sdp-interop solves this issue by caching both the most recent Unified Plan offer and the most recent Unified Plan answer. When one goes from Plan B to Unified Plan, sdp-interop uses the cached Unified Plan offer/answer and adds the missing information from there. You can see here exactly how this is done.

    Another limitation is that, in some cases, a unified plan SDP cannot be mapped to a plan B SDP. If the unified SDP has two audio m-lines (for example) that have different media or transport attributes, these cannot be reconciled when trying to squish them together in a single plan B m-section. This is why sdp-interop can only work if the transport attributes are the same (i.e., bundle and rtcp-mux are being used), and if all codec attributes are exactly the same for each m-line of a given media type. Fortunately, Chrome and Firefox do both of these things by default. (This is probably also part of the reason why implementing Unified Plan won’t be trivial for Chrome.)

    One last soft limitation is that the SDP interoperability layer has only been tested when Firefox answers a call and not when it offers one because in the Jitsi architecture the endpoints always get invited by the Focus to join a call and never offer one.

    Far, far beyond the basics

    Even with the SDP interoperability layer in place, a number of difficulties had to be overcome to bring Firefox support to Jitsi Videobridge and Mozilla has been a great help in solving all of them. In most cases, the problem was easy to fix, but required time and effort to identify. For reference (and for fun!) we’ll briefly describe a few of those problems here.

    One of our first unpleasant surprises was that one day the Jitsi prototype implementation decided to stop working all of a sudden. The DTLS negotiation started failing soon after Mozilla enabled DTLS 1.2 in Firefox, and, as it turned out, there was a problem in the DTLS version negotiation between Firefox and our Bouncy Castle-based stack. The RFCs are a little ambiguous in relation to the record layer versions, but we assumed the openssl rules to be the standard and patched our stack to behave according to those rules.

    Another minor issue was that Firefox was missing msids but Mozilla kindly took care of that.

    Next, the Jitsi team faced a very weird issue where the remote video playback on the Firefox side froze or never started. The decoder was stalling. The weird thing about this was that, in the test environment (LAN conditions), the problem appeared to be triggered only when goog-remb was signaled in the SDP. After some digging, it turned out that the problem had nothing to do with goog-remb. The real issue was that the Jitsi Videobridge was relaying RED to Firefox but the latter doesn’t currently support ulpfec/red so nothing made it through to the decoder. Signaling goog-remb probably tells Chrome to encapsulate VP8 into RED right from the beginning of the streaming, even before packet loss is detected. (Due to the overhead introduced by adding any redundant data, it’s usually a good idea to activate only when the network conditions require it.) The Jitsi Videobridge now decapsulates RED into plain VP8 when it streams to Firefox (or any other client that doesn’t support ULPFEC/RED).

    The Jitsi team has also discovered and fixed a few issues in the Jitsi code base, including a non-zero offset bug in our stack, probably inside the SRTP transformers, that was causing SRTP auth failures.

    Finally, and maybe most importantly, in a typical multistream enabled conference, Firefox creates two (potentially three) sendrecv channels (for audio, for video, and potentially for data) and N recvonly channels, some for incoming audio and some for incoming video. Those recvonly channels will send RTCP feedback with an internally generated SSRC. Here’s where the trouble begun.

    Those internally generated SSRCs of the recvonly channels are known only to Firefox. They’re not known neither to the client app (as they’re not included in the SDP), nor to the Jitsi Videobridge, nor to the other endpoints, notably Chrome.

    When using bundle, Chrome will discard RTCP traffic coming from unannounced SSRCs as it uses SSRCs to decide if an RTCP packet should go the the sending Audio or the sending Video channel. If it can’t find where to dispatch an RTCP packet, it drops it. Firefox is not affected as it handles this differently. The webrtc code that does the filtering is in bundlefilter.cc which is not included in mozilla-central. Unfortunately we (Jitsi) have the same filtering/demux logic implemented in our gateway.

    This is hugely important because PLIs/RRs/NACKs/etc from recvonly channels although they might reach Chrome, they’re discarded, so the typical result is a stalled decoder on the Firefox side. Mozilla fixed this in Bug 1160280 by exposing in the SDP the SSRC for recvonly channels.

    Conclusion

    It’s been quite an interesting journey but we are almost there! Firefox Nightly (v41) and Firefox Developer Edition 40 have all the required pieces in place and Jitsi based many-to-many conferences work fine using multistream.

    One of the last things for Jitsi to tackle is simulcast support in Firefox. Jitsi’s simulcast implementation relies heavily on MediaStream constructors but they’re not available in Firefox at the moment. The Jitsi team is working on an alternative approach that doesn’t require MediaStream constructors. Desktop sharing is another significant item that’s missing when Jitsi runs on Firefox, but it is also currently work in progress.

    In other words, Firefox and Jitsi are about to become best buddies!