Mozilla

JavaScript Articles

Sort by:

View:

  1. Visually Representing Angular Applications

    This article concerns diagrammatically representing Angular applications. It is a first step, not a fully figured out dissertation about how to visual specify or document Angular apps. And maybe the result of this is that I, with some embarrassment, find out that someone else already has a complete solution.

    My interest in this springs from two ongoing projects:

    1. My day job working on the next generation version of Desk.com‘s support center agent application and
    2. My night job working on a book, Angular In Depth, for Manning Publications

    1: Large, complex Angular application

    The first involves working on a large, complex Angular application as part of a multi-person front-end team. One of the problems I, and I assume other team members encounter (hopefully I’m not the only one), is getting familiar enough with different parts of the application so my additions or changes don’t hose it or cause problems down the road.

    With Angular application it is sometimes challenging to trace what’s happening where. Directives give you the ability to encapsulate behavior and let you employ that behavior declaratively. That’s great. Until you have nested directives or multiple directives operating in tandem that someone else painstakingly wrote. That person probably had a clear vision of how everything related and worked together. But, when you come to it newly, it can be challenging to trace the pieces and keep them in your head as you begin to add features.

    Wouldn’t it be nice to have a visual representation of complex parts of an Angular application? Something that gives you the lay-of-the-land so you can see at a glance what depends on what.

    2: The book project

    The second item above — the book project — involves trying to write about how Angular works under-the-covers. I think most Angular developers have at one time or another viewed some part of Angular as magical. We’ve also all cursed the documentation, particularly those descriptions that use terms whose descriptions use terms whose descriptions are poorly defined based on an understanding of the first item in the chain.

    There’s nothing wrong with using Angular directives or services as demonstrated in online examples or in the documentation or in the starter applications. But it helps us as developers if we also understand what’s happening behind the scenes and why. Knowing how Angular services are created and managed might not be required to write an Angular application, but the ease of writing and the quality can be, I believe, improved by better understanding those kinds of details.

    Visual representations

    In the course of trying to better understand Angular behind-the-scenes and write about it, I’ve come to rely heavily on visual representations of the key concepts and processes. The visual representations I’ve done aren’t perfect by any means, but just working through how to represent a process in a diagram has a great clarifying effect.

    There’s nothing new about visually representing software concepts. UML, process diagrams, even Business Process Modeling Notation (BPMN) are ways to help visualize classes, concepts, relationships and functionality.

    And while those diagramming techniques are useful, it seems that at least in the Angular world, we’re missing a full-bodied visual language that is well suited to describe, document or specify Angular applications.

    We probably don’t need to reinvent the wheel here — obviously something totally new is not needed — but when I’m tackling a (for me) new area of a complex application, having available a customized visual vocabulary to represent it would help.

    Diagrammatically representing front-end JavaScript development

    I’m working with Angular daily so I’m thinking specifically about how to represent an Angular application but this may also be an issue within the larger JavaScript community: how to diagrammatically represent front-end JavaScript development in a way allows us to clearly visualize our models, controllers and views, and the interactions between the DOM and our JavaScript code including a event-driven, async callbacks. In other words, a visual domain specific language (DSL) for client-side JavaScript development.

    I don’t have a complete answer for that, but in self-defense I started working with some diagrams to roughly represent parts of an Angular application. Here’s sort of the sequence I went through to arrive at a first cut:

    1. The first thing I did was write out a detailed description of the problem and what I wanted out of an Angular visual DSL. I also defined some simple abbreviations to use to identify the different types of Angular “objects” (directives, controllers, etc.). Then I dove in began diagramming.
    2. I identified the area of code I needed to understand better, picked a file and threw it on the diagram. What I wanted to do was to diagram it in such a way that I could look at that one file and document it without simultaneously having to trace everything to which it connected.
    3. When the first item was on the diagram, I went to something on which it depended. For example, starting with a directive this leads to associated views or controllers. I diagrammed the second item and added the relationship.
    4. I kept adding items and relationships including nested directives and their views and controllers.
    5. I continued until the picture made sense and I could see the pieces involved in the task I had to complete.

    Since I was working on a specific ticket, I knew the problem I needed to solve so not all information had to be included in each visual element. The result is rough and way too verbose, but it did accomplish:

    • Showing me the key pieces and how they related, particularly the nested directives.
    • Including useful information on where methods or $scope properties lived.
    • Giving a guide to the directories where each item lives.

    It’s not pretty but here is the result:

    This represents a somewhat complicated part of the code and having the diagram helped in at least four ways:

    • By going through the exercise of creating it, I learned the pieces involved in an orderly way — and I didn’t have to try to retain the entire structure in my head as I went.
    • I got the high-level view I needed.
    • It was very helpful when developing, particularly since the work got interrupted and I had to come back to it a few days later.
    • When the work was done, I added it to our internal WIKI to ease future ramp-up in the area.

    I think the some next steps might be to define and expand the visual vocabulary by adding things such as:

    • Unique shapes or icons to identify directives, controllers, views, etc.
    • Standardize how to represent the different kinds of relationships such as ng-include or a view referenced by a directive.
    • Standardize how to represent async actions.
    • Add representations of the model.

    As I said in the beginning, this is rough and nowhere near complete, but it did confirm for me the potential value of having a diagramming convention customized for JavaScript development. And in particular, it validated the need for a robust visual DSL to explore, explain, specify and document Angular applications.

  2. interact.js for drag and drop, resizing and multi-touch gestures

    interact.js is a JavaScript module for Drag and drop, resizing and multi-touch gestures with inertia and snapping for modern browsers (and also IE8+).

    Background

    I started it as part of my GSoC 2012 project for Biographer‘s network visualization tool. The tool was a web app which rendered to an SVG canvas and used jQuery UI for drag and drop, selection and resizing. Because jQuery UI has little support for SVG, heavy workarounds had to be used. I needed to make the web app more usable on smartphones and tablets and the largest chunk of this work was to replace jQuery UI with interact.js which:

    • is lightweight,
    • works well with SVG,
    • handles multi-touch input,
    • leaves the task of rendering/styling elements to the application and
    • allows the application to supply object dimensions instead of parsing element styles or getting DOMRects.

    What interact.js tries to do is present input data consistently across different browsers and devices and provide convenient ways to pretend that the user did something that they didn’t really do (snapping, inertia, etc.).

    Certain sequences of user input can lead to InteractEvents being fired. If you add event listeners for an event type, that function is given an InteractEvent object which provides pointer coordinates and speed and, in gesture events, scale, distance, angle, etc. The only time interact.js modifies the DOM is to style the cursor; making an element move while a drag happens has to be done from your own event listeners. This way you’re in control of everything that happens.

    Slider demo

    Here’s an example of how you could make a slider with interact.js. You can view and edit the complete HTML, CSS and JS of all the demos in this post on CodePen.

    See the Pen interact.js simple slider by Taye A (@taye) on CodePen.

    JavaScript rundown

    interact('.slider')                   // target the matches of that selector
      .origin('self')                     // (0, 0) will be the element's top-left
      .restrict({drag: 'self'})           // keep the drag within the element
      .inertia(true)                      // start inertial movement if thrown
      .draggable({                        // make the element fire drag events
        max: Infinity                     // allow drags on multiple elements
      })
      .on('dragmove', function (event) {  // call this function on every move
        var sliderWidth = interact.getElementRect(event.target.parentNode).width,
            value = event.pageX / sliderWidth;
     
        event.target.style.paddingLeft = (value * 100) + '%';
        event.target.setAttribute('data-value', value.toFixed(2));
      });
     
    interact.maxInteractions(Infinity);   // Allow multiple interactions
    • interact('.slider') [docs] creates an Interactable object which targets elements that match the '.slider' CSS selector. An HTML or SVG element object could also have been used as the target but using a selector lets you use the same settings for multiple elements.
    • .origin('self') [docs] tells interact.js to modify the reported coordinates so that an event at the top-left corner of the target element would be (0,0).
    • .restrict({drag: 'self'}) [docs] keeps the coordinates within the area of the target element.
    • .inertia(true) [docs] lets the user “throw” the target so that it keeps moving after the pointer is released.
    • Calling .draggable({max: Infinity}) [docs] on the object:
      • allows drag listeners to be called when the user drags from an element that matches the target and
      • allows multiple target elements to be dragged simultaneously
    • .on('dragmove', function (event) {...}) [docs] adds a listener for the dragmove event. Whenever a dragmove event occurs, all listeners for that event type that were added to the target Interactable are called. The listener function here calculates a value from 0 to 1 depending on which point along the width of the slider the drag happened. This value is used to position the handle.
    • interact.maxInteractions(Infinity) [docs] is needed to enable multiple interactions on any target. The default value is 1 for backwards compatibility.

    A lot of differences in browser implementations are resolved by interact.js. MouseEvents, TouchEvents and PointerEvents would produce identical drag event objects so this slider works on iOS, Android, Firefox OS and Windows RT as well as on desktop browsers as far back as IE8.

    Rainbow pixel canvas demo

    interact.js is useful for more than moving elements around a page. Here I use it for drawing onto a canvas element.

    See the Pen interact.js pixel rainbow canvas by Taye A (@taye) on CodePen.

    JavaScript rundown

    var pixelSize = 16;
     
    interact('.rainbow-pixel-canvas')
      .snap({
        // snap to the corners of a grid
        mode: 'grid',
        // specify the grid dimensions
        grid: { x: pixelSize, y: pixelSize }
      })
      .origin('self')
      .draggable({
        max: Infinity,
        maxPerElement: Infinity
      })
      // draw colored squares on move
      .on('dragmove', function (event) {
        var context = event.target.getContext('2d'),
            // calculate the angle of the drag direction
            dragAngle = 180 * Math.atan2(event.dx, event.dy) / Math.PI;
     
        // set color based on drag angle and speed
        context.fillStyle = 'hsl(' + dragAngle + ', 86%, '
                            + (30 + Math.min(event.speed / 1000, 1) * 50) + '%)';
     
        // draw squares
        context.fillRect(event.pageX - pixelSize / 2, event.pageY - pixelSize / 2,
                         pixelSize, pixelSize);
      })
      // clear the canvas on doubletap
      .on('doubletap', function (event) {
        var context = event.target.getContext('2d');
     
        context.clearRect(0, 0, context.canvas.width, context.canvas.height);
      });
     
      function resizeCanvases () {
        [].forEach.call(document.querySelectorAll('.rainbow-pixel-canvas'), function (canvas) {
          canvas.width = document.body.clientWidth;
          canvas.height = window.innerHeight * 0.7;
        });
      }
     
      // interact.js can also add DOM event listeners
      interact(document).on('DOMContentLoaded', resizeCanvases);
      interact(window).on('resize', resizeCanvases);
     
    interact.maxInteractions(Infinity);

    Snapping is used to modify the pointer coordinates so that they are always aligned to a grid.

      .snap({
        // snap to the corners of a grid
        mode: 'grid',
        // specify the grid dimensions
        grid: { x: pixelSize, y: pixelSize }
      })

    Like in the previous demo, multiple drags are enabled but an extra option, maxPerElement, needs to be changed to allow multiple drags on the same element.

      .draggable({
        max: Infinity,
        maxPerElement: Infinity
      })

    The movement angle is calculated with Math.atan2(event.dx, event.dy) and that’s used to set the hue of the paint color. event.speed is used to adjust the lightness.

    interact.js has tap and double tap events which are equivalent to click and double click but without the delay on mobile devices. Also, unlike regular click events, a tap isn’t fired if the mouse is moved before being released. (I’m working on adding more events like these).

      // clear the canvas on doubletap
      .on('doubletap', function (event) {
        ...

    It can also listen for regular DOM events. In the above demo it’s used to listen for window resize and document DOMContentLoaded.

      interact(document).on('DOMContentLoaded', resizeCanvases);
      interact(window).on('resize', resizeCanvases);

    Similar to jQuery, It can also be used for delegated events. For example:

    interact('input', { context: document.body })
      .on('keypress', function (event) {
        console.log(event.key);
      });

    Supplying element dimensions

    To get element dimensions interact.js normally uses:

    • Element#getBoundingClientRect() for SVGElements and
    • Element#getClientRects()[0] for HTMLElements (because it includes the element’s borders)

    and adds page scroll. This is done when checking which action to perform on an element, checking for drops, calculating 'self' origin and in a few other places. If your application keeps the dimensions of elements that are being interacted with, then it makes sense to use the application’s data instead of getting the DOMRect. To allow this, Interactables have a rectChecker() [docs] method to change how elements’ dimensions are gotten. The method takes a function as an argument. When interact.js needs an element’s dimensions, the element is passed to that function and the return value is used.

    Graphic Editor Demo

    The “SVG editor” below has a Rectangle class to represent <rect class="edit-rectangle"/> elements in the DOM. Each rectangle object has dimensions, the element that the user sees and a draw method.

    See the Pen Interactable#rectChecker demo by Taye A (@taye) on CodePen.

    JavaScript rundown

    var svgCanvas = document.querySelector('svg'),
        svgNS = 'http://www.w3.org/2000/svg',
        rectangles = [];
     
    function Rectangle (x, y, w, h, svgCanvas) {
      this.x = x;
      this.y = y;
      this.w = w;
      this.h = h;
      this.stroke = 5;
      this.el = document.createElementNS(svgNS, 'rect');
     
      this.el.setAttribute('data-index', rectangles.length);
      this.el.setAttribute('class', 'edit-rectangle');
      rectangles.push(this);
     
      this.draw();
      svgCanvas.appendChild(this.el);
    }
     
    Rectangle.prototype.draw = function () {
      this.el.setAttribute('x', this.x + this.stroke / 2);
      this.el.setAttribute('y', this.y + this.stroke / 2);
      this.el.setAttribute('width' , this.w - this.stroke);
      this.el.setAttribute('height', this.h - this.stroke);
      this.el.setAttribute('stroke-width', this.stroke);
    }
     
    interact('.edit-rectangle')
      // change how interact gets the
      // dimensions of '.edit-rectangle' elements
      .rectChecker(function (element) {
        // find the Rectangle object that the element belongs to
        var rectangle = rectangles[element.getAttribute('data-index')];
     
        // return a suitable object for interact.js
        return {
          left  : rectangle.x,
          top   : rectangle.y,
          right : rectangle.x + rectangle.w,
          bottom: rectangle.y + rectangle.h
        };
      })

    Whenever interact.js needs to get the dimensions of one of the '.edit-rectangle' elements, it calls the rectChecker function that was specified. The function finds the Rectangle object using the element argument then creates and returns an appropriate object with left, right, top and bottom properties.

    This object is used for restricting when the restrict elementRect option is set. In the slider demo from earlier, restriction used only the pointer coordinates. Here, restriction will try to prevent the element from being dragged out of the specified area.

      .inertia({
        // don't jump to the resume location
        // https://github.com/taye/interact.js/issues/13
        zeroResumeDelta: true
      })
      .restrict({
        // restrict to a parent element that matches this CSS selector
        drag: 'svg',
        // only restrict before ending the drag
        endOnly: true,
        // consider the element's dimensions when restricting
        elementRect: { top: 0, left: 0, bottom: 1, right: 1 }
      })

    The rectangles are made draggable and resizable.

      .draggable({
        max: Infinity,
        onmove: function (event) {
          var rectangle = rectangles[event.target.getAttribute('data-index')];
     
          rectangle.x += event.dx;
          rectangle.y += event.dy;
          rectangle.draw();
        }
      })
      .resizable({
        max: Infinity,
        onmove: function (event) {
          var rectangle = rectangles[event.target.getAttribute('data-index')];
     
          rectangle.w = Math.max(rectangle.w + event.dx, 10);
          rectangle.h = Math.max(rectangle.h + event.dy, 10);
          rectangle.draw();
        }
      });
     
    interact.maxInteractions(Infinity);

    Development and contributions

    I hope this article gives a good overview of how to use interact.js and the types of applications that I think it would be useful for. If not, there are more demos on the project homepage and you can throw questions or issues at Twitter or Github. I’d really like to make a comprehensive set of examples and documentation but I’ve been too busy with fixes and improvements. (I’ve also been too lazy :-P).

    Since the 1.0.0 release, user comments and contributions have led to loads of bug fixes and many new features including:

    So please use it, share it, break it and help to make it better!

  3. jsDelivr and its open-source load balancing algorithm

    This is a guest post by Dmitriy Akulov of jsDelivr.

    Recently I wrote about jsDelivr and what makes it unique where I described in detail about the features that we offer and how our system works. Since then we improved a lot of stuff and released even more features. But the biggest one is was the open source of our load balancing algorithm.

    As you know from the previous blog post we are using Cedexis to do our load balancing. In short we collect millions of RUM (Real User Metrics) data points from all over the world. When a user visits a website partner of Cedexis or ours a JavaScript is executed in the background that does performance checks to our core CDNs, MaxCDN and CloudFlare, and sends this data back to Cedexis. We can then use it to do load balancing based on real time performance information from real life users and ISPs. This is important as it allows us to mitigate outages that CDNs can experience in very localized areas such as a single country or even a single ISP and not worldwide.

    Open-sourcing the load balancing code

    Now our load balancing code is open to everybody to review, test and even send their own Pull Requests with improvements and modifications.

    Until recently the code was actually written in PHP, but due to performance issues and other problems that arrised from that it was decided to switch to JavaScript. Now the DNS application is completely written in js and I will try to explain how exactly it works.

    This is an application that runs on a DNS level and integrates with Cedexis’ API. Every DNS request made to cdn.jsdelivr.net is processed by the following code and then based on all the variables it returns a CNAME that the client can use to get the requested asset.

    Declaring providers

    The first step is to declare our providers:

    providers: {
        'cloudflare': 'cdn.jsdelivr.net.cdn.cloudflare.net',
        'maxcdn': 'jsdelivr3.dak.netdna-cdn.com',
        ...
    },

    This array contains all the aliases of our providers and the hostnames that we can return if the provider is then chosen. We actually use a couple of custom servers to improve the performance in locations that the CDNs lack but we are currently in the process of removing all of them in favor of more enterprise CDNs that wish to sponsor us.

    Before I explain the next array I want to skip to line 40:

    defaultProviders: [ 'maxcdn', 'cloudflare' ],

    Because our CDN providers get so much more RUM tests than our custom servers their data and in turn the load balancing results are much more reliable and better. This is why by default only MaxCDN and CloudFlare are considered for any user request. And its actually the main reason we want to sunset our custom servers.

    Country mapping

    Now that you know that comes our next array:

    countryMapping: {
        'CN': [ 'exvm-sg', 'cloudflare' ],
        'HK': [ 'exvm-sg', 'cloudflare' ],
        'ID': [ 'exvm-sg', 'cloudflare' ],
        'IT': [ 'prome-it', 'maxcdn', 'cloudflare' ],
        'IN': [ 'exvm-sg', 'cloudflare' ],
        'KR': [ 'exvm-sg', 'cloudflare' ],
        'MY': [ 'exvm-sg', 'cloudflare' ],
        'SG': [ 'exvm-sg', 'cloudflare' ],
        'TH': [ 'exvm-sg', 'cloudflare' ],
        'JP': [ 'exvm-sg', 'cloudflare', 'maxcdn' ],
        'UA': [ 'leap-ua', 'maxcdn', 'cloudflare' ],
        'RU': [ 'leap-ua', 'maxcdn' ],
        'VN': [ 'exvm-sg', 'cloudflare' ],
        'PT': [ 'leap-pt', 'maxcdn', 'cloudflare' ],
        'MA': [ 'leap-pt', 'prome-it', 'maxcdn', 'cloudflare' ]
    },

    This array contains country mappings that override the “defaultProviders” parameter. This is where the custom servers currently come to use. For some countries we know 100% that our custom servers can be much faster than our CDN providers so we manually specify them. Since these locations are few we only need to create handful of rules.

    ASN mappings

    asnMapping: {
        '36114': [ 'maxcdn' ], // Las Vegas 2
        '36351': [ 'maxcdn' ], // San Jose + Washington
        '42473': [ 'prome-it' ], // Milan
        '32489': [ 'cloudflare' ], // Canada
        ...
    },

    ASN mappings contains overrides per ASN. Currently we are using them to improve the results of Pingdom tests. The reason for this is because we rely on RUM results to do load balancing we never get any performance tests for ASNs used by hosting providers such as companies where Pingdom rents their servers. So the code is forced to failover to country level performance data to chose the best provider for Pingdom and any other synthetic test and server. This data is not always reliable because not all ISPs have the same performance with a CDN provider as the fastest CDN provider country-wide. So we tweak some ASNs to work better with jsDelivr.

    More settings

    • lastResortProvider sets the CDN provider we want to use in case the application fails to chose one itself. This should be very rare.
    • defaultTtl: 20 is the TTL for our DNS record. We made some tests and decided that this was the most optimal value. In worst case scenario in case of downtime the maximum downtime jsDelivr can have is 20 seconds. Plus our DNS and our CDN are fast enough to compensate for the extra DNS latency every 20 seconds without having any impact on performance.
    • availabilityThresholds is a value in percentage and sets the uptime below which a provider should be considered down. This is based on RUM data. Again because of some small issues with synthetic tests we had to lower the Pingdom threshold. The Pingdom value does not impact anyone else.
    • sonarThreshold Sonar is a secondary uptime monitor we use to ensure the uptime of our providers. It runs every 60 seconds and checks all of our providers including their SSL certificates. If something is wrong our application will pick up the change in uptime and if it drops below this threshold it will be considered as down.
    • And finally minValidRtt is there to filter out all invalid RUM tests.

    The initialization process

    Next our app starts the initialization process. Wrong config and uptime not meeting our criteria is checked and all providers not matching our criteria are then removed from the potential candidates for this request.

    Next we create a reasons array for debugging purposes and apply our override settings. Here we use Cedexis API to get the latest live data for sonar uptime, rum update and HTTP performance.

    sonar = request.getData('sonar');
    candidates = filterObject(request.getProbe('avail'), filterCandidates);
    //console.log('candidates: ' + JSON.stringify(candidates));
    candidates = joinObjects(candidates, request.getProbe('http_rtt'), 'http_rtt');
    //console.log('candidates (with rtt): ' + JSON.stringify(candidates));
    candidateAliases = Object.keys(candidates);

    In case of uptime we also filter our bad providers that dont meet our criteria of uptime by calling the filterCandidates function.

    function filterCandidates(candidate, alias) {
        return (-1 < subpopulation.indexOf(alias))
        && (candidate.avail !== undefined)
        && (candidate.avail >= availabilityThreshold)
        && (sonar[alias] !== undefined)
        && (parseFloat(sonar[alias]) >= settings.sonarThreshold);
    }

    The actual decision making is performed by a rather small code:

    if (1 === candidateAliases.length) {
        decisionAlias = candidateAliases[0];
        decisionReasons.push(reasons.singleAvailableCandidate);
        decisionTtl = decisionTtl || settings.defaultTtl;
    } else if (0 === candidateAliases.length) {
        decisionAlias = settings.lastResortProvider;
        decisionReasons.push(reasons.noneAvailableOrNoRtt);
        decisionTtl = decisionTtl || settings.defaultTtl;
    } else {
        candidates = filterObject(candidates, filterInvalidRtt);
        //console.log('candidates (rtt filtered): ' + JSON.stringify(candidates));
        candidateAliases = Object.keys(candidates);
        if (!candidateAliases.length) {
        decisionAlias = settings.lastResortProvider;
        decisionReasons.push(reasons.missingRttForAvailableCandidates);
        decisionTtl = decisionTtl || settings.defaultTtl;
    } else {
        decisionAlias = getLowest(candidates, 'http_rtt');
        decisionReasons.push(reasons.rtt);
        decisionTtl = decisionTtl || settings.defaultTtl;
    }
    }
        response.respond(decisionAlias, settings.providers[decisionAlias]);
        response.setReasonCode(decisionReasons.join(''));
        response.setTTL(decisionTtl);
    };

    In case we only have 1 provider left after our checks we simply select that provider and output the CNAME, if we have 0 providers left then the lastResortProvider is used. Otherwise if everything is ok and we have more than 1 provider left we do more checks.

    Once we have left with providers that are currently online and don’t have any issues with their performance data we sort them based on RUM HTTP performance and push the CNAME out for the user’s browser to use.

    And thats it. Most of the other stuff like fallback to country level data is automatically done in backend and we only get the actual data we can use in our application.

    Conclusion

    I hope you found it interesting and learned more about what you should be considering when doing load balancing especially based on RUM data.

    Check out jsDelivr and feel free to use it in your projects. If you are interested to help we are also looking for node.js developers and designers to help us out.

    We are also looking for companies sponsors to help us grow even faster.

  4. An easier way of using polyfills

    Polyfills are a fantastic way to enable the use of modern code even while supporting legacy browsers, but currently using polyfills is too hard, so at the FT we’ve built a new service to make it easier. We’d like to invite you to use it, and help us improve it.

    Image credit: https://www.flickr.com/photos/hamur0w0/6984884135

    More pictures, they said. So here’s a unicorn, which is basically a horse with a polyfill.

    The challenge

    Here are some of the issues we are trying to solve:

    • Developers do not necessarily know which features need to be polyfilled. You load your site in some old version of IE beloved by a frustratingly large number of your users, see that the site doesn’t work, and have to debug it to figure out which feature is causing the problem. Sometimes the culprit is obvious, but often not, especially when legacy browsers also lack good developer tools.
    • There are often multiple polyfills available for each feature. It can be hard to know which one most faithfully emulates the missing feature.
    • Some polyfills come as a big bundle with lots of other polyfills that you don’t need, to provide comprehensive coverage of a large feature set, such as ES6. It should not be necessary to ship all of this code to the browser to fix something very simple.
    • Newer browsers don’t need the polyfill, but typically the polyfill is served to all browsers. This reduces performance in modern browsers in order to improve compatibility with legacy ones. We don’t want to make that compromise. We’d rather serve polyfills only to browsers that lack a native implementation of the feature.

    Our solution: polyfills as a service

    To solve these problems, we created the polyfill service. It’s a similar idea to going to an optometrist, having your eyes tested, and getting a pair of glasses perfectly designed to correct your particular vision problem. We are doing the same for browsers. Here’s how it works:

    1. Developers insert a script tag into their page, which loads the polyfill service endpoint.
    2. The service analyses the browser’s user-agent header and a list of requested features (or uses a default list of everything polyfillable) and builds a list of polyfills that are required for this browser
    3. The polyfills are ordered using a graph sort to place them in the right dependency order.
    4. The bundle is minified and served through a CDN (for which we’re very grateful to Fastly for their support)

    Do we really need this solution? Well, consider this: Modernizr is a big grab bag of feature detects, and all sensible use cases benefit from a custom build, but a large proportion of Modernizr users just use the default build, often from cdnjs.com or as part of html5boilerplate. Why include Modernizr if you aren’t using its feature detects? Maybe you misunderstand the purpose of the library and just think that Modernizr “fixes stuff”? I have to admit, I did, when I first heard the name, and I was mildly disappointed to find that rather than doing any actual modernising, Modernizr actually just defines modernness.

    The polyfill service, on the other hand, does fix stuff. There’s really nothing wrong with not wanting to spend time gaining intimate knowledge of all the foibles of legacy browsers. Let someone figure it out once, and then we can all benefit from it without needing or wanting to understand the details.

    How to use it

    The simplest use case is:

    <script src="//cdn.polyfill.io/v1/polyfill.min.js" async defer></script>

    This includes our default polyfill set. The default set is a manually curated list of features that we think are most essential to modern web development, and where the polyfills are reasonably small and highly accurate. If you want to specify which features you want to polyfill though, go right ahead:

    <!-- Just the Array.from polyfill -->
    <script src="//cdn.polyfill.io/v1/polyfill.min.js?features=Array.from" async defer></script>
     
    <!-- The default set, plus the geolocation polyfill -->
    <script src="//cdn.polyfill.io/v1/polyfill.min.js?features=default,Navigator.prototype.geolocation" async defer></script>

    If it’s important that you have loaded the polyfills before parsing your own code, you can remove the async and defer attributes, or use a script loader (one that doesn’t require any polyfills!).

    Testing and documenting feature support

    This table shows the polyfill service’s effect for a number of key web technologies and a range of popular browsers:

    Polyfill service support grid

    The full list of features we support is shown on our feature matrix. To build this grid we use Sauce Labs’ test automation platform, which runs each polyfill through a barrage of tests in each browser, and documents the results.

    So, er, user-agent sniffing? Really?

    Yes. There are several reasons why UA analysis wins out over feature detection for us:

    • In some cases, we have multiple polyfills for the same feature, because some browsers offer a non-compliant implementation that just needs to be bashed into shape, while others lack any implementation at all. With UA detection you can choose to serve the right variant of the polyfill.
    • With UA detection, the first HTTP request can respond directly with polyfill code. If we used feature detection, the first request would serve feature-detect code, and then a second one would be needed to fetch specific polyfills.

    Almost all websites with significant scale do UA detection. This isn’t to say the stigma attached to it is necessarily bad. It’s easy to write bad UA detect rules, and hard to write good ones. And we’re not ruling out making a way of using the service via feature-detects (in fact there’s an issue in our tracker for it).

    A service for everyone

    The service part of the app is maintained by the FT, and we are working on expanding and improving the tools, documentation, testing and service features all the time. The source is freely available on GitHub so you can easily host it yourself, but we also host an instance of the service on cdn.polyfill.io which you can use for free, and our friends at Fastly are providing free CDN distribution and SSL.

    We’ve made a platform. We need the community’s help to populate it. We already serve some of the best polyfills from Jonathan Neal, Mathias Bynens and others, but we’d love to be more comprehensive. Bring your polyfills, improve our tests, and make this a resource that can help move the web forward!

  5. Porting to Emscripten

    Emscripten is an open-source compiler that compiles C/C++ source code into the highly optimizable asm.js subset of JavaScript. This enables running programs originally written for desktop environments in a web browser.

    Porting your game to Emscripten offers several benefits. Most importantly it enables reaching a far wider potential user base. Emscripten games work on any modern web browser. There is no need for installers or setups – the user just opens a web page. Local storage of game data in browser cache means the game only needs to be re-downloaded after updates. If you implement a cloud based user data storage system users can continue their gameplay seamlessly on any computer with a browser.

    More info is available in:

    While Emscripten support for portable C/C++ code is very good there are some things that need to be taken into consideration. We will take a look at those in this article.

    Part 1: Preparation

    Is porting my game to Emscripten even feasible? If it is, how easy will it be? First consider the following restrictions imposed by Emscripten:

    • No closed-source third-party libraries
    • No threads

    Then, already having some of the following:

    • Using SDL2 and OpenGL ES 2.0 for graphics
    • Using SDL2 or OpenAL for audio
    • Existing multiplatform support

    will make the porting task easier. We’ll next look into each of these points more closely.

    First things to check

    If you’re using any third-party libraries for which you don’t have the source code you’re pretty much out of luck. You’ll have to rewrite your code not to use them.

    Heavy use of threads is also going to be a problem since Emscripten doesn’t currently support them. There are web workers but they’re not the same thing as threads on other platforms since there’s no shared memory. So you’ll have to disable multithreading.

    SDL2

    Before even touching Emscripten there are things you can do in your normal development environment. First of all you should use SDL2. SDL is a library which takes care of platform-specific things like creating windows and handling input. An incomplete port of SDL 1.3 ships with Emscripten and there’s a port of full SDL2 in the works. It will be merged to upstream soon.

    Space combat in FTL.

    OpenGL ES 2.0

    Second thing is to use OpenGL ES 2.0. If your game is using the SDL2 render interface this has already been done for you. If you use Direct3D you’ll first have to create an OpenGL version of your game. This is why multiplatform support from the beginning is such a good idea.

    Once you have a desktop OpenGL version you then need to create an OpenGL ES version. ES is a subset of full OpenGL where some features are not available and there are some additional restrictions. At least the NVidia driver and probably also AMD support creating ES contexts on desktop. This has the advantage that you can use your existing environment and debugging tools.

    You should avoid the deprecated OpenGL fixed-function pipeline if possible. While Emscripten has some support for this it might not work very well.

    There are certain problems you can run into at this stage. First one is lack of extension support. Shaders might also need rewriting for Emscripten. If you are using NVidia add #version line to trigger stricter shader validation.

    GLSL ES requires precision qualifiers for floating-point and integer variables. NVidia accepts these on desktop but most other GL implementations not, so you might end up with two different sets of shaders.

    OpenGL entry point names are different between GL ES and desktop. GL ES does not require a loader such as GLEW but you still might have to check GL extensions manually if you are using any. Also note that OpenGL ES on desktop is more lenient than WebGL. For example WebGL is more strict about glTexImage parameters and glTexParameter sampling modes.

    Multiple render targets might not be supported on GL ES. If you are using a stencil buffer you must also have a depth buffer. You must use vertex buffer objects, not user-mode arrays. Also you cannot mix index and vertex buffers into the same buffer object.

    For audio you should use SDL2 or OpenAL. One potential issue is that the Emscripten OpenAL implementation might require more and larger sound buffers than desktop to avoid choppy sounds.

    Multiplatform support

    It’s good if your project has multiplatform support, especially for mobile platforms (Android, iOS). There are two reasons for this. First, WebGL is essentially OpenGL ES instead of desktop OpenGL so most of your OpenGL work is already done. Second, since mobile platforms use ARM architecture most of the processor-specific problems have already been fixed. Particularly important is memory alignment since Emscripten doesn’t support unaligned loads from memory.

    After you have your OpenGL sorted out (or even concurrently with it if you have multiple people) you should port your game to Linux and/or OS X. Again there are several reasons. First one is that Emscripten is based on LLVM and Clang. If your code was written and tested with MSVC it probably contains non standard constructs which MSVC will accept but other compilers won’t. Also different optimizer might expose bugs which will be much easier to debug on desktop than on a browser.

    FTL Emscripten version main menu. Notice the missing “Quit” button. The UI is similar to that of the iPad version.

    A good overview of porting a Windows game to Linux is provided in Ryan Gordon’s Steam Dev Days talk.

    If you are using Windows you could also compile with MinGW.

    Useful debugging tools

    UBSan

    The second reason for porting to Linux is to gain access to several useful tools. First among these is undefined behavior sanitizer (UBSan). It’s a Clang compiler feature which adds runtime checks to catch C/C++ undefined behavior in your code. Most useful of these is the unaligned load check. C/C++ standard specifies that when accessing a pointer it must be properly aligned. Unfortunately x86-based processors will perform unaligned loads so most existing code has not been checked for this. ARM-based processors will usually crash your program when this happens. This is why a mobile port is good. On Emscripten an unaligned load will not crash but instead silently give you incorrect results.

    UBSan is also available in GCC starting with 4.9 but unfortunately the unaligned load sanitizer is only included in the upcoming 5.0 release.

    AddressSanitizer

    Second useful tool in Clang (and GCC) is AddressSanitizer. This is a runtime checker which validates your memory accesses. Reading or writing outside allocated buffers can lead to crashes on any platform but the problem is somewhat worse on Emscripten. Native binaries have a large address space which contains lots of empty space. Invalid read, especially one that is only slightly off, might hit a valid address and so not crash immediately or at all. On Emscripten the address space is much “denser” so any invalid access is likely to hit something critical or even be outside the allocated address space entirely. This will trigger an unspectacular crash and might be very hard to debug.

    Valgrind

    The third tool is Valgrind. It is a runtime tool which runs uninstrumented binaries and checks them for various properties. For our purposes the most useful are memcheck and massif. Memcheck is a memory validator like AddressSanitizer but it catches a slightly different set of problems. It can also be used to pinpoint memory leaks. Massif is a memory profiler which can answer the question “why am I using so much memory?” This is useful since Emscripten is also a much more memory-constrained platform than desktop or even mobile and has no built in tools for memory profiling.

    Valgrind also has some other checkers like DRD and Helgrind which check for multithreading issues but since Emscripten doesn’t support threads we won’t discuss them here. They are very useful though so if you do multithreading on desktop you really should be using them.

    Valgrind is not available on Windows and probably will never be. That alone should be a reason to port your games to other platforms.

    Third-party libraries

    Most games use a number of third-party libraries. Hopefully you’ve already gotten rid of any closed-source ones. But even open-source ones are usually shipped as already-compiled libraries. Most of these are not readily available on Emscripten so you will have to compile them yourself. Also the Emscripten object format is based on LLVM bytecode which is not guaranteed to be stable. Any precompiled libraries might no longer work in future versions of Emscripten.

    While Emscripten has some support for dynamic linking it is not complete or well supported and should be avoided.

    The best way around these issues is to build your libraries as part of your standard build process and statically link them. While bundling up your libraries to archives and including those in link step works you might run into unexpected problems. Also changing your compiler options becomes easier if all sources are part of your build system.

    Once all that is done you should actually try to compile with Emscripten. If you’re using MS Visual Studio 2010 there’s an integration module which you can try. If you’re using cmake Emscripten ships with a wrapper (emcmake) which should automatically configure your build.

    If you’re using some other build system it’s up to you to set it up. Generally CC=emcc and CXX=em++ should do the trick. You might also have to remove platform-specific options like SSE and such.

    Part 2: Emscripten itself

    So now it links but when you load it up in your browser it just hangs and after a while the browser will tell you the script has hung and kill it.

    What went wrong?

    On desktop games have an event loop which will poll input, simulate state and draw the scene and run until terminated. On a browser there is instead a callback which does these things and is called by the browser. So to get your game to work you have to refactor your loop to a callback. In Emscripten this is set with the function emscripten_set_main_loop. Fortunately in most cases this is pretty simple. The easiest way is to refactor the body of your loop to a helper function and then in your desktop version call it in a loop and in the browser set it as your callback. Or if you’re using C++11 you can use a lambda and store that in std::function. Then you can add a small wrapper which calls that.

    Problems appear if you have multiple separate loops, for example loading screens. In that case you need to either refactor them into a single loop or call them one after another, setting a new one and canceling the previous one with emscripten_cancel_main_loop. Both of these are pretty complex and depend heavily on your code.

    So, now the game runs but you get a bunch of error messages that your assets can’t be found. The next step is to add your assets to the package. The simple way is to preload them. Adding the switch --preload-file <filename> to link flags will cause Emscripten to add the specified files to a .data file which will then be preloaded before main is called. These files can then be accessed with standard C/C++ IO calls. Emscripten will take care of the necessary magic.

    However this approach becomes problematic when you have lots of assets. The whole package needs to be loaded before the program starts which can lead to excessive loading times. To fix this you can stream in some assets like music or video.

    If you already have async loading in your desktop code you can reuse that. Emscripten has the function emscripten_async_wget_data for loading data asynchronously. One difference to keep in mind is that Emscripten async calls only know asset size after loading has completed whereas desktop generally knows if after the file has been opened. For optimal results you should refactor your code to something like “load this file, then here’s an operation to do after you have it”. C++11 lambdas can be useful here. In any case you really should have matching code on the desktop version because debugging is so much easier there.

    You should add a call at the end of your main loop which handles async loads. You should not load too much stuff asynchronously as it can be slow, especially if you’re loading multiple small files.

    So now it runs for a while but crashes with a message about exceeded memory limit. Since Emscripten emulates memory with JavaScript arrays the size of those arrays is crucial. By default they are pretty small and can’t grow. You can enable growing them by linking with -s ALLOW_MEMORY_GROWTH=1 but this is slow and might disable asm.js optimizations. It’s mostly useful in the debugging phase. For final release you should find out a memory limit that works and use -s TOTAL_MEMORY=<number>.

    As described above, Emscripten doesn’t have a memory profiler. Use Valgrind massif tool on Linux to find out where the memory is spent.

    If your game is still crashing you can try using JavaScript debugger and source maps but they don’t necessarily work very well. This is why sanitizers are important. printf or other logging is a good way to debug too. Also -s SAFE_HEAP=1 in link stage can find some memory bugs.

    Osmos test version on Emscripten test html page.

    Saves and preferences

    Saving stuff is not as simple as on desktop. The first thing you should do is find all the places where you’re saving or loading user-generated data. All of it should be in one place or go through one wrapper. If it doesn’t you should refactor it on desktop before continuing.

    The simplest thing is to set up a local storage. Emscripten already has the necessary code to do it and emulate standard C-like filesystem interface so you don’t have to change anything.

    You should add something like this to either the preRun in html or first thing in your main:

    FS.createFolder('/', 'user_data', true, true)
    FS.mount(IDBFS, {}, '/user_data');
    FS.syncfs(true, function(err) {
                  if(err) console.log('ERROR!', err);
                  console.log('finished syncing..');
                }

    Then after you’ve written a file you need to tell the browser to sync it. Add a new method which contains something like this:

    static void userdata_sync()
    {
        EM_ASM(
            FS.syncfs(function(error) {
                if (error) {
                    console.log("Error while syncing", error);
                }
                });
            );
    }

    and call it after closing the file.

    While this works it has the problem that the files are stored locally. For desktop games this is not a problem since users understand that saves are stored on their computer. For web-based games the users expect their saves to be there on all computers. For the Mozilla Bundle, Humble Bundle built a CLOUDFS library which works just like Emscripten’s IDBFS and has a pluggable backend. You need to build your own using emscripten GET and POST APIs.

    Osmos demo at the Humble Mozilla Bundle page.

    Making it fast

    So now your game runs but not very fast. How to make it faster?

    On Firefox the first thing to check is that asm.js is enabled. Open web console and look for message “Successfully compiled asm.js”. If it’s not there the error message should tell you what’s going wrong.

    The next thing to check is your optimization level. Emscripten requires proper -O option both when compiling and linking. It’s easy to forget -O from link stage since desktop doesn’t usually require it. Test the different optimization levels and read the Emscripten documentation about other build flags. In particular OUTLINING_LIMIT and PRECISE_F32 might affect code speed.

    You can also enable link-time optimization by adding --llvm-lto <n> option. But beware that this has known bugs which might cause incorrect code generation and will only be fixed when Emscripten is upgraded to a newer LLVM sometime in the future. You might also run into bugs in the normal optimizer since Emscripten is still somewhat work-in-progress. So test your code carefully and if you run into any bugs report them to Emscripten developers.

    One strange feature of Emscripten is that any preloaded resources will be parsed by the browser. We usually don’t want this since we’re not using the browser to display them. Disable this by adding the following code as --pre-js:

    var Module;
    if (!Module) Module = (typeof Module !== 'undefined' ? Module : null) || {};
    // Disable image and audio decoding
    Module.noImageDecoding = true;
    Module.noAudioDecoding = true;

    Next thing: don’t guess where the time is being spent, profile! Compile your code with --profiling option (both compile and link stage) so the compiler will emit named symbols. Then use the browser’s built-in JavaScript profiler to see which parts are slow. Beware that some versions of Firefox can’t profile asm.js code so you will either have to upgrade your browser or temporarily disable asm.js by manually removing use asm -statement from the generated JavaScript. You should also profile with both Firefox and Chrome since they have different performance characteristics and their profilers work slightly differently. In particular Firefox might not account for slow OpenGL functions.

    Things like glGetError and glCheckFramebuffer which are slow on desktop can be catastrophic in a browser. Also calling glBufferData or glBufferSubData too many times can be very slow. You should refactor your code to avoid them or do as much with one call as possible.

    Another thing to note is that scripting languages used by your game can be very slow. There’s really no easy way around this one. If your language provides profiling facilities you can use those to try to speed it up. The other option is to replace your scripts with native code which will get compiled to asm.js.

    If you’re doing physics simulation or something else that can take advantage of SSE optimizations you should be aware that currently asm.js doesn’t support it but it should be coming sometime soon.

    To save some space on the final build you should also go through your code and third party libraries and disable all features you don’t actually use. In particular libraries like SDL2 and freetype contain lots of stuff which most programs don’t use. Check the libraries’ documentation on how to disable unused features. Emscripten doesn’t currently have a way to find out which parts of code are the largest but if you have a Linux build (again, you should) you can use

    nm -S --size-sort game.bin

    to see this. Just be aware that what’s large on Emscripten and what’s large on native might not be the same thing. In general they should agree pretty well.

    Sweeping autumn leaves in Dustforce.

    In conclusion

    To sum up, porting an existing game to Emscripten consists of removing any closed-source third party libraries and threading, using SDL2 for window management and input, OpenGL ES for graphics, and OpenAL or SDL2 for audio. You should also first port your game to other platforms, such as OS X and mobile, but at least for Linux. This makes finding potential issues easier and gives access to several useful debugging tools. The Emscripten port itself minimally requires changes to main loop, asset file handling, and user data storage. Also you need to pay special attention to optimizing your code to run in a browser.

  6. Massive: The asm.js Benchmark

    asm.js is a subset of JavaScript that is very easy to optimize. Most often it is generated by a compiler, such as Emscripten, from C or C++ code. The result can run at very high speeds, close to that of the same code compiled natively. For that reason, Emscripten and asm.js are useful for things like 3D game engines, which are usually large and complex C++ codebases that need to be fast, and indeed top companies in the game industry have adopted this approach, for example Unity and Epic, and you can see it in action in the Humble Mozilla Bundle, which recently ran.

    As asm.js code becomes more common, it is important to be able to measure performance on it. There are of course plenty of existing benchmarks, including Octane which contains one asm.js test, and JetStream which contains several. However, even those do not contain very large code samples, and massive codebases are challenging in particular ways. For example, just loading a page with such a script can take significant time while the browser parses it, causing a pause that is annoying to the user.

    A recent benchmark from Unity measures the performance of their game engine, which (when ported to the web) is a large asm.js codebase. Given the high popularity of the Unity engine among developers, this is an excellent benchmark for game performance in browsers, as real-world as it can get, and also it tests large-scale asm.js. It does however focus on game performance as a whole, taking into account both WebGL and JavaScript execution speed. For games, that overall result is often what you care about, but it is also interesting to measure asm.js on its own.

    Benchmarking asm.js specifically

    Massive is a benchmark that measures asm.js performance specifically. It contains several large, real-world codebases: Poppler, SQLite, Lua and Box2D; see the FAQ on the massive site for more details on each of those.

    Massive reports an overall score, summarizing it’s individual measurements. This score can help browser vendors track their performance over time and point to areas where improvements are needed, and for developers it can provide a simple way to get an idea of how fast asm.js execution is on a particular device and browser.

    Importantly, Massive does not only test throughput. As already mentioned, large codebases can affect startup time, and they can also affect responsiveness and other important aspects of the user experience. Massive therefore tests, in addition to throughput, how long it takes the browser to load a large codebase, and how responsive it is while doing so. It also tests how consistent performance is. Once again, see the FAQ for more details on each of those.

    Massive has been developed openly on github from day one, and we’ve solicited and received feedback from many relevant parties. Over the last few months Massive development has been in beta while we received comments, and there are currently no substantial outstanding issues, so we are ready to announce the first stable version, Massive 1.0.

    Massive tests multiple aspects of performance, in new ways, so it is possible something is not being measured in an optimal manner, and of course bugs always exist in software. However, by developing Massive in the open and thereby giving everyone the chance to inspect it and report issues, and by having a lengthy beta period, we believe we have the best possible chance of a reliable result. Of course, if you do find something wrong, please file an issue! General feedback is of course always welcome as well.

    Massive performance over time

    Massive is brand-new, but it is still interesting to look at how it performs on older browsers (“retroactively”), because if it measures something useful, and if browsers are moving in the right direction, then we should see Massive improve over time, even on browser versions that were released long before Massive existed. The graph below shows Firefox performance from version 14 (released 2012-07-17, over 2 years ago) and version 32 (which became the stable version in September 2014):

    Higher numbers are better, so we can indeed see that Massive scores do follow the expected pattern of improvement, with Firefox’s Massive score rising to around 6x its starting point 2 years ago. Note that the Massive score is not “linear” in the sense that 6x the score means 6x the performance, as it is calculated using the geometric mean (like Octane), however, the individual scores it averages are mostly linear. A 6x improvement therefore does represent a very large and significant speedup.

    Looking more closely at the changes over time, we can see which features landed in each of those versions of Firefox where we can see a significant improvement:

    There are three big jumps in Firefox’s Massive score, each annotated:

    • Firefox 22 introduced OdinMonkey, an optimization module for asm.js code. By specifically optimizing asm.js content, it almost doubled Firefox’s Massive score. (At the time, of course, Massive didn’t exist; but we measured speedups on other benchmarks.)
    • Firefox 26 parses async scripts off of the main thread. This avoids the browser or page becoming nonresponsive while the script loads. For asm.js content, not only parsing but also compilation happens in the background, making the user experience even smoother. Also in Firefox 26 are general optimizations for float32 operations, which appear in one of the Massive tests.
    • Firefox 29 caches asm.js code: The second time you visit the same site, previously-compiled asm.js code will just be loaded from disk, avoiding any compilation pause at all. Another speedup in this version is that the previous float32 optimizations are fully optimized in asm.js code as well.

    Large codebases, and why we need a new benchmark

    Each of those features is expected to improve asm.js performance, so it makes sense to see large speedups there. So far, everything looks pretty much as we would expect. However, a fourth milestone is noted on that graph, and it doesn’t cause any speedup. That feature is IonMonkey, which landed in Firefox 18. IonMonkey was a new optimizing compiler for Firefox, and it provided very large speedups on most common browser benchmarks. Why, then, doesn’t it show any benefit in Massive?

    IonMonkey does help very significantly on small asm.js codebases. But in its original release in Firefox 18 (see more details in the P.S. below), IonMonkey did not do well on very large ones – as a complex optimizing compiler, compilation time is not necessarily linear, which means that large scripts can take very large amounts of time to compile. IonMonkey therefore included a script size limit – over a certain size, IonMonkey simply never kicks in. This explains why Massive does not improve on Firefox 18, when IonMonkey landed – Massive contains very large codebases, and IonMonkey at the time could not actually run on them.

    That shows exactly why a benchmark like Massive is necessary, as other benchmarks did show speedups upon IonMonkey’s launch. In other words, Massive is measuring something that other benchmarks do not. And that thing – large asm.js codebases – is becoming more and more important.

    (P.S. IonMonkey’s script size limit prevented large codebases from being optimized when IonMonkey originally launched, but that limit has been relaxed over time, and practically does not exist today. This is possible through compilation on a background thread, interruptible compilation, and just straightforward improvements to compilation speed, all of which make it feasible to compile larger and larger functions. Exciting general improvements to JavaScript engines are constantly happening across the board!)

  7. Introducing SIMD.js

    SIMD stands for Single Instruction Multiple Data, and is the name for performing operations on multiple data elements together. For example, a SIMD add instruction can add multiple values, in parallel. SIMD is a very popular technique for accelerating computations in graphics, audio, codecs, physics simulation, cryptography, and many other domains.

    In addition to delivering performance, SIMD also reduces power usage, as it uses fewer instructions to do the same amount of work.

    SIMD.js

    SIMD.js is a new API being developed by Intel, Google, and Mozilla for JavaScript which introduces several new types and functions for doing SIMD computations. For example, the Float32x4 type represents 4 float32 values packed up together. The API contains functions to operate on those values together, including all the basic arithmetic operations, and operations to rearrange, load, and store such values. The intent is for browsers to implement this API directly, and provide optimized implementations that make use of SIMD instructions in the underlying hardware.

    The focus is currently on supporting both x86 platforms with SSE and ARM platforms with NEON. We’re also interested in the possibility of supporting other platforms, potentially including MIPS, Power, and others.

    SIMD.js is originally derived from the Dart SIMD specification, and it is rapidly evolving to become a more general API, and to cover additional use cases such as those that require narrower integer types, including Int8x16 and Int16x8, and saturating operations.

    SIMD.js is a fairly low-level API, and it is expected that libraries will be written on top of it to expose higher-level functionality such as matrix operations, transcendental functions, and more.

    In addition to being usable in regular JS, there is also work is underway to add SIMD.js to asm.js too, so that it can be used from asm.js programs such those produced by Emscripten. In Emscripten, SIMD can be achieved through the built-in autovectorization, the generic SIMD extensions, or the new (and still growing) Emscripten-specific API. Emscripten will also be implementing subsets of popular headers such as <xmmintrin.h> with wrappers around the SIMD.js APIs, as additional ways to ease porting SIMD code in some situations.

    SIMD.js Today

    The SIMD.js API itself is in active development. The ecmascript_simd github repository is currently serving as a provision specification as well as providing a polyfill implementation to provide the functionality, though of course not the accelerated performance, of the SIMD API on existing browsers. It also includes some benchmarks which also serve as examples of basic SIMD.js usage.

    To see SIMD.js in action, check out the demo page accompanying the IDF2014 talk on SIMD.js.

    The API has been presented to TC-39, which has approved it for stage 1 (Proposal). Work is proceeding in preparation for subsequent stages, which will involve proposing something closer to a finalized API.

    SIMD.js implementation in Firefox Nightly is in active development. Internet Explorer has listed SIMD.js as “under consideration”. There is also a prototype implementation in a branch of Chromium.

    Short SIMD and Long SIMD

    One of the uses of SIMD is to accelerate processing of large arrays of data. If you have an array of N elements, and you want to do roughly the same thing to every element in the array, you can divide N by whatever SIMD size the platform makes available and run that many instances of your SIMD subroutine. Since N can can be very large, I call these kind of problems long SIMD problems.

    Another use of SIMD is to accelerate processing of clusters of data. RGB or RGBA pixels, XYZW coordinates, or 4×4 matrices are all examples of such clusters, and I call problems which are expressed in these kinds of types short SIMD problems.

    SIMD is a broad domain, and the boundary between short and long SIMD isn’t always clear, but at a high level, the two styles are quite different. Even the terminology used to describe them features a split: In the short SIMD world, the operation which copies a scalar value into every element of a vector value is called a “splat”, while in the long vector world the analogous operation is called a “broadcast”.

    SIMD.js is primarily a “short” style API, and is well suited for short SIMD problems. SIMD.js can also be used for long SIMD problems, and it will still deliver significant speedups over plain scalar code. However, its fixed-length types aren’t going to achieve maximum performance of some of today’s CPUs, so there is still room for another solution to be developed to take advantage of that available performance.

    Portability and Performance

    There is a natural tension in many parts of SIMD.js between the desire to have an API which runs consistently across all important platforms, and the desire to have the API run as fast as possible on each individual platform.

    Fortunately, there is a core set of operations which are very consistent across a wide variety of platforms. These operations include most of the basic arithmetic operations and form the core of SIMD.js. In this set, little to no overhead is incurred because many of the corresponding SIMD API instructions map directly to individual instructions.

    But, there also are many operations that perform well on one platform, and poorly on others. These can lead to surprising performance cliffs. The current approach of the SIMD.js API is to focus on the things that can be done well with as few performance cliffs as possible. It is also focused on providing portable behavior. In combination, the aim is to ensure that a program which runs well on one platform will likely run and run well on another.

    In future iterations of SIMD.js, we expect to expand the scope and include more capabilities as well as mechanisms for querying capabilities of the underlying platform. Similar to WebGL, this will allow programs to determine what capabilities are available to them so they can decide whether to fall back to more conservative code, or disable optional functionality.

    The overall vision

    SIMD.js will accelerate a wide range of demanding applications today, including games, video and audio manipulation, scientific simulations, and more, on the web. Applications will be able to use the SIMD.js API directly, libraries will be able to use SIMD.js to expose higher-level interfaces that applications can use, and Emscripten will compile C++ with popular SIMD idioms onto optimized SIMD.js code.

    Looking forward, SIMD.js will continue to grow, to provide broader functionality. We hope to eventually accompany SIMD.js with a long-SIMD-style API as well, in which the two APIs can cooperate in a manner very similar to the way that OpenCL combines explicit vector types with the implicit long-vector parallelism of the underlying programming model.

  8. Generational Garbage Collection in Firefox

    Generational garbage collection (GGC) has now been enabled in the SpiderMonkey JavaScript engine in Firefox 32. GGC is a performance optimization only, and should have no observable effects on script behavior.

    So what is it? What does it do?

    GGC is a way for the JavaScript engine to collect short-lived objects faster. Say you have code similar to:

    function add(point1, point2) {
        return [ point1[0] + point2[0], point1[1] + point2[1] ];
    }

    Without GGC, you will have high overhead for garbage collection (from here on, just “GC”). Each call to add() creates a new Array, and it is likely that the old arrays that you passed in are now garbage. Before too long, enough garbage will pile up that the GC will need to kick in. That means the entire JavaScript heap (the set of all objects ever created) needs to be scanned to find the stuff that is still needed (“live”) so that everything else can be thrown away and the space reused for new objects.

    If your script does not keep very many total objects live, this is totally fine. Sure, you’ll be creating tons of garbage and collecting it constantly, but the scan of the live objects will be fast (since not much is live). However, if your script does create a large number of objects and keep them alive, then the full GC scans will be slow, and the performance of your script will be largely determined by the rate at which it produces temporary objects — even when the older objects aren’t changing, and you’re just re-scanning them over and over again to discover what you already knew. (“Are you dead?” “No.” “Are you dead?” “No.” “Are you dead?”…)

    Generational collector, Nursery & Tenured

    With a generational collector, the penalty for temporary objects is much lower. Most objects will be allocated into a separate memory region called the Nursery. When the Nursery fills up, only the Nursery will be scanned for live objects. The majority of the short-lived temporary objects will be dead, so this scan will be fast. The survivors will be promoted to the Tenured region.

    The Tenured heap will also accumulate garbage, but usually at a far lower rate than the Nursery. It will take much longer to fill up. Eventually, we will still need to do a full GC, but under typical allocation patterns these should be much less common than Nursery GCs. To distinguish the two cases, we refer to Nursery collections as minor GCs and full heap scans as major GCs. Thus, with a generational collector, we split our GCs into two types: mostly fast minor GCs, and fewer slower major GCs.

    GGC Overhead

    While it might seem like we should have always been doing this, it turns out to require quite a bit of infrastructure that we previously did not have, and it also incurs some overhead during normal operation. Consider the question of how to figure out whether some Nursery object is live. It might be pointed to by a live Tenured object — for example, if you create an object and store it into a property of a live Tenured object.

    How do you know which Nursery objects are being kept alive by Tenured objects? One alternative would be to scan the entire Tenured heap to find pointers into the Nursery, but this would defeat the whole point of GGC. So we need a way of answering the question more cheaply.

    Note that these Tenured ⇒ Nursery edges in the heap graph won’t last very long, because the next minor GC will promote all survivors in the Nursery to the Tenured heap. So we only care about the Tenured objects that have been modified since the last minor (or major) GC. That won’t be a huge number of objects, so we make the code that writes into Tenured objects check whether it is writing any Nursery pointers, and if so, record the cross-generational edges in a store buffer.

    In technical terms, this is known as a write barrier. Then, at minor GC time, we walk through the store buffer and mark every target Nursery object as being live. (We actually use the source of the edge at the same time, since we relocate the Nursery object into the Tenured area while marking it live, and thus the Tenured pointer into the Nursery needs to be updated.)

    With a store buffer, the time for a minor GC is dependent on the number of newly-created edges from the Tenured area to the Nursery, not just the number of live objects in the Nursery. Also, keeping track of the store buffer records (or even just the checks to see whether a store buffer record needs to be created) does slow down normal heap access a little, so some code patterns may actually run slower with GGC.

    Allocation Performance

    On the flip side, GGC can speed up object allocation. The pre-GGC heap needs to be fully general. It must track in-use and free areas and avoid fragmentation. The GC needs to be able to iterate over everything in the heap to find live objects. Allocating an object in a general heap like this is surprisingly complex. (GGC’s Tenured heap has pretty much the same set of constraints, and in fact reuses the pre-GGC heap implementation.)

    The Nursery, on the other hand, just grows until it is full. You never need to delete anything, at least until you free up the whole Nursery during a minor GC, so there is no need to track free regions. Consequently, the Nursery is perfect for bump allocation: to allocate N bytes you just check whether there is space available, then increment the current end-of-heap pointer by N bytes and return the previous pointer.

    There are even tricks to optimize away the “space available” check in many cases. As a result, objects with a short lifespan never go through the slower Tenured heap allocation code at all.

    Timings

    I wrote a simple benchmark to demonstrate the various possible gains of GGC. The benchmark is sort of a “vector Fibonacci” calculation, where it computes a Fibonacci sequence for both the x and y components of a two dimensional vector. The script allocates a temporary object on every iteration. It first times the loop with the (Tenured) heap nearly empty, then it constructs a large object graph, intended to be placed into the Tenured portion of the heap, and times the loop again.

    On my laptop, the benchmark shows huge wins from GGC. The average time for an iteration through the loop drops from 15 nanoseconds (ns) to 6ns with an empty heap, demonstrating the faster Nursery allocation. It also shows the independence from the Tenured heap size: without GGC, populating the long-lived heap slows down the mean time from 15ns to 27ns. With GGC, the speed stays flat at 6ns per iteration; the Tenured heap simply doesn’t matter.

    Note that this benchmark is intended to highlight the improvements possible with GGC. The actual benefit depends heavily on the details of a given script. In some scripts, the time to initialize an object is significant and may exceed the time required to allocate the memory. A higher percentage of Nursery objects may get tenured. When running inside the browser, we force enough major GCs (eg, after a redraw) that the benefits of GGC are less noticeable.

    Also, the description above implies that we will pause long enough to collect the entire heap, which is not the case — our incremental garbage collector dramatically reduces pause times on many Web workloads already. (The incremental and generational collectors complement each other — each attacks a different part of the problem.)

    Continued…

  9. WebIDE, Storage inspector, jQuery events, iframe switcher + more – Firefox Developer Tools Episode 34

    A new set of Firefox Developer Tools features has just been uplifted to the Aurora channel. These features are available right now in Aurora, and will be in the Firefox 34 release in November. This release brings new tools (storage inspector, WebIDE), an updated profiler, and handy enhancements to the existing tools:

    WebIDE

    WebIDE, a new tool for in-browser app development, has been enabled by default in this release. WebIDE lets you create a new Firefox OS app (which is just a web app) from a template, or open up the code for an already created app. From there you can edit the app’s files. It’s one click to run the app in a simulator and one more to debug it with the developer tools. Open WebIDE from Firefox’s “Web Developer” menu. (docs)

    Storage inspector

    There’s a new panel that shows the data your page has stored in cookies, localStorage, sessionStorage, and IndexedDB, which was created mostly by Girish Shama. Enable the Storage panel by checking off Settings > “Default Developer Tools” > “Storage”. The panel is read-only right now, with editing ability planned for a future release. (docs) (development notes) (UserVoice request)

    storage inspector

    jQuery events

    The event listener popup in the Inspector now supports jQuery. This means the popup will display the function you attached with e.g. jQuery.on(), and not the jQuery wrapper function itself. See this post for more info and how to add support for your preferred framework. (development notes)

    JQuery events

    Iframe switcher

    Change the frame you’re debugging using the new frame selection menu. Selecting a frame will switch all of the tools to debug that iframe, including the Inspector, Console, and Debugger. Add the frame selection button by checking off Settings > “Available Toolbox Buttons” > “Select an iframe”. (docs) (development notes)(UserVoice request)

    iframe selection

    Updated profiler

    An updated JavaScript profiler appears in the new “Performance” tab (formerly the “Profiler” tab). New to the profiler are a frame rate timeline and categories for frames like “network” and “graphics”. (docs) (development notes)

    new profiler

    console.table()

    Add a call to console.table() anywhere in your JavaScript to log data to the console using a table-like display. Log any object, array, Map, or Set. Sort a column in the table by clicking on its header. (docs) (development notes)

    console.table

    Selector preview

    Hover over a CSS selector in the Inspector or Style Editor to highlight all the nodes that match that selector on the page. (development notes)

    selector previews

    Other mentions

    • Persistent split console – The split console (opened by pressing ESC) will now open with the tools if you had it open the last time the tools were closed. (development notes)
    • Web audio – AudioParam connections – the Web Audio Editor now displays connections from AudioNodes to AudioParams. (development notes)

    Special thanks to the 41 contributors that added all the features and fixes in this release.

    Comment here, shoot feedback to @FirefoxDevTools on Twitter, or propose changes on the Developer Tools feedback channel. If you’d like to help out, check out the guide to getting involved.

  10. Introducing Blast.js

    After releasing Velocity.js, a highly performant web animation engine, I wanted to leverage that power for typographic manipulation. The question soon arose, How could I animate one letter, one word, or one sentence at a time without bloating my HTML with wrapper elements?

    If I could figure this out, I could create beautiful typographic animation sequences (the kind you see in movie titles), and perform real-time textual analysis.

    After researching lower-level DOM methods and polishing my RegEx skills, I built Blast.js: a jQuery/Zepto plugin that breaks apart text to enable hassle-free text manipulation. Follow that link to see a few demos.

    Let’s jump right into a code example. If we were to Blast an element using the following syntax…

    $("div").blast({ delimiter: "word" });

    …and if our element initially looked like this…

    <div>
        Hello World
    </div>

    …our element would now look like this:

    <div class="blast-root">
        <span class="blast">Hello</span>
        <span class="blast">World</span>
    </div>

    The div’s text was broken into individual span elements using the specified word delimiter. We could have used the character, sentence, or element delimiters instead.

    For a breakdown of Blast’s API, refer to its documentation.

    This article serves to explore the technical aspects that went into making Blast versatile and accurate: We’ll learn about very powerful, yet little-known, DOM traversal techniques plus how to maximally leverage RegEx for linguistic accuracy.

    If you’re interested in the technical aspects of how rich motion design works or how to manipulate text, this article is for you.

    Versatility

    Most DOM elements are composed of descendant text nodes. Blast traverses the entirety of the HTML element that it is targeted on, descending recursively until it’s found every descendant text node.

    For example, if you Blast the following HTML:

    <div>Hello <span>World</span></div>

    The containing div is an element node. This element node is composed of two children: 1) a text node (“Hello “) and 2) a span element node. The span element node contains one child: a text node of its own (“World”).

    With each text node Blast finds, it executes the RegEx query associated with the chosen delimiter type (e.g. character, word, or sentence) in order to find submatches. For example, a text node of “World” blasted with the character delimiter will produce five submatches: “w”, “o”, “r”, “l”, and “d”. Blast wraps a new element node of a user-defined type (span is the default) around each of these submatches.

    By traversing the DOM in this way, Blast can be applied safely to the entirety of an element without concern for breaking any of its descendant HTML or its associated event handlers. (Event handlers are never bound to text nodes, but rather to containing element nodes.)

    In fact, let’s try just that — in real-time! Click here to see Blast used on a CodePen page with the word delimiter. Notice how the generated wrapper elements are filtered with alternating colors. Next, click around. You’ll see that the page continues to work perfectly; all buttons, event handlers, and interactions remain fully intact. Nothing has been compromised.

    This versatility is crucial when blasting user-generated content, which, by its nature, is not necessarily predictably structured. It can be dirtied with HTML.

    Reversal

    When Blast generates wrappers around each of text node’s submatches, it assigns each wrapper a “blast” class. This class is later referenced when Blast is reversed.

    Blast reversal is triggered by passing in false as Blast’s sole parameter. The reversal process works as follows: The DOM is traversed, but the elements it’s looking for are element nodes (not text nodes) that have been assigned the “blast” class. Each matched element node is then replaced with its inner HTML.

    For example, reversing Blast on the following HTML…

    <div id="helloWorld" class="blast-root">
        <span class="blast">Hello</span>
        <span class="blast">World</span>
    </div>

    … using the following syntax…

    $("#helloWorld").blast(false);

    …will result in Blast descending into #helloWorld, matching each element node individually, then substituting these element nodes with the text nodes that they contain — “Hello” and “World”, respectively.

    After this process, our DOM is back to exactly where it was before we Blasted it. This ability to cleanly reverse allows us to jump into arbitrarily structured HTML, Blast it apart, run a series of typographic animations, then reverse Blast upon completion so that our markup remains clean and structured as originally intended.

    Let’s do just that:

    See the Pen Blast.js – Command: Reverse by Julian Shapiro (@julianshapiro) on CodePen.

    Accuracy

    We’ve established that Blast preserves HTML by touching only the relevant nodes (text nodes). Now let’s explore how Blast is able to pull off this next trick:

    See the Pen Blast.js TypeKit Article – Accuracy by Julian Shapiro (@julianshapiro) on CodePen.

    Remember, when a descendant text node is found in an element node targeted by Blast, the chosen delimiter’s RegEx is executed against it. Let’s examine each delimiter, starting with the simplest: character.

    (Note that you can follow along with these examples by clicking the demo buttons under the Robustness Gallery section of Blast’s documentation. You can also visit RegEx101.com to test the following RegEx queries against your own bodies of text.)

    The RegEx for the character delimiter is simply /(S)/, which treats every non-space character as a submatch (a submatch is the part of the text node that gets wrapped by a newly-generated element). Simple enough.

    Next, the word delimiter uses this RegEx: /s*(S+)s*/. This matches any non-space character surrounded by either a space or nothing (nothing is the edge case where a word appears at the beginning or ending of a text node). Specifically, s* means “optionally match a space character”, and the S+ in the middle means “match as many non-space characters as possible.” Note that the word delimiter matches will include any punctuation that’s adjoined to the word, e.g. “Hey!” will be a full match. For the vast majority of use cases, this is more desirable than treating every adjoined punctuation as its own word.

    Now things start to get more complex. It’s trivial to match characters and space-delimited words, but it’s tricky to robustly match sentences — especially in a multilingual manner. Blast’s sentence delimiter delimits phrases either 1) ending in Latin alphabet punctuation (linebreaks are not considered punctuation) or 2) located at the end of a body of text. The sentence delimiter’s RegEx looks like this:

    (?=S)(([.]{2,})?[^!?]+?([.…!?]+|(?=s+$)|$)(s*[′’'”″“")»]+)*)

    Below is an expanded view (with spacing) for better legibility:

    (?=S) ( ([.]{2,})? [^!?]+? ([.…!?]+|(?=s+$)|$) (s*[′’'”″“")»]+)* )

    Let’s break that down into its components:

    • (?=S) The sentence must contain a non-space character.
    • ([.]{2,})? The sentence may begin with a group of periods, e.g. “… that was a bad idea, Tom!”
    • [^!?]+? Grab everything that isn’t an unequivocally-terminating punctuation character, but stop when the following condition is reached…
    • ([.…!?]+|(?=s+$)|$) …match the last occurrence of sentence-final punctuation or the end of the text (optionally with trailing spaces).
    • (s*[′’'”″“")»]+)* After the final punctuation is matched, also include any and all pairs of (optionally space-delimited) quotes and parentheses.

    That’s quite a bit to to digest, but if you refer to those RegEx components while revisiting the sentence matching behavior from the top of this section (re-embedded below for convenience), you’ll start to see how the larger pieces come together.

    See the Pen Blast.js TypeKit Article – Accuracy by Julian Shapiro (@julianshapiro) on CodePen.

    (Click on the HTML tab to modify the HTML and see how Blast behaves on different bodies of text.)

    We still haven’t explained why that embedded demo’s errant periods aren’t falsely triggering the end of a sentence match: The trick is to perform a pre-pass on each text node — prior to the primary RegEx execution — in which likely false positives are rendered inert by temporary encoding them into non-matching strings. Then, after the sentence RegEx is executed, the likely false positives are decoded back to their original characters.

    The false positive encoding process consists of replacing a punctuation character with its ASCII equivalent inside double curly brackets. For example, a likely false positive period (e.g. one found in the title “Mr. Johnson”) will be turned into “Mr{{46}} Johnson”. Then, when the sentence delimiter’s RegEx is executed, it skips over the {{46}} block since curly braces aren’t considered Latin alphabet punctuation.

    Here’s the logic behind this process:

    text
    /* Escape the following Latin abbreviations and English
       titles: e.g., i.e., Mr., Mrs., Ms., Dr., Sr., and Jr. */
    .replace(RegEx.abbreviations, function(match) {
        return match.replace(/./g, "{{46}}");
    })
    /* Escape inner-word (non-space-delimited) periods.
       For example, the period inside "Blast.js". */
    .replace(RegEx.innerWordPeriod, function(match) {
       return match.replace(/./g, "{{46}}");
    });

    So now you have an overview of Blast’s behavior, but you haven’t learned that much. Not to worry, the next two sections get super technical.

    Deep dive: Regex

    This section is optional. This is a technical deep dive into how Blast’s RegEx queries are designed.

    This is the RegEx code block that you can find at the top of Blast’s source code:

    var characterRanges = {
            latinLetters: "\u0041-\u005A\u0061-\u007A\u00C0-\u017F\u0100-\u01FF\u0180-\u027F",
        },
        Reg = {
            abbreviations: new RegExp("[^" + characterRanges.latinLetters + "](e\.g\.)|(i\.e\.)|(mr\.)|(mrs\.)|(ms\.)|(dr\.)|(prof\.)|(esq\.)|(sr\.)|(jr\.)[^" + characterRanges.latinLetters + "]", "ig"),
            innerWordPeriod: new RegExp("[" + characterRanges.latinLetters + "].[" + characterRanges.latinLetters + "]", "ig"),
        };

    The first step is to define the UTF8 character ranges within which the letters used by all the Latin alphabet languages are contained. If that string looks like total gibberish to you, fear not: Character representation systems associate an ID with each of their displayable characters. RegEx simply allows us to define a range of ID’s (place a “-” between your first character’s ID and the last character’s ID). We take advantange of this by collating a bunch of ID ranges together in order to skip past ranges that contain characters that aren’t used in everyday language (e.g. emoticons, arrow symbols, etc.).

    Once we know what all the acceptable characters are, we can use them to create RegEx queries:

    The abbreviations RegEx looks for case-insensitive whitelisted abbreviations (e.g. Mr., Dr. Jr.) that are not immediately preceded by one of the accepted characters. In other words, it wants to find where these abbreviations are preceded by either nothing, a space, or a non-letter character. For example, we don’t want to match “ms.” in “grams.”, but we want to match “ms.” in “→Ms. Piggy”. Likewise, the RegEx query ensures that the abbreviation is also not immediately followed by a letter. For example, we don’t want to match “e.g.” in a corporation’s name abbreviation such as “E.G.G.S.”. But, we do want to match “e.g.” in “… farm animals, e.g. cows, bigs, etc.”

    The inner-word period RegEx looks for any period that’s sandwiched immediately between a whitelisted Latin alphabet letters on either side. So, the period inside “Blast.js” successfully matches, but the period at the end of “This is is a short sentence.” successfully does not.

    Deep dive: DOM traversal

    This section is optional. This is a deep dive into how text node traversal works.

    Let’s take a look at the recursive DOM traversal code:

    if (node.nodeType === 1 && node.hasChildNodes()
        && !Reg.skippedElements.test(node.tagName)
        && !Reg.hasPluginClass.test(node.className)) {
        /* Note: We don't cache childNodes' length since it's a live nodeList (which changes dynamically with the use of splitText() above). */
        for (var i = 0; i < node.childNodes.length; i++) {
            Element.nodeBeginning = true;
     
            i += traverseDOM(node.childNodes[i], opts);
        }
    }

    Above, we check that the node…

    • Has a nodeType of 1 (which is the ID associated with an element node).
    • Has child nodes for us to crawl.
    • Is not one of the blacklisted element node tags (script, textarea, and select), which contain text nodes, but not that typical kind that users likely want to be blasted.
    • Isn’t already assigned the “blast” class, which Blast uses to keep track of which elements it’s currently being used on.

    If the above conditions aren’t true and if the nodeType is instead returning a value of 3, then we know we’ve hit an actual text node. In this case, we proceed with submatch and element wrapping logic. Refer to the inlined comments for a thorough walkthrough:

    /* Find what position in the text node that our
    delimiter's RegEx returns a match. */
    matchPosition = textNode.data.search(delimiterRegex);
     
    /* If there's a RegEx match in this text node, proceed
       with element wrapping. */
    if (matchPosition !== -1) {
        /* Return the match. */
        var match = node.data.match(delimiterRegex),
            /* Get the node's full text. */
            matchText = match[0],
            /* Get only the match's text. */
            subMatchText = match[1] || false;
     
        /* RegEx queries that can return empty strings (e.g ".*")
           produce an empty matchText which throws the entire
           traversal process into an infinite loop due to the
           position index not incrementing. Thus, we bump up
           the position index manually, resulting in a zero-width
           split at this location followed by the continuation
           of the traversal process. */
        if (matchText === "") {
            matchPosition++;
        /* If a RegEx submatch is produced that is not
           identical to the full string match, use the submatch's
           index position and text. This technique allows us to
           avoid writing multi-part RegEx queries for submatch finding. */
        } else if (subMatchText &amp;&amp; subMatchText !== matchText) {
            matchPosition += matchText.indexOf(subMatchText);
            matchText = subMatchText;
        }
     
        /* Split this text node into two separate nodes at the
           position of the match, returning the node that begins
           after the match position. */
        var middleBit = node.splitText(matchPosition);
     
        /* Split the newly-produced text node at the end of the
           match's text so that middleBit is a text node that
           consists solely of the matched text. The other
           newly-created text node, which begins at the end
           of the match's text, is what will be traversed in
           the subsequent loop (in order to find additional
           matches in the containing text node). */
        middleBit.splitText(matchText.length);
     
        /* Over-increment the loop counter so that we skip
           the extra node (middleBit) that we've just created
           (and already processed). */
        skipNodeBit = 1;
     
        /* Create the wrapped node. Note: wrapNode code
           is not shown, but it simply consists of creating
           a new element and assigning it an innerText value. */
        var wrappedNode = wrapNode(middleBit);
     
        /* Then replace the middleBit text node with its
           wrapped version. */
        middleBit.parentNode.replaceChild(wrappedNode, middleBit);
    }

    This process isn’t tremendously performant when used on a large bodies of text with a delimiter that produces a lot of small matches (namely, the character delimiter), but it’s phenomenally robust and reliable.

    Wrapping up

    Go forth and blast shit up ;-) If you create something cool, please post it on CodePen and share it in the comments below.

    Follow me on Twitter for tweets about UI manipulation.