Mozilla

JavaScript Articles

Sort by:

View:

  1. Serving Backbone for Robots & Legacy Browsers

    I like the Single Page Application model and Backbone.js, because I get it. As a former Java developer, I am used to object oriented coding and events for messaging. Within our HTML5 consultancy, SC5, Backbone has become almost a synonym for single page applications, and it is easy to move between projects because everybody gets the same basic development model.

    We hate the fact that we need to have server side workarounds for robots. Making applications crawlable is very reasonable business-wise, but ill-suited for the SPA model. Data-driven single page applications typically get only served a HTML page skeleton, and the actual construction of all the visual elements is done in browser. Any other way would easily lead into double code paths (one on a browser, one on a server). Some have even concerned on giving up the SPA model and moving the logic and representation back to the server.

    Still, we should not let the tail wag the dog. Why sacrifice the user experience of 99,9% of the users for the sake of the significant 0.1%? Instead, for such low traffic, a better suited solution would be to create a server side workaround.

    Solving the Crawling Problem with an App Proxy

    The obvious solution for the problem is running the same application code at the both ends. Like in the digital television transformation, a set-top box would fill in the gap of legacy televisions by crunching the digital signal into analog form. Correspondingly, a proxy would run the application server side and serve the resulting HTML back to the crawlers. Smart browsers would get all the interactive candy, whereas crawlers and legacy browsers would just get the pre-processed HTML document.

    Proxy pattern explained through a TV set metaphor

    Thanks to node.js, JavaScript developers have been able to use their favourite language on the both ends for some time already, and proxy-like solutions have become a plausible option.

    Implementing DOM and Browser APIs on the Server

    Single page applications typically heavily depend on DOM manipulation. Typical server applications combine several view templates into a page through concatenation, whereas Backbone applications append the views into DOM as new elements. Developer would either need to emulate DOM on the server side, or build an abstraction layer that would permit using DOM on the browser and template concatenation on the server. DOM can either be serialized into a HTML document or vice versa, but these techniques cannot be easily mixed runtime.

    A typical Backbone application talks with the browser APIs through several different layers – either by using Backbone or jQuery APIs, or accessing the APIs directly. Backbone itself has only minor dependencies to layers below – jQuery is used in DOM manipulation and AJAX requests, and application state handling is done using pushState.

    Sample Backbone layers

    Node.js has ready-made modules for each level of abstraction: JSDOM offers a full DOM implementation on the server-side, whereas Cheerio provides a jQuery API on top of a fake DOM with a better performance. Some of the other server-side Backbone implementations, like AirBnB Rendr and Backbone.LayoutManager, set the abstraction level to the level of Backbone APIs (only), and hide the actual DOM manipulation under a set of conventions. Actually, Backbone.LayoutManager does offer the jQuery API through Cheerio, but the main purpose of the library itself is to ease the juggling between Backbone layouts, and hence promote a higher level of abstraction.

    Introducing backbone-serverside

    Still, we went for our own solution. Our team is a pack of old dogs that do not learn new tricks easily. We believe there is no easy way of fully abstracting out the DOM without changing what Backbone applications essentially are. We like our Backbone applications without extra layers, and jQuery has always served us as a good compatibility layer to defend ourselves against browser differences in DOM manipulation. Like Backbone.LayoutManager, we choose Cheerio as our jQuery abstraction. We solved the Backbone browser API dependencies by overriding Backbone.history and Backbone.ajax with API compatible replacements. Actually, in the first draft version, these implementations remain bare minimum stubs.

    We are quite happy about the solution we have in the works. If you study the backbone-serverside example, it looks quite close to what a typical Backbone application might be. We do not enforce working on any particular level of abstraction; you can use either Backbone APIs or the subset of APIs that jQuery offers. If you want to go deeper, nothing stops from implementing server-side version of a browser API. Insuch cases, the actual server side implementation may be a stub. For example, needs touch event handling on the server?

    The current solution assumes a node.js server, but it does not necessarily mean drastic changes to an existing server stack. An existing servers for API and static assets can remain as-is, but there should be a proxy to forward the requests of dumb clients to our server. The sample application serves static files, API and the proxy from the same server, but they all could be decoupled with small modifications.

    backbone-serverside as a proxy

    Writing Apps That Work on backbone-serverside

    Currently the backbone-serverside core is a bare minimum set of adapters to make Backbone run on node.js. Porting your application to run on server may require further modifications.

    If the application does not already utilise a module loader, such as RequireJS or Browserify, you need to figure out on how to load the same modules on the server. In our example below, we use RequireJS and need a bit JavaScript to use Cheerio instead of vanilla jQuery on the server. Otherwise we are pretty able to use the same stack we typically use (jQuery, Underscore/Lo-Dash, Backbone and Handlebars.When choosing the modules, you may need to limit to the ones that do not play with Browser APIs directly, or be prepared to write a few stubs by yourself.

    // Compose RequireJS configuration run-time by determining the execution
    // context first. We may pass different values to browser and server.
    var isBrowser = typeof(window) !== 'undefined';
     
    // Execute this for RequireJS (client or server-side, no matter which)
    requirejs.config({
     
        paths: {
            text: 'components/requirejs-text/text',
            underscore: 'components/lodash/dist/lodash.underscore',
            backbone: 'components/backbone/backbone',
            handlebars: 'components/handlebars/handlebars',
            jquery: isBrowser ? 'components/jquery/jquery' : 'emptyHack'
        },
     
        shim: {
            'jquery': {
                deps: ['module'],
                exports: 'jQuery',
                init: function (module) {
                    // Fetch the jQuery adapter parameters for server case
                    if (module && module.config) {
                        return module.config().jquery;
                    }
     
                    // Fallback to browser specific thingy
                    return this.jQuery.noConflict();
                }
            },
            'underscore': {
                exports: '_',
                init: function () {
                    return this._.noConflict();
                }
            },
            'backbone': {
                deps: ['underscore', 'jquery'],
                exports: 'Backbone',
                init: function (_, $) {
                    // Inject adapters when in server
                    if (!isBrowser) {
                        var adapters = require('../..');
                        // Add the adapters we're going to be using
                        _.extend(this.Backbone.history,
                            adapters.backbone.history);
                        this.Backbone.ajax = adapters.backbone.ajax;
                        Backbone.$ = $;
                    }
     
                    return this.Backbone.noConflict();
                }
            },
            'handlebars': {
                exports: 'Handlebars',
                init: function() {
                    return this.Handlebars;
                }
            }
        },
     
        config: {
            // The API endpoints can be passed via URLs
            'collections/items': {
                // TODO Use full path due to our XHR adapter limitations
                url: 'http://localhost:8080/api/items'
            }
        }
    });

    Once the configuration works alright, the application can be bootstrapped normally. In the example, we use Node.js express server stack and pass specific request paths to Backbone Router implementation for handling. When done, we will serialize the DOM into text and send that to the client. Some extra code needs to be added to deal with Backbone asynchronous event model. We will discuss that more thoroughly below.

    // URL Endpoint for the 'web pages'
    server.get(//(items/d+)?$/, function(req, res) {
        // Remove preceeding '/'
        var path = req.path.substr(1, req.path.length);
        console.log('Routing to '%s'', path);
     
        // Initialize a blank document and a handle to its content
        //app.router.initialize();
     
        // If we're already on the current path, just serve the 'cached' HTML
        if (path === Backbone.history.path) {
            console.log('Serving response from cache');
            res.send($html.html());
        }
     
        // Listen to state change once - then send the response
        app.router.once('done', function(router, status) {
            // Just a simple workaround in case we timeouted or such
            if (res.headersSent) {
                console.warn('Could not respond to request in time.');
            }
     
            if (status === 'error') {
                res.send(500, 'Our framework blew it. Sorry.');
            }
            if (status === 'ready') {
                // Set the bootstrapped attribute to communicate we're done
                var $root = $html('#main');
                $root.attr('data-bootstrapped', true);
     
                // Send the changed DOM to the client
                console.log('Serving response');
                res.send($html.html());
            }
        });
     
        // Then do the trick that would cause the state change
        Backbone.history.navigate(path, { trigger: true });
    });

    Dealing with Application Events and States

    Backbone uses an asynchronous, event-driven model for communicating between the models views and other objects. For an object oriented developer, the model is fine, but it causes a few headaches on node.js. After all, Backbone applications are data driven; pulling data from a remote API endpoint may take seconds, and once it eventually arrives, the models will notify views to repaint themselves. There is no easy way to know when all the application DOM manipulation is finished, so we needed to invent our own mechanism.

    In our example we utilise simple state machines to solve the problem. Since the simplified example does not have a separate application singleton class, we use a router object as the single point of control. Router listens for changes in states of each view, and only notifies the express server about readiness to render when all the views are ready. In the beginning of the request, router resets the view states to pending and does not notify the browser or server until it knows all the views are done. Correspondingly, the views do not claim to be done until they know they are fed with valid data from their corresponding model/collection. The state machine is simple and can be consistently applied throughout the different Backbone objects.

    Activity diagram of a Backbone app event flow

    Beyond the Experimental Hack

    The current version is still experimental work, but it proves Backbone applications can happily live on the server without breaking Backbone APIs or introducing too many new conventions. Currently in SC5 we have a few projects starting that could utilise the this implementation, so we will
    continue the effort.

    We believe the web stack community benefits from this effort, thus we have published the work in GitHub. It is far from being finished and we would appreciate all community continueributions in the forms of ideas and code. Share the love, criticism and all in between: @sc5io #backboneserverside.

    Particularly,we plan to change and hope to get contributions for the following:

    • The current example will likely misbehave on concurrent requests. It shares a single DOM representation for all the ongoing requests, which can easily mess up each other.
    • The state machine implementation is just one idea on how to determine when to serialize the DOM back to the client. It likely can be drastically simplified for most use cases, and it is quite possible to find a better generic solution.
    • The server-side route handling is naive. To emphasize that only the crawlers and legacy browsers might need server-side rendering, the sample could use projects like express-device to detect if we are serving a legacy browser or a server.
    • The sample application is a very rudimentary master-details view application and will not likely cause any wow effect. It needs a little bit of love.

    We encourage you to fork the repository and start from modifying the example for your needs. Happy Hacking!

  2. Adding cursor swipe to the Firefox OS keyboard

    In this article we will take a look at how to approach adding features to a core component in the system such as the input keyboard. It turns out it is pretty easy!

    Before we start, take a look at this concept video from Daniel Hooper to get an idea of what we want to implement:

    Cool, huh? Making such a change for other mobile platforms would be pretty hard or just plain impossible, but in Firefox OS it is quite simple and it will take us less than 50 lines of code.

    The plan

    Conceptually, what we want to achieve is that when the user swipes her finger on the keyboard area, the cursor in the input field moves a distance and direction proportional to the swiping, left or right.

    Since a common scenario is that the user might be pressing a wrong key and would like to slide to a close-by key to correct it, we will only start moving the cursor when the swipe distance is longer than the width of a single key.

    Preparing your environment

    In order to start hacking Firefox OS itself, you will need a copy of Gaia (the collection of webapps that make up the frontend of Firefox OS) and B2G desktop (a build of the B2G app runtime used on devices where all apps should run as they would on a device).

    You can take a look at this previous article from Mozilla Hacks in which we guide you through setting up and hacking on Gaia. There is also a complete guide to setting up this environment at https://wiki.mozilla.org/Gaia/Hacking.

    Once you get Gaia to run in B2G, you are ready to hack!

    Ready to hack!

    Firefox OS is all HTML5, and internally it is composed by several ‘apps’. We can find the main system apps in the apps folder in the gaia repository that you cloned before, including the keyboard app that we will be modifying.
    In this post we will be editing only apps/keyboard/js/keyboard.js, which is where
    a big chunk of the keyboard logic lives.

    We start by initializing some extra variables at the top of the file that will help us keep track of the swiping later.

    var swipeStartMovePos = null; // Starting point of the swiping
    var swipeHappening = false; // Are we in the middle of swiping?
    var swipeLastMousex = -1; // Previous mouse position
    var swipeMouseTravel = 0; // Amount traveled by the finger so far
    var swipeStepWidth = 0; // Width of a single keyboard key

    Next we should find where the keyboard processes touch events. At
    the top of keyboard.js we see that the event handlers for touch events are
    declared:

    var eventHandlers = {
      'touchstart': onTouchStart,
      'mousedown': onMouseDown,
      'mouseup': onMouseUp,
      'mousemove': onMouseMove
    };

    Nice! Now we need to store the coordinates of the initial touch event. Both onTouchStart and onMouseDown end up calling the function startPress after they do their respective post-touch tasks, so we will take care of storing the coordinates there.

    startPress does some work for when a key is pressed, like highlighting the key or checking whether the user is pressing backspace. We will write our logic after that. A convenient thing is that one of the arguments in its signature is coords, which refers to the coordinates where the user started touching, in the context of the keyboard element. So storing the coordinates is as easy as that:

    function startPress(target, coords, touchId) {
      swipeStartMovePos = { x: coords.pageX, y: coords.pageY };
      ...

    In that way we will always have available the coordinates of the last touch even starting point.

    The meat of our implementation will happen during the mousemove event, though. We see that the function onMouseMove is just a simple proxy function for the bigger movePress function, where the ‘mouse’ movements are processed. Here is where we will write our cursor-swiping logic.

    We will use the width of a keyboard key as our universal measure. Since the width of keyboard keys changes from device to device, we will first have to retrieve it calling a method in IMERender, which is the object that controls how the keyboard is rendered on the screen:

    swipeStepWidth = swipeStepWidth || IMERender.getKeyWidth();

    Now we can check if swiping is happening, and whether the swiping is longer than swipeStepWidth. Conveniently enough, our movePress function also gets passed the coords object:

    if (swipeHappening || (swipeStartMovePos && Math.abs(swipeStartMovePos.x - coords.pageX) > swipeStepWidth)) {

    Most of our logic will go inside that ‘if’ block. Now that we know that swiping is happening, we have to determine what direction it is going, assigning 1 for right and -1 for left to our previously initialized variable swipeDirection. After that, we add the amount of distance traveled to the variable swipeMouseTravel, and set swipeLastMousex to the current touch coordinates:

    var swipeDirection = coords.pageX > swipeLastMousex ? 1 : -1;
     
    if (swipeLastMousex > -1) {
      swipeMouseTravel += Math.abs(coords.pageX - swipeLastMousex);
    }
    swipeLastMousex = coords.pageX;

    Ok, now we have to decide how the pixels travelled by the user’s finger will translate into cursor movement. Let’s make that half the width of a key. That means that for every swipeStepWidth / 2 pixels travelled, the cursor in the input field will move one character.

    The way we will move the cursor is a bit hacky. What we do is to simulate the pressing of ‘left arrow’ or ‘right arrow’ by the user, even if these keys don’t even exist in the phone’s virtual keyboard. That allows us to move the cursor in the input field. Not ideal, but Mozilla is about to push a new Keyboard IME API that will give the programmer a proper API to manipulate curor positions and selections. For now, we will just workaround it:

    var stepDistance = swipeStepWidth / 2;
    if (swipeMouseTravel > stepDistance) {
      var times = Math.floor(swipeMouseTravel / stepDistance);
      swipeMouseTravel = 0;
      for (var i = 0; i < times; i++)
        navigator.mozKeyboard.sendKey(swipeDirection === -1 ? 37 : 39, undefined);
    }

    After that we just need to confirm that swiping is happening and do some cleanup of timeouts and intervals initialized in other areas of the file, that because of our new swiping functionality ouldn’t get executed otherwise. We also call hideAlternatives to avoid the keyboard to present us with alternative characters while we are swiping.

    swipeHappening = true;
     
    clearTimeout(deleteTimeout);
    clearInterval(deleteInterval);
    clearTimeout(menuTimeout);
    hideAlternatives();
    return;

    The only thing left to do is to reset all the values we’ve set when the user lifts her finger off the screen. The event handler for that is onMouseUp, which calls the function endPress, at the beginning of which we will put our logic:

    // The user is releasing a key so the key has been pressed. The meat is here.
    function endPress(target, coords, touchId) {
        swipeStartMovePos = null;
        ...
        if (swipeHappening === true) {
            swipeHappening = false;
            swipeLastMousex = -1;
            return;
        }

    With this last bit, our implementation is complete. Here is a rough video I’ve made with the working implementation:

    You can see the complete implementation code changes on GitHub.

    Conclusion

    Contributing bugfixes or features to Firefox OS is as easy as getting Gaia, B2G and start hacking in HTML5. If you are comfortable programming in JavaScript and familiar with making web pages, you can already contribute to the mobile operating system from Mozilla.

    Appendix: Finding an area to work on

    If you already know what bug you want to solve or what feature you want to implement in Firefox OS, first check if it has already been filed in Bugzilla, which is the issue repository that Mozilla uses to keep track of bugs. If it hasn’t, feel free to add it. Otherwise, if you are looking for new bugs to fix, a quick search will reveal many new ones that are sill unassigned. Feel free to pick them up!

  3. Capturing – Improving Performance of the Adaptive Web

    Responsive design is now widely regarded as the dominant approach to building new websites. With good reason, too: a responsive design workflow is the most efficient way to build tailored visual experiences for different device screen sizes and resolutions.

    Responsive design, however, is only the tip of the iceberg when it comes to creating a rich, engaging mobile experience.


    Image Source: For a Future-Friendly Web by Brad Frost

    The issue of performance with responsive websites

    Performance is one of the most important features of a website, but is also frequently overlooked. Performance is something that many developers struggle with – in order to create high-performing websites you need to spend a lot of time tuning your site’s backend. Even more time is required to understand how browsers work, so that you make rendering pages as fast as possible.

    When it comes to creating responsive websites, the performance challenges are even more difficult because you have a single set of markup that is meant to be consumed by all kinds of devices. One problem you hit is the responsive image problem – how do you ensure that big images intended for your Retina Macbook Pro are not downloaded on an old Android phone? How do you prevent desktop ads from rendering on small screen devices?

    It’s easy to overlook performance as a problem because we often conduct testing under perfect conditions – using a fast computer, fast internet, and close proximity to our servers. Just to give you an idea of how evident this problem is, we conducted an analysis into some top responsive e-commerce sites which revealed that the average responsive site home page consists of 87.2 resources and is made up of 1.9 MB of data.

    It is possible to solve the responsive performance problem by making the necessary adjustments to your website manually, but performance tuning by hand involves both complexity and repetition, and that makes it a great candidate for creating tools. With Capturing, we intend to make creating high-performing adaptive web experiences as easy as possible.

    Introducing Capturing

    Capturing is a client-side API we’ve developed to give developers complete control over the DOM before any resources have started loading. With responsive sites, it is a challenge to control what resources you want to load based on the conditions of the device: all current solutions require you to make significant changes to your existing site by either using server-side user-agent detection, or by forcing you to break semantic web standards (for example, changing the src attribute to data-src).

    Our approach to give you resource control is done by capturing the source markup before it has a chance to be parsed by the browser, and then reconstructing the document with resources disabled.

    The ability to control resources client-side gives you an unprecedented amount of control over the performance of your website.

    Capturing was a key feature of Mobify.js 1.1, our framework for creating mobile and tablet websites using client-side templating. We have since reworked Mobify.js in our 2.0 release to be a much more modular library that can be used in any existing website, with Capturing as the primary focus.

    A solution to the responsive image problem

    One way people have been tackling the responsive image problem is by modifying existing backend markup, changing the src of all their img elements to something like data-src, and accompanying that change with a <noscript> fallback. The reason this is done is discussed in this CSS-Tricks post

    “a src that points to an image of a horse will start downloading as soon as that image gets parsed by the browser. There is no practical way to prevent this.

    With Capturing, this is no longer true.

    Say, for example, you had an img element that you want to modify for devices with Retina screens, but you didn’t want the original image in the src attribute to load. Using Capturing, you could do something like this:

    if (window.devicePixelRatio && window.devicePixelRatio >= 2) {
        var bannerImg = capturedDoc.getElementById("banner");
        bannerImg.src = "retinaBanner.png"
    }

    Because we have access to the DOM before any resources are loaded, we can swap the src of images on the fly before they are downloaded. The latter example is very basic – a better example to highlight the power of capturing it to demonstrate a perfect implementation of the picture polyfill.

    Picture Polyfill

    The Picture element is the official W3C HTML extension for dealing with adaptive images. There are polyfills that exist in order to use the Picture element in your site today, but none of them are able to do a perfect polyfill – the best polyfill implemented thus far requires a <noscript> tag surrounding an img element in order to support browsers without Javascript. Using Capturing, you can avoid this madness completely.

    Open the example and be sure to fire up the network tab in web inspector to see which resources get downloaded:

    Here is the important chunk of code that is in the source of the example:

    <picture>
        <source src="/examples/assets/images/small.jpg">
        <source src="/examples/assets/images/medium.jpg" media="(min-width: 450px)">
        <source src="/examples/assets/images/large.jpg" media="(min-width: 800px)">
        <source src="/examples/assets/images/extralarge.jpg" media="(min-width: 1000px)">
        <img src="/examples/assets/images/small.jpg">
    </picture>

    Take note that there is an img element that uses a src attribute, but the browser only downloads the correct image. You can see the code for this example here (note that the polyfill is only available in the example, not the library itself – yet):

    Not all sites use modified src attributes and <noscript> tags to solve the responsive image problem. An alternative, if you don’t want to rely on modifying src or adding <noscript> tags for every image of your site, is to use server-side detection in order to swap out images, scripts, and other content. Unfortunately, this solution comes with a lot of challenges.

    It was easy to use server-side user-agent detection when the only device you needed to worry about was the iPhone, but with the amount of new devices rolling out, keeping a dictionary of all devices containing information about their screen width, device pixel ratio, and more is a very painful task; not to mention there are certain things you cannot detect with server-side user-agent – such as actual network bandwidth.

    What else can you do with Capturing?

    Solving the responsive image problem is a great use-case for Capturing, but there are also many more. Here’s a few more interesting examples:

    Media queries in markup to control resource loading

    In this example, we use media queries in attributes on images and scripts to determine which ones will load, just to give you an idea of what you can do with Capturing. This example can be found here:

    Complete re-writing of a page using templating

    The primary function of Mobify.js 1.1 was client-side templating to completely rewrite the pages of your existing site when responsive doesn’t offer enough flexibility, or when changing the backend is simply too painful and tedious. It is particularly helpful when you need a mobile presence, fast. This is no longer the primary function of Mobify.js, but it still possible using Capturing.

    Check out this basic example:

    In this example, we’ve taken parts of the existing page and used them in a completely new markup rendered to browser.

    Fill your page with grumpy cats

    And of course, there is nothing more useful then replacing all the images in a page with grumpy cats! In a high-performing way, of course ;-).

    Once again, open up web inspector to see that the original images on the site did not download.

    Performance

    So what’s the catch? Is there a performance penalty to using Capturing? Yes, there is, but we feel the performance gains you can make by controlling your resources outweigh the minor penalty that Capturing brings. On first load, the library (and main executable if not concatenated together), must download and execute, and the load time here will vary depending on the round trip latency of the device (ranges from around ~60ms to ~300ms). However, the penalty of every subsequent request will be reduced by at least half due to the library being cached, and the just-in-time (JIT) compiler making the compilation much more efficient. You can run the test yourself!

    We also do our best to keep the size of the library to a minimum – at the time of publishing this blog post, the library is 4KB minified and gzipped.

    Why should you use Capturing?

    We created Capturing to give more control of performance to developers on the front-end. The reason other solutions fail to solve this problem is because the responsibilities of the front-end and backend have become increasingly intertwined. The backend’s responsibility should be to generate semantic web markup, and it should be the front-end’s responsibility to take the markup from the backend and processes it in such a way that it is best visually represented on the device, and in a high-performing way. Responsive design solves the first issue (visually representing data), and Capturing helps solve the next (increasing performance on websites by using front-end techniques such as determining screen size and bandwidth to control resource loading).

    If you want to continue to obey the laws of the semantic web, and if you want an easy way to control performance at the front-end, we highly recommend that you check out Mobify.js 2.0!

    How can I get started using Capturing?

    Head over to our quick start guide for instructions on how to get setup using Capturing.

    What’s next?

    We’ve begun with an official developer preview of Mobify.js 2.0, which includes just the Capturing portion, but we will be adding more and more useful features.

    The next feature on the list to add is automatic resizing of images, allowing you to dynamically download images based on the size of the browser window without the need to modify your existing markup (aside from inserting a small javascript snippet)!

    We also plan to create other polyfills that can only be solved with Capturing, such as the new HTML5 Template Tag, for example.

    We look forward to your feedback, and we are excited to see what other developers will do with our new Mobify.js 2.0 library!

  4. Building User-Extensible Webapps with Local

    In an interview with Andrew Binstock in 2012, Alan Kay described the browser as “a joke.” If that surprises you, you’ll be glad to know that Mr. Binstock was surprised as well.

    Part of the problem Kay pointed out is well-known: feature-set. Browsers are doing today what word-processors and presentation tools have done for decades. But that didn’t seem to be the problem that bothered him most. The real problem? Browser-makers thought they were making an application, when they were really building an OS.

    The browser tab is a very small environment. Due to the same-origin policy, the application’s world is limited to what its host reveals. Unfortunately, remote hosts are often closed networks, and users don’t control them. This stops us from doing composition (no pipe in the browser) and configuration (no swapping out backends for your frontend). You can change tabs, but you can’t combine them.

    Built out of IRON

    Despite these problems, the Web is successful, and the reasons for that are specific. In a paper published in 2011, Microsoft, UT, and Penn researchers outlined the necessary qualities (PDF): Isolated, Rich, On-demand, and Networked. Those qualities are why, on the whole, you can click around the Web and do interesting things without worrying a virus will infect your computer. As they point out, if we want to improve the Web, we have to be careful not to soften it.

    That research team proposed a less-featured core browser which downloads its high-level capabilities with the page. Their approach could improve richness and security for the Web, but it requires a “radical refactor” first. With a need for something more immediate, I’ve developed Local, an in-browser program architecture which is compatible with HTML5 APIs.

    HTTP over Web Workers

    Local uses Web Workers to run its applications. They’re the only suitable choice available, as iframes and object-capabilities tools (like Google’s Caja or Crockford’s ADsafe) share the document’s thread. Workers, however, lack access to the document, making them difficult to use. Local’s solution to this is to treat the Workers like Web hosts and dispatch requests over the postMessage API. The Workers respond in turn with HTML, which the document renders.

    This leaves it to the document to make a lot of decisions: traffic permissions, HTML behaviors, which apps to load, and so on. Those decisions make up the page’s “environment,” and they collectively organize the apps into either a host-driven site, a pluggable web app, or a user-driven desktop environment.

    One of Local’s fundamental requirements is composition. The Internet’s strength– distributed interconnection– should be reflected in its software. REST is a unified interface to Local’s architecture, a philosophy which is borrowed from the Plan9 file-system. In HTML5 + Local, URIs can represent remote service endpoints, local service endpoints, and encoded chunks of data. The protocol to target javascript (httpl://) allows client regions to link to and target the Workers without event-binding.

    This keeps HTML declarative: there’s no application-specific setup. Additional interface primitives can be introduced by the Environment. Grimwire.com tries its own take on Web Intents, which produces a drag-and-drop-based UX. For programmatic composition, Local leans on the Link header, and provides the “navigator” prototype to follow those links in a hypermedia-friendly way.

    Security is also a fundamental requirement for Local. The Web Worker provides a secure sandbox for untrusted code (source (PDF), source). Content Security Policies allow environments to restrict inline scripts, styling, and embeds (including images). Local then provides a traffic dispatch wrapper for the environment to examine, scrub, route or deny application requests. This makes it possible to set policies (such as “local requests only”) and to intercept Cookie, Auth, and other session headers. The flexibility of those policies vary for each environment.

    Example Environment: a Markdown Viewer

    To get an idea of how this works, let’s take a quick tour through a simple environment. These snippets are from blog.grimwire.com. The page HTML, JS, and markdown are served statically. A Worker application, “markdown.js”, proxies its requests to the hosted blog posts and converts their content to HTML. The environment then renders that HTML into the Content “client region,” which is an area segmented by Local into its own browsing context (like an iframe).

    index.js

    The first file we’ll look at is “index.js,” the script which sets up the environment:

    // The Traffic Mediator
    // examines and routes all traffic in the application
    // (in our simple blog, we'll permit all requests and log the errors)
    Environment.setDispatchWrapper(function(request, origin, dispatch) {
        var response = dispatch(request);
        // dispatch() responds with a promise which is
        //   fulfilled on 2xx/3xx and rejected on 4xx/5xx
        response.except(console.log.bind(console));
        return response;
    });
     
    // The Region Post-processor
    // called after a response is rendered
    // (gives the environment a chance to add plugins or styles to new content)
    Environment.setRegionPostProcessor(function(renderTargetEl) {
        Prism.highlightAll(); // add syntax highlighting with prismjs
                              // (http://prismjs.com/)
    });
     
    // Application Load
    // start a worker and configure it to load our "markdown.js" file
    Environment.addServer('markdown.util', new Environment.WorkerServer({
        scriptUrl:'/local/apps/util/markdown.js',
        // ^^ this tells WorkerServer what app to load
        baseUrl:'/posts'
        // ^^ this tells markdown.js where to find the markdown files
    }));
     
    // Client Regions
    // creates browsing regions within the page and populates them with content
    var contentRegion = Environment.addClientRegion('content');
    contentRegion.dispatchRequest('httpl://markdown.util/frontpage.md');

    The environment here is very minimal. It makes use of two hooks: the dispatch wrapper and the region post-processor. A more advanced environment might sub-type the ClientRegion and WorkerServer prototypes, but these two hooks should provide a lot of control on their own. The dispatch wrapper is primarily used for security and debugging, while the region post-processor is there to add UI behaviors or styles after new content enters the page.

    Once the hooks are defined, the environment loads the markdown proxy and dispatches a request from the content region to load ‘frontpage.md’. Workers load asynchronously, but the WorkerServer buffers requests made during load, so the content region doesn’t have to wait to dispatch its request.

    When a link is clicked or a form is submitted within a ClientRegion, Local converts that event into a custom ‘request’ DOM event and fires it off of the region’s element. Another part of Local listens for the ‘request’ event and handles the dispatch and render process. We use dispatchRequest() to programmatically fire our own ‘request’ event at the start. After that, markdown files can link to “httpl://markdown.util/:post_name.md” and the region will work on its own.

    markdown.js

    Let’s take a quick look at “markdown.js”:

    // Load Dependencies
    // (these calls are synchronous)
    importScripts('linkjs-ext/responder.js');
    importScripts('vendor/marked.js'); // https://github.com/chjj/marked
     
    // Configure Marked.js
    marked.setOptions({ gfm: true, tables: true });
     
    // Pipe Functions
    // used with `Link.Responder.pipe()` to convert the response markdown to html
    function headerRewrite(headers) {
        headers['content-type'] = 'text/html';
        return headers;
    }
    function bodyRewrite(md) { return (md) ? marked(md) : ''; }
     
    // WorkerServer Request Handler
    app.onHttpRequest(function(request, response) {
        // request the markdown file
        var mdRequest = Link.dispatch({
            method  : 'get',
            url     : app.config.baseUrl + request.path,
                                // ^^ the `baseUrl` given to us by index.js
            headers : { accept:'text/plain' }
        });
        // use helper libraries to pipe and convert the response back
        Link.responder(response).pipe(mdRequest, headerRewrite, bodyRewrite);
    });
     
    // Inform the environment that we're ready to handle requests
    app.postMessage('loaded');

    This script includes all of the necessary pieces for a Worker application. At minimum, the app must define an HTTP request handler and post the ‘loaded’ message back to the environment. (postMessage() is part of MyHouse, the low-level Worker manager which HTTPL is built on.)

    Before the application is loaded, Local nulls any APIs which might allow data leaks (such as XMLHttpRequest). When a Worker uses Link.dispatch, the message is transported to the document and given to the dispatch wrapper. This is how security policies are enforced. Local also populates the app.config object with the values given to the WorkerServer constructor, allowing the environment to pass configuration to the instance.

    With those two snippets, we’ve seen the basics of how Local works. If we wanted to create a more advanced site or desktop environment, we’d go on to create a layout manager for the client regions, UIs to load and control Workers, security policies to enforce permissions, and so on.

    You can find the complete source for the blog at github.com/pfraze/local-blog.

    User-Driven Software

    Local’s objective is to let users drive the development of the Web. In its ideal future, private data can be configured to save to private hosts, peer-to-peer traffic can go unlogged between in-browser servers with WebRTC, APIs can be mashed up on the fly, and users can choose the interfaces. Rather than fixed websites, I’d like to see hosts provide platforms built around different tasks (blogging, banking, shopping, developing, etc) and competing on services for their user’s apps. Then, services like Mint.com could stop asking for your banking credentials. Instead, they’d just host a JS file.

    You can get started with Local by reading its documentation and blog, and by trying out Grimwire, a general-purpose deployment in its early stages. The source can be found on GitHub under the MIT license.

  5. Finding Words by Synonym with Cinnamon.js

    There are only two hard things in Computer Science: cache invalidation and naming things.

    — Phil Karlton

    Naming things in web development is hard too, from evolving CSS classes to headers and links. From the perspective of information architecture, headers and links serve as visual waypoints, helping users build mental models of a site and navigate from page to page.

    But a second, underappreciated role that header and link names play is through the browser’s built-in Find function. I can only speak from personal experience — and maybe I’m the exception to the rule — but I often rely on Find to do existence checks on in-page content and quickly jump to it.

    Sometimes Find falls short though. For instance, consider a visitor that likes your site and decides to subscribe to your RSS feed. They search the page for “RSS” but nothing comes up. The problem is that you named your link “Feed” or “Subscribe”, or used the RSS symbol. They shrug their shoulders and move on — and you’ve lost a potential follower.

    I wrote Cinnamon.js to ease the pain of naming things, by having Find work with synonyms (demo).

    Try It Out

    To use Cinnamon.js, you can simply include the script on your page:

    <script src="cinnamon.js"></script>

    Then wrap your word with synonyms, separated by commas, like so:

    <span data-cinnamon="Blaze,Flame,Pyre">Fire</span>

    This is an example of a markup API, requiring only a bit of HTML to get going.

    The Basic Style

    In a nutshell, the script takes each synonym listed in the data-cinnamon attribute and creates a child element, appropriately styled.

    To style the synonyms, I stack them behind the original text with the following CSS. The synonym text is hidden while the original text gets highlighted.

    position: absolute;
    top: 0;
    left: 0;
    z-index: -1;
    display: inline-block;
    width: 100%;
    height: 100%;
    overflow: hidden;
    color: transparent;

    Here’s how it looks in Firefox’s 3D view. The green blocks represent the synonyms.

    Firefox 3D View

    Cross-Browser Quirks

    For the purposes of the script, when a synonym is found, the text should stay invisible while its background gets highlighted. This gives the illusion that the original word is the one being highlighted.

    In testing this, I discovered some differences in how browsers handle Find. These are edge cases that you hopefully won’t ever have to deal with, but they loomed larger in making Cinnamon.js.

    Finding Invisible Text

    If text is set to display: none;, Find doesn’t see it at all — this much is true of all browsers. Same goes for visibility: hidden; (except for Opera, where Find matches the synonym but nothing is seen).

    When opacity is set to 0, most browsers match the text, but nothing is visibly highlighted (Opera is the odd man out again, highlighting the background of the matched text).

    When text is set to color: transparent;, most browsers including Firefox and Chrome will highlight the area while the text stays transparent — just what we want for our script.

    Safari

    However, Safari does things differently. When transparent text is found, Safari will display it as black text on yellow. If the text is buried under elements with a higher z-index, it brings it to the top.

    Another difference: most browsers match text in the middle of a string. Safari only does so when the string is CamelCase.

    Other Issues

    Hidden text, used deceptively, can be penalized in Google’s search results. Given the techniques used, Cinnamon.js carries some small measure of risk, especially if it’s misused.

    Another issue is the impact of Cinnamon.js on accessibility. Fortunately, there’s aria-hidden="true", which is used to tell screen readers to ignore synonyms.

    Keep On Searching

    I’ve used the browser’s Find function for years without giving it much thought. But in writing Cinnamon.js, I’ve learned quite a bit about the web and how a small piece of it might be extended. You just never know what’ll inspire your next hack.

  6. Simplifying audio in the browser

    The last few years have seen tremendous gains in the capabilities of browsers, as the latest HTML5 standards continue to get implemented. We can now render advanced graphics on the canvas, communicate in real-time with WebSockets, access the local filesystem, create offline apps and more. However, the one area that has lagged behind is audio.

    The HTML5 Audio element is great for a small set of uses (such as playing music), but doesn’t work so well when you need low-latency, precision playback.

    Over the last year, a new audio standard has been developed for the browser, which gives developers direct access to the audio data. Web Audio API allows for high precision and high performing audio playback, as well as many advanced features that just aren’t possible with the HTML5 Audio element. However, support is still limited, and the API is considerably more complex than HTML5 Audio.

    Introducing howler.js

    The most obvious use-case for high-performance audio is games, but most developers have had to settle for HTML5 Audio with a Flash fallback to get browser compatibility. My company, GoldFire Studios, exclusively develops games for the open web, and we set out to find an audio library that offered the kind of audio support a game needs, without relying on antiquated technologies. Unfortunately, there were none to be found, so we wrote our own and open-sourced it: howler.js.

    Howler.js defaults to Web Audio API and uses HTML5 Audio as the fallback. The library greatly simplifies the API and handles all of the tricky bits automatically. This is a simple example to create an audio sprite (like a CSS sprite, but with an audio file) and play one of the sounds:

    var sound = new Howl({
      urls: ['sounds.mp3', 'sounds.ogg'],
      sprite: {
        blast: [0, 2000],
        laser: [3000, 700],
        winner: [5000, 9000]
      }
    });
     
    // shoot the laser!
    sound.play('laser');

    Using feature detection

    At the most basic level, this works through feature detection. The following snippet detects whether or not Web Audio API is available and creates the audio context if it is. Current support for Web Audio API includes Chrome 10+, Safari 6+, and iOS 6+. It is also in the pipeline for Firefox, Opera and most other mobile browsers.

    var ctx = null,
      usingWebAudio = true;
    if (typeof AudioContext !== 'undefined') {
      ctx = new AudioContext();
    } else if (typeof webkitAudioContext !== 'undefined') {
      ctx = new webkitAudioContext();
    } else {
      usingWebAudio = false;
    }

    Audio support for different codecs varies across browsers as well, so we detect which format is best to use from your provided array of sources with the canPlayType method:

    var audioTest = new Audio();
    var codecs = {
      mp3: !!audioTest.canPlayType('audio/mpeg;').replace(/^no$/,''),
      ogg: !!audioTest.canPlayType('audio/ogg; codecs="vorbis"').replace(/^no$/,''),
      wav: !!audioTest.canPlayType('audio/wav; codecs="1"').replace(/^no$/,''),
      m4a: !!(audioTest.canPlayType('audio/x-m4a;') || audioTest.canPlayType('audio/aac;')).replace(/^no$/,''),
      webm: !!audioTest.canPlayType('audio/webm; codecs="vorbis"').replace(/^no$/,'')
    };

    Making it easy

    These two key components of howler.js allows the library to automatically select the best method of playback and source file to load and play. From there, the library abstracts away the two different APIs and turns this (a simplified Web Audio API example without all of the extra fallback support and extra features):

    // create gain node
    var gainNode, bufferSource;
    gainNode = ctx.createGain();
    gainNode.gain.value = volume;
    loadBuffer('sound.wav');
     
    var loadBuffer = function(url) {
      // load the buffer from the URL
      var xhr = new XMLHttpRequest();
      xhr.open('GET', url, true);
      xhr.responseType = 'arraybuffer';
      xhr.onload = function() {
        // decode the buffer into an audio source
        ctx.decodeAudioData(xhr.response, function(buffer) {
          if (buffer) {
            bufferSource = ctx.createBufferSource();
            bufferSource.buffer = buffer;
            bufferSource.connect(gainNode);
            bufferSource.start(0);
          }
        });
      };
      xhr.send();
    };

    (Note: some old deprecated names were createGainNode and noteOn, if you see them in other examples on the web)

    Into this:

    var sound = new Howl({
      urls: ['sound.wav'],
      autoplay: true
    });

    It is important to note that neither Web Audio API nor HTML5 Audio are the perfect solution for everything. As with anything, it is important to select the right tool for the right job. For example, you wouldn’t want to load a large background music file using Web Audio API, as you would have to wait for the entire data source to load before playing. HTML5 Audio is able to play very quickly after the download begins, which is why howler.js also implements an override feature that allows you to mix-and-match the two APIs within your app.

    Audio in the browser is ready

    I often hear that audio in the browser is broken and won’t be useable for anything more than basic audio streaming for quite some time. This couldn’t be further from the truth. The tools are already in today’s modern browsers. High quality audio support is here today, and Web Audio API and HTML5 combine to offer truly plugin-free, cross-browser audio support. Browser audio is no longer a second-class citizen, so let’s all stop treating it like one and keep making apps for the open web.

  7. Story of a Knight: the making of

    The travel of a medieval knight through fullscreen DOM. The ‘making of’ the demo that has won the November Dev Derby.

    Technologies used:

    Markup And Style

    Markup and style are organized in this way:

    • An external wrapper that contains everything
    • Three controls boxes with fixed positions and high z-index
    • An internal wrapper that contains the Google Maps iframe, canvas path and 8 div elements for the story

    The external wrapper and control boxes

    The external wrapper contains:

    • The audio tag with ogg and mp3 sources, on top left;
    • The div which is populated with fullscreen switcher if the browser supports it, on top right;
    • The navigation with numbers to move through the canvas path, on bottom right.

    <div id="external-wrapper">
      <audio controls="controls">
        <source src="assets/saltarello.ogg" type="audio/ogg" />
        <source src="assets/saltarello.mp3" type="audio/mp3" />
        Your browser does not support the audio element.
      </audio>
     
      <div id="fullscreen-control">
      </div>
     
      <ul class="navigation">
        <li class="active"><a href="#start">1</a></li>
        <li><a href="#description">2</a></li>
        ...
      </ul>

    The internal wrapper

    The internal wrapper contains:

    • The iframe with the big Google Map embedded, absolutely positioned with negative x and y;
    • A div of the same size and the same absolute position of the map, but with a bigger z-index, which has a “background-size: cover” semi-transparent image of old paper to give a parchment effect;
    • The canvas path (once activated the javascript plugin, it will be drawn here);
    • The 8 divs that tells the story with texts and images, absolutely positioned.

    <div class="wrapper">
      <iframe width="5000" height="4500" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.it/?ie=UTF8&amp;t=k&amp;ll=44.660839,14.811584&amp;spn=8.79092,13.730164&amp;z=9&amp;output=embed"></iframe>
      <div style="position: absolute; top: -2000px; left: -1100px; background: transparent url(assets/bg_paper.jpg) no-repeat top left; width: 5000px; height: 4500px;background-size: cover;opacity: 0.4;z-index: 999;"></div>
     
      <div class="demo">
        <img src="assets/knight.png" alt="">
        <h1><span>&#8225;</span> Story of a Knight <span>&#8225;</span></h1>
        <span class="arrow">&#8224;</span> Of Venetian lagoon AD 1213 <span class="arrow">&#8224;</span>
      </div>
     
      <div class="description">
        <span class="big">He learnedthe profession of arms<br/>in an Apennines' fortress.</span>
        <img src="assets/weapons.png" alt="" style="position: absolute; top: 180px; left: 20px;">
      </div>
      ...
    </div>

    JavaScript

    The Scrollpath plugin

    Available at https://github.com/JoelBesada/scrollpath

    First we need to embed the jQuery library in the last part of the page

    <script src="http://code.jquery.com/jquery-latest.pack.js"></script>

    Then we can call the scrollpath.js plugin, the demo.js where we give the instructions to draw the canvas path and initiate it, the easing.js to have a smooth movement (also include the scrollpath.css in the head of the document).

    <script src="script/min.jquery.scrollpath.js"></script>
    <script src="script/demo.js"></script>
    <script src="script/jquery.easing.js"></script>
    <link rel="stylesheet" type="text/css" href="style/scrollpath.css" />

    Let’s see the relevant parts of the demo.js file:

    1. At the beginning there are the instructions for drawing the path, using the methods “moveTo”, “lineTo”, “arc” and declaring x/y coordinates;
    2. Then there’s the initialization of the plugin on the internal wrapper;
    3. Finally there’s the navigation implementation with smooth scrolling.
    $(document).ready(init);
     
      function init() {
      /* ========== DRAWING THE PATH AND INITIATING THE PLUGIN ============= */
     
      var path = $.fn.scrollPath("getPath");
      // Move to 'start' element
      path.moveTo(400, 50, {name: "start"});
      // Line to 'description' element
      path.lineTo(400, 800, {name: "description"});
      // Arc down and line
      path.arc(200, 1200, 400, -Math.PI/2, Math.PI/2, true);
      ...
     
      // We're done with the path, let's initiate the plugin on our wrapper element
      $(".wrapper").scrollPath({drawPath: true, wrapAround: true});
     
      // Add scrollTo on click on the navigation anchors
      $(".navigation").find("a").each(function() {
        var target = this.getAttribute("href").replace("#", "");
        $(this).click(function(e) {
          e.preventDefault();
     
          // Include the jQuery easing plugin (http://gsgd.co.uk/sandbox/jquery/easing/)
          // for extra easing functions like the one below
          $.fn.scrollPath("scrollTo", target, 1000, "easeInOutSine");
        });
      });
     
      /* ===================================================================== */
    }

    The jQuery-FullScreen plugin

    Available at https://github.com/martinaglv/jQuery-FullScreen

    To cap it all, the fullscreen. Include the jQuery-FullScreen plugin, then verify with a script if the browser supports the functionality: in case yes, it will append the switcher on the top right corner; then initialize it on the external wrapper to push everything fullscreen.

    <script src="script/jquery.fullscreen.js"></script>
     
    <script>
      (function () {
        //Fullscreen for modern browser
        if($.support.fullscreen){
          var fullScreenButton = $('#fullscreen-control').append('<a id="goFullScreen">Watch full screen</a>');
          fullScreenButton.click(function(e){
            e.preventDefault();
            $('#external-wrapper').fullScreen();
          });
        }
      })();
    </script>

    Summary

    The hardest part was to figure out which size/zoom level give to the Google Maps iframe and then where to position it in relation to the div with the canvas.
    The other thing that has reserved some problems was the loading time: I had initially placed the video of a medieval battle in slow-motion along the path, but then I removed it because the page was loaded too slowly

    As you have seen everything is very simple, the good result depends only on the right mix of technology, storytelling and aesthetics. I think that the front-end is entering in a golden age, a period rich of expressive opportunities: languages and browsers are evolving rapidly, so there’s the chance to experiment mixing different techniques and obtain creative results.

  8. Koalas to the Max – a case study

    One day I was browsing reddit when I came across this peculiar link posted on it: http://www.cesmes.fi/pallo.swf

    The game was addictive and I loved it but I found several design elements flawed. Why did it start with four circles and not one? Why was the color split so jarring? Why was it written in flash? (What is this, 2010?) Most importantly, it was missing a golden opportunity to split into dots that form an image instead of just doing random colors.

    Creating the project

    This seemed like a fun project, and I reimplemented it (with my design tweaks) using D3 to render with SVG.

    The main idea was to have the dots split into the pixels of an image, with each bigger dot having the average color of the four dots contained inside of it recursively, and allow the code to work on any web-based image.
    The code sat in my ‘Projects’ folder for some time; Valentines day was around the corner and I thought it could be a cute gift. I bought the domain name, found a cute picture, and thus “koalastothemax.com (KttM)” was born.

    Implementation

    While the user-facing part of KttM has changed little since its inception, the implementation has been revisited several times to incorporate bug fixes, improve performance, and bring support to a wider range of devices.

    Notable excerpts are presented below and the full code can be found on GitHub.

    Load the image

    If the image is hosted on koalastothemax.com (same) domain then loading it is as simple as calling new Image()

    var img = new Image();
    img.onload = function() {
     // Awesome rendering code omitted
    };
    img.src = the_image_source;

    One of the core design goals for KttM was to let people use their own images as the revealed image. Thus, when the image is on an arbitrary domain, it needs to be given special consideration. Given the same origin restrictions, there needs to be a image proxy that could channel the image from the arbitrary domain or send the image data as a JSONP call.

    Originally I used a library called $.getImageData but I had to switch to a self hosted solution after KttM went viral and brought the $.getImageData App Engine account to its limits.

    Extract the pixel data

    Once the image loads, it needs to be resized to the dimensions of the finest layer of circles (128 x 128) and its pixel data can be extracted with the help of an offscreen HTML5 canvas element.

    koala.loadImage = function(imageData) {
     // Create a canvas for image data resizing and extraction
     var canvas = document.createElement('canvas').getContext('2d');
     // Draw the image into the corner, resizing it to dim x dim
     canvas.drawImage(imageData, 0, 0, dim, dim);
     // Extract the pixel data from the same area of canvas
     // Note: This call will throw a security exception if imageData
     // was loaded from a different domain than the script.
     return canvas.getImageData(0, 0, dim, dim).data;
    };

    dim is the number of smallest circles that will appear on a side. 128 seemed to produce nice results but really any power of 2 could be used. Each circle on the finest level corresponds to one pixel of the resized image.

    Build the split tree

    Resizing the image returns the data needed to render the finest layer of the pixelization. Every successive layer is formed by grouping neighboring clusters of four dots together and averaging their color. The entire structure is stored as a (quaternary) tree so that when a circle splits it has easy access to the dots from which it was formed. During construction each subsequent layer of the tree is stored in an efficient 2D array.

    // Got the data now build the tree
    var finestLayer = array2d(dim, dim);
    var size = minSize;
     
    // Start off by populating the base (leaf) layer
    var xi, yi, t = 0, color;
    for (yi = 0; yi < dim; yi++) {
     for (xi = 0; xi < dim; xi++) {
       color = [colorData[t], colorData[t+1], colorData[t+2]];
       finestLayer(xi, yi, new Circle(vis, xi, yi, size, color));
       t += 4;
     }
    }

    Start by going through the color data extracted in from the image and creating the finest circles.

    // Build up successive nodes by grouping
    var layer, prevLayer = finestLayer;
    var c1, c2, c3, c4, currentLayer = 0;
    while (size < maxSize) {
     dim /= 2;
     size = size * 2;
     layer = array2d(dim, dim);
     for (yi = 0; yi < dim; yi++) {
       for (xi = 0; xi < dim; xi++) {
         c1 = prevLayer(2 * xi    , 2 * yi    );
         c2 = prevLayer(2 * xi + 1, 2 * yi    );
         c3 = prevLayer(2 * xi    , 2 * yi + 1);
         c4 = prevLayer(2 * xi + 1, 2 * yi + 1);
         color = avgColor(c1.color, c2.color, c3.color, c4.color);
         c1.parent = c2.parent = c3.parent = c4.parent = layer(xi, yi,
           new Circle(vis, xi, yi, size, color, [c1, c2, c3, c4], currentLayer, onSplit)
         );
       }
     }
     splitableByLayer.push(dim * dim);
     splitableTotal += dim * dim;
     currentLayer++;
     prevLayer = layer;
    }

    After the finest circles have been created, the subsequent circles are each built by merging four dots and doubling the radius of the resulting dot.

    Render the circles

    Once the split tree is built, the initial circle is added to the page.

    // Create the initial circle
    Circle.addToVis(vis, [layer(0, 0)], true);

    This employs the Circle.addToVis function that is used whenever the circle is split. The second argument is the array of circles to be added to the page.

    Circle.addToVis = function(vis, circles, init) {
     var circle = vis.selectAll('.nope').data(circles)
       .enter().append('circle');
     
     if (init) {
       // Setup the initial state of the initial circle
       circle = circle
         .attr('cx',   function(d) { return d.x; })
         .attr('cy',   function(d) { return d.y; })
         .attr('r', 4)
         .attr('fill', '#ffffff')
           .transition()
           .duration(1000);
     } else {
       // Setup the initial state of the opened circles
       circle = circle
         .attr('cx',   function(d) { return d.parent.x; })
         .attr('cy',   function(d) { return d.parent.y; })
         .attr('r',    function(d) { return d.parent.size / 2; })
         .attr('fill', function(d) { return String(d.parent.rgb); })
         .attr('fill-opacity', 0.68)
           .transition()
           .duration(300);
     }
     
     // Transition the to the respective final state
     circle
       .attr('cx',   function(d) { return d.x; })
       .attr('cy',   function(d) { return d.y; })
       .attr('r',    function(d) { return d.size / 2; })
       .attr('fill', function(d) { return String(d.rgb); })
       .attr('fill-opacity', 1)
       .each('end',  function(d) { d.node = this; });
    }

    Here the D3 magic happens. The circles in circles are added (.append('circle')) to the SVG container and animated to their position. The initial circle is given special treatment as it fades in from the center of the page while the others slide over from the position of their “parent” circle.

    In typical D3 fashion circle ends up being a selection of all the circles that were added. The .attr calls are applied to all of the elements in the selection. When a function is passed in it shows how to map the split tree node onto an SVG element.

    .attr('cx', function(d) { return d.parent.x; }) would set the X coordinate of the center of the circle to the X position of the parent.

    The attributes are set to their initial state then a transition is started with .transition() and then the attributes are set to their final state; D3 takes care of the animation.

    Detect mouse (and touch) over

    The circles need to split when the user moves the mouse (or finger) over them; to be done efficiently the regular structure of the layout can be taken advantage of.

    The described algorithm vastly outperforms native “onmouseover” event handlers.

    // Handle mouse events
    var prevMousePosition = null;
    function onMouseMove() {
     var mousePosition = d3.mouse(vis.node());
     
     // Do nothing if the mouse point is not valid
     if (isNaN(mousePosition[0])) {
       prevMousePosition = null;
       return;
     }
     
     if (prevMousePosition) {
       findAndSplit(prevMousePosition, mousePosition);
     }
     prevMousePosition = mousePosition;
     d3.event.preventDefault();
    }
     
    // Initialize interaction
    d3.select(document.body)
     .on('mousemove.koala', onMouseMove)

    Firstly a body wide mousemove event handler is registered. The event handler keeps track of the previous mouse position and calls on the findAndSplit function passing it the line segments traveled by the user’s mouse.

    function findAndSplit(startPoint, endPoint) {
     var breaks = breakInterval(startPoint, endPoint, 4);
     var circleToSplit = []
     
     for (var i = 0; i < breaks.length - 1; i++) {
       var sp = breaks[i],
           ep = breaks[i+1];
     
       var circle = splitableCircleAt(ep);
       if (circle && circle.isSplitable() && circle.checkIntersection(sp, ep)) {
         circle.split();
       }
     }
    }

    The findAndSplit function splits a potentially large segment traveled by the mouse into a series of small segments (not bigger than 4px long). It then checks each small segment for a potential circle intersection.

    function splitableCircleAt(pos) {
     var xi = Math.floor(pos[0] / minSize),
         yi = Math.floor(pos[1] / minSize),
         circle = finestLayer(xi, yi);
     if (!circle) return null;
     while (circle && !circle.isSplitable()) circle = circle.parent;
     return circle || null;
    }

    The splitableCircleAt function takes advantage of the regular structure of the layout to find the one circle that the segment ending in the given point might be intersecting. This is done by finding the leaf node of the closest fine circle and traversing up the split tree to find its visible parent.

    Finally the intersected circle is split (circle.split()).

    Circle.prototype.split = function() {
     if (!this.isSplitable()) return;
     d3.select(this.node).remove();
     delete this.node;
     Circle.addToVis(this.vis, this.children);
     this.onSplit(this);
    }

    Going viral

    Sometime after Valentines day I meet with Mike Bostock (the creator of D3) regarding D3 syntax and I showed him KttM, which he thought was tweet-worthy – it was, after all, an early example of a pointless artsy visualization done with D3.

    Mike has a twitter following and his tweet, which was retweeted by some members of the Google Chrome development team, started getting some momentum.

    Since the koala was out of the bag, I decided that it might as well be posted on reddit. I posted it on the programing subreddit with the tile “A cute D3 / SVG powered image puzzle. [No IE]” and it got a respectable 23 points which made me happy. Later that day it was reposted to the funny subreddit with the title “Press all the dots :D” and was upvoted to the front page.

    The traffic went exponential. Reddit was a spike that quickly dropped off, but people have picked up on it and spread it to Facebook, StumbleUpon, and other social media outlets.

    The traffic from these sources decays over time but every several months KttM gets rediscovered and traffic spikes.

    Such irregular traffic patterns underscore the need to write scalable code. Conveniently KttM does most of the work within the user’s browser; the server needs only to serve the page assets and one (small) image per page load allowing KttM to be hosted on a dirt-cheap shared hosting service.

    Measuring engagement

    After KttM became popular I was interested in exploring how people actually interacted with the application. Did they even realize that the initial single circle can split? Does anyone actually finish the whole image? Do people uncover the circles uniformly?

    At first the only tracking on KttM was the vanilla GA code that tracks pageviews. This quickly became underwhelming. I decided to add custom event tracking for when an entire layer was cleared and when a percentage of circles were split (in increments of 5%). The event value is set to the time in seconds since page load.

    As you can see such event tracking offers both insights and room for improvement. The 0% clear event is fired when the first circle is split and the average time for that event to fire seems to be 308 seconds (5 minutes) which does not sound reasonable. In reality this happens when someone opens KttM and leaves it open for days then, if a circle is split, the event value would be huge and it would skew the average. I wish GA had a histogram view.

    Even basic engagement tracking sheds vast amounts of light into how far people get through the game. These metrics proved very useful when the the mouse-over algorithm was upgraded. I could, after several days of running the new algorithm, see that people were finishing more of the puzzle before giving up.

    Lessons learned

    While making, maintaining, and running KttM I learned several lessons about using modern web standards to build web applications that run on a wide range of devices.

    Some native browser utilities give you 90% of what you need, but to get your app behaving exactly as you want, you need to reimplement them in JavaScript. For example, the SVG mouseover events could not cope well with the number of circles and it was much more efficient to implement them in JavaScript by taking advantage of the regular circle layout. Similarly, the native base64 functions (atob, btoa) are not universally supported and do not work with unicode. It is surprisingly easy to support the modern Internet Explorers (9 and 10) and for the older IEs Google Chrome Frame provides a great fallback.

    Despite the huge improvements in standard compliance it is still necessary to test the code on a wide variety of browsers and devices, as there are still differences in how certain features are implemented. For example, in IE10 running on the Microsoft Surface html {-ms-touch-action: none; } needed to be added to allow KttM to function correctly.

    Adding tracking and taking time to define and collect the key engagement metrics allows you to evaluate the impact of changes that get deployed to users in a quantitative manner. Having well defined metrics allows you to run controlled tests to figure out how to streamline your application.

    Finally, listen to your users! They pick up on things that you miss – even if they don’t know it. The congratulations message that appears on completion was added after I received complaints that is was not clear when a picture was fully uncovered.

    All projects are forever evolving and if you listen to your users and run controlled experiments then there is no limit to how much you can improve.

  9. NORAD Tracks Santa

    This year, Open Web standards like WebGL, Web Workers, Typed Arrays, Fullscreen, and more will have a prominent role in NORAD’s annual mission to track Santa Claus as he makes his journey around the world. That’s because Analytical Graphics, Inc. used Cesium as the basis for the 3D Track Santa application.

    Cesium is an open source library that uses JavaScript, WebGL, and other web technologies to render a detailed, dynamic, and interactive virtual globe in a web browser, without the need for a plugin. Terrain and imagery datasets measured in gigabytes or terabytes are streamed to the browser on demand, and overlaid with lines, polygons, placemarks, labels, models, and other features. These features are accurately positioned within the 3D world and can efficiently move and change over time. In short, Cesium brings to the Open Web the kind of responsive, geospatial experience that was uncommon even in bulky desktop applications just a few years ago.

    The NORAD Tracks Santa web application goes live on December 24. Cesium, however, is freely available today for commercial and non-commercial use under the Apache 2.0 license.

    In this article, I’ll present how Cesium uses cutting edge web APIs to bring an exciting in-browser experience to millions of people on December 24.

    The locations used in the screenshots of the NORAD Tracks Santa application are based on test data. We, of course, won’t know Santa’s route until NORAD starts tracking him on Christmas Eve. Also, the code samples in this article are for illustrative purposes and do not necessarily reflect the exact code used in Cesium. If you want to see the official code, check out our GitHub repo.

    WebGL

    Cesium could not exist without WebGL, the technology that brings hardware-accelerated 3D graphics to the web.

    It’s hard to overstate the potential of this technology to bring a whole new class of scientific and entertainment applications to the web; Cesium is just one realization of that potential. With WebGL, we can render scenes like the above, consisting of hundreds of thousands of triangles, at well over 60 frames per second.

    Yeah, you could say I’m excited.

    If you’re familiar with OpenGL, WebGL will seem very natural to you. To oversimplify a bit, WebGL enables applications to draw shaded triangles really fast. For example, from JavaScript, we execute code like this:

    gl.bindBuffer(gl.ARRAY_BUFFER, vertexBuffer);
     
    gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, indexBuffer);
     
    gl.drawElements(gl.TRIANGLES, numberOfIndices, gl.UNSIGNED_SHORT, 0);

    vertexBuffer is a previously-configured data structure holding vertices, or corners of triangles. A simple vertex just specifies the position of the vertex as X, Y, Z coordinates in 3D space. A vertex can have additional attributes, however, such as colors and the vertex’s coordinates within a 2D image for texture mapping.

    The indexBuffer links the vertices together into triangles. It is a list of integers where each integer specifies the index of a vertex in the vertexBuffer. Each triplet of indices specifies one triangle. For example, if the first three indices in the list are [0, 2, 1], the first triangle is defined by linking up vertices 0, 2, and 1.

    The drawElements call instructs WebGL to draw the triangles defined by the vertex and index buffers. The really cool thing is what happens next.

    For every vertex in vertexBuffer, WebGL executes a program, called a vertex shader, that is supplied by the JavaScript code. Then, WebGL figures out which pixels on the screen are “lit up” by each triangle – a process called rasterization. For each of these pixels, called fragments, another program, a fragment shader, is invoked. These programs are written in a C-like language called GLSL that executes on the system’s Graphics Processing Unit (GPU). Thanks to this low-level access and the impressive parallel computation capability of GPUs, these programs can do sophisticated computations very quickly, creating impressive visual effects. This feat is especially impressive when you consider that they are executed hundreds of thousands or millions of times per render frame.

    Cesium’s fragment shaders approximate atmospheric scattering, simulate ocean waves, model the reflection of the sun off the ocean surface, and more.

    WebGL is well supported in modern browsers on Windows, Linux and Mac OS X. Even Firefox for Android supports WebGL!

    While I’ve shown direct WebGL calls in the code above, Cesium is actually built on a renderer that raises the level of abstraction beyond WebGL itself. We never issue drawElements calls directly, but instead create command objects that represent the vertex buffers, index buffers, and other data with which to draw. This allows the renderer to automatically and elegantly solve esoteric rendering problems like the insufficient depth buffer precision for a world the size of Earth. If you’re interested, you can read more about Cesium’s data-driven renderer.

    For more information about some of the neat rendering effects used in the NORAD Tracks Santa application, take a look at our blog post on the subject.

    Typed Arrays and Cross-Origin Resource Sharing

    Virtual globes like Cesium provide a compelling, interactive 3D view of real-world situations by rendering a virtual Earth combined with georeferenced data such as roads, points of interest, weather, satellite orbits, or even the current location of Santa Claus. At the core of a virtual globe is the rendering of the Earth itself, with realistic terrain and satellite imagery.

    Terrain describes the shape of the surface: the mountain peaks, the hidden valleys, the wide open plains, and everything in between. Satellite or aerial imagery is then overlaid on this otherwise colorless surface and brings it to life.

    The global terrain data used in the NORAD Tracks Santa application is derived from the Shuttle Radar Topography Mission (SRTM), which has a 90-meter spacing between -60 and 60 degrees latitude, and the Global 30 Arc Second Elevation Data Set (GTOPO30), which has 1-kilometer spacing for the entire globe. The total size of the dataset is over 10 gigabytes.

    For imagery, we use Bing Maps, who is also a part of the NORAD Tracks Santa team. The total size of this dataset is even bigger – easily in the terabytes.

    With such enormous datasets, it is clearly impractical to transfer all of the terrain and imagery to the browser before rendering a scene. For that reason, both datasets are broken up into millions of individual files, called tiles. As Santa flies around the world, Cesium downloads new terrain and imagery tiles as they are needed.

    Terrain tiles describing the shape of the Earth’s surface are binary data encoded in a straightforward format. When Cesium determines that it needs a terrain tile, we download it using XMLHttpRequest and access the binary data using typed arrays:

    var tile = ...
     
    var xhr = new XMLHttpRequest();
     
    xhr.open('GET', terrainTileUrl, true);
     
    xhr.responseType = 'arraybuffer';
     
     
     
    xhr.onload = function(e) {
     
        if (xhr.status === 200) {
     
            var tileData = xhr.response;
     
            tile.heights = new Uint16Array(tileData, 0, heightmapWidth * heightmapHeight);
     
            var heightsBytes = tile.heights.byteLength;
     
            tile.childTileBits = new Uint8Array(tileData, heightsBytes, 1)[0];
     
            tile.waterMask = new Uint8Array(tileData, heightsBytes + 1, tileData.byteLength - heightsBytes - 1);
     
            tile.state = TileState.RECEIVED;
     
        } else {
     
            // ...
     
        }
     
    };
     
     
     
    xhr.send();

    Prior to the availability of typed arrays, this process would have been much more difficult. The usual course was to encode the data as text in JSON or XML format. Not only would such data be larger when sent over the wire(less), it would also be significantly slower to process it once it was received.

    While it is generally very straightforward to work with terrain data using typed arrays, two issues make it a bit trickier.

    The first is cross-origin restrictions. It is very common for terrain and imagery to be hosted on different servers than are used to host the web application itself, and this is certainly the case in NORAD Tracks Santa. XMLHttpRequest, however, does not usually allow requests to non-origin hosts. The common workaround of using script tags instead of XMLHttpRequest won’t work well here because we are downloading binary data – we can’t use typed arrays with JSONP.

    Fortunately, modern browsers offer a solution to this problem by honoring Cross-Origin Resource Sharing (CORS) headers, included in the response by the server, indicating that the response is safe for use across hosts. Enabling CORS is easy to do if you have control over the web server, and Bing Maps already includes the necessary headers on their tile files. Other terrain and imagery sources that we’d like to use in Cesium are not always so forward-thinking, however, so we’ve sometimes been forced to route cross-origin requests through a same-origin proxy.

    The other tricky aspect is that modern browsers only allow up to six simultaneous connections to a given host. If we simply created a new XMLHttpRequest for each tile requested by Cesium, the number of queued requests would grow large very quickly. By the time a tile was finally downloaded, the viewer’s position in the 3D world may have changed so that the tile is no longer even needed.

    Instead, we manually limit ourselves to six outstanding requests per host. If all six slots are taken, we won’t start a new request. Instead, we’ll wait until next render frame and try again. By then, the highest priority tile may be different than it was last frame, and we’ll be glad we didn’t queue up the request then. One nice feature of Bing Maps is that it serves the same tiles from multiple hostnames, which allows us to have more outstanding requests at once and to get the imagery into the application faster.

    Web Workers

    The terrain data served to the browser is, primarily, just an array of terrain heights. In order to render it, we need to turn the terrain tile into a triangle mesh with a vertex and index buffer. This process involves converting longitude, latitude, and height to X, Y, and Z coordinates mapped to the surface of the WGS84 ellipsoid. Doing this once is pretty fast, but doing it for each height sample, of which each tile has thousands, starts to take some measurable time. If we did this conversion for several tiles in a single render frame, we’d definitely start to see some stuttering in the rendering.

    One solution is to throttle tile conversion, doing at most N per render frame. While this would help with the stuttering, it doesn’t avoid the fact that tile conversion competes with rendering for CPU time while other CPU cores sit idle.

    Fortunately, another great new web API comes to the rescue: Web Workers.

    We pass the terrain ArrayBuffer downloaded from the remote server via XMLHttpRequest to a Web Worker as a transferable object. When the worker receives the message, it builds a new typed array with the vertex data in a form ready to be passed straight to WebGL. Unfortunately, Web Workers are not yet allowed to invoke WebGL, so we can’t create vertex and index buffers in the Web Worker; instead, we post the typed array back to the main thread, again as a transferable object.

    The beauty of this approach is that terrain data conversion happens asynchronously with rendering, and that it can take advantage of the client system’s multiple cores, if available. This leads to a smoother, more interactive Santa tracking experience.

    Web Workers are simple and elegant, but that simplicity presents some challenges for an engine like Cesium, which is designed to be useful in various different types of applications.

    During development, we like to keep each class in a separate .js file, for ease of navigation and to avoid the need for a time-consuming combine step after every change. Each class is actually a separate module, and we use the Asynchronous Module Definition (AMD) API and RequireJS to manage dependencies between modules at runtime.

    For use in production environments, it is a big performance win to combine the hundreds of individual files that make up a Cesium application into a single file. This may be a single file for all of Cesium or a user-selected subset. It may also be beneficial to combine parts of Cesium into a larger file containing application-specific code, as we’ve done in the NORAD Tracks Santa application. Cesium supports all of these use-cases, but the interaction with Web Workers gets tricky.

    When an application creates a Web Worker, it provides to the Web Worker API the URL of the .js file to invoke. The problem is, in Cesium’s case, that URL varies depending on which of the above use-cases is currently in play. Worse, the worker code itself needs to work a little differently depending on how Cesium is being used. That’s a big problem, because workers can’t access any information in the main thread unless that information is explicitly posted to it.

    Our solution is the cesiumWorkerBootstrapper. Regardless of what the WebWorker will eventually do, it is always constructed with cesiumWorkerBootstrapper.js as its entry point. The URL of the bootstrapper is deduced by the main thread where possible, and can be overridden by user code when necessary. Then, we post a message to the worker with details about how to actually dispatch work.

    var worker = new Worker(getBootstrapperUrl());
     
     
     
    //bootstrap
     
    var bootstrapMessage = {
     
        loaderConfig : {},
     
        workerModule : 'Workers/' + processor._workerName
     
    };
     
     
     
    if (typeof require.toUrl !== 'undefined') {
     
        bootstrapMessage.loaderConfig.baseUrl = '..';
     
    } else {
     
        bootstrapMessage.loaderConfig.paths = {
     
            'Workers' : '.'
     
        };
     
    }
     
    worker.postMessage(bootstrapMessage);

    The worker bootstrapper contains a simple onmessage handler:

    self.onmessage = function(event) {
     
        var data = event.data;
     
        require(data.loaderConfig, [data.workerModule], function(workerModule) {
     
            //replace onmessage with the required-in workerModule
     
            self.onmessage = workerModule;
     
        });
     
    };

    When the bootstrapper receives the bootstrapMessage, it uses the RequireJS implementation of require, which is also included in cesiumWorkerBootstrapper.js, to load the worker module specified in the message. It then “becomes” the new worker by replacing its onmessage handler with the required-in one.

    In use-cases where Cesium itself is combined into a single .js file, we also combine each worker into its own .js file, complete with all of its dependencies. This ensures that each worker needs to load only two .js files: the bootstrapper plus the combined module.

    Mobile Devices

    One of the most exciting aspects of building an application like NORAD Tracks Santa on web technologies is the possibility of achieving portability across operating systems and devices with a single code base. All of the technologies used by Cesium are already well supported on Windows, Linux, and Mac OS X on desktops and laptops. Increasingly, however, these technologies are becoming available on mobile devices.

    The most stable implementation of WebGL on phones and tablets is currently found in Firefox for Android. We tried out Cesium on several devices, including a Nexus 4 phone and a Nexus 7 tablet, both running Android 4.2.1 and Firefox 17.0. With a few tweaks, we were able to get Cesium running, and the performance was surprisingly good.

    We did encounter a few problems, however, presumably a result of driver bugs. One problem was that normalizing vectors in fragment shaders sometimes simply does not work. For example, GLSL code like this:

    vec3 normalized = normalize(someVector);

    Sometimes results in a normalized vector that still has a length greater than 1. Fortunately, this is easy to work around by adding another call to normalize:

    vec3 normalized = normalize(normalize(someVector));

    We hope that as WebGL gains more widespread adoption on mobile, bugs like this will be detected by the WebGL conformance tests before devices and drivers are released.

    The Finished Application

    As long-time C++ developers, we were initially skeptical of building a virtual globe application on the Open Web. Would we be able to do all the things expected of such an application? Would the performance be good?

    I’m pleased to say that we’ve been converted. Modern web APIs like WebGL, Web Workers, and Typed Arrays, along with the continual and impressive gains in JavaScript performance, have made the web a convenient, high-performance platform for sophisticated 3D applications. We’re looking forward to continuing to use Cesium to push the limits of what is possible in a browser, and to take advantage of new APIs and capabilities as they become available.

    We’re also looking forward to using this technology to bring a fun, 3D Santa tracking experience to millions of kids worldwide this Christmas as part of the NORAD Tracks Santa team. Check it out on December 24 at www.noradsanta.org.

  10. Performance with JavaScript String Objects

    This article aims to take a look at the performance of JavaScript engines towards primitive value Strings and Object Strings. It is a showcase of benchmarks related to the excellent article by Kiro Risk, The Wrapper Object. Before proceeding, I would suggest visiting Kiro’s page first as an introduction to this topic.

    The ECMAScript 5.1 Language Specification (PDF link) states at paragraph 4.3.18 about the String object:

    String object member of the Object type that is an instance of the standard built-in String constructor

    NOTE A String object is created by using the String constructor in a new expression, supplying a String value as an argument.
    The resulting object has an internal property whose value is the String value. A String object can be coerced to a String value
    by calling the String constructor as a function (15.5.1).

    and David Flanagan’s great book “JavaScript: The Definitive Guide”, very meticulously describes the Wrapper Objects at section 3.6:

    Strings are not objects, though, so why do they have properties? Whenever you try to refer to a property of a string s, JavaScript converts the string value to an object as if by calling new String(s). […] Once the property has been resolved, the newly created object is discarded. (Implementations are not required to actually create and discard this transient object: they must behave as if they do, however.)

    It is important to note the text in bold above. Basically, the different ways a new String object is created are implementation specific. As such, an obvious question one could ask is “since a primitive value String must be coerced to a String Object when trying to access a property, for example str.length, would it be faster if instead we had declared the variable as String Object?”. In other words, could declaring a variable as a String Object, ie var str = new String("hello"), rather than as a primitive value String, ie var str = "hello" potentially save the JS engine from having to create a new String Object on the fly so as to access its properties?

    Those who deal with the implementation of ECMAScript standards to JS engines already know the answer, but it’s worth having a deeper look at the common suggestion “Do not create numbers or strings using the ‘new’ operator”.

    Our showcase and objective

    For our showcase, we will use mainly Firefox and Chrome; the results, though, would be similar if we chose any other web browser, as we are focusing not on a speed comparison between two different browser engines, but at a speed comparison between two different versions of the source code on each browser (one version having a primitive value string, and the other a String Object). In addition, we are interested in how the same cases compare in speed to subsequent versions of the same browser. The first sample of benchmarks was collected on the same machine, and then other machines with a different OS/hardware specs were added in order to validate the speed numbers.

    The scenario

    For the benchmarks, the case is rather simple; we declare two string variables, one as a primitive value string and the other as an Object String, both of which have the same value:

      var strprimitive = "Hello";
      var strobject    = new String("Hello");

    and then we perform the same kind of tasks on them. (notice that in the jsperf pages strprimitive = str1, and strobject = str2)

    1. length property

      var i = strprimitive.length;
      var k = strobject.length;

    If we assume that during runtime the wrapper object created from the primitive value string strprimitive, is treated equally with the object string strobject by the JavaScript engine in terms of performance, then we should expect to see the same latency while trying to access each variable’s length property. Yet, as we can see in the following bar chart, accessing the length property is a lot faster on the primitive value string strprimitive, than in the object string strobject.


    (Primitive value string vs Wrapper Object String – length, on jsPerf)

    Actually, on Chrome 24.0.1285 calling strprimitive.length is 2.5x faster than calling strobject.length, and on Firefox 17 it is about 2x faster (but having more operations per second). Consequently, we realize that the corresponding browser JavaScript engines apply some “short paths” to access the length property when dealing with primitive string values, with special code blocks for each case.

    In the SpiderMonkey JS engine, for example, the pseudo-code that deals with the “get property” operation looks something like the following:

      // direct check for the "length" property
      if (typeof(value) == "string" && property == "length") {
        return StringLength(value);
      }
      // generalized code form for properties
      object = ToObject(value);
      return InternalGetProperty(object, property);

    Thus, when you request a property on a string primitive, and the property name is “length”, the engine immediately just returns its length, avoiding the full property lookup as well as the temporary wrapper object creation. Unless we add a property/method to the String.prototype requesting |this|, like so:

      String.prototype.getThis = function () { return this; }
      console.log("hello".getThis());

    then no wrapper object will be created when accessing the String.prototype methods, as for example String.prototype.valueOf(). Each JS engine has embedded similar optimizations in order to produce faster results.

    2. charAt() method

      var i = strprimitive.charAt(0);
      var k = strobject["0"];


    (Primitive value string vs Wrapper Object String – charAt(), on jsPerf)

    This benchmark clearly verifies the previous statement, as we can see that getting the value of the first string character in Firefox 20 is substiantially faster in strprimitive than in strobject, about x70 times of increased performance. Similar results apply to other browsers as well, though at different speeds. Also, notice the differences between incremental Firefox versions; this is just another indicator of how small code variations can affect the JS engine’s speed for certain runtime calls.

    3. indexOf() method

      var i = strprimitive.indexOf("e");
      var k = strobject.indexOf("e");


    (Primitive value string vs Wrapper Object String – IndexOf(), on jsPerf)

    Similarly in this case, we can see that the primitive value string strprimitive can be used in more operations than strobject. In addition, the JS engine differences in sequential browser versions produce a variety of measurements.

    4. match() method

    Since there are similar results here too, to save some space, you can click the source link to view the benchmark.

    (Primitive value string vs Wrapper Object String – match(), on jsPerf)

    5. replace() method

    (Primitive value string vs Wrapper Object String – replace(), on jsPerf)

    6. toUpperCase() method

    (Primitive value string vs Wrapper Object String – toUpperCase(), on jsPerf)

    7. valueOf() method

      var i = strprimitive.valueOf();
      var k = strobject.valueOf();

    At this point it starts to get more interesting. So, what happens when we try to call the most common method of a string, it’s valueOf()? It seems like most browsers have a mechanism to determine whether it’s a primitive value string or an Object String, thus using a much faster way to get its value; surprizingly enough Firefox versions up to v20, seem to favour the Object String method call of strobject, with a 7x increased speed.


    (Primitive value string vs Wrapper Object String – valueOf(), on jsPerf)

    It’s also worth mentioning that Chrome 22.0.1229 seems to have favoured too the Object String, while in version 23.0.1271 a new way to get the content of primitive value strings has been implemented.

    A simpler way to run this benchmark in your browser’s console is described in the comment of the jsperf page.

    8. Adding two strings

      var i = strprimitive + " there";
      var k = strobject + " there";


    (Primitive string vs Wrapper Object String – get str value, on jsPerf)

    Let’s now try and add the two strings with a primitive value string. As the chart shows, both Firefox and Chrome present a 2.8x and 2x increased speed in favour of strprimitive, as compared with adding the Object string strobject with another string value.

    9. Adding two strings with valueOf()

      var i = strprimitive.valueOf() + " there";
      var k = strobject.valueOf() + " there";


    (Primitive string vs Wrapper Object String – str valueOf, on jsPerf)

    Here we can see again that Firefox favours the strobject.valueOf(), since for strprimitive.valueOf() it moves up the inheritance tree and consequently creates a new wapper object for strprimitive. The effect this chained way of events has on the performance can also be seen in the next case.

    10. for-in wrapper object

      var i = "";
      for (var temp in strprimitive) { i += strprimitive[temp]; }
     
      var k = "";
      for (var temp in strobject) { k += strobject[temp]; }

    This benchmark will incrementally construct the string’s value through a loop to another variable. In the for-in loop, the expression to be evaluated is normally an object, but if the expression is a primitive value, then this value gets coerced to its equivalent wrapper object. Of course, this is not a recommended method to get the value of a string, but it is one of the many ways a wrapper object can be created, and thus it is worth mentioning.


    (Primitive string vs Wrapper Object String – Properties, on jsPerf)

    As expected, Chrome seems to favour the primitive value string strprimitive, while Firefox and Safari seem to favour the object string strobject. In case this seems much typical, let’s move on the last benchmark.

    11. Adding two strings with an Object String

      var str3 = new String(" there");
     
      var i = strprimitive + str3;
      var k = strobject + str3;


    (Primitive string vs Wrapper Object String – 2 str values, on jsPerf)

    In the previous examples, we have seen that Firefox versions offer better performance if our initial string is an Object String, like strobject, and thus it would be seem normal to expect the same when adding strobject with another object string, which is basically the same thing. It is worth noticing, though, that when adding a string with an Object String, it’s actually quite faster in Firefox if we use strprimitive instead of strobject. This proves once more how source code variations, like a patch to a bug, lead to different benchmark numbers.

    Conclusion

    Based on the benchmarks described above, we have seen a number of ways about how subtle differences in our string declarations can produce a series of different performance results. It is recommended that you continue to declare your string variables as you normally do, unless there is a very specific reason for you to create instances of the String Object. Also, note that a browser’s overall performance, particularly when dealing with the DOM, is not only based on the page’s JS performance; there is a lot more in a browser than its JS engine.

    Feedback comments are much appreciated. Thanks :-)