Mozilla

Performance Articles

Sort by:

View:

  1. jsDelivr – The advanced open source public CDN

    This is a guest post by Dmitriy Akulov and his project jsDelivr. – Editor’s note.

    As a developer you are probably aware of Google Hosted Libraries. Google offers an easy and fast way to include 12 of the most popular js libraries in your websites.

    But what if you are a webmaster and you want take advantage of a fast CDN with other less popular projects too? Or if you are a developer and you want to make your project easier to access and use by other users.

    This is where jsDelivr comes into play. jsDelivr is a free and open source CDN created to help developers and webmasters. There are no popularity restrictions and all kinds of files are allowed, including JavaScript libraries, jQuery plugins, CSS frameworks, fonts and more.

    Adding a library

    To add a new library or update an existing one all the developer has to do is to clone our Github repository and apply the modifications they see fit. Once a moderator reviews the Pull Request and merges it, the files become instantly available from the official website.

    If a mod is online the approval should not take more than 20 minutes, otherwise it can take up to 10 hours until someone comes online. But once our auto-update utility comes online review times will drop.

    Reliability

    But what actually makes it so advanced? The idea of jsDelivr was not to create another public CDN but to offer a super fast and reliable infrastructure that developers and website owners could trust and use. Any big or small website can use it without worrying about it. There are no bandwidth limits and our service is rock solid.

    Slow responses, timeouts and downtime are not tolerated, so we designed a unique system to overcome these problems and offer a product that even Enterprise CDNs would be jealous of. Uptime and performance are top priority, we monitor everything at all times and we are always looking into new technologies and providers that may further improve our CDN.

    Infrastructure

    network-map

    Unlike the competition jsDelivr uses a unique Multi-CDN infrastructure to offer the best possible uptime and performance. The main backbone of it is built on top of CDN networks provided by MaxCDN and CloudFlare.

    And we also use custom servers in locations where CDNs have little or no presence. In total at this moment this results in 42 global POP locations. In the future we plan to add even more locations to offer top performance even in less popular countries.

    Of course lots of locations means nothing if you can’t load balance across them correctly. For the load balancing system we use services provided by Cedexis. One of their main features is the real time performance data they gather on all major CDN providers. 1.3 billion RUM (Real User Metrics) performance tests per day are processed and available to all Cedexis users.

    Measuring performance

    To gather these RUM tests they have deployed a special JavaScript code on thousands of websites. Every visitor to one of these websites executes the code and starts testing different CDN providers in the background as they browse the website. The testing does not impact on the browsing experience in any way and is completely transparent to the user. You can actually see how it works by visiting our website and opening developer tools to “Network” tab.

    The beauty of these tests is that they are not synthetic. They reflect the real performance real users will get if they download a file from one of those CDNs.

    The following information is then stored:

    • Performance metrics to each of our providers.
    • Availability metrics to each of our providers.
    • Browser’s User-Agent
    • First three octets of the user’s IP address

    Now that we have all this information we can use it in our smart load balancing algorithm.

    Every user gets a unique response that is based on their location and ISP provider. Each time a users requests to download a file from jsDelivr, our algorithm extracts the performance and availability data it has available for the last few minutes and then figures out the most optimal provider for that particular user and that particular time. All that in a few ms.

    First it makes sure that all available providers are online. For this it uses the RUM availability data and a synthetic test that checks each provider every minute for uptime. Then it proceeds to sort the providers by performance for the ISP of the user and his/her location.

    Once it has the fastest provider it returns the hostname to the user. So for example 2 different users in London with different ISPs could get 2 different responses because their ISPs have different routing and performance to different CDN providers. This smart system guarantees maximum uptime and fast loading times to all users. If a provider goes down jsDelivr won’t experience any issue at all and immediately start serving a different provider.

    This algorithm also immediately responds to performance degradation. For example a CDN provider gets DDoSed in Europe and their response times increase, jsDelivr will pick up the change and simply stop using this provider in Europe but still consider it for users in USA and other locations that were not affected by the attack.

    Don’t rely on a single CDN for uptime and speed. Everything can go down, but the chances for 2 CDNs and multiple servers to go down at the same time are very slim. And this is why jsDelivr is the most optimal solution for every website out there. No matter how big it is.

    I should also point out that MaxCDN, CloudFlare, Cedexis and the rest of the companies sponsor jsDelivr for free. Its nice to see that there are companies out there that are willing to help open source projects and build a fast and free internet.

    Advanced Features

    jsDelivr also supports some interesting and very helpful features such as:

    Version Aliasing

    Instead of using a unique URL for each version to load a project with jsDelivr you can use aliasing. Lets take for example the project Abaaso. At this moment the latest version is 3.10.50 and you can load it by specifying the exact version in your url as always. But since this project gets updated very often you would end up with using the old version pretty soon. To overcome this problem you can now simply use the following URL:

    //cdn.jsdelivr.net/abaaso/3.10/abaaso.min.js

    By using 3.10 you tell jsDelivr to load the latest version it has in the 3.10 branch which in this case is 3.10.50. This is the optimal solution for most authors because they can load the latest minor version without worrying for major changes that could break their website.

    It is of course possible to load the latest version in the v3 branch by using the following URL:

    //cdn.jsdelivr.net/abaaso/3/abaaso.min.js

    And if for any reason you need to always load the latest available version in any major branch you can use:

    //cdn.jsdelivr.net/abaaso/latest/abaaso.min.js

    By using the latest version you tell the server to load the absolute latest version it has. This of course is dangerous and given enough time may and will break your website. So use this feature with caution.

    Load multiple files with a single HTTP request

    jsDelivr is the first CDN to support this kind of functionality. You can load multiple files using a single HTTP request. Similar to combining and minifying js files in your own server, but cached by the huge and smart network of jsDelivr.

    All you have to do is to build your own URL with the projects and files you want to combine and their versions if needed. For example, to load the latest version for projects abaaso, ace and alloyui you would use the following syntax:

    //cdn.jsdelivr.net/g/abaaso,ace,alloyui

    Have in mind that loading the latest version is not recommended and given enough time will break your website. This is why you should specify the exact versions or use version aliases:

    //cdn.jsdelivr.net/g/jquery@2.1,angularjs@1.2

    So jquery@2.1 will load 2.1.0 and angularjs@1.2 will load 1.2.14. But the above URL will load the main files of each project and nothing else.

    If you want to load multiple files from a single project then you can do the following:

    //cdn.jsdelivr.net/g/jquery@2.1,angularjs@1.2.14(angular.min.js+angular-resource.min.js+angular-animate.min.js+angular-cookies.min.js+angular-route.min.js+angular-sanitize.min.js)

    If you want to load CSS then select css files using the above format. If all files in the group URL have a .css extension then the server will automatically respond with a Content-Type: text/css HTTP header. In all other cases (for /g/ URLs) Content-Type: application/javascript is used.

    Next you simply include the url in your website and you are done. Less DNS resolving, less TCP connections, less HTTP requests = Faster website.

    You can even use this feature to offer your users a builder to allow them to generate a URL with the modules they need and then load them all using a fast CDN.

    A real API

    jsDelivr has a fully featured API that can be used by developers in their websites,to create custom modules and anything else you might think of https://github.com/jsdelivr/api

    You can request exactly what you need using our API without downloading a huge package json. And it also supports cdnjs and Google. This way developers have everything they need to build their applications.

    Auto-Updates

    jsDelivr libgrabber is a utility that is going to run on our servers and can auto-update all hosted projects if configured. The best part is that the authors don’t have to change anything in their repos. All changes are made on jsDelivr side.

    All you need is to create an update.json file with some basic info inside the project you want to keep auto-updated in jsDelivr repo. This file also supports multiple sources for new versions. Like npm, bower and directly Github repos. It is still under development but is planned to be released soon.

    Try it out, help out!

    jsDelivr is a very interesting project that I enjoy developing and making better. It also heavily relies on the help of the community. Consider using it in your websites and host there your projects.

    And if you are interested in helping out, we can always use some help, just join the conversation on Github.

    Feel free to leave your comments and ask me any questions you might have.

    Thank you

  2. Optimizing your JavaScript game for Firefox OS

    When developing on a quad core processor with 16 gigabytes of RAM you can easily forget to consider how it will perform on a mobile device. This article will detail some best practices and things to consider for moving a game to Firefox OS or any similar hardware target.

    Making the best of 256 Mb RAM/800 Mhz CPU

    There are many areas of focus to keep in mind while developing a game. When your goal is to draw 60 times a second, garbage collection and inefficient drawing calls start to get in your way. Let’s start with the basics…

    Don’t over-optimize

    This might sound counter-intuitive in an article about game optimization but optimization is the last step; performed on complete, working code. While it’s never a bad idea to keep these tips and tricks in mind, you don’t know whether you’ll need them until you’ve finished the game and played it on a device.

    Optimize Drawing

    Drawing on HTML5 2D canvas is the biggest bottleneck in most JavaScript games, as all other updates are usually just algebra without touching the DOM. Canvas operations are hardware accelerated, which can give you some extra room to breath.

    Use whole-pixel rendering

    Sub-pixel rendering occurs when you render objects on a canvas without whole values.

    ctx.drawImage(myImage, 0.3, 0.5)

    This causes the browser to do extra calculations to create the anti-aliasing effect. To avoid this, make sure to round all co-ordinates used in calls to drawImage using Math.floor or as you’ll reader further in the article, bitwse operators.

    jsPerf – drawImage whole pixels.

    Cache drawing in an offscreen canvas

    If you find yourself with complex drawing operations on each frame, consider creating an offscreen canvas, draw to it once (or whenever it changes) on the offscreen canvas, then on each frame draw the offscreen canvas.

    myEntity.offscreenCanvas = document.createElement(“canvas”);
    myEntity.offscreenCanvas.width = myEntity.width;
    myEntity.offscreenCanvas.height = myEntity.height;
    myEntity.offscreenContext = myEntity.offscreenCanvas.getContext(“2d”);
     
    myEntity.render(myEntity.offscreenContext);

    Use moz-opaque on the canvas tag (Firefox Only)

    If your game uses canvas and doesn’t need to be transparent, set the moz-opaque attribute on the canvas tag. This information can be used internally to optimize rendering.

    <canvas id="mycanvas" moz-opaque></canvas>

    Described more in Bug 430906 – Add moz-opaque attribute on canvas.

    Scaling canvas using CSS3 transform

    CSS3 transforms are faster by using the GPU. Best case is to not scale the canvas or have a smaller canvas and scale up rather than a bigger canvas and scale down. For Firefox OS, target 480 x 320 px.

    var scaleX = canvas.width / window.innerWidth;
    var scaleY = canvas.height / window.innerHeight;
     
    var scaleToFit = Math.min(scaleX, scaleY);
    var scaleToCover = Math.max(scaleX, scaleY);
     
    stage.style.transformOrigin = "0 0"; //scale from top left
    stage.style.transform = "scale(" + scaleToFit + ")";

    See it working in this jsFiddle.

    Nearest-neighbour rendering for scaling pixel-art

    Leading on from the last point, if your game is themed with pixel-art, you should use one of the following techniques when scaling the canvas. The default resizing algorithm creates a blurry effect and ruins the beautiful pixels.

    canvas {
      image-rendering: crisp-edges;
      image-rendering: -moz-crisp-edges;
      image-rendering: -webkit-optimize-contrast;
      -ms-interpolation-mode: nearest-neighbor;
    }

    or

    var context = canvas.getContext(‘2d’);
    context.webkitImageSmoothingEnabled = false;
    context.mozImageSmoothingEnabled = false;
    context.imageSmoothingEnabled = false;

    More documentation is available on MDN for image-rendering.

    CSS for large background images

    If like most games you have a static background image, use a plain DIV element with a CSS background property and position it under the canvas. This will avoid drawing a large image to the canvas on every tick.

    Multiple canvases for layers

    Similar to the last point, you may find you have some elements that are frequently changing and moving around whereas other things (like UI) never change. An optimization in this situation is to create layers using multiple canvas elements.

    For example you could create a UI layer that sits on top of everything and is only drawn during user input. You could create game layer where the frequently updating entities exist and a background layer for entities that rarely update.

    Don’t scale images in drawImage

    Cache various sizes of your images on an offscreen canvas when loading as opposed to constantly scaling them in drawImage.

    jsPerf – Canvas drawImage Scaling Performance.

    Be careful with heavy physics libraries

    If possible, roll your own physics as libraries like Box2D don’t perform well on low-end Firefox OS devices.

    When asm.js support lands in Firefox OS, Emscripten-compiled libraries can take advantage of near-native performance. More reading in Box2d Revisited.

    Use WebGL instead of Context 2D

    Easier said than done but giving all the heavy graphics lifting to the GPU will free up the CPU for greater good. Even though WebGL is 3D, you can use it to draw 2D surfaces. There are some libraries out there that aim to abstract the drawing contexts.

    Minimize Garbage Collection

    JavaScript can spoil us when it comes to memory management. We generally don’t need to worry about memory leaks or conservatively allocating memory. But if we’ve allocated too much and garbage collection occurs in the middle of a frame, that can take up valuable time and result in a visible drop in FPS.

    Pool common objects and classes

    To minimise the amount of objects being cleaned during garbage collection, use a pre-initialised pool of objects and reuse them rather than creating new objects all the time.

    Code example of generic object pool:

    Avoid internal methods creating garbage

    There are various JavaScript methods that create new objects rather than modifying the existing one. This includes: Array.slice, Array.splice, Function.bind.

    Read more about JavaScript garbage collection

    Avoid frequent calls to localStorage

    LocalStorage uses file IO and blocks the main thread to retrieve and save data. Use an in-memory object to cache the values of localStorage and even save writes for when the player is not mid-game.

    Code example of an abstract storage object:

    Async localStorage API with IndexedDB

    IndexedDB is a non-blocking API for storing data on the client but may be overkill for small and simple data. Gaia’s library to make localStorage API asynchronous, using IndexedDB is available on Github: async_storage.js.

    Miscellaneous micro-optimization

    Sometimes when you’ve exhausted all your options and it just won’t go any faster, you can try some micro-optimizations below. However do note that these only start to make a difference in heavy usage when every millisecond counts. Look for them in your hot game loops.

    Use x | 0 instead of Math.floor
    Clear arrays with .length = 0 to avoid creating a new Array
    Sacrifice some CPU time to avoid creating garbage.
    Use if .. else over switch
    jsPerf – switch vs if-else
    Date.now() over (+ new Date)
    jsPerf – Date.now vs new Date().getTime() vs +new Date
    Or performance.now() for a sub-millisecond solution
    Use TypedArrays for floats or integers (e.g. vectors and matrices)
    gl-matrix – Javascript Matrix and Vector library for High Performance WebGL apps

    Conclusion

    Building for mobile devices and not-so-strong hardware is a good and creative exercise, and we hope you will want to make sure your games work well on all platforms!

  3. Performance with JavaScript String Objects

    This article aims to take a look at the performance of JavaScript engines towards primitive value Strings and Object Strings. It is a showcase of benchmarks related to the excellent article by Kiro Risk, The Wrapper Object. Before proceeding, I would suggest visiting Kiro’s page first as an introduction to this topic.

    The ECMAScript 5.1 Language Specification (PDF link) states at paragraph 4.3.18 about the String object:

    String object member of the Object type that is an instance of the standard built-in String constructor

    NOTE A String object is created by using the String constructor in a new expression, supplying a String value as an argument.
    The resulting object has an internal property whose value is the String value. A String object can be coerced to a String value
    by calling the String constructor as a function (15.5.1).

    and David Flanagan’s great book “JavaScript: The Definitive Guide”, very meticulously describes the Wrapper Objects at section 3.6:

    Strings are not objects, though, so why do they have properties? Whenever you try to refer to a property of a string s, JavaScript converts the string value to an object as if by calling new String(s). [...] Once the property has been resolved, the newly created object is discarded. (Implementations are not required to actually create and discard this transient object: they must behave as if they do, however.)

    It is important to note the text in bold above. Basically, the different ways a new String object is created are implementation specific. As such, an obvious question one could ask is “since a primitive value String must be coerced to a String Object when trying to access a property, for example str.length, would it be faster if instead we had declared the variable as String Object?”. In other words, could declaring a variable as a String Object, ie var str = new String("hello"), rather than as a primitive value String, ie var str = "hello" potentially save the JS engine from having to create a new String Object on the fly so as to access its properties?

    Those who deal with the implementation of ECMAScript standards to JS engines already know the answer, but it’s worth having a deeper look at the common suggestion “Do not create numbers or strings using the ‘new’ operator”.

    Our showcase and objective

    For our showcase, we will use mainly Firefox and Chrome; the results, though, would be similar if we chose any other web browser, as we are focusing not on a speed comparison between two different browser engines, but at a speed comparison between two different versions of the source code on each browser (one version having a primitive value string, and the other a String Object). In addition, we are interested in how the same cases compare in speed to subsequent versions of the same browser. The first sample of benchmarks was collected on the same machine, and then other machines with a different OS/hardware specs were added in order to validate the speed numbers.

    The scenario

    For the benchmarks, the case is rather simple; we declare two string variables, one as a primitive value string and the other as an Object String, both of which have the same value:

      var strprimitive = "Hello";
      var strobject    = new String("Hello");

    and then we perform the same kind of tasks on them. (notice that in the jsperf pages strprimitive = str1, and strobject = str2)

    1. length property

      var i = strprimitive.length;
      var k = strobject.length;

    If we assume that during runtime the wrapper object created from the primitive value string strprimitive, is treated equally with the object string strobject by the JavaScript engine in terms of performance, then we should expect to see the same latency while trying to access each variable’s length property. Yet, as we can see in the following bar chart, accessing the length property is a lot faster on the primitive value string strprimitive, than in the object string strobject.


    (Primitive value string vs Wrapper Object String – length, on jsPerf)

    Actually, on Chrome 24.0.1285 calling strprimitive.length is 2.5x faster than calling strobject.length, and on Firefox 17 it is about 2x faster (but having more operations per second). Consequently, we realize that the corresponding browser JavaScript engines apply some “short paths” to access the length property when dealing with primitive string values, with special code blocks for each case.

    In the SpiderMonkey JS engine, for example, the pseudo-code that deals with the “get property” operation looks something like the following:

      // direct check for the "length" property
      if (typeof(value) == "string" && property == "length") {
        return StringLength(value);
      }
      // generalized code form for properties
      object = ToObject(value);
      return InternalGetProperty(object, property);

    Thus, when you request a property on a string primitive, and the property name is “length”, the engine immediately just returns its length, avoiding the full property lookup as well as the temporary wrapper object creation. Unless we add a property/method to the String.prototype requesting |this|, like so:

      String.prototype.getThis = function () { return this; }
      console.log("hello".getThis());

    then no wrapper object will be created when accessing the String.prototype methods, as for example String.prototype.valueOf(). Each JS engine has embedded similar optimizations in order to produce faster results.

    2. charAt() method

      var i = strprimitive.charAt(0);
      var k = strobject["0"];


    (Primitive value string vs Wrapper Object String – charAt(), on jsPerf)

    This benchmark clearly verifies the previous statement, as we can see that getting the value of the first string character in Firefox 20 is substiantially faster in strprimitive than in strobject, about x70 times of increased performance. Similar results apply to other browsers as well, though at different speeds. Also, notice the differences between incremental Firefox versions; this is just another indicator of how small code variations can affect the JS engine’s speed for certain runtime calls.

    3. indexOf() method

      var i = strprimitive.indexOf("e");
      var k = strobject.indexOf("e");


    (Primitive value string vs Wrapper Object String – IndexOf(), on jsPerf)

    Similarly in this case, we can see that the primitive value string strprimitive can be used in more operations than strobject. In addition, the JS engine differences in sequential browser versions produce a variety of measurements.

    4. match() method

    Since there are similar results here too, to save some space, you can click the source link to view the benchmark.

    (Primitive value string vs Wrapper Object String – match(), on jsPerf)

    5. replace() method

    (Primitive value string vs Wrapper Object String – replace(), on jsPerf)

    6. toUpperCase() method

    (Primitive value string vs Wrapper Object String – toUpperCase(), on jsPerf)

    7. valueOf() method

      var i = strprimitive.valueOf();
      var k = strobject.valueOf();

    At this point it starts to get more interesting. So, what happens when we try to call the most common method of a string, it’s valueOf()? It seems like most browsers have a mechanism to determine whether it’s a primitive value string or an Object String, thus using a much faster way to get its value; surprizingly enough Firefox versions up to v20, seem to favour the Object String method call of strobject, with a 7x increased speed.


    (Primitive value string vs Wrapper Object String – valueOf(), on jsPerf)

    It’s also worth mentioning that Chrome 22.0.1229 seems to have favoured too the Object String, while in version 23.0.1271 a new way to get the content of primitive value strings has been implemented.

    A simpler way to run this benchmark in your browser’s console is described in the comment of the jsperf page.

    8. Adding two strings

      var i = strprimitive + " there";
      var k = strobject + " there";


    (Primitive string vs Wrapper Object String – get str value, on jsPerf)

    Let’s now try and add the two strings with a primitive value string. As the chart shows, both Firefox and Chrome present a 2.8x and 2x increased speed in favour of strprimitive, as compared with adding the Object string strobject with another string value.

    9. Adding two strings with valueOf()

      var i = strprimitive.valueOf() + " there";
      var k = strobject.valueOf() + " there";


    (Primitive string vs Wrapper Object String – str valueOf, on jsPerf)

    Here we can see again that Firefox favours the strobject.valueOf(), since for strprimitive.valueOf() it moves up the inheritance tree and consequently creates a new wapper object for strprimitive. The effect this chained way of events has on the performance can also be seen in the next case.

    10. for-in wrapper object

      var i = "";
      for (var temp in strprimitive) { i += strprimitive[temp]; }
     
      var k = "";
      for (var temp in strobject) { k += strobject[temp]; }

    This benchmark will incrementally construct the string’s value through a loop to another variable. In the for-in loop, the expression to be evaluated is normally an object, but if the expression is a primitive value, then this value gets coerced to its equivalent wrapper object. Of course, this is not a recommended method to get the value of a string, but it is one of the many ways a wrapper object can be created, and thus it is worth mentioning.


    (Primitive string vs Wrapper Object String – Properties, on jsPerf)

    As expected, Chrome seems to favour the primitive value string strprimitive, while Firefox and Safari seem to favour the object string strobject. In case this seems much typical, let’s move on the last benchmark.

    11. Adding two strings with an Object String

      var str3 = new String(" there");
     
      var i = strprimitive + str3;
      var k = strobject + str3;


    (Primitive string vs Wrapper Object String – 2 str values, on jsPerf)

    In the previous examples, we have seen that Firefox versions offer better performance if our initial string is an Object String, like strobject, and thus it would be seem normal to expect the same when adding strobject with another object string, which is basically the same thing. It is worth noticing, though, that when adding a string with an Object String, it’s actually quite faster in Firefox if we use strprimitive instead of strobject. This proves once more how source code variations, like a patch to a bug, lead to different benchmark numbers.

    Conclusion

    Based on the benchmarks described above, we have seen a number of ways about how subtle differences in our string declarations can produce a series of different performance results. It is recommended that you continue to declare your string variables as you normally do, unless there is a very specific reason for you to create instances of the String Object. Also, note that a browser’s overall performance, particularly when dealing with the DOM, is not only based on the page’s JS performance; there is a lot more in a browser than its JS engine.

    Feedback comments are much appreciated. Thanks :-)

  4. No Single Benchmark for the Web

    Google released a new JavaScript benchmark a few days ago called Octane. New benchmarks are always welcome, as they push browsers to new levels of performance in new areas. I was particularly pleased to see the inclusion of pdf.js, which unlike most benchmarks is real-world code, as well as the GB Emulator which is a very interesting type of performance-intensive code. However, every benchmark suite has limitations, and it is worth keeping that in mind, especially given the new benchmark’s title in the announcement and in the project page as “The JavaScript Benchmark Suite for the Modern Web” – which is a high goal to set for a single benchmark.

    Now, every benchmark must pick some code to run out of all the possible code out there, and picking representative code is very hard. So it is always understandable that benchmarks are never 100% representative of the code that exists and is important. However, even taking that into account, I have concerns with some of the code selected to appear in Octane: There are better versions of two of the five new benchmarks, and performance on those better versions is very different than the versions that do appear in Octane.

    Benchmarking black boxes

    One of the new benchmarks in Octane is “Mandreel”, which is the Bullet physics engine compiled by Mandreel, a C++ to JS compiler. Bullet is definitely interesting code to include in a benchmark. However the choice of Mandreel’s port is problematic. One issue is that Mandreel is a closed-source compiler, a black box, making it hard to learn from it what kind of code is efficient and what should be optimized. We just have a generated code dump, which, as a commercial product, would cost money for anyone to reproduce those results with modifications to the original C++ being run or a different codebase. We also do not have the source code compiled for this particular benchmark: Bullet itself is open source, but we don’t know the specific version compiled here, nor do we have the benchmark driver code that uses Bullet, both of which would be necessary to reproduce these results using another compiler.

    An alternative could have been to use Bullet compiled by Emscripten, an open source compiler that similarly compiles C++ to JS (disclaimer: I am an Emscripten dev). Aside from being open, Emscripten also has a port of Bullet (a demo can be seen here) that can interact in a natural way with regular JS, making it usable in normal web games and not just compiled ones, unlike Mandreel’s port. This is another reason for preferring the Emscripten port of Bullet instead.

    Is Mandreel representative of the web?

    The motivation Google gives for including Mandreel in Octane is that Mandreel is “used in countless web-based games.” It seems that Mandreel is primarily used in the Chrome Web Store (CWS) and less outside in the normal web. The quoted description above is technically accurate: Mandreel games in the CWS are indeed “web-based” (written in JS+HTML+WebGL) even if they are not actually “on the web”, where by “on the web” I mean outside of the walled garden of the CWS and in the normal web that all browsers can access. And it makes perfect sense that Google cares about the performance of code that runs in the CWS, since Google runs and profits from that store. But it does call into question the title of the Octane benchmark as “The JavaScript Benchmark Suite for the Modern Web.”

    Performance of generated code is highly variable

    With that said, it is still fair to say that compiler-generated code is increasing in importance on the web, so some benchmark must be chosen to represent it. The question is how much the specific benchmark chosen represents compiled code in general. On the one hand the compiled output of Mandreel and Emscripten is quite similar: both use large typed arrays, the same Relooper algorithm, etc., so we could expect performance to be similar. That doesn’t seem to always be the case, though. When we compare Bullet compiled by Mandreel with Bullet compiled by Emscripten – I made a benchmark of that a while back, it’s available here – then on my MacBook pro, Chrome is 1.5x slower than Firefox on the Emscripten version (that is, Chrome takes 1.5 times as long to execute in this case), but 1.5x faster on the Mandreel version that Google chose to include in Octane (that is, Chrome receives a score 1.5 times larger in this case). (I tested with Chrome Dev, which is the latest version available on Linux, and Firefox Aurora which is the best parallel to it. If you run the tests yourself, note that in the Emscripten version smaller numbers are better while the opposite is true in the Octane version.)

    (An aside, not only does Chrome have trouble running the Emscripten version quickly, but that benchmark also exposes a bug in Chrome where the tab consistently crashes when the benchmark is reloaded – possibly a dupe of this open issue. A serious problem of that nature, that does not happen on the Mandreel-compiled version, could indicate that the two were optimized differently as a result of having received different amounts of focus by developers.)

    Another issue with the Mandreel benchmark is the name. Calling it Mandreel implies it represents all Mandreel-generated code, but there can be huge differences in performance depending on what C/C++ code is compiled, even with a single compiler. For example, Chrome can be 10-15x slower than Firefox on some Emscripten-compiled benchmarks (example 1, example 2) while on others it is quite speedy (example). So calling the benchmark “Mandreel-Bullet” would have been better, to indicate it is just one Mandreel-compiled codebase, which cannot represent all compiled code.

    Box2DWeb is not the best port of Box2D

    “Box2DWeb” is another new benchmark in Octane, in which a specific port of Box2D to JavaScript is run, namely Box2DWeb. However, as seen here (see also this), Box2DWeb is significantly slower than other ports of Box2D to the web, specifically Mandreel and Emscripten’s ports from the original C++ that Box2D is written in. Now, you can justify excluding the Mandreel version because it cannot be used as a library from normal JS (just as with Bullet before), but the Emscripten-compiled one does not have that limitation and can be found here. (Demos can be seen here and here.)

    Another reason for preferring the Emscripten version is that it uses Box2D 2.2, whereas Box2DWeb uses the older Box2D 2.1. Compiling the C++ code directly lets the Emscripten port stay up to date with the latest upstream features and improvements far more easily.

    It is possible that Google surveyed websites and found that the slower Box2DWeb was more popular, although I have no idea whether that was the case, but if so that would partially justify preferring the slower version. However, even if that were true, I would argue that it would be better to use the Emscripten version because as mentioned earlier it is faster and more up to date. Another factor to consider is that the version included in Octane will get attention and likely an increase in adoption, which makes it all the more important to select the one that is best for the web.

    I put up a benchmark of Emscripten-compiled Box2D here, and on my machine Chrome is 3x slower than Firefox on that benchmark, but 1.6x faster on the version Google chose to include in Octane. This is a similar situation to what we saw earlier with the Mandreel/Bullet benchmark and it raises the same questions about how representative a single benchmark can be.

    Summary

    As mentioned at the beginning, all benchmarks are imperfect. And the fact that the specific code samples in Octane are ones that Chrome runs well does not mean the code was chosen for that reason: The opposite causation is far more likely, that Google chose to focus on optimizing those and in time made Chrome fast on them. And that is how things properly work – you pick something to optimize for, and then optimize for it.

    However, in 2 of the 5 new benchmarks in Octane there are good reasons for preferring alternative, better versions of those two benchmarks as we saw before. Now, it is possible that when Google started to optimize for Octane, the better options were not yet available – I don’t know when Google started that effort – but the fact that better alternatives exist in the present makes substantial parts of Octane appear less relevant today. Of course, if performance on the better versions was not much different than the Octane versions then this would not matter, but as we saw there were in fact significant differences when comparing browsers on those versions: One browser could be significantly better on one version of the same benchmark but significantly slower on another.

    What all of this shows is that there cannot be a single benchmark for the modern web. There are simply too many kinds of code, and even when we focus on one of them, different benchmarks of that particular task can behave very differently.

    With that said, we shouldn’t be overly skeptical: Benchmarks are useful. We need benchmarks to drive us forward, and Octane is an interesting new benchmark that, even with the problems mentioned above, does contain good ideas and is worth focusing on. But we should always be aware of the limitations of any single benchmark, especially when a single benchmark claims to represent the entire modern web.

     

  5. Getting snappy – performance optimizations in Firefox 13

    Back in the fall of 2011, we took a targeted look at Firefox responsiveness issues. We identified a number of short term projects that together could achieve significant responsiveness improvements in day-to-day Firefox usage. Project Snappy kicked off at the end of the year with the goal of improving Firefox responsiveness.

    Although Snappy first contributed fixes to Firefox 11, Snappy’s most noticeable contributions to date are landing with Firefox 13. Currently in beta, this release includes a number of responsiveness related fixes, most notably tabs-on-demand, cycle collector improvements, and start-up optimization.

    Tabs -on-Demand

    Tabs-on-demand is a feature that reduces start-up time for Firefox windows with many tabs. In Firefox 12, all tabs are loaded on start-up. For windows with many tabs this may cause a delay before you can interact with Firefox as each tab must load its content. In Firefox 13, only the active tab will load. Loading of background tabs is deferred until a tab is selected. This results in Firefox starting faster as tabs-on-demand reduces processing requirements, network usage, and memory consumption.

    Cycle Collector

    As you interact with the browser and Web content, memory is allocated as needed. The Firefox cycle collector works to automatically free some of this memory when it is no longer needed. This action reduces Firefox’s memory usage. In Firefox 13, the cycle collector is more efficient, spending less time examining memory that is still in use, which results in less pauses as you use Firefox.

    Start-up

    Firefox start-up time is visible to all users. Our investigation into start-up has identified a number of unoptimized routines in the code that executes before what we call “first paint”. “First paint” signifies when the Firefox user interface is first visible on your screen. In Firefox 13 we have optimized file calls, audio sessions, drag and drop, and overall IO, just to name a few. We are continuing to profile the Firefox start-up sequence to identify further optimizations that can be made in future releases.

    There are numerous other Snappy fixes in Firefox 13 including significant improvements to IO contention, font enumeration, and livemark overhead. All of these fixes contribute to a more responsive experience. We are already working on further responsiveness fixes for future Firefox releases. You can expect to see Snappy improvements in upcoming releases in areas such as memory usage, shutdown time, network cache and connections, menus, and graphics.

  6. There is no simple solution for local storage

    TL;DR: we have to stop advocating localStorage as a great opportunity for storing data as it performs badly. Sadly enough the alternatives are not nearly as supported or simple to implement.

    When it comes to web development you will always encounter things that sound too good to be true. Sometimes they are good, and all that stops us from using them is our notion of being conspicuous about *everything* as developers. In a lot of cases, however, they really are not as good as they seem but we only find out after using them for a while that we are actually “doing it wrong”.

    One such case is local storage. There is a storage specification (falsely attributed to HTML5 in a lot of examples) with an incredibly simple API that was heralded as the cookie killer when it came out. All you have to do to store content on the user’s machine is to access the navigator.localStorage (or sessionStorage if you don’t need the data to be stored longer than the current browser session):

    localStorage.setItem( 'outofsight', 'my data' );
    console.log( localStorage.getItem( 'outofsight' ) ); // -> 'my data'

    This local storage solution has a few very tempting features for web developers:

    • It is dead simple
    • It uses strings for storage instead of complex databases (and you can store more complex data using JSON encoding)
    • It is well supported by browsers
    • It is endorsed by a lot of companies (and was heralded as amazing when iPhones came out)

    A few known issues with it are that there is no clean way to detect when you reach the limit of local storage and there is no cross-browser way to ask for more space. There are also more obscure issues around sessions and HTTPS, but that is just the tip of the iceberg.

    The main issue: terrible performance

    LocalStorage also has a lot of drawbacks that aren’t quite documented and certainly not covered as much in “HTML5 tutorials”. Especially performance oriented developers are very much against its use.

    When we covered localStorage a few weeks ago using it to store images and files in localStorage it kicked off a massive thread of comments and an even longer internal mailing list thread about the evils of localStorage. The main issues are:

    • localStorage is synchronous in nature, meaning when it loads it can block the main document from rendering
    • localStorage does file I/O meaning it writes to your hard drive, which can take long depending on what your system does (indexing, virus scanning…)
    • On a developer machine these issues can look deceptively minor as the operating system cached these requests – for an end user on the web they could mean a few seconds of waiting during which the web site stalls
    • In order to appear snappy, web browsers load the data into memory on the first request – which could mean a lot of memory use if lots of tabs do it
    • localStorage is persistent. If you don’t use a service or never visit a web site again, the data is still loaded when you start the browser

    This is covered in detail in a follow-up blog post by Taras Glek of the Mozilla performance team and also by Andrea Giammarchi of Nokia.

    In essence this means that a lot of articles saying you can use localStorage for better performance are just wrong.

    Alternatives

    Of course, browsers always offered ways to store local data, some you probably never heard of as shown by evercookie (I think my fave when it comes to the “evil genius with no real-world use” factor is the force-cached PNG image to be read out in canvas). In the internal discussions there was a massive thrust towards advocating IndexedDB for your solutions instead of localStorage. We then published an article how to store images and files in IndexedDB and found a few issues – most actually related to ease-of-use and user interaction:

    • IndexedDB is a full-fledged DB that requires all the steps a SQL DB needs to read and write data – there is no simple key/value layer like localStorage available
    • IndexedDB asks the user for permission to store data which can spook them
    • The browser support is not at all the same as localStorage, right now IndexedDB is supported in IE10, Firefox and Chrome and there are differences in their implementations
    • Safari, Opera, iOS, Opera Mobile, Android Browser favour WebSQL instead (which is yet another standard that has been officially deprecated by the W3C)

    As always when there are differences in implementation someone will come up with an abstraction layer to work around that. Parashuram Narasimhan does a great job with that – even providing a jQuery plugin. It feels wrong though that we as implementers have to use these. It is the HTML5 video debate of WebM vs. H264 all over again.

    Now what?

    There is no doubt that the real database solutions and their asynchronous nature are the better option in terms of performance. They are also more matured and don’t have the “shortcut hack” feeling of localStorage. On the other hand they are hard to use in comparison, we already have a lot of solutions out there using localStorage and asking the user to give us access to storing local files is unacceptable for some implementations in terms of UX.

    The answer is that there is no simple solution for storing data on the end users’ machines and we should stop advocating localStorage as a performance boost. What we have to find is a solution that makes everybody happy and doesn’t break the current implementations. This might prove hard to work around. Here are some ideas:

    • Build a polyfill library that overrides the localStorage API and stores the content in IndexedDB/WebSQL instead? This is dirty and doesn’t work around the issue of the user being asked for permission
    • Implement localStorage in an asynchronous fashion in browsers – actively disregarding the spec? (this could set a dangerous precedent though)
    • Change the localStorage spec to store asynchronously instead of synchronously? We could also extend it to have a proper getStorageSpace interface and allow for native JSON support
    • Define a new standard that allows browser vendors to map the new API to the existing supported API that matches the best for the use case?

    We need to fix this as it doesn’t make sense to store things locally and sacrifice performance at the same time. This is a great example of how new web standards give us much more power but also make us face issues we didn’t have to deal with before. With more access to the OS, we also have to tread more carefully.

  7. Faster Canvas Pixel Manipulation with Typed Arrays

    Edit: See the section about Endiannes.

    Typed Arrays can significantly increase the pixel manipulation performance of your HTML5 2D canvas Web apps. This is of particular importance to developers looking to use HTML5 for making browser-based games.

    This is a guest post by Andrew J. Baker. Andrew is a professional software engineer currently working for Ibuildings UK where his time is divided equally between front- and back-end enterprise Web development. He is a principal member of the browser-based games channel #bbg on Freenode, spoke at the first HTML5 games conference in September 2011, and is a scout for Mozilla’s WebFWD innovation accelerator.


    Eschewing the higher-level methods available for drawing images and primitives to a canvas, we’re going to get down and dirty, manipulating pixels using ImageData.

    Conventional 8-bit Pixel Manipulation

    The following example demonstrates pixel manipulation using image data to generate a greyscale moire pattern on the canvas.

    JSFiddle demo.

    Let’s break it down.

    First, we obtain a reference to the canvas element that has an id attribute of canvas from the DOM.

    var canvas = document.getElementById('canvas');

    The next two lines might appear to be a micro-optimisation and in truth they are. But given the number of times the canvas width and height is accessed within the main loop, copying the values of canvas.width and canvas.height to the variables canvasWidth and canvasHeight respectively, can have a noticeable effect on performance.

    var canvasWidth  = canvas.width;
    var canvasHeight = canvas.height;

    We now need to get a reference to the 2D context of the canvas.

    var ctx = canvas.getContext('2d');

    Armed with a reference to the 2D context of the canvas, we can now obtain a reference to the canvas’ image data. Note that here we get the image data for the entire canvas, though this isn’t always necessary.

    var imageData = ctx.getImageData(0, 0, canvasWidth, canvasHeight);

    Again, another seemingly innocuous micro-optimisation to get a reference to the raw pixel data that can also have a noticeable effect on performance.

    var data = imageData.data;

    Now comes the main body of code. There are two loops, one nested inside the other. The outer loop iterates over the y axis and the inner loop iterates over the x axis.

    for (var y = 0; y < canvasHeight; ++y) {
        for (var x = 0; x < canvasWidth; ++x) {

    We draw pixels to image data in a top-to-bottom, left-to-right sequence. Remember, the y axis is inverted, so the origin (0,0) refers to the top, left-hand corner of the canvas.

    The ImageData.data property referenced by the variable data is a one-dimensional array of integers, where each element is in the range 0..255. ImageData.data is arranged in a repeating sequence so that each element refers to an individual channel. That repeating sequence is as follows:

    data[0]  = red channel of first pixel on first row
    data[1]  = green channel of first pixel on first row
    data[2]  = blue channel of first pixel on first row
    data[3]  = alpha channel of first pixel on first row
     
    data[4]  = red channel of second pixel on first row
    data[5]  = green channel of second pixel on first row
    data[6]  = blue channel of second pixel on first row
    data[7]  = alpha channel of second pixel on first row
     
    data[8]  = red channel of third pixel on first row
    data[9]  = green channel of third pixel on first row
    data[10] = blue channel of third pixel on first row
    data[11] = alpha channel of third pixel on first row
     
     
    ...

    Before we can plot a pixel, we must translate the x and y coordinates into an index representing the offset of the first channel within the one-dimensional array.

            var index = (y * canvasWidth + x) * 4;

    We multiply the y coordinate by the width of the canvas, add the x coordinate, then multiply by four. We must multiply by four because there are four elements per pixel, one for each channel.

    Now we calculate the colour of the pixel.

    To generate the moire pattern, we multiply the x coordinate by the y coordinate then bitwise AND the result with hexadecimal 0xff (decimal 255) to ensure that the value is in the range 0..255.

            var value = x * y & 0xff;

    Greyscale colours have red, green and blue channels with identical values. So we assign the same value to each of the red, green and blue channels. The sequence of the one-dimensional array requires us to assign a value for the red channel at index, the green channel at index + 1, and the blue channel at index + 2.

            data[index]   = value;	// red
            data[++index] = value;	// green
            data[++index] = value;	// blue

    Here we’re incrementing index, as we recalculate it with each iteration, at the start of the inner loop.

    The last channel we need to take into account is the alpha channel at index + 3. To ensure that the plotted pixel is 100% opaque, we set the alpha channel to a value of 255 and terminate both loops.

            data[++index] = 255;	// alpha
        }
    }

    For the altered image data to appear in the canvas, we must put the image data at the origin (0,0).

    ctx.putImageData(imageData, 0, 0);

    Note that because data is a reference to imageData.data, we don’t need to explicitly reassign it.

    The ImageData Object

    At time of writing this article, the HTML5 specification is still in a state of flux.

    Earlier revisions of the HTML5 specification declared the ImageData object like this:

    interface ImageData {
        readonly attribute unsigned long width;
        readonly attribute unsigned long height;
        readonly attribute CanvasPixelArray data;
    }

    With the introduction of typed arrays, the type of the data attribute has altered from CanvasPixelArray to Uint8ClampedArray and now looks like this:

    interface ImageData {
        readonly attribute unsigned long width;
        readonly attribute unsigned long height;
        readonly attribute Uint8ClampedArray data;
    }

    At first glance, this doesn’t appear to offer us any great improvement, aside from using a type that is also used elsewhere within the HTML5 specification.

    But, we’re now going to show you how you can leverage the increased flexibility introduced by deprecating CanvasPixelArray in favour of Uint8ClampedArray.

    Previously, we were forced to write colour values to the image data one-dimensional array a single channel at a time.

    Taking advantage of typed arrays and the ArrayBuffer and ArrayBufferView objects, we can write colour values to the image data array an entire pixel at a time!

    Faster 32-bit Pixel Manipulation

    Here’s an example that replicates the functionality of the previous example, but uses unsigned 32-bit writes instead.

    NOTE: If your browser doesn’t use Uint8ClampedArray as the type of the data property of the ImageData object, this example won’t work!

    JSFiddle demo.

    The first deviation from the original example begins with the instantiation of an ArrayBuffer called buf.

    var buf = new ArrayBuffer(imageData.data.length);

    This ArrayBuffer will be used to temporarily hold the contents of the image data.

    Next we create two ArrayBuffer views. One that allows us to view buf as a one-dimensional array of unsigned 8-bit values and another that allows us to view buf as a one-dimensional array of unsigned 32-bit values.

    var buf8 = new Uint8ClampedArray(buf);
    var data = new Uint32Array(buf);

    Don’t be misled by the term ‘view’. Both buf8 and data can be read from and written to. More information about ArrayBufferView is available on MDN.

    The next alteration is to the body of the inner loop. We no longer need to calculate the index in a local variable so we jump straight into calculating the value used to populate the red, green, and blue channels as we did before.

    Once calculated, we can proceed to plot the pixel using only one assignment. The values of the red, green, and blue channels, along with the alpha channel are packed into a single integer using bitwise left-shifts and bitwise ORs.

            data[y * canvasWidth + x] =
                (255   << 24) |	// alpha
                (value << 16) |	// blue
                (value <<  8) |	// green
                 value;		// red
        }
    }

    Because we’re dealing with unsigned 32-bit values now, there’s no need to multiply the offset by four.

    Having terminated both loops, we must now assign the contents of the ArrayBuffer buf to imageData.data. We use the Uint8ClampedArray.set() method to set the data property to the Uint8ClampedArray view of our ArrayBuffer by specifying buf8 as the parameter.

    imageData.data.set(buf8);

    Finally, we use putImageData() to copy the image data back to the canvas.

    Testing Performance

    We’ve told you that using typed arrays for pixel manipulation is faster. We really should test it though, and that’s what this jsperf test does.

    At time of writing, 32-bit pixel manipulation is indeed faster.

    Wrapping Up

    There won’t always be occasions where you need to resort to manipulating canvas at the pixel level, but when you do, be sure to check out typed arrays for a potential performance increase.

    EDIT: Endianness

    As has quite rightly been highlighted in the comments, the code originally presented does not correctly account for the endianness of the processor on which the JavaScript is being executed.

    The code below, however, rectifies this oversight by testing the endianness of the target processor and then executing a different version of the main loop dependent on whether the processor is big- or little-endian.

    JSFiddle demo.

    A corresponding jsperf test for this amended code has also been written and shows near-identical results to the original jsperf test. Therefore, our final conclusion remains the same.

    Many thanks to all commenters and testers.

  8. Firefox 7 is lean and fast

    Based on a blog post originally posted here by Nicholas Nethercote, Firefox Developer.

    tl;dr
    Firefox 7 now uses much less memory than previous versions: often 20% to 30% less, and sometimes as much as 50% less. This means that Firefox and the websites you use will be snappier, more responsive, and suffer fewer pauses. It also means that Firefox is less likely to crash or abort due to running out of memory.

    These benefits are most noticeable if you do any of the following:
    - keep Firefox open for a long time;
    - have many tabs open at once, particularly tabs with many images;
    - view web pages with large amounts of text;
    - use Firefox on Windows
    - use Firefox at the same time as other programs that use lots of memory.

    Background

    Mozilla engineers started an effort called MemShrink, the aim of which is to improve Firefox’s speed and stability by reducing its memory usage. A great deal of progress has been made, and thanks to Firefox’s faster development cycle, each improvement made will make its way into a final release in only 12–18 weeks. The newest update to Firefox is the first general release to benefit from MemShrink’s successes, and the benefits are significant.

    Quantifying the improvements
    Measuring memory usage is difficult: there are no standard benchmarks, there are several different metrics you can use, and memory usage varies enormously depending on what the browser is doing. Someone who usually has only a handful of tabs open will have an entirely different experience from someone who usually has hundreds of tabs open. (This latter case is not uncommon, by the way, even though the idea of anyone having that many tabs open triggers astonishment and disbelief in many people. E.g. see the comment threads here and here.)

    Endurance tests
    Dave Hunt and others have been using the MozMill add-on to perform “endurance tests“, where they open and close large numbers of websites and track memory usage in great detail. Dave recently performed an endurance test comparison of development versions of Firefox, repeatedly opening and closing pages from 100 widely used websites in 30 tabs.

    [The following numbers were run while the most current version of Firefox was in Beta and capture the average and peak “resident” memory usage for each browser version over five runs of the tests. “Resident” memory usage is the amount of physical RAM that is being used by Firefox, and is thus arguably the best measure of real machine resources being used.]

    2
    3

    The measurements varied significantly between runs. If we do a pair-wise comparison of runs, we see the following relative reductions in memory usage:

    Minimum resident: 1.1% — 23.5% (median 6.6%)
    Maximum resident: -3.5% — 17.9% (median 9.6%)
    Average resident: 4.4% — 27.3% (median 20.0%)

    The following two graphs showing how memory usage varied over time during Run 1 for each version. Firefox 6′s graph is first, with the latest version second. (Note: Compare only to the purple “resident” lines; the meaning of the green “explicit” line changed between the versions and so the two green lines cannot be sensibly compared.)
    Firefox 7 is clearly much better; its graph is both lower and has less variation.

    ff6
    ff7


    MemBench

    Gregor Wagner has a memory stress test called MemBench. It opens 150 websites in succession, one per tab, with a 1.5 second gap between each site. The sites are mostly drawn from Alexa’s Top sites list. I ran this test on 64-bit builds of Firefox 6 and 7 on my Ubuntu Linux machine, which has 16GB of RAM. Each time, I let the stress test complete and then opened about:memory to get measurements for the peak resident usage. Then I hit the “Minimize memory usage” button in about:memory several times until the numbers stabilized again, and then re-measured the resident usage. (Hitting this button is not something normal users do, but it’s useful for testing purposes because causes Firefox to immediately free up memory that would be eventually freed when garbage collection runs.)

    For Firefox 6, the peak resident usage was 2,028 MB and the final resident usage was 669 MB. For Firefox 7, the peak usage was 1,851 MB (a 8.7% reduction) and the final usage was 321 MB (a 52.0% reduction). This latter number clearly shows that fragmentation is a much smaller problem in Firefox 7.
    (On a related note, Gregor recently measured cutting-edge development versions of Firefox and Google Chrome on MemBench.)


    Conclusion

    Obviously, these tests are synthetic and do not match exactly how users actually use Firefox. (Improved benchmarking is one thing we’re working on as part of MemShrink, but we’ve got a long way to go. ) Nonetheless, the basic operations (opening and closing web pages in tabs) are the same, and we expect the improvements in real usage will mirror improvements in the tests.

    This means that users should see Firefox 7 using less memory than earlier versions — often 20% to 30% less, and sometimes as much as 50% less — though the improvements will depend on the exact workload. Indeed, we have had lots of feedback from early users that the latest Firefox update feels faster, is more responsive, has fewer pauses, and is generally more pleasant to use than previous versions.

    Mozilla’s MemShrink efforts are continuing. The endurance test results above show that the Beta version of Firefox already has even better memory usage, and I expect we’ll continue to make further improvements as time goes on.