Mozilla

JavaScript speedups in Firefox 3.6

This post was written by David Mandelin who works on Mozilla’s JavaScript team.

Firefox 3.5 introduced TraceMonkey, our new JavaScript engine that traces loops and JIT compiles them to native (x86/ARM) code. Many JavaScript programs ran 3-4x faster in TraceMonkey compared to Firefox 3. (See our previous article for technical details.)

For JavaScript performance in Firefox 3.6, we focused on the areas that we thought needed further improvement the most:

  • Some JavaScript code was not trace-compiled in Firefox 3.5. Tracing was disabled by default for Firefox UI JavaScript and add-on JavaScript, so those programs did not benefit from tracing. Also, many advanced JavaScript features were not trace-compiled. For Firefox 3.6, we wanted to trace more programs and more JS features.
  • Animations coded with JavaScript were often choppy because of garbage collection pauses. We wanted to improve GC performance to make pauses shorter and animations smoother.

In this article, I’ll explain the most important JS performance improvements that come with Firefox 3.6. I’ll focus on listing what kinds of JS code get faster, including sample programs that show the improvements Fx3.6 makes over Fx3.5.

JIT for Browser UI JavaScript

Firefox runs JavaScript code in one of two contexts:content and chrome (no relation to Google Chrome). JavaScript that is part of web content runs in a content context. JavaScript that is part of the browser UI or browser add-ons runs in a chrome context and has extra privileges. For example, chrome JS can alter the main browser UI, but content JS is not allowed to.

The TraceMonkey JIT can be enabled or disabled separately for content and chrome JS using about:config. Because bugs affecting chrome JS are a greater risk for security and reliability, in Firefox 3.5 we chose to disable the JIT for chrome JS by default. After extensive testing, we’ve decide to enable the JIT for chrome JS by default, something we did not have time to fully investigate for Fx3.5. Turning on the JIT for chrome should make the JS behind the Firefox UI and add-ons run faster. This difference is probably not very noticeable for general browser usage, because the UI was designed and coded to perform well with the older JS engines. The difference should be more noticeable for add-ons that do heavy JS computation.

Option Fx3.5 Default Fx3.6 Default
javascript.options.jit.chrome false true
javascript.options.jit.content true true
about:config options for the JIT

Garbage Collector Performance

JavaScript is a garbage-collected language, so periodically the JavaScript engine must reclaim unused memory. Our garbage collector (GC) pauses all JavaScript programs while it works. This is fine as long as the pauses are “short”. But if the pauses are even a little too long, they can make animations jerky. Animations need to run at 30-60 frames per second to look smooth, which means it should take no longer than 17-33 ms to render one frame. Thus, GC pauses longer than 40 ms cause jerkiness, while pauses under 10 ms should be almost unnoticeable. In Firefox 3.5, pause times were noticeably long, and JavaScript animations are increasingly common on the web, so reducing pause times was a major goal for JavaScript in Firefox 3.6.

Demo: GC Pauses and Animation

Demo.
The spinning dial animation shown here illustrates pause times. Besides animating the dial, this demo creates one million 100-character strings per second, so it requires frequent GC. The frame delay meter gives the average time between frames in milliseconds. The estimated GC delay meter gives the average estimated GC delay, based on the assumption that if a frame has a delay of 1.7 times the average delay or more, then exactly one GC ran during that frame. (This procedure may not be valid for other browsers, so it is not valid for comparing different browsers. Note also that the GC time also depends on other live JavaScript sessions, so for a direct comparison of two browsers, have the same tabs open in each.) On my machine, I get an estimated GC delay of about 80 ms in Fx3.5, but only 30 ms in Fx3.6.

But it’s a lot easier to see the difference by opening the demo in Fx3.5, watching it a bit, and then trying it in Fx3.6.
In Fx3.5, I see frequent pauses and the animation looks noticeably jerky. In Fx3.6, it looks pretty smooth, and it’s hard for me even to tell exactly when the GC is running.

How Fx3.6 does it better. We’ve made many improvements to the garbage collector and memory allocator. I want to give a little more technical details on the big two changes that really cut our pause times.

First, we noticed that a large fraction of the pause time was spent calling free to reclaim the unused memory. We can’t do much to make freeing memory faster, but we realized we could do it on a separate thread. In Fx3.6, the main JS thread simply adds unused memory chunks to a queue, and another thread frees them during idle time or on a separate processor. This means machines with 2 or more cores will benefit more from this change. But even when one core, freeing might be delayed to an idle time when it will not affect scripts.

Second, we knew that in Fx3.5 running GC clears out all the native code compiled by the JIT as well as some other caches that speed up JS. The reason is that the tracing JIT and GC did not know about each other, so if the GC ran, it might reclaim objects being used by a compiled trace. The result was that immediately after a GC, JS ran a bit slower as the caches and compiled traces were built back up. This would be experienced as either an extended GC pause or a brief hiccup of slow animation right after the GC pause. In Fx3.6, we taught the GC and the JIT to work together, and now the GC does not clear caches or wipe out native code, so it resumes running normally right after GC.

Tracing More JavaScript Constructs

In my article on TraceMonkey for the Fx3.5 release, I noted that certain code constructs, such as the arguments object, were not traced and did not get performance improvements from the JIT. A major goal for JS in Fx3.6 was to trace more stuff, so more programs can run faster. We do trace more stuff now, in particular:

  • DOM Properties. DOM objects are special and harder for the trace compiler to work with. For Fx3.5, we implemented tracing of DOM methods, but not DOM properties. Now we trace DOM properties (and other “native” C++ getters and setters) as well. We still do not trace scripted getters and setters.
  • Closures. Fx3.5 traced only a few operations involving closures (by which I mean functions that refer to variables defined in lexically enclosing functions). Fx3.6 can trace more programs that use closures. The main operation that is still not traced yet is creating an anonymous function that modifies closure variables. But calling such a function and actually writing to the closure variables are traced.
  • arguments. We now trace most common uses of the arguments keyword. “Exotic” uses, such as setting elements of arguments, are not traced.
  • switch. We have improved performance when tracing switch statements that use densely packed numeric case labels. These are particularly important for emulators and VMs.

These improvements are particularly important for jQuery and Dromaeo, which heavily use arguments, closures, and the DOM. I suspect many other complex JavaScript applications will also benefit. For example, we recently heard from the author that this R-tree library performs much better in Fx3.6.

Here is a pair of demos of new things we trace. The first sets a DOM property in a loop. The second calls a sum function implemented with arguments I get a speedup of about 2x for both of them in Fx3.6 vs. Fx3.5.

Demo: Fx3.6 Tracing DOM properties and arguments


DOM Property Set:

Sum using arguments:

String and RegExp Improvements

Fx3.6 includes several improvements to string and regular expression performance. For example, the regexp JIT compiler now supports a larger class of regular expressions, including the ever-popular w+. We also made some of our basic operations faster, like indexOf, match, and search. Finally, we made concatenating sequences of several strings inside a function (a common operation in building up HTML or other kinds of textual output) much faster.

Technical aside on how we made string concatenation faster: The C++ function that concatenates two strings S1 and S2 does this: Allocate a buffer big enough to hold the result, then copy the characters of S1 and S2 into the buffer. To concatenate more than two strings, as in JS s + "foo" + t, Fx3.5 simply concatenates two at a time from left to right.

Using the Fx3.5 algorithm, to concatenate N strings each of length K, we need to do N-1 memory allocations, and all but one of them are for temporary strings. Worse, the first two input strings are copied N-1 times, the next one is copied N-2 times, and so on. The total number of characters copied is K(N-1)(N+2)/2, which is O(N^2).

Clearly, we can do a lot better. The minimum work we can do is to copy each input string exactly once to the output string, for a total of KN characters copied. Fx3.6 achieves this by detecting sequences of concatenation in JS programs and combining the entire sequence into one operation that uses the optimal algorithm.

Here are a few string benchmarks you can try that are faster in Fx3.6:

Demo: Fx3.6 String Operations


/w+/:

indexOf('foo'):

match('foo'):

Build HTML:

Final Thoughts and Next Steps

We also made a lot of little improvements that don't fit into the big categories above. Most importantly, Adobe, Mozilla, Intel, Sun, and other contributors continue to improve nanojit, the compiler back-end used by TraceMonkey. We have improved its use of memory, made trace recording and compiling faster, and also improved the speed of the generated native code. A better nanojit gives a boost to all JS that runs in the JIT.

There are two big items that didn't make the cut for Fx3.6, but will be in the next version of Firefox and are already available in nightly builds:

  • JITting recursion. Recursive code, like explicit looping code, is likely to be hot code, so it should be JITted. Nightly builds JIT directly recursive functions. Mutual recursion (g calls f calls g) is not traced yet.
  • AMD x64 nanojit backend. Nanojit now has a backend that generates AMD x64 code, which gives the possibility of better performance on that plaform.

And if you try a nightly build, you'll find that many of these demos are already even faster than in Fx3.6!

85 comments

Comments are now closed.

  1. onur wrote on January 25th, 2010 at 12:00:

    Firefox 3.6 does not support these ICC profiles version 4

  2. Dimitrs wrote on January 26th, 2010 at 06:50:

    May I, humbly, suggest an option to just JIT *everything* instead of burning cycles analyzing code paths? I think Chrome works this way.

  3. Pingback from Blue Sky On Mars » Blog Archive » SlowNews: Sikuli, Letters.app, Pintura, Firefox 3.6 on January 26th, 2010 at 07:01:

    […] 3.6 was released last week, representing a lot of work from a whole lot of people. JavaScript performance is faster still than the already fast 3.5 and there have been a number of other areas of the UI […]

  4. tsphand1 wrote on January 28th, 2010 at 01:49:

    I need more speed up

    mozilla can build x64 version…………
    when it done?

  5. Chris wrote on January 28th, 2010 at 14:54:

    Firefox 3.6 clean start
    Frame Delay: 16
    GC Delay: 74

    Chrome 4 clean start
    Frame Delay: 9
    GC Delay: 16

    Methinks Mozilla have a way to go with this one.

  6. Vijay Rayapati wrote on January 29th, 2010 at 04:10:

    Awesome, when we will have threaded JS execution in FF like similar thing in Google Gears.

    1. Sasha Chedygov wrote on January 30th, 2010 at 14:13:

      It already exists. Check out the Web Workers API.

  7. F wrote on January 30th, 2010 at 08:31:

    Frame Delay: 13
    GC Delay: 143

    Meh, i dont see the difference

  8. default wrote on January 31st, 2010 at 12:04:

    I think it’s very sad that the x86_64 nanojit is still not available in Firefox 3.6. I regularly test trunk builds and it’s been working well for me for *months* already.

    Keep in mind that on Unix, i.e. Linux/FreeBSD, a native, 64 bit Firefox is the default on x86_64 systems!

  9. Pascal wrote on February 2nd, 2010 at 08:46:

    uh, quite interesting: did just for fun some comparison between fx3.6 and chrome and even while i have about 100 tabs in fx open it is still faster in the string section (about 3 times!)

  10. bleh wrote on February 2nd, 2010 at 20:35:

    3.6 seems very fast and smooth compared to 3.5. I have chrome 4 beta also installed and it’s not as fast as I remember. They seem the same to me but Chrome does not have NoScript. Chrome with extensions has really slowed it down. Keep moving on FF, its making progress.

    1. Ziru wrote on March 1st, 2010 at 11:36:

      The good thing about Chrome is that it is not difficult to figure out which extension slows the browser down. Users can therefore easily disable the buggy extension. For FF, the most-often recommendation from the Mozilla forum is to switch to the safe-mode, disable all the extensions, and enable them one by one.

      1. raj wrote on March 15th, 2011 at 05:55:

        I have tried both chrome and firefox.You have more plugins choice for firefox.These plugins make firefox more usable and reliable.

  11. nick wrote on May 8th, 2010 at 13:34:

    Those string operations in the last box run significantly faster in firefox than in Chrome, and I have the latest build of both. However, the first two tests are faster in chrome.

    I get an estimated GC delay of 91 ms! Chrome seems to consume a lot more system resources to get the 10 ms score on the same test. My computer is quite old though, so maybe I can’t win.

  12. Gary wrote on May 27th, 2010 at 10:27:

    Well, I think I’ve just about had it with Firefox… unless someone has a suggestion to help me.

    In my experience, the GC hesitations in Firefox 3.x have become much more common. It’s mostly prominent with watching videos. Yes, even after a video has cached, it will experience momentary pauses every 15-20 seconds, so it is nothing to do with streaming. Clearing the cache doesn’t seem to help. A fresh reboot will lessen the effect, but after loading a few web pages and running a few videos, the hesitations become quite noticeable.

    I ran the GC pause animation in Firefox, and I get on average GC delays of 59ms, frame delays of 17ms. The second hand repeatedly pauses once or twice with EVERY rotation.

    I ran the GC pause animation in Google Chrome, and I get on average GC delays of 12ms and frame delays of 7ms. I get a clean sweep of the second hand, no pauses whatsoever.

    I spent hours searching around the Internet, trying to find information about these pauses that I’ve experienced, until finally finding this page. The results tell me that I’m better off with Google Chrome… unless there is some special update I can apply to fix the GC performance in Firefox?

    1. Gary wrote on May 27th, 2010 at 10:38:

      NOTE: My stated GC delay in Firefox is understated. Rerunning the animation shows a wild swing of results, much higher than I originally found. As of my last run, it started out in the mid 50’s, then continued to rise with each cycle of the second hand, bumping up to over 100ms, then leveling off in the mid 90’s. I’m not surprised, given the behavior I’ve experienced with it…

    2. Christopher Blizzard wrote on May 27th, 2010 at 10:42:

      Gary –

      Firefox does have a browser-wide GC, which does cause pauses across the entire browser and ends up running pretty often, especially if you have a lot of tabs open.

      For Firefox 4 we’re planning to have per-page GC implemented which will give us most of the benefit of the per-process model that Chrome enjoys.

      Note that these tests are built to trigger the GC in Firefox. Chrome might not even run the GC at all running these tests. (In fact, there’s a big fat warning in the post that says: “This procedure may not be valid for other browsers, so it is not valid for comparing different browsers.”) So keep that in mind as well.

      So agreed that GC pauses suck, we’re going to have something for Firefox 4 and when we move to a multi-process model we’ll be in even better shape.

  13. Gary wrote on June 2nd, 2010 at 07:00:

    Thanks for the rapid reply, Christopher (hadn’t expected it so fast, so I didn’t return right away). :-)

    My bad… I didn’t even realize that Chrome is based on the WebKit layout engine and application framework, so it doesn’t have anything in common with Mozilla.

    I did conduct my own test, of performing the same browsing activities in both Chrome and Firefox after a reboot to clear out any used up memory. After a while, I began to notice how video response in Chrome was better. Firefox would start to slip into this periodic hesitation/pause during playback.

    I did not see this problem in earlier version of Firefox. I can’t recall when it began to surface. In any case, was it a change in the operating model that started it? Or was it always there, just took a few other changes to help reveal it?

    Chrome has some appealing attributes to it, like the small real estate usage at the top that allows for a greater viewing area in each tab. And it simplifies some of the options control menus (albeit at the expense of some flexibility). But I like the extensive configuration control that Firefox provides. I’ll look forward to revisiting it in version 4. ;-)

  14. Sadiq nasiry wrote on July 1st, 2010 at 00:07:

    Well, I think it’s just new name but not a new difference because all
    the speed, plug ins and others are the same and same however the promotion is the tabs (upgrade from fx3.0) but for a dial-up connection no kind of difference!!!

  15. massschneider wrote on July 29th, 2010 at 08:15:

    bumping up to over 100ms, then leveling off in the mid 90′s. I’m not surprised

  16. Jeffz wrote on August 28th, 2010 at 14:20:

    Guys,
    FF latest – even with all add ons is still “jerky as h”.
    It’s JS animations are worse of all other browsers I test it on.
    Even IE7 is visibly better.

    FF should do something about complex JS animations.
    And fast.

    1. Jeffz wrote on August 28th, 2010 at 14:22:

      I meant “all add ons off”

  17. max jones wrote on October 29th, 2010 at 06:32:

    there is something about the way javascript is being handled that is different from earlier versions. trying to find out what is how i arrived here.

    right now firefox is nearly unusable. every few minutes i get a message about a “javascript that is unresponsive and do i want to continue.” whatever is going on brings my laptop to a halt. if i wait for 5 or 10 minutes it will free up enough cylcles to move on. mostly i just have to kill it and reboot. i have to wait until that “do you want to continue” message pops… often there is are a couple more javascripts that have become unresponsive that prevent me from stopping the first clicking stop in the dialog box…

    most of the problems are with gmail. at first i thought google might be pulling a microsoft to give chrome an advantage. but after watching it and figuring out which scripts were dying i concluded its more or less random. all are effected.

    running ubuntu 10.10.1. i’ve tried nightly builds but it doesn’t seem to be going away…

  18. psalmsninetyone wrote on November 1st, 2010 at 22:49:

    I’m using firefox 3.6.12 and I’m getting this on the first run:

    Frame Delay: 18
    GC Delay: 14

    Then I get this on the second run:

    Frame Delay: 16
    GC Delay: 27

    The GC Delay figures rise as tabs increase (naturally) but the Frame Delay was pretty constant

  19. jhon wrote on March 17th, 2011 at 13:47:

    i m using Firefox 3.6.15 with Greasefire ..its gr8

  20. BMorris wrote on March 23rd, 2011 at 03:58:

    JIT is one of the most important technology in browser history that was developed. Perhaps it will be nice for all browsers to adopt and implement JIT and comeup with a common standard for it and focus on optimizing it even more.

  21. Chris wrote on March 30th, 2011 at 17:22:

    Nice test app.

    My dial jerks every 2-3 turns and the GC delay is hovering around 80ms, this is with FF 3.6. I only have 12 addons, and I reckon the only big ones are adblock plus which I consider essential and tab mix plus which I also would struggle to do without. The rest are tiny ones like perspectives.

1 2

Comments are closed for this article.