Mozilla

a quick note on JavaScript engine components

There have been a bunch of posts about the JägerMonkey (JM) post that we made the other day, some of which get things subtly wrong about the pieces of technology that are being used as part of Mozilla’s JM work. So here’s the super-quick overview of what we’re using, what the various parts do and where they came from:

1. SpiderMonkey.This is Mozilla’s core JavaScript Interpreter. This engine takes raw JavaScript and turns it into an intermediate bytecode. That bytecode is then interpreted. SpiderMonkey was responsible for all JavaScript handling in Firefox 3 and earlier. We continue to make improvements to this engine, as it’s still the basis for a lot of work that we did in Firefox 3.5, 3.6 and later releases as well.

2. Tracing. Tracing was added before Firefox 3.5 and was responsible for much of the big jump that we made in performance. (Although some of that was because we also improved the underlying SpiderMonkey engine as well.)

This is what we do to trace:

  1. Monitor interpreted JavaScript code during execution looking for code paths that are used more than once.
  2. When we find a piece of code that’s used more than once, optimize that code.
  3. Take that optimized representation and assemble it to machine code and execute it.

What we’ve found since Firefox 3.5 is that when we’re in full tracing mode, we’re really really fast. We’re slow when we have to “fall back” to SpiderMonkey and interpret + record.

One difficult part of tracing is generating code that runs fast. This is done by a piece of code called Nanojit. Nanojit is a piece of code that was originally part of the Tamarin project. Mozilla isn’t using most of Tamarin for two reasons: 1. we’re not shipping ECMAScript 4 and 2. the interpreted part of Tamarin was much slower than SpiderMonkey. For Firefox 3.5 we took the best part – Nanojit – and bolted it to the back of SpiderMonkey instead.

Nanojit does two things: it takes a high-level representation of JavaScript and does optimization. It also includes an assembler to take that optimized representation and generate native code for machine-level execution.

Mozilla and Adobe continue to collaborate on Nanojit. Adobe uses Nanojit as part of their ActionScript VM.

3. Nitro Assembler. This is a piece of code that we’re taking from Apple’s version of webkit that generates native code for execution. The Nitro Assembler is very different than Nanojit. While Nanojit takes a high-level representation, does optimization and then generates code all the Nitro Assembler does is generate code. So it’s complex, low-level code, but it doesn’t do the same set of things that Nanojit does.

We’re using the Nitro assembler (along with a lot of other new code) to basically build what everyone else has – compiled JavaScript – and then we’re going to do what we did with Firefox 3.5 – bolt tracing onto the back of that. So we’ll hopefully have the best of all worlds: SpiderMonkey generating native code to execute like the other VMs with the ability to go onto trace for tight inner loops for even more performance.

I hope this helps to explain what bits of technology we’re using and how they fit into the overall picture of Firefox’s JS performance.

33 comments

Comments are now closed.

  1. soumynano wrote on March 8th, 2010 at 15:08:

    So would it be correct to say that you are intending to replace SpiderMonkey’s interpreter with the native-compiler Nitro (and assorted support code)?

  2. Magne Andersson wrote on March 8th, 2010 at 15:12:

    I got to ask, where will this put Firefox compared to Safari, Google Chrome and Opera 10.5 in JavaScript speed? Do you have any guesses? Is there a chance it might even get faster than at least one of them?

    My other question is, when do you think we could see a stable implementation? Are we talking Firefox.next or a version further into the future?

    Thanks!

  3. Boris wrote on March 8th, 2010 at 15:33:

    Magne, we’re already faster than all of those browsers in lots of different situations.

    So is your real question about a particular benchmark?

    1. Magne Andersson wrote on March 8th, 2010 at 16:02:

      Well, Opera (and previously Safari and Google Chrome) has recently been able to claim that they are THE fastest. I guess that is based on an average of the most popular benchmarks, so yes.

      I am not using JavaScript much in my own development and is not that familiar with existing benchmarks, but if I were to name the ones I know, and that seems to matter currently, it would be Sunspider, Dromaeo and the V8 benchmark suite. There’s also one more that I know of, but can’t currently remember the name, that has been brought up.

  4. Drazick wrote on March 8th, 2010 at 16:01:

    Could we conclude from that that the next Firefox will have JS Performance which are as good as Safari to the least?

    Thanks for the wonderful work.

  5. Anonymous wrote on March 8th, 2010 at 16:04:

    @Boris: Can you provide some example, because Christopher early also have mentioned about some superior tests, but w/o any proofs :(
    P.S. captcha: “reproves matters” — funny :)

  6. Boris wrote on March 8th, 2010 at 16:13:

    Magne, those claims have largely been based on one particular benchmark: Sunspider (and somewhat V8 in the case of Chrome).

    The Dromaeo JS benchmark is structured in such a way that its score will closely track Sunspider (about 60% of the score), with the V8 suite having the next biggest chunk of input (about 30-35%). The subscores on particular Dromaeo sections are relevant, but the overall score gives you no really new information.

    So to answer your question here, the main goal of this work is to not have any pathologically slow cases as we do now. There is the expectation that this should also allow us to do much better on Sunspider (modulo the fact that it doesn’t run the same workload in our js engine as in other ones and the fact that it fundamentally favors up-front compilation over tracing due to how it does timing). What that means in practice, we’ll see.

    1. Magne Andersson wrote on March 8th, 2010 at 17:04:

      I see. But, can you give some “real-world” examples where we would be able to see improvements in speed? That is, in regular Internet applications that would benefit from the JägerMonkey project.

  7. Slobo wrote on March 8th, 2010 at 16:21:

    I suspect this will not have much effect on DOM manipulation speed. Right now WebKit spanks Gecko hard on a webapp I’m developing (enough that I’m considering working out the few Gecko specifics and recommending Chromium to our intranet users).

    Is there a good place to monitor the progress of javascript DOM manipulation performance development in firefox? Thanks

  8. Boris wrote on March 8th, 2010 at 17:56:

    I suspect that http://www.apejet.org/aaron/blog/2008/10/04/browser-benchmarking-using-go-game/ (which is a benchmark, but based on an actual go game AI) will benefit. So will various emulators that people seem to want to write in JavaScript. In general, what will most benefit is very branchy code.

  9. Paul wrote on March 8th, 2010 at 20:34:

    Sorry, still dont get it completely. I get that – Nanojit takes high level stuff, optimises and generates native code, and that – Nitro only generates native code.

    So, you have 2 compilers generating native code, right? Is that because Nanojit and Nitro generate optimal code in different areas of javascript?

    1. Richard Barrell wrote on July 15th, 2010 at 03:26:

      Nitro and Nanojit both generate native code. You can think of it like this: Nitro is a fast code generator that produces mediocre (slow-ish) native code. Nanojit is a slow code generator that produces good (faster) native code.

      As I understand it, the idea is to run the Nitro on *all* javascript because it is so cheap, because the native code that Nitro produces will run somewhat faster than interpreting bytecode would have been. The tracing engine then kicks in on the hot-spots in that code, and optimizes them much more aggressively. So you get reasonably-speedy, cheap-to-generate code for all javascript, and you invest the CPU time to get very-fast code for the few bits where most of the time is spent.

  10. jpvincent wrote on March 9th, 2010 at 02:07:

    it’s always interesting to know how stuff work beyond the scene

    however, as a web developer, it would be even more interesting to have tips (if any) on how to take advantage of this kind of evolution, I mean by that code examples on where we should see improvements

  11. Mark Lee Smith wrote on March 9th, 2010 at 02:23:

    Boris

    Forgive me if this sounds somewhat rude but isn’t it very convenient for you to be able to say that, and I’m paraphrasing here: “We already have the fastest Javascript engine”, “It’s just the benchmarks that everyone else uses that show we’re the slowest”, “Here is one place where we come out ahead.”

    Proof?

    I’m sorry but this simply doesn’t satisfy me.

  12. chris wrote on March 9th, 2010 at 07:53:

    so let me get this right:

    right now:
    spidermonkey interpret->tracing optimize->nanojit compile

    with jaejermonkey:

    tracing optimize->nitro compile

    so basically, spidermonkey interpreter is dead? since there will be no need for an interpreter at all?

    does this change make mozilla js does same functions as webkit js? I’m quite confused, if so, whats the reason against just using webkit’s js engine as a whole?

  13. Juha wrote on March 9th, 2010 at 08:29:

    Is this now targeted only for x86 or is there already a code optimized for ARM?

  14. Tristan wrote on March 9th, 2010 at 11:14:

    Very informative and helpful, Chris. This is good work, and I know how hard it is to make complex stuff understandable :-)

  15. Boris wrote on March 9th, 2010 at 14:47:

    Slobo, you’re correct that this will have little effect on most DOM manipulation. That said, most “dom” issues are actually not; they’re layout issues. Did you file a bug on the performance issue with your webapp? If not, can you either file (and cc “:bz”) or let me know directly at bz at mit dot edu how the problem can be reproduced?

    Paul, Nanojit does a lot of optimizations that can’t be easily done in a simple codegen path using the Nitro assembler. So yes, there will be two different paths for generating native code that will be used in different circumstances. The JM codepath (using the Nitro assembler for codegen) for baseline code generation, and Nanojit for the hotspots.

    Mark Lee Smith, where did I say we have the fastest JS engine? We have a JS engine that is very fast in some cases, not so fast in others. We’re slower on benchmarks that other browser developers created and then optimized their engines for (sometimes for several years before releasing the benchmarks to the world). We’re slower on many real-world cases (which is why the JM work is going on). We’re a lot faster in many other real-world cases. For example, most pixel manipulation I’ve tried via canvas (image filters, etc) is much faster in Gecko, especially with vlad’s patches to make canvas imagedata not cart around 4x the memory it needs to. We’re faster on math-intensive benchmarks, generally, if the code is not too branchy. That’s a big if, obviously, which is why JM is being worked on.

    chris, the new setup has two separate compile steps. First compile using a compiler we’re writing and using the nitro assembler for codegen. Then for the hotspots recompile with tracing and nanojit. SpiderMonkey interpreter is not dead yet, but the goal is to work on killing it if possible, yes. Not sure what your next-to-last question means. To answer your last question, webkit’s js engine doesn’t have a number of capabilities that SpiderMonkey currently has (e.g. it must run on a single thread, it doesn’t support all the features SpiderMonkey does, etc).

    Juha, the Nitro assembler has an ARM backend, so things should already work as well on ARM as they do on x86.

  16. Boris wrote on March 9th, 2010 at 14:58:

    Mark lee smith, just in case I misunderstood and you’re just asking for examples where TM is faster than jsc or V8 or both, at least on my hardware, try these:

    http://dromaeo.com/?dromaeo-object-string
    http://dromaeo.com/?dromaeo-object-array
    http://web.mit.edu/bzbarsky/www/mandelbrot-clean.html
    http://www.galbraiths.org/benchmarks/pixastic.html (esp. in a build with vlad’s patch I mention above)

    I’ve also recently seen someone time things on google spreadsheets and discover that it’s much faster in Gecko than in Chrome or Safari, but there could be all sorts of noise (DOM, layout, different codepaths) in that…

  17. njn wrote on March 9th, 2010 at 19:45:

    Chris said:

    “Nanojit does two things: it takes a high-level representation of JavaScript and does optimization. It also includes an assembler to take that optimized representation and generate native code for machine-level execution.”

    Not quite… Nanojit works on a quite low-level intermediate representation, called LIR, which is kind of an abstract assembly language. It’s the tracing compiler’s job to convert JS bytecode into LIR. It’s Nanojit’s job to optimize the LIR and then generate native code.

  18. bob wrote on March 10th, 2010 at 05:05:

    That’s great to see such initiatives while keeping the trace monkey work.
    I hope you’ll achieve your goals quick enough (as you know, people are driven by the “cool” factor)

  19. Junebug wrote on March 10th, 2010 at 08:31:

    Wow. Thank God browser dev doesn’t respect patents. Jobs has a point when he talks about IP. They’ve had some pretty smart ideas.

    C’mon Mozilla, where is your creativity? You’ve done some smart implementation work, and I respect that, but when was the last time when you stunned the world with something MASSIVE? In technical stuff? You are great at getting people work together, but Apple just blows my mind. Focused dev has its good points.

  20. Dan wrote on March 10th, 2010 at 18:44:

    Boris, I think the part of Chris’ question that you couldn’t quite understand is: will there now be one less JS engine in the world of web browsers? In terms of functionality, quirks and behaviour. That’s the way I interpreted the question. I did think that replacing part of the browser with a part from WebKit reduces the healthyy competition between browsers, but I guess it’s just a small part of the whole JS system.

  21. Andrew wrote on March 10th, 2010 at 18:47:

    Boris,

    A question for you concerning TraceMonkey; does either nanojit or nitro do optimizations such as dead code removal and unswitching? Or is that your responsibility before passing it in to them?

  22. Oskar wrote on March 10th, 2010 at 19:22:

    Nice! Can’t wait to see improvements with JägerMonkey.
    Hope it get’s out by the year (Firefox 4.0)
    Firefox is fast, and with Direct2D for me it’s fastest browser that I used. (tried all latest browsers)
    Keep the good work on! :)

  23. Boris wrote on March 10th, 2010 at 20:14:

    Dan, there will be just as many JS engines as before. In the short term, there will be three parts to the Mozilla JS engine: interpreter, method jit, trace jit. Long-term there will be at least method jit and trace jit; whether the interpreter sticks around is unknown.

    And yes, the part being used from Webkit is a very small part of the entire JS engine.

  24. Erik Harrison wrote on March 10th, 2010 at 23:17:

    If any armchair hackers are still reading through the comments here, a coincidental glance of the commit logs and some bugzilla tickets made it look like it was possible to build a methodjit/tracejit Firefox. It is, so I did.

    Here are some Sunspider numbers on my super dinky Dell Latitude, running Xubuntu:

    Firefox 3.6: 88.14runs/s
    (http://dromaeo.com/?id=96279)

    Methodjit Firefox: 36.55runs/s
    (http://dromaeo.com/?id=96271)

    Methodjit Firefox +Tracing: 126.10runs/s
    (http://dromaeo.com/?id=96272)

    That’s a 50% improvement for those of you playing along at home, although I’m sure there are dozens of caveats I’m not aware of.

    If you want to get into measuring contests with other browsers.

    Chrome unstable: 288.11runs/s
    http://dromaeo.com/?id=96273

    Yes, V8 is still doing really well, but if you look at the individual benchmarks you’ll see that for many tasks Firefox is faster. This methodjit stuff seems to be retaining that advantage while flattening the curve elsewhere. That’s awesome, and I look forward to even more goodness in the future. Thanks everyone!

  25. Brandon jones wrote on March 13th, 2010 at 15:17:

    Correct me if I’m wrong here, but isn’t the whole point of open source having the ability to use code from anywhere for whatever reason you want? Why reinvent the wheel just because the wheel was made by a competitor? It makes more sense to use what you can and tune it as needed for your purposes, then to just start from scratch for everything.

  26. njn wrote on March 14th, 2010 at 21:32:

    A question for you concerning TraceMonkey; does either nanojit or nitro do optimizations such as dead code removal and unswitching? Or is that your responsibility before passing it in to them?

    Nanojit does dead code removal, CSE, some constant folding, and some dead store elimination. I don’t know what unswitching is.

  27. Boris wrote on March 15th, 2010 at 06:33:

    Andrew, nanojit does dead-code removal for sure. Not sure what other optimizations it does, but there are some.

    The Nitro assembler is just an assembler. It does no optimization at all, as I understand. Not sure how register allocation works in that setup; one of the folks who’ve been working with it would know better.

  28. Pingback from David Mandelin's blog » JägerMonkey & Nitro Components on March 15th, 2010 at 17:30:

    […] is complicated stuff and it’s hard to get right. So Chris Blizzard made a post correcting some of the misconceptions. I thought it might also be easier to see what we’re doing in a picture of the major system […]

  29. Stifu wrote on March 16th, 2010 at 07:04:

    @Brandon jones
    “Correct me if I’m wrong here, but isn’t the whole point of open source having the ability to use code from anywhere for whatever reason you want? (…)”

    Yeah, that’s why they borrowed some parts from Webkit for JägerMonkey. For the rest, they have ideas that haven’t been explored yet (doing trace JIT + method JIT, rather than just one of them), which is why they keep going their own way. What’s wrong with that?
    They’re neither reinventing the wheel nor ignoring open source code made available by competitors, they’re cherry picking good ideas from others in order to create something that has the potential to be better than anything currently available. Nothing to complain about, IMO.

  30. Dissertations wrote on June 12th, 2011 at 02:43:

    Good post indeed. As I understand it, the idea is to run the Nitro on *all* javascript because it is so cheap, because the native code that Nitro produces will run somewhat faster than interpreting bytecode would have been. The tracing engine then kicks in on the hot-spots in that code, and optimizes them much more aggressively. So you get reasonably-speedy, cheap-to-generate code for all javascript, and you invest the system resource time to get very-fast code for the few bits where most of the time is spent. Thanks, Steve

Comments are closed for this article.