This is the fifth part in a series on WebAssembly and what makes it fast. If you haven’t read the others, we recommend starting from the beginning.
For example, the team working on React could replace their reconciler code (aka the virtual DOM) with a WebAssembly version. People who use React wouldn’t have to do anything… their apps would work exactly as before, except they’d get the benefits of WebAssembly.
The reason developers like those on the React team would make this swap is because WebAssembly is faster. But what makes it faster?
This diagram gives a rough picture of what the start-up performance of an application might look like today.
Each bar shows the time spent doing a particular task.
- Parsing—the time it takes to process the source code into something that the interpreter can run.
- Compiling + optimizing—the time that is spent in the baseline compiler and optimizing compiler. Some of the optimizing compiler’s work is not on the main thread, so it is not included here.
- Re-optimizing—the time the JIT spends readjusting when its assumptions have failed, both re-optimizing code and bailing out of optimized code back to the baseline code.
- Execution—the time it takes to run the code.
- Garbage collection—the time spent cleaning up memory.
One important thing to note: these tasks don’t happen in discrete chunks or in a particular sequence. Instead, they will be interleaved. A little bit of parsing will happen, then some execution, then some compiling, then some more parsing, then some more execution, etc.
This means there’s still room for improvement.
How does WebAssembly compare?
Here’s an approximation of how WebAssembly would compare for a typical web application.
There are slight variations between browsers in how they handle all of these phases. I’m using SpiderMonkey as my model here.
This isn’t shown in the diagram, but one thing that takes up time is simply fetching the file from the server.
This means it takes less time to transfer it between the server and the client. This is especially true over slow networks.
Browsers often do this lazily, only parsing what they really need to at first and just creating stubs for functions which haven’t been called yet.
From there, the AST is converted to an intermediate representation (called bytecode) that is specific to that JS engine.
In contrast, WebAssembly doesn’t need to go through this transformation because it is already an intermediate representation. It just needs to be decoded and validated to make sure there aren’t any errors in it.
Compiling + optimizing
Different browsers handle compiling WebAssembly differently. Some browsers do a baseline compilation of WebAssembly before starting to execute it, and others use a JIT.
Either way, the WebAssembly starts off much closer to machine code. For example, the types are part of the program. This is faster for a few reasons:
- The compiler doesn’t have to spend time running the code to observe what types are being used before it starts compiling optimized code.
- The compiler doesn’t have to compile different versions of the same code based on those different types it observes.
- More optimizations have already been done ahead of time in LLVM. So less work is needed to compile and optimize it.
Sometimes the JIT has to throw out an optimized version of the code and retry it.
This happens when assumptions that the JIT makes based on running code turn out to be incorrect. For example, deoptimization happens when the variables coming into a loop are different than they were in previous iterations, or when a new function is inserted in the prototype chain.
There are two costs to deoptimization. First, it takes some time to bail out of the optimized code and go back to the baseline version. Second, if that function is still being called a lot, the JIT may decide to send it through the optimizing compiler again, so there’s the cost of compiling it a second time.
In WebAssembly, things like types are explicit, so the JIT doesn’t need to make assumptions about types based on data it gathers during runtime. This means it doesn’t have to go through reoptimization cycles.
However, most developers don’t know about JIT internals. Even for those developers who do know about JIT internals, it can be hard to hit the sweet spot. Many coding patterns that people use to make their code more readable (such as abstracting common tasks into functions that work across types) get in the way of the compiler when it’s trying to optimize the code.
Plus, the optimizations a JIT uses are different between browsers, so coding to the internals of one browser can make your code less performant in another.
In addition, WebAssembly was designed as a compiler target. This means it was designed for compilers to generate, and not for human programmers to write.
Since human programmers don’t need to program it directly, WebAssembly can provide a set of instructions that are more ideal for machines. Depending on what kind of work your code is doing, these instructions run anywhere from 10% to 800% faster.
This can be a problem if you want predictable performance, though. You don’t control when the garbage collector does its work, so it may come at an inconvenient time. Most browsers have gotten pretty good at scheduling it, but it’s still overhead that can get in the way of your code’s execution.
At least for now, WebAssembly does not support garbage collection at all. Memory is managed manually (as it is in languages like C and C++). While this can make programming more difficult for the developer, it does also make performance more consistent.
- executing often takes less time because there are fewer compiler tricks and gotchas that the developer needs to know to write consistently performant code, plus WebAssembly’s set of instructions are more ideal for machines.
- garbage collection is not required since the memory is managed manually.
There are some cases where WebAssembly doesn’t perform as well as expected, and there are also some changes on the horizon that will make it faster. I’ll cover those in the next article.
About Lin Clark