Compiling to WebAssembly: It’s Happening!

WebAssembly is a new binary format for compilation to the web. It is in the process of being designed and implemented as we speak, in collaboration among the major browser vendors. Things are moving quickly! In this post we’ll show some of our recent progress with a deep dive into the toolchain side of WebAssembly.

For WebAssembly to be usable, we need two major components: toolchains that compile code into WebAssembly, and browsers that can execute that output. Both of those components depend on progress in finishing the WebAssembly spec, but otherwise are largely separate engineering efforts. This separation is a good thing, as it will enable compilers to emit WebAssembly that runs in any browser, and browsers to run WebAssembly no matter which compiler generated it; in other words, it allows multiple toolchains and multiple browsers to work together, improving user choice. The separation also allows work on the two components to proceed in parallel right now.

A new project on the toolchain side of WebAssembly is Binaryen. Binaryen is a compiler infrastructure library for WebAssembly, written in C++. If you’re not working on a WebAssembly compiler yourself, you’ll probably never need to know anything about it, but if you use a WebAssembly compiler then it might use Binaryen for you under the hood; we’ll see examples of that later.

At Binaryen’s core is a modular set of classes that can parse and emit WebAssembly, as well as represent it in an AST designed for writing flexible transformation passes on. Built on top of that are several useful tools:

  • The Binaryen shell, which can load a WebAssembly module, transform it, execute it in an interpreter, print it, etc. Loading and printing use WebAssembly’s current temporary s-expression format, which has the suffix .wast (work is underway on designing the WebAssembly binary format, as well as the final text format, but they aren’t ready yet).
  • asm2wasm, which compiles asm.js into WebAssembly.
  • wasm2asm, which compiles WebAssembly into asm.js. (This is a work in progress.)
  • s2wasm, which compiles .s files, in the format emitted by the new WebAssembly backend being developed in LLVM, to WebAssembly.
  • wasm.js, a port of Binaryen itself to JavaScript. This lets us run all the above components on a web page or any other JavaScript environment.

For a general overview of Binaryen, you can see these slides from a talk I recently gave. Don’t skip slide #9 :)

It’s important to note that WebAssembly is still in the design phase, and the formats that Binaryen can read and write (.wast, .s) are not final. Binaryen has been constantly updating with those changes; the rate of churn is decreasing, but expect breakage.

Let’s discuss some of the specific areas where Binaryen can be helpful.

Compiling to WebAssembly using Emscripten

Emscripten can compile C and C++ to asm.js, and Binaryen’s asm2wasm tool can compile asm.js to WebAssembly, so together Emscripten+Binaryen provide a complete way to compile C and C++ to WebAssembly. You can run asm2wasm on asm.js code directly (it can be run on the commandline), but it’s easiest to let Emscripten do it for you, using something like

emcc file.cpp -o file.js -s ‘BINARYEN=”path-to-binaryen”’

Emscripten will compile file.cpp, and emit a main JavaScript file and a separate file for the WebAssembly output, in .wast format. Under the hood, Emscripten compiles to asm.js, then runs asm2wasm on the asm.js file to produce the .wast file. For more details, see the Emscripten wiki page on WebAssembly.

But wait, what good is it to compile to WebAssembly when browsers don’t support it yet? Good question :) Yes, we don’t want to ship this code since browsers can’t run it. But it is still very useful for testing purposes: we want to know that Emscripten can compile properly to WebAssembly as soon as we can, since we don’t want to wait on browser support.

But how can we check that Emscripten is in fact compiling properly to WebAssembly, if we can’t run it? For that, we can use wasm.js, which Emscripten integrated into our output .js file when we ran that emcc command before. wasm.js contains portions of Binaryen compiled to JavaScript, including the Binaryen interpreter. If you run file.js (in node.js, or on a web page) then what happens is the interpreter will execute that WebAssembly. That lets us actually verify that the compiled WebAssembly code does the right thing. You can see an example of such a compiled program here, and there are some more builds for testing purposes in the build suite repo.

Of course, we are not quite on as solid ground as we would like, given this weird testing environment: a C++ program compiled to WebAssembly, running in a WebAssembly interpreter itself compiled from C++ to JavaScript, and no other way to run the program yet. But we have a few reasons to be confident in the results:

  • This output passes the Emscripten test suite. That includes many real-world codebases (Python, zlib, SQLite, etc.) as well as lots of unit tests for corner cases in C and C++. Experience has shown that when that test suite is passed, it’s very likely that other code will work too.
  • The Binaryen interpreter passes the WebAssembly spec test suite, indicating that it is running WebAssembly properly. In other words, when browsers get native support, they should run it in the same way (except much faster! this code is running in a simple intepreter for testing purposes, so it’s very slow; but note that there is work in progress on fast ways to polyfill).
  • This output was generated using Emscripten, which is a stable compiler used in production, and a relatively small amount of code on top of that in Binaryen (just a few thousand lines). The less new code, the less risk of bugs.

Overall, this indicates that we are in good shape here, and can compile C and C++ to WebAssembly today using Emscripten + Binaryen, even if browsers can’t run it yet.

Note that aside from emitting WebAssembly, the builds that we emit in this mode use everything else from the Emscripten toolchain normally: Emscripten’s port of the musl libc and syscalls to access it, OpenGL/WebGL code, browser integration code, node.js integration code, and so forth. As a result, this supports everything Emscripten already does, and existing projects using Emscripten can switch to emitting WebAssembly with just the flip of a switch. This is a key part of letting existing C++ projects that compile to the web benefit from WebAssembly when it launches, with little or no effort on their part.

Using the new experimental LLVM WebAssembly backend with Emscripten

We just saw an important milestone for Emscripten, in that it can compile to WebAssembly and even test that we get valid output. But things don’t stop there: that was using Emscripten’s current asm.js compiler backend, together with asm2wasm. There is a new LLVM backend for WebAssembly in development directly in the upstream LLVM repository, and while it isn’t ready for general use yet, in the long term it will be very important. Binaryen has support for that too.

The LLVM backend, like most LLVM backends, emits assembly code, in this case in a specific .s format. That output is close to WebAssembly, but not identical – it looks more like the output of a C compiler (linear list of instructions, one instruction per line, etc.) rather than WebAssembly’s more structured AST. The .s file can be translated into WebAssembly in a fairly straightforward way, though, and Binaryen includes s2wasm, a tool that translates .s to WebAssembly. It can be run standalone on the commandline, but also has Emscripten integration support: Emscripten now has a WASM_BACKEND option, which you can use like this:

emcc file.cpp -o file.js -s ‘BINARYEN=”path-to-binaryen”’ -s WASM_BACKEND=1

(Note that you also need the BINARYEN option, as s2wasm is part of Binaryen.) When that option is provided, Emscripten uses the new WebAssembly backend instead of the existing asm.js one. After calling the backend and receiving .s from it, Emscripten calls s2wasm to convert that to WebAssembly. Some examples of programs you can build with the new backend are on the Emscripten wiki.

There are, therefore, two ways to compile to WebAssembly using Binaryen: Emscripten + asm.js backend + asm2wasm, which works right now and should be fairly robust and reliable, and Emscripten + new WebAssembly backend + s2wasm, which is not yet fully functional, but as the WebAssembly backend matures it should become a powerful option, and hopefully will replace the asm.js backend in the future. The goal is to make that transition seamless: flipping between the two WebAssembly modes is just a matter of setting an option, as we saw.

The same is also true between asm.js and WebAssembly support in Emscripten, which is also just an option you can set, and the transition there should be seamless as well. In other words, there will be a straight and simple path from

  • using Emscripten to emit asm.js today, to
  • using it to emit WebAssembly using asm2wasm (possible today, but browsers can’t run it yet), to
  • using it to emit WebAssembly using the new LLVM backend (once the backend is ready).

Each step should provide substantial benefits, with no extra effort for developers.

In closing, note that while this post focused on using Binaryen with Emscripten, at its core it is designed to be a general-purpose WebAssembly library in C++: If you want to write something toolchain-related with WebAssembly, you probably need code to read WebAssembly, print it out, an AST to operate on, etc., which Binaryen provides. It was very useful in writing asm2wasm, s2wasm, etc., and hopefully other projects will find it useful as well.

About Alon Zakai

Alon is on the research team at Mozilla, where he works primarily on Emscripten, a compiler from C and C++ to JavaScript. Alon founded the Emscripten project in 2010.

More articles by Alon Zakai…


21 comments

  1. Juan Linietsky

    Great news about compiling! Can’t wait for testing it in actual browsers without having to compile them myself :P

    December 17th, 2015 at 11:35

  2. Etiene

    omg omg omg!!! I’ve been waiting for this!! Been so excited! Ever since I heard about webassembly I can’t wait to ship the Lua interpreter to a browser :D Great news, thank you!

    December 17th, 2015 at 13:19

  3. Robin

    Does anyone at all remember the old website assembler.org of circa 2000-2001? It offered several demos of low-level programming for Internet Explorer – referred to as “web assembly” – showcasing very fast graphics effects in the web browser, much faster than what both JavaScript and Flash of the time offered. The site offered several games and graphical demos, all very impressive and with perfect screen updates, no flickering, no jerkiness etc.

    When reading this article it feels as if Mozilla is as usual reinventing, not inventing.

    December 17th, 2015 at 13:27

  4. Tom Marsden

    This is really exciting, nice write-up!

    December 17th, 2015 at 16:18

  5. jiyinyiyong

    Looking forward to a mature ecosystem of WebAssembly and then compile my indentation-based language
    http://cirru.org/ to it :P

    December 17th, 2015 at 19:16

  6. BanMe

    the way back engine might be of assistance in proving this, if they catalogued the site that is.. I remember back in the day coding something like this that wrapped Win32 api into vbscript or javascript, might have to look this up now too.

    December 17th, 2015 at 20:56

  7. Sajid Qureshi

    The specification is hard to follow, due to wordings like “can be”, “shall be” etc., also a mix of future, present and past sentence syntax.

    For a specification, rigor has to be applied to avoid prolonging draft finalization.

    Gives a impression of too much being deferred, especially without concrete outline of features relevant for a execution environment.

    Dividing the run time (address space) into a 32 and a 64 bit it not necessary. Too much speculation about data types sizes, sizes has to be defined and possible to query at run time for the code to adapt.

    Shared memory support can not be deferred without later major headaches and should be uniform across 32, 64 bit and cloud run times / address spaces – catering for split executions without resorting to network endpoints or pipes.

    Across thread communication with Message passing will severely degrade co-routines and multi-threading data processing performance.

    For IO purposes, consider memory mapping IO, Unix like IOCTL and sockets as part of the specification together a set of libraries implementors has to customize and/or adapt.

    December 18th, 2015 at 03:08

  8. About Time

    It’s about time! I’ve been watching the whole “compile to JS” movement in horror for some time now.

    December 18th, 2015 at 06:25

  9. Steve Naidamast

    Oh, here we go again…

    We started using intepretive code in the 1980s. Then we began using binaries back in the 1990s. Then Java came about followed by .NET and we were back to interpretive code.

    Now Microsoft has announced .NET Native, which is a compiler and here we have Mozilla with WebAssemblies.

    The more things change the more they stay the same. We should have stayed with compiled binaries. They are faster and more secure. But no! Everyone had to use all the new cool tools…

    December 18th, 2015 at 08:43

    1. Alon Zakai

      @Steve

      To be clear, this isn’t a Mozilla-only project. As mentioned in the article, it’s all browser vendors: Mozilla, Google, Microsoft, and Apple.

      And yes, there might be some deja vu from history here. But browser vendors are appreciating that there are just benefits from a binary format that the web can’t get otherwise. So yes, binaries have been done before, but it makes sense to do them now for the web platform. Of course, while keeping the web cross-vendor, portable, open, and secure – we just want the benefits of binaries, not the downsides.

      December 18th, 2015 at 09:11

  10. Steve Naidamast

    @Alon Zakai

    It doesn’t matter who is doing this process. If you watch the trends I often do you will find that there are indications that the field is very slowly beginning to look to go to executable binaries again. This is not a big deal for either the .NET or Java platforms.

    However, you cannot make a binary cross-platform compatible but web-applications it doesn’t matter since these components will only operate on the server-side.

    In any case, this is being done for profits. Eventually, you will be seeing advertisements for the advantages of using compiled executable so everyone will have to get ready and move to a “new” platform yet again.

    Being in the field as long as I have been you get to see that the trends always repeat themselves. Besides, the field has become a bloody mess since 2010…

    December 18th, 2015 at 12:19

  11. Jim

    The open source world is having its ActiveX moment

    December 19th, 2015 at 01:21

  12. hutzlibu

    @Steve Naidamast
    Well, even though I am not “as long in the field” as you are, I think it is more like spiral stairs …

    Lot’s of deja vu, but on a higher (or at least, different) level.

    The web has become a very important place, so naturally it is wise to make it as fast as possible.
    And it doesn’t really matter, when there are in theory better ways of doing things, when you want to get things done in the real life, now.

    So nobody claims, you should target wasm for everything, but for those who want to target the web – they will get an even more powerful tool, soon – to which I am looking forward, with a smile.

    So to Alon Zakai and the others involved: thank you for your great work!

    December 19th, 2015 at 01:33

  13. Francis Kim (@franciskim_co)

    Can’t wait!

    December 19th, 2015 at 19:34

  14. BaasBartels

    @Robin not sure if you’re trolling or being serious, however the according to the info page on assembler.org, the website and the demo’s showcased were built using “dynamic DHTML” (sic) a.k.a. html + javascript + css + dom. They were also meant to be cross-platform between IE4+ and Netscape 4+.
    (https://web.archive.org/web/20010622081048/http://assembler.org/XLAT/info/index.html)

    December 20th, 2015 at 01:03

  15. Dusty

    I’m really interested in this tech, this could really advance the whole ecosystem, please make it happen! :-)

    December 21st, 2015 at 01:19

  16. Jim Lonero

    How well does this handle pointers, the STD library classes, and smart pointers?

    December 21st, 2015 at 09:12

    1. Alon Zakai

      All of those should work correctly, and are in the test suite that we pass.

      For STD library classes, smart pointers, etc., they work because we use the existing stable Emscripten standard libraries.

      December 21st, 2015 at 10:15

  17. JS

    I love it!

    December 28th, 2015 at 06:44

  18. Ravenmetrix

    We’re excited on this major update! I’ve already asked my technical team to check Emscripten and how WebAssembly can affect / benefit our current and future projects

    January 2nd, 2016 at 07:32

  19. AndyX

    What a mess! It’s as if web development is imploding. Businesses need SOME stability in order to operate. Potential developers need SOME idea of what to learn. How is anyone going to be able to plan for their personal our business life when everything is in such chaos?!

    January 6th, 2016 at 19:24

Comments are closed for this article.