Mozilla

WebGL Articles

Sort by:

View:

  1. Unity games in WebGL: Owlchemy Labs’ conversion of Aaaaa! to asm.js

    You may have seen the big news today, but for those who’ve been living in an Internet-less cave, starting today through October 28 you can check out the brand spankin’ new Humble Mozilla Bundle. The crew here at Owlchemy Labs were given the unique opportunity to work closely with Unity, maker of the leading cross-platform game engine, and Humble to attempt to bring one of our games, Aaaaa! for the Awesome, a collaboration with Dejobaan Games, to the web via technologies like WebGL and asm.js.

    I’ll attempt to enumerate some of the technical challenges we hit along the way as well as provide some tips for developers who might follow our path in the future.

    Unity WebGL exporter

    Working with pre-release alpha versions of the Unity WebGL exporter (now in beta) was a surprisingly smooth experience overall! Jonas Echterhoff, Ralph Hauwert and the rest of the team at Unity did an amazing job getting the core engine running with asm.js and playing Unity content in the browser at incredible speeds; it was pretty staggering. When you look at the scope of the problem and the technical magic needed to go all the way from C# scripting down to the final 1-million-plus-line .js file, the technology is mind boggling.

    Thankfully, as content creators and game developers, Unity has allowed us to focus our worries away from the problem of getting our games to compile in this new build target by taking care of the heavy lifting under the hood. So did we just hit the big WebGL export button and sit back while Unity cranked out the html and js? Well, it’s a bit more involved than that, but it’s certainly better than some of the prior early-stage ports we’ve done.

    For example, our experience with bringing a game through the now defunct Unity to Stage3D/Flash exporter during the Flash in a Flash contest in late 2011 was more like taking a machete to a jungle of code, hacking away core bits, working around inexplicably missing core functionality (no generic lists?!) and making a mess of our codebase. WebGL was a breeze comparatively!

    The porting process

    Our porting process began in early June of this year when we gained alpha access to the WIP WebGL exporter to prove whether a complex game like Aaaaa! for the Awesome was going to be portable within a relatively short time frame with such an early framework. After two days of mucking about with the exporter, we knew it would be doable (and had content actually running in-browser!) but as with all tech endeavors like this, we were walking in blind as to the scope of the entire port that was ahead of us.

    Would we hit one or two bugs? Hundreds? Could it be completed in the short timespan we were given? Thankfully we made it out alive and dozens of bug reports and fixes later, we have a working game! Devs jumping into this process now (October 2014 and onward) fortunately get all of these fixes built in from the start and can benefit from a much smoother pipeline from Unity to WebGL. The exporter has improved by a huge amount since June!

    Initial issues

    We came across some silly issues that were either caused by our project’s upgrade from Unity 4 to Unity 5 or simply the exporter being in such “early days”. Fun little things such as all mouse cursor coordinates being inverted inexplicably caused some baffled faces but of course has been fixed at the time of writing. We also hit some physics-related bugs that turned out to have been caused by the Unity 4 to Unity 5 upgrade — this led to a hilarious bug where players wouldn’t smash through score plates and get points but instead slammed into score plates as if they were made of concrete, instantly crushing the skydiving player. A fun new feature!

    Additionally, we came across a very hard-to-track-down memory leak bug that only exhibited itself after playing the game for an extended session. With a hunch that the leak revolved around scene loading and unloading, we built a hands-off repro case that loaded and unloaded the same scene hundreds of times, causing the crash and helping the Unity team find and fix the leak! Huzzah!

    Bandwidth considerations

    Above examples are fun to talk about but have essentially been solved by this point. That leaves developers with two core development issues that they’ll need to keep in mind when bringing games to the Web: bandwidth considerations, and form factor / user experience changes.

    Aaaaa! Is a great test case for a worst case scenario when it comes to file size. We have a game with over 200 levels or zones, over with 300 level assets that can be spawned at runtime in any level, 48 unique skyboxes (6 textures per sky!), and 38 full-length songs. Our standalone PC/Mac build weighs in at 388mb uncompressed. Downloading almost 400 megabytes to get to the title screen of our game would be completely unacceptable!

    In our case, we were able to rely on Unity’s build process to efficiently strip and pack the build into a much smaller size, but also took advantage of Unity’s AudioClip streaming solution to stream in our music at runtime on demand! The file size savings of streaming music was huge and highly recommended for all Unity games. To glean additional file size savings, Asset Bundles can be used for loading levels on demand, but are best used in simple games or when building games from the ground up with web in mind.

    In the end, our final *compressed* WebGL build size, which includes all of our loaded assets as well as the Unity engine itself ended up weighing in at 68.8 MB, compared to a *compressed* standalone size of 192 MB, almost 3x smaller than our PC build!

    Form factor/user experience changes

    User experience considerations are the other important factor to keep in mind when developing games for the Web or porting existing games to be fun, playable Web experiences. Examples of keeping the form factor of the Web include avoiding “sacred” key presses, such as Escape. Escape is used as pause in many games but many browsers eat up the Escape key and reserve it for exiting full-screen mode or releasing mouse lock. Mouse lock and full-screen are both important to creating fully-fledged gaming experiences on the web so you’ll want to find a way to re-bind keys to avoid these special key presses that are off-limits when in the browser.

    Secondly, you’ll want to remember that you’re working within a sandboxed environment on the Web so loading in custom music from the user’s hard drive or saving large files locally can be problematic due to this sandboxing. It might be worth evaluating which features in your game you might want to be modified to fit the Web experience vs. a desktop experience.

    Players also notice the little things that key someone into a game being a rushed port. For example, if you have a quit button on the title screen of your PC game, you should definitely remove it in your web build as quitting is not a paradigm used on the Web. At any point the user can simply navigate away from the page, so watch out for elements in your game that don’t fit the current web ecosystem.

    Lastly you’ll want to think about ways to allow your data to persist across multiple browsers on different machines. Gamers don’t always sit on the same machine to play their games, which is why many services allow for cloud save functionality. The same goes for the Web, and if you can build a system (like the wonderfully talented Edward Rudd created for the Humble Player, it will help the overall web experience for the player.

    Bringing games to the Web!

    So with all of that being said, the Web seems like a very viable place to be bringing Unity content as the WebGL exporter solidifies. You can expect Owlchemy Labs to bring more of their games to the Web in the near future, so keep an eye out for those! ;) With our content running at almost the same speed as native desktop builds, we definitely have a revolution on our hands when it comes to portability of content, empowering game developers with another outlet for their creative content, which is always a good thing.

    Thanks to Dejobaan Games, the team at Humble Bundle, and of course the team at Unity for making all of this possible!

  2. Blend4Web: the Open Source Solution for Online 3D

    Half year ago Blend4Web was first released publicly. In this article I’ll show what Blend4Web is, how it is evolved and and how it can be used for web development.

    What Is Blend4Web?

    In short, Blend4Web is an open source framework for creating 3D web applications. It uses Blender – the popular open source 3D modeling suite – as the primary authoring tool. 3D graphics is rendered by means of WebGL which is also an open standard technology. The two main keywords here – Blender and Web(GL) – explain the purpose of this engine perfectly.

    The full source code of Blend4Web together with some usage examples is available under GPLv3 on GitHub (there is also a commercial licensing option).

    The 3D Web

    On June the 2nd Apple presented their new operating systems – OS X Yosemite and iOS 8 – both featuring WebGL support in their Safari browser. That marked the end of a 5 year cycle during which the WebGL technology has been evolving, starting with the first unstable browser builds (if anybody remembers Firefox 3.7 alpha?). Now, all the major browsers on all desktop and mobile systems support this open standard for rendering 3D graphics, everywhere, without any plugins.

    That was a long and difficult road, along which Blend4Web development has been following WebGL development as a shadow. Broken rendering, tab crashes, security “warnings” from some big guys, unavailability in public browser builds, all sorts of fears, uncertainty and doubts. All this didn’t matter, because we have the opportunity to do 3D graphics (and sound) in browsers!

    Blender

    The first Blender 2.5x builds appeared in summer 2010. At the time we, the programming geeks, were pushed to learn the basics of 3D modeling by the beautiful Sintel from the open source movie of the same name. After choosing Blender, we could be as independent as possible, with a full open source pipeline organized on a Linux platform. Blender gave us the power to make our own 3D scenes, and later helped to attract talanted artists from its wonderful community to join us.

    Blend4Web Evolution in Demos

    Our demo scenes matured together with the development of Blend4Web. The first one was a quite low-poly and almost non-interactive demo called The Island. It was created in 2011 and polished a bit before the public release. In this demo we introduced our Blender-based pipeline in which all the assets are stored in separate files and are linked into the main file for level design and further exporting (for this reason some of Blend4Web users call it “free Unity Pro”).

    In Fashion Show we developed cloth animation techniques. Some post-processing effects, dynamic reflection and particle systems were added later. After Blend4Web has gone public we summarized these cloth-releated tricks in one of our tutorials.

    The Farm is a huge scene (in the scale of a browser): over 25 hectares of land, buildings, animated animals and foliage. We added some gamedev elements into it, including the ability of first-person walking, interacting with objects, driving a vehicle. The demo features spatial audio (via Web Audio) and physics (via Bullet and Asm.js). The Freedesktop folks tried it as a benchmark while testing the Mesa drivers (and got “massive crashes” :).

    We also tried some visualization and created Nature Morte. In this scene we used carefully crafted textures and materials, as well as post-processing effects to improve realism. However, the technology used for this demo was
    quite simple and old-school, as we had no support for visual shader editing yet.

    Things have changed when Blender’s node materials have become available to our artists. They created over 40 different materials for the Sports Car model: chromed metal, painted metal, glass, rubber, leather etc.

    In our latest release we stepped even further by adding support for the animation control by the user. Now interactivity can be implemented without any coding. In order to demonstrate the new opening possibilities we presented interactive infographic of a light helicopter.

    Among the other possible applications of this simple yet effective tool (called NLA Script) we can list the following: interactive 3D web design, product promotions, learning materials, cartoons with the ability to choose between different story lines, point-and-click games and any other applications previously created with Flash.

    Using Blend4Web

    It is very easy to start using Blend4Web – just download and install the Blender addon as shown in this video tutorial:

    The most wonderful thing is that your Blender scene can be exported into a self-contained HTML file that can be emailed, uploaded to your own website or to a cloud – in short shared however you like. This freedom is a fundamental difference from numerous 3D web publishing services as we don’t lock our users to our technology by any means.

    For those who want to create highly interactive 3D web apps we offer the SDK. Some notable examples of what is possible with the Blend4Web API are demonstrated in our programming tutorials, ranging from web design to games.

    Programming 3D web apps with Blend4Web is not much harder than building average RIAs. Unlike some other WebGL frameworks in the wild we tried to offload all graphics, animation and audio tasks to respective professionals. The programmer just loads the scene…

    var m_data = require("data");
    m_data.load("example.json", load_cb);

    …and then writes the logic which triggers the 3D scene changes that are “hard-coded” by the artists, e.g. plays the animation for the user-selected object:

    var m_scenes = require("scenes");
    var m_anim = require("animation");
     
    var myobj = m_scenes.pick_object(event.clientX, event.clientY);
    m_anim.apply_def(myobj);
    m_anim.play(myobj);

    As you can see the APIs are structured in a CommonJS way which we believe is important for creating compact and fast web apps.

    The Future

    There are many possible ways in which the Internet and IT will be going but there is no doubt that the strong and steady development of 3D Web is already here. We expect that more and more users will change their expectations about how web content should look and feel like. We’re gonna help the web developers meet these demands with plans to improve usability and performance and to implement new interesting graphics effects.

    We also follow the development of WebGL 2.0 (thanks Mozilla for your job) and expect to create even more nice things on top of it.

    Stay Tuned

    Read our blog, join us on Twitter, Google+, Facebook and Reddit, watch the demos and tutorials on our YouTube channel, fork Blend4Web at GitHub.

  3. Inside the Party Bus: Building a Web App with Multiple Live Video Streams + Interactive Graphics

    Gearcloud Labs is exploring the use of open technologies to build new kinds of shared video experiences. Party Bus is a demo app that mixes multiple live video streams together with interactive graphics and synchronized audio. We built it using a combination of node.js, WebSockets, WebRTC, WebGL, and Web Audio. This article shares a few things we learned along the way.

    User experience

    First, take a ride on the Party Bus app to see what it does. You need Firefox or Chrome plus a decent GPU, but if that’s not handy you can get an idea of the app by watching the example video on YouTube.

    Since the app uses WebRTC getUserMedia(), you have to give permission for the browser to use your camera. After it starts, the first thing you’ll notice is your own video stream mounted to a seat on the 3D bus (along with live streams from any other concurrent riders). In most scenes, you can manipulate the bus in 3D using the left mouse (change camera angle), scroll wheel (zoom in/out), and right mouse (change camera position). Also try the buttons in the bottom control bar to apply effects to your own video stream: from changing your color, to flipping yourself upside down, bouncing in your seat, etc.

    How party bus uses WebRTC

    Party Bus uses WebRTC to set up P2P video streams needed for the experience. WebRTC does a great job supporting native video in the browser, and punching out firewalls to enable peer connections (via STUN). But with WebRTC, you also need to provide your own signaler to coordinate which endpoints will participate in a given application session.

    The Party Bus app uses a prototype platform we built called Mixology to handle signaling and support the use of dynamic peer topologies. Note that many apps can simply use peer.js, but we are using Mixology to explore new and scalable approaches for combining large numbers of streams in a variety of different connection graphs.

    For example, if a rider joins a bus that already has other riders, the system takes care of building the necessary connection paths between the new rider and peers on the same bus, and then notifying all peers through a WebSocket that the new rider needs to be assigned a seat.

    Specifically, clients interact with the Mixology signaling server by instantiating a Mixology object

    var m = new Mixology(signalerURL);

    and then using it to register with the signaler

    m.register(['mix-in'], ['mix-out']);

    The two arguments give specific input and output stream types supported by the client. Typing inputs and outputs in this way allows Mixology to assemble arbitrary graphs of stream connections, which may vary depending on application requirements. In the case of Party Bus, we’re just using a fully connected mesh among all peers. That is, all clients register with the same input and output types.

    The signaler is implemented as a node.js application that maintains a table of registered peers and the connections among them. The signaler can thus take care of handling peer arrivals, departures, and other events — updating other peers as necessary via callback functions. All communications between peers and the signaler are implemented internally using WebSockets, using socket.io.

    For example, when a new peer is registered, the server updates the topology table, and uses a callback function to notify other peers that need to know about the new connection.

    m.onPeerRegistered = function(peer) { ... }

    In this function, peers designated to send streams initiate the WebRTC offer code. Peers designated to receive streams initiate the WebRTC answer code (as well as provide a callback function onAddStream() to be used when the new input stream is ready).

    In the case of Party Bus, it’s then up to the app to map the new video stream to the right seat in the 3D bus model, and from then on, apply the necessary 3D transforms using three.js. Similarly, if a rider leaves the bus, the system takes care of notifying other clients that a peer has exited, so they can take appropriate action to remove what would otherwise be a dead video stream in the display.

    Party Bus organizes the “riders” on a bus using an array of virtual screens:

    var vsArray = new Array(NUM_SEATS);

    After registering itself with Mixology, the app receives a callback whenever a new peer video stream becomes available for its bus instance:

    function onAddStream(stream, peerId) {
        var i = getNextOpenSeat();
        vsArray[i] = new VScreen(stream, peerId);
    }

    The Party Bus app creates a virtual screen object for every video stream on the current bus. The incoming streams are associated with DOM video objects in the virtual screen constructor:

    function VScreen(stream, id) {
        var v = document.createElement(‘video’);
        v.setAttribute(“id”, “monitor:+ id);
        v.style.visibility = “hidden”;
        v.src = window.URL.createObjectURL(stream);  // binds stream to dom video object
        v.autoplay =true;
        document.body.appendChild(v);
    }

    Movie or app?

    Party Bus uses three.js to draw a double decker bus, along with virtual screens “riding” in the seats. The animation loop runs about two minutes, and consists of about a dozen director “shots”. Throughout the demo, the individual video screens are live, and can be manipulated by each rider. The overall sequence of shots is designed to change scene lighting and present other visual effects, such as bus thrusters which were created with the particle engine of Stemkoski.

    Party Bus is a web app, but the animation is programmed so the user can just let it run like a movie. The curious user may try to interact with it, and find that in most scenes it’s also possible to change the 3D view. However, in shots with a moving camera or bus, we found it necessary to block certain camera translations (movements in x, y, z position), or rotations (turning on x, y, z axis) — otherwise, the mouse will “fight” the program, resulting in a jerky presentation.

    But most of the fun in Party Bus is just hamming it up for the camera, applying visual effects to your own stream, and looking for other live riders on the bus.

    More info

    For more information on the Party Bus app, or to stay in the loop on development of the Mixology platform, please check out www.gearcloudlabs.com.

  4. Flambe Provides Support For Firefox OS

    Flambe is a performant cross-platform open source game engine based on the Haxe programming language. Games are compiled to HTML5 or Flash and can be optimized for desktop or mobile browsers. The HTML5 Renderer uses WebGL, but provides fallback to the Canvas tag and functions nicely even on low-end phones. Flash Rendering uses Stage 3D and native Android and iOS apps are packaged using Adobe AIR.

    Flambe provides many other features, including:

    • simple asset loading
    • scene management
    • touch support
    • complete physics library
    • accelerometer access

    It has been used to create many of the Nickelodeon games available at nick.com/games and m.nick.com/games. To see other game examples, and some of the other well-known brands making use of the engine, have a look at the Flambe Showcase.

    In the last few weeks, the developers of the Flambe engine have been working to add support for Firefox OS. With the 4.0.0 release of Flambe, it is now possible to take Flambe games and package them into publication-ready Firefox OS applications, complete with manifest.

    Firefox Marketplace Games

    To get an idea of what is possible with the Flambe engine on the Firefox OS platform, take a look at two games that were submitted recently to the Firefox Marketplace. The first — The Firefly Game written by Mark Knol — features a firefly that must navigate through a flock of hungry birds. The game’s use of physics, sound and touch are very effective.
    firefly

    The second game, entitled Shoot’em Down, tests the player’s ability to dodge fire while shooting down as many enemy aircraft as possible. The game was written by Bruno Garcia, who is the main developer of the Flambe engine. The source for this game is available as one of the engine’s demo apps.
    shootemup

    Building a Firefox OS App using Flambe

    Before you can begin writing games using the Flambe engine, you will need to install and setup a few pieces of software:

    1. Haxe. Auto installers are available for OSX, Windows and Linux on the download page.
    2. Node.js for building projects. Version 0.8 or greater is required
    3. A Java runtime.

    Once those prerequisites are met, you can run the following command to install Flambe:

    # Linux and Mac may require sudo
    npm install -g flambe
    flambe update

    This will install Flambe and you can begin writing apps with the engine.

    Create a Project

    To create a new project, run the following command.

    flambe new

    This will create a directory named whatever you supplied for ProjectName. In this directory you will have several files and other directories for configuring and coding your project. By default the new command creates a very simple project that illustrates loading and animating an image.

    A YAML (flambe.yaml) file within the project directory defines several characteristics of the project for build purposes. This file contains tags for developer, name and version of the app, and other project meta-data, such as description. In addition it contains the main class name that will be fired as the entry point to your application. This tag needs to be set to a fully qualified Haxe Class name. I.e., if you use a package name in your Haxe source file, you need to prepend the package name in this tag like this: packagename.Classname. (The default example uses urgame.Main.) You can also set the orientation for your app within the YAML file.

    Of specific note for Firefox OS developers, a section of the YAML file contains a partial manifest.webapp that can be altered. This data is merged into a complete manifest.webapp when the project is built.

    The main project folder also contains a directory for assets (images, sounds, animations, and particle effects files). The icons folder contains the icons that will be used with your app. The src folder contains the Haxe source code for your application.

    Build the Project

    Flambe provides a build method to compile your code to the appropriate output. To build the app run:

    flambe build <output>

    Where output is html, flash, android, ios, or firefox. Optionally you can add the –debug option to the build command, producing output more suitable for debugging. For Firefox OS this will produce non-minified JavaScript files. The build process will add a build directory to your application. Inside of the build directory a firefox directory will be created containing your Firefox OS app.

    Debug the Project

    You can debug your application in the Firefox App Manager. See Using the App Manager for details on installing and debugging using the App Manager. Within the App Manager you can add the built app using the Add Packaged App button and selecting the ProjectName/build/firefox directory. Debugging for other platforms is described in the Flambe documentation.
    appmanager
    The -debug option can provide additional insight for debugging and performance tuning. In addition to being able to step through the generated JavaScript, Flambe creates a source map that allows you to look look through the original Haxe files while debugging.
    debugsession
    To see the original Haxe files in the debugger, select the Debugger options icon in the far right corner of the debugger and choose Show Original Sources.
    sourcemap
    Also, when using the -debug option you can use a shortcut key (Ctrl + O) to initiate a view of your app that illustrates overdraw — this measures the number of times a pixel is being drawn in a frame. The brighter the pixel the more times it is being drawn. By reducing the amount of overdraw, you should be able to improve the performance of your game.
    overdraw

    A Bit about Haxe and Flambe

    Haxe is an object-oriented, class-based programing language that can be compiled to many other languages. In Flambe, your source code needs to be written using Haxe-specific syntax. Developers familiar with Java, C++ or JavaScript will find learning the language relatively straightforward. The Haxe website contains a reference guide that nicely documents the language. For editing, there are many options available for working with Haxe. I am using Sublime with the Haxe plugin.

    Flambe offers some additional classes that need to be used when building your app. To get a better understanding of these classes, let’s walk through the simple app that is created when you run the flambe new command. The Main.hx file created in the source directory contains the Haxe source code for the Main Class. It looks like this:

    package urgame;
     
    import flambe.Entity;
    import flambe.System;
    import flambe.asset.AssetPack;
    import flambe.asset.Manifest;
    import flambe.display.FillSprite;
    import flambe.display.ImageSprite;
     
    class Main
    {
      private static function main ()
      {
        // Wind up all platform-specific stuff
        System.init();
     
        // Load up the compiled pack in the assets directory named "bootstrap"
        var manifest = Manifest.fromAssets("bootstrap");
        var loader = System.loadAssetPack(manifest);
        loader.get(onSuccess);
      }
     
      private static function onSuccess (pack :AssetPack)
      {
        // Add a solid color background
        var background = new FillSprite(0x202020, System.stage.width, System.stage.height);
        System.root.addChild(new Entity().add(background));
     
        // Add a plane that moves along the screen
        var plane = new ImageSprite(pack.getTexture("plane"));
        plane.x._ = 30;
        plane.y.animateTo(200, 6);
        System.root.addChild(new Entity().add(plane));
      }
    }

    Haxe Packages and Classes

    The package keyword provides a way for classes and other Haxe data types to be grouped and addressed by other pieces of code, organized by directory. The import keyword is used to include classes and other Haxe types within the file you are working with. For example, import flambe.asset.Manifest will import the Manifest class, while import flambe.asset.* will import all types defined in the asset package. If you try to use a class that you have not imported into your code and run the build command, you will receive an error message stating that the particular class could not be found. All of the Flambe packages are documented on the Flambe website.

    Flambe Subsystem Setup and Entry point

    The main function is similar to other languages and acts as the entry point into your app. Flambe applications must have one main function and only one per application. In the main function the System.init() function is called to setup all the subsystems that will be needed by your code and the Flambe engine.

    Flambe Asset Management

    Flambe uses a dynamic asset management system that allows images, sound files, etc. to be loaded very simply. In this particular instance the fromAssets function defined in the Manifest class examines the bootstrap folder located in the assets directory to create a manifest of all the available files. The loadAssetPack System function creates an instance of the AssetPack based on this manifest. One of the functions of AssetPack is get, which takes a function parameter to call when the asset pack is loaded into memory. In the default example, the only asset is an image named plane.png.

    Flambe Entities and Components

    Flambe uses an abstract concept of Entities and Components to describe and manipulate game objects. An Entity is essentially just a game object with no defining characteristics. Components are characteristics that are attached to entities. For example an image component may be attached to an entity. Entities are also hierarchal and can be nested. For example, entity A can be created and an image could be attached to it. Entity B could then be created with a different image. Entity A could then be attached to the System root (top level Entity) and Entity B could then be attached to Entity A or the System root. The entity nest order is used for rendering order, which can be used to make sure smaller visible objects are not obscured by other game objects.

    Creating Entities and Components in the Sample App

    The onSuccess function in the default sample is called by the loader instance after the AssetPack is loaded. The function first creates an instance of a FillSprite Component, which is a rectangle defined by the size of the display viewport width and height. This rectangle is colored using the hex value defined in the first parameter. To actually have the FillSprite show up on the screen you first have to create an Entity and add the Component to it. The new Entity().add(background) method first creates the Entity and then adds the FillSprite Component. The entire viewport hierarchy starts at the System.root, so the addChild command adds this new Entity to the root. Note this is the first Entity added and it will be the first rendered. In this example this entity represents a dark background.

    Next the plane image is created. This is done by passing the loaded plane image to the ImageSprite Component constructor. Note that the AssetPack class’s getTexture method is being used to retrieve the loaded plane image. The AssetPack class contains methods for retrieving other types of Assets as well. For example, to retrieve and play a sound you would use pack.getSound("bounce").play();.

    Flambe Animated Data Types

    Flambe wraps many of the default Haxe data types in classes and introduces a few more. One of these is the AnimatedFloat class. This class essentially wraps a float and provides some utility functions that allow the float to be altered in a specific way. For example, one of the functions of the AnimatedFloat class is named animateTo, which takes parameters to specify the final float value and the time in which the animation will occur. Many components within the Flambe system use AnimatedFloats for property values. The plane that is loaded in the default application is an instance of the ImageSprite Component. Its x and y placement values are actually AnimatedFloats. AnimatedFloat values can be set directly but special syntax has to be used (value._).

    In the example, the x value for the ImageSprite is set to 30 using this syntax: plane.x._ = 30;. The y value for the ImageSprite is then animated to 200 over a 6 second period. The x and y values for an ImageSprite represent the upper left corner of the image when placed into the viewport. You can alter this using the centerAnchor function of the ImageSprite class. After this call, the x and y values will be in reference to the center of the image. While the default example does not do this, it could be done by calling plane.centerAnchor();. The final line of code just creates a new Entity, adds the plane Component to the Entity and then adds the new Entity to the root. Note that this is the second Entity added to the root and it will render after the background is rendered.

    Flambe Event Model

    Another area of Flambe that is important to understand is its event model. Flambe uses a signal system where the subsystems, Components and Entities have available signal properties that can be connected to in order to listen for a specific signal event. For example, resizing the screen fires a signal. This event can be hooked up using the following code.

    System.stage.resize.connect(function onResize() {
      //do something
    });

    This is a very nice feature when dealing with other components within apps. For example, to do something when a user either clicks on or touches an ImageSprite within your app you would use the following code:

    //ImageSprite Component has pointerDown signal property
    myBasketBallEntity.get(ImageSprite).pointerDown.connect(function (event) {
        bounceBall();
    });

    In this case the pointerDown signal is fired when a user either uses a mouse down or touch gesture.

    Demo Apps

    The Flambe repository also contains many demo apps that can be used to further learn the mechanics and APIs for the engine. These demos have been tested on Firefox OS and perform very well. Pictured below are several screenshots taken on a Geeksphone Keon running Firefox OS.
    colla

    Of particular note in the demos are the physics and particles demos. The physics demo uses the Nape Haxe library and allows for some very cool environments. The Nape website contains documentation for all the packages available. To use this library you need to run the following command:

    haxelib install nape

    The particle demo illustrates using particle descriptions defined in a PEX file within a Flambe-based game. PEX files can be defined using a particle editor, like Particle Designer.

    Wrapping Up

    If you are a current Flambe game developer with one or more existing games, why not use the new version of the engine to compile and package them for Firefox OS? If you are a Firefox OS developer and are looking for a great way to develop new games for the platform, Flambe offers an excellent means for developing engaging, performant games for Firefox OS–and many other platforms besides!

    And, if you are interested in contributing to Flambe, we’d love to hear from you as well.

  5. Lessons learnt building ViziCities

    Just over 2 weeks ago Peter Smart and Robin Hawkes released the first version of ViziCities to the world. It’s free to use and open-sourced under an MIT license.

    In this post I will talk to you about the lessons learnt during the development of ViziCities. From application architecture to fine-detailed WebGL rendering improvements, we learnt a lot in the past year and we hope that by sharing our experiences we can help others avoid the same mistakes.

    What is ViziCities?

    In a rather geeky nutshell, ViziCities is a WebGL application that allows you to visualise anywhere in the world in 3D. It’s primary purpose is to look at urban areas, though it’ll work perfectly fine in other places if they have decent OpenStreetMap coverage.

    Demo

    The best way to explain what ViziCities does is to try it out yourself. You’ll need a WebGL-enabled browser and an awareness that you’re using pre-alpha quality software. Click and drag your way around, zoom in using the mouse wheel, and rotate the camera by clicking the mouse wheel or holding down shift while clicking and dragging.

    You can always take a look at this short video if you’re unable to try the demo right now:

    What’s the point of it?

    We started the project for multiple reasons. One of those reasons is that it’s an exciting technical and design challenge for us – both Peter and I thrive on pushing ourselves to the limits by exploring something unknown to us.

    Another reason is that we were inspired by the latest SimCity game and how they visualise data about the performance of your city – in fact, Maxis, the developers behind SimCity reached out to us to tell us how much they like the project!

    There’s something exciting about creating a way to do that for real-world cities rather than fictional ones. The idea of visualising a city in 3D with up-to-date data about that city overlaid is an appealing one. Imagine if you could see census data about your area, education data, health data, crime data, property information, live transport (trains, buses, traffic, etc), you’d be able to learn and understand so much more about the place you live.

    This is really just the beginning – the possibilities are endless.

    Why 3D?

    A common question we get is “Why 3D?” – the short answer, beyond “because it’s a visually interesting way of looking at a city”, is that 3D allows you to do things and analyse data in ways that you can’t do in a 2D map. For example by using 3D you can take height and depth into consideration, so you can better visualise the sheer volume of stuff that lies above and below you in a city – things like pipes and underground tunnels, or bridges, overpasses, tall buildings, the weather, and planes! On a 2D map, looking at all of this would be a confusing mess, in 3D you get to see it exactly how it would look in the real world – you can easily see how objects within a city relate to each other.

    Core technology

    At the most basic level ViziCities is built using Three.js, a WebGL library that abstracts all the complexity of 3D rendering in the browser. We use a whole range of other technologies too, like Web Workers, each of which serves a specific purpose or solves a specific problem that we’ve encountered along the way.

    Let’s take a look at some of those now.

    Lessons learnt

    Over the past year we’ve come from knowing practically nothing about 3D rendering and geographic data visualisation, to knowing at least enough about each of them to be dangerous. Along the way we’ve hit many roadblocks and have had to spend a significant amount of time working out what’s wrong and coming up with solutions to get around them.

    The process of problem solving is one I absolutely thrive on, but it’s not for everybody and I hope that the lessons I’m about to share will help you avoid these same problems, allowing you to save time and do more important things with your life.

    These lessons are in no particular order.

    Using a modular, decoupled application architecture pays off in the long run

    We originally started out with hacky, prototypal experiments that were dependency heavy and couldn’t easily be pulled apart and used in other experiments. Although it allowed us to learn how everything worked, it was a mess and caused a world of pain when it came to building out a proper application.

    In the end we re-wrote everything based on a simple approach using the Constructor Pattern and the prototype property. Using this allowed us to separate out logic into decoupled modules, making everything a bit more understandable whilst also allowing us to extend and replace functionality without breaking anything else (we use the Underscore _.extend method to extend objects).

    Here’s an example of our use of the Constructor Pattern.

    To communicate amongst modules we use the Mediator Pattern. This allows us to keep things as decoupled as possible as we can publish events without having to know about who is subscribing to them.

    Here’s an example of our use of the Mediator Pattern:

    /* globals window, _, VIZI */
    (function() {
      "use strict";
     
      // Apply to other objects using _.extend(newObj, VIZI.Mediator);
      VIZI.Mediator = (function() {
        // Storage for topics that can be broadcast or listened to
        var topics = {};
     
        // Subscribe to a topic, supply a callback to be executed
        // when that topic is broadcast to
        var subscribe = function( topic, fn ){
          if ( !topics[topic] ){
            topics[topic] = [];
          }
     
          topics[topic].push( { context: this, callback: fn } );
     
          return this;
        };
     
        // Publish/broadcast an event to the rest of the application
        var publish = function( topic ){
          var args;
     
          if ( !topics[topic] ){
            return false;
          }
     
          args = Array.prototype.slice.call( arguments, 1 );
          for ( var i = 0, l = topics[topic].length; i &lt; l; i++ ) {
     
            var subscription = topics[topic][i];
            subscription.callback.apply( subscription.context, args );
          }
          return this;
        };
     
        return {
          publish: publish,
          subscribe: subscribe
        };
      }());
    }());

    I’d argue that these 2 patterns are the most useful aspects of the new ViziCities application architecture – they have allowed us to iterate quickly without fear of breaking everything.

    Using promises instead of wrestling with callbacks

    Early on in the project I was talking to my friend Hannah Wolfe (Ghost’s CTO) about how annoying callbacks are, particularly when you want to load a bunch of stuff in order. It didn’t take Hannah long to point out how stupid I was being (thanks Hannah) and that I should be using promises instead of wrestling with callbacks. At the time I brushed them off as another hipster fad but in the end she was right (as always) and from that point onwards I used promises wherever possible to take back control of application flow.

    For ViziCities we ended up using the Q library, though there are plenty others to choose from (Hannah uses when.js for Ghost).

    The general usage is the same whatever library you choose – you set up promises and you deal with them at a later date. However, the beauty comes when you want to queue up a bunch of tasks and either handle them in order, or do something when they’re all complete. We use this in a variety of places, most noticeably when loading ViziCities for the first time (also allowing us to output a progress bar).

    I won’t lie, promises take a little while to get your head around but once you do you’ll never look back. I promise (sorry, couldn’t resist).

    Using a consistent build process with basic tests

    I’ve never been one to care too much about process, code quality, testing, or even making sure things are Done Right™. I’m a tinkerer and I much prefer learning and seeing results than spending what feels like wasted time on building a solid process. It turns out my tinkering approach doesn’t work too well for a large Web application which requires consistency and robustness. Who knew?

    The first step for code consistency and quality was to enable strict mode and linting. This meant that the most glaring of errors and inconsistencies were flagged up early on. As an aside, due to our use of the Constructor Pattern we wrapped each module in an anonymous function so we could enable strict mode per module without necessarily enabling it globally.

    At this point it was still a faff to use a manual process for creating new builds (generating a single JavaScript file with all the modules and external dependencies) and to serve the examples. The break-through was adopting a proper build system using Grunt, thanks mostly to a chat I had with Jack Franklin about my woes at an event last year (he subsequently gave me his cold which took 8 weeks to get rid of, but it was worth it).

    Grunt allows us to run a simple command in the terminal to do things like automatically test, concatenate and minify files ready for release. We also use it to serve the local build and auto-refresh examples if they’re open in a browser. You can look at our Grunt setup to see how we set everything up.

    For automated testing we use Mocha, Chai, Sinon.js, Sinon-Chai and PhantomJS. Each of which serves a slightly different purpose in the testing process:

    • Mocha is used for the overall testing framework
    • Chai is used as an assertion library to allows you to write readable tests
    • Sinon.js is used to fake application logic and track behaviour through the testing process
    • PhantomJS is used to run client-side tests in a headless browser from the terminal

    We’ve already put together some (admittedly basic) tests and we plan to improve and increase the test coverage before releasing 0.1.0.

    Travis CI is used to make sure we don’t break anything when pushing changes to GitHub. It automatically performs linting and runs our tests via Grunt when changes are pushed, including pull requests from other contributors (a life saver). Plus it lets you have a cool badge to put on your GitHub readme that shows everyone whether the current version is building successfully.

    Together, these solutions have made ViziCities much more reliable than it has ever been. They also mean that we can move rapidly by building automatically, and they allow us to not have to worry so much about accidentally breaking something. The peace of mind is priceless.

    Monitoring performance to measure improvements

    General performance in frames-per-second can be monitored using FPSMeter. It’s useful for debugging parts of the application that are locking up the browser or preventing the rendering loop from running at a fast pace.

    You can also use the Three.js renderer.info property to monitor what you’re rendering and how it changes over time.

    It’s worth keeping an eye on this to make sure objects are not being rendered when they move out of the current viewport. Early on in ViziCities we had a lot of issues with this not happening and the only way to be sure we to monitor these values.

    Turning geographic coordinates into 2D pixel coordinates using D3.js

    One of the very first problems we encountered was how to turn geographic coordinates (latitude and longitude) into pixel-based coordinates. The math involved to achieve this isn’t simple and it gets even more complicated if you want to consider different geographic projections (trust me, it gets confusing fast).

    Fortunately, the D3.js library has already solved these problems for you, specifically within its geo module. Assuming you’ve included D3.js, you can convert coordinates like so:

    var geoCoords = [-0.01924, 51.50358]; // Central point as [lon, lat]
    var tileSize = 256; // Pixel size of a single map tile
    var zoom = 15; // Zoom level
     
    var projection = d3.geo.mercator()
      .center(geoCoords) // Geographic coordinates of map centre
      .translate([0, 0]) // Pixel coordinates of .center()
      .scale(tileSize &lt;&lt; zoom); // Scaling value
     
    // Pixel location of Heathrow Airport to relation to central point (geoCoords)
    var pixelValue = projection([-0.465567112, 51.4718071791]); // Returns [x, y]

    The scale() value is the hardest part of the process to understand. It basically changes the pixel value that’s returned based on how zoomed in you want to be (imagine zooming in on Google Maps). It took me a very long time to understand so I detailed how scale works in the ViziCities source code for others to learn from (and so I can remember!). Once you nail the scaling then you will be in full control of the geographic-to-pixel conversion process.

    Extruding 2D building outlines into 3D objects on-the-fly

    While 2D building outlines are easy to find, turning them into 3D objects turned out to be not quite as easy as we imagined. There’s currently no public dataset containing 3D buildings, which is a shame though it makes it more fun to do it yourself.

    What we ended up using was the THREE.ExtrudeGeometry object, passing in a reference to an array of pixel points (as a THREE.Shape object) representing a 2D building footprint.

    The following is a basic example that would extrude a 2D outline into a 3D object:

    var shape = new THREE.Shape();
    shape.moveTo(0, 0);
    shape.lineTo(10, 0);
    shape.lineTo(10, 10);
    shape.lineTo(0, 10);
    shape.lineTo(0, 0); // Remember to close the shape
     
    var height = 10;
    var extrudeSettings = { amount: height, bevelEnabled: false };
     
    var geom = new THREE.ExtrudeGeometry( shape, extrudeSettings );
    var mesh = new THREE.Mesh(geom);

    What turned out interesting was how it actually turned out quicker to generate 3D objects on-the-fly than to pre-render them and load them in. This was mostly due to the fact it would take longer to download a pre-rendered 3D object than downloading the 2D coordinates string and generating it at runtime.

    Using Web Workers to dramatically increase performance and prevent browser lockup

    One thing we did notice with the generation of 3D objects was that it locked up the browser, particularly when processing a large number of shapes at the same time (you know, like an entire city). To work around this we delved into the magical world of Web Workers.

    Web Workers allow you to run parts of your application in a completely separate processor thread to the browser renderer, meaning anything that happens in the Web Worker thread won’t slow down the browser renderer (ie. it won’t lock up). It’s incredibly powerful but it can also be incredibly complicated to get working as you want it to.

    We ended up using the Catiline.js Web Worker library to abstract some of the complexity and allow us to focus on using Web Workers to our advantage, rather than fighting against them. The result is a Web Worker processing script that’s passed 2D coordinate arrays and returns generated 3D objects.

    After getting this working we noticed that while the majority of browser lock-ups were eliminated, there were two new lock-ups introduced. Specifically, there was a lock-up when the 2D coordinates were passed into the Web Worker scripts, and another lock-up when the 3D objects were returned back to the main application.

    The solution to this problem came from the inspiring mind of Vladimir Agafonkin (of LeafletJS fame). He helped me understand that to avoid the latter of the lock-ups (passing the 3D objects back to the application) I needed to use transferrable objects), namely ArrayBuffer objects. Doing this allows you to effectively transfer ownership of objects created within a Web Worker thread to the main application thread, rather than copying them. We implemented this to great effect, eliminating the second lock-up entirely.

    To eliminate the first lock-up (passing 2D coordinates into the Web Worker) we need to take a different approach. The problem lies again with the copying of the data, though in this case you can’t use transferrable objects. The solution instead lies in loading the data into the Web Worker script using the importScripts method. Unfortunately, I’ve not yet worked out a way to do this with dynamic data sourced from XHR requests. Still, this is definitely a solution that would work.

    Using simplify.js to reduce the complexity of 2D shapes before rendering

    Something we found early on was that complex 2D shapes caused a lot of strain when rendered as 3D objects en-masse. To get around this we use Vladimir Agafonkin’s simplify.js library to reduce the quality of 2D shapes before rendering.

    It’s a great little tool that allows you to keep the general shape while dramatically reducing the number of points used, thus reducing its complexity and render cost. By using this method we could render many more objects with little to no change in how the objects look.

    Getting accurate heights for buildings is really difficult

    One problem we never imagined encountering was getting accurate height information for buildings within a city. While the data does exist, it’s usually unfathomably expensive or requires you to be in education to get discounted access.

    The approach we went for uses accurate height data from OpenStreetMap (if available), falling back to a best-guess that uses the building type combined with 2D footprint area. In most cases this will give a far more accurate guess at the height than simply going for a random height (which is how we originally did it).

    Restricting camera movement to control performance

    The original dream with ViziCities was to visualise an entire city in one go, flying around, looking down on the city from the clouds like some kind of God. We fast learnt that this came at a price, a performance price, and a data-size price. Neither of which we were able to afford.

    When we realised this wasn’t going to be possible we looked at how to approach things from a different angle. How can you feel like you’re looking at an entire city without rendering an entire city? The solution was deceptively simple.

    By restricting camera movement to only view a small area at a time (limiting zoom and rotation) you’re able to have much more control over how many objects can possibly be in view at one time. For example, if you prevent someone from being able to tilt a camera to look at the horizon then you’ll never need to render every single object between the camera and the edge of your scene.

    This simple approach means that you can go absolutely anywhere in the world within ViziCities, whether a thriving metropolis or a country-side retreat, and not have to change the way you handle performance. Every situation is predictable and therefore optimisable.

    Tile-based batching of objects to improve loading and rendering performance

    Another approach we took to improve performance was by splitting the entire world into a grid system, exactly like how Google and other map providers do things. This allows you to load data in small chunks that eventually build up to a complete image.

    In the case of ViziCities, we use the tiles to only request JSON data for the geographic area visible to you. This means that you can start outputting 3D objects as each tile loads rather than waiting for everything to load.

    A by-product of this approach is that you get to benefit from frustum culling, which is when objects not within your view are not rendered, dramatically increasing performance.

    Caching of loaded data to save on time and resources when viewing the same location

    Coupled with the tile-based loading is a caching system that means that you don’t request the same data twice, instead pulling the data from a local store. This saves bandwidth but also saves time as it can take a while to download each JSON tile.

    We currently use a dumb local approach that resets the cache on each refresh, but we plan to implement something like localForage to have the cache persist between browser sessions.

    Using the OpenStreetMap Overpass API rather than rolling your own PostGIS database

    Late into the development of ViziCities we realised that it was unfeasible to continue using our own PostGIS database to store and manipulate geographic data. For one, it would require a huge server just to store the entirety of OpenStreetMap in a database, but really it was just a pain to set up and manage and an external approach was required.

    The solution came in the shape of the Overpass API, an external JSON and XML endpoint to OpenStreetMap data. Overpass allows you to send a request for specific OpenStreetMap tags within a bounding box (in our case, a map tile):

    http://overpass-api.de/api/interpreter?data=[out:json];((way(51.50874,-0.02197,51.51558,-0.01099)[%22building%22]);(._;node(w);););out;

    And get back a lovely JSON response:

    {
      "version": 0.6,
      "generator": "Overpass API",
      "osm3s": {
        "timestamp_osm_base": "2014-03-02T22:08:02Z",
        "copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
      },
      "elements": [
     
    {
      "type": "node",
      "id": 262890340,
      "lat": 51.5118466,
      "lon": -0.0205134
    },
    {
      "type": "node",
      "id": 278157418,
      "lat": 51.5143963,
      "lon": -0.0144833
    },
    ...
    {
      "type": "way",
      "id": 50258319,
      "nodes": [
        638736123,
        638736125,
        638736127,
        638736129,
        638736123
      ],
      "tags": {
        "building": "yes",
        "leisure": "sports_centre",
        "name": "The Workhouse"
      }
    },
    {
      "type": "way",
      "id": 50258326,
      "nodes": [
        638736168,
        638736170,
        638736171,
        638736172,
        638736168
      ],
      "tags": {
        "building": "yes",
        "name": "Poplar Playcentre"
      }
    },
    ...
      ]
    }

    The by-product of this was that you get worldwide support out of the box and benefit from minutely OpenSteetMap updates. Seriously, if you edit or add something to OpenStreetMap (please do) it will show up in ViziCities within minutes.

    Limiting the number of concurrent XHR requests

    Something we learnt very recently was that spamming the Overpass API endpoint with a tonne of XHR requests at the same time wasn’t particularly good for us nor for Overpass. It generally caused delays as Overpass rate-limited us so data took a long time to make its way back to the browser. The great thing was that by already using promises to manage the XHR requests we were half-way ready to solve the problem.

    The final piece of the puzzle is to use throat.js to limit the number of concurrent XHR requests so we can take control and load resources without abusing external APIs. It’s beautifully simple and worked perfectly. No more loading delays!

    Using ViziCities in your own project

    I hope that these lessons and tips have helped in some way, and I hope that it encourages you to try out ViziCities for yourself. Getting set up is easy and well documented, just head to the ViziCities GitHub repo and you’ll find everything you need.

    Contributing to ViziCities

    Part of the reason why we opened up ViziCities was to encourage other people to help build it and make it even more awesome than Peter and I could ever make it. Since launch, we’ve had over 1,000 people favourite the project on GitHub, as well as nearly 100 forks. More importantly, we’ve had 9 Pull Requests from members of the community who we didn’t previously know and who we’ve not asked to help. It’s such an amazing feeling to see people help out like that.

    If we were to pick a favourite contribution so far, it would be adding the ability to load anywhere in the world by putting coordinates in the URL. Such a cool feature and one that has made the project much more usable for everyone.

    We’d love to have more people contribute, whether dealing with issues or playing with the visual styling. Read more about how to contribute and give it a go!

    What’s next?

    It’s been a crazy year and an even crazier fortnight since we launched the project. We never imagined it would excite people in the way it has, it’s blown us away.

    The next steps are slowly going through the issues and getting ready for the 0.1.0 release, which will still be alpha quality but will be sort of stable. Aside from that we’ll continue experimenting with exciting new technologies like the Oculus Rift (yes, that’s me with one strapped to my face)…

    Visualising realtime air traffic in 3D…

    And much, much more. Watch this space.

  6. WebGL Deferred Shading

    WebGL brings hardware-accelerated 3D graphics to the web. Many features of WebGL 2 are available today as WebGL extensions. In this article, we describe how to use the WEBGL_draw_buffers extension to create a scene with a large number of dynamic lights using a technique called deferred shading, which is popular among top-tier games.

    live demosource code

    Today, most WebGL engines use forward shading, where lighting is computed in the same pass that geometry is transformed. This makes it difficult to support a large number of dynamic lights and different light types.

    Forward shading can use a pass per light. Rendering a scene looks like:

    foreach light {
      foreach visible mesh {
        if (light volume intersects mesh) {
          render using this material/light shader;
          accumulate in framebuffer using additive blending;
        }
      }
    }

    This requires a different shader for each material/light-type combination, which adds up. From a performance perspective, each mesh needs to be rendered (vertex transform, rasterization, material part of the fragment shader, etc.) once per light instead of just once. In addition, fragments that ultimately fail the depth test are still shaded, but with early-z and z-cull hardware optimizations and a front-to-back sorting or a z-prepass, this not as bad as the cost for adding lights.

    To optimize performance, light sources that have a limited effect are often used. Unlike real-world lights, we allow the light from a point source to travel only a limited distance. However, even if a light’s volume of effect intersects a mesh, it may only affect a small part of the mesh, but the entire mesh is still rendered.

    In practice, forward shaders usually try to do as much work as they can in a single pass leading to the need for a complex system of chaining lights together in a single shader. For example:

    foreach visible mesh {
      find lights affecting mesh;
      Render all lights and materials using a single shader;
    }

    The biggest drawback is the number of shaders required since a different shader is required for each material/light (not light type) combination. This makes shaders harder to author, increases compile times, usually requires runtime compiling, and increases the number of shaders to sort by. Although meshes are only rendered once, this also has the same performance drawbacks for fragments that fail the depth test as the multi-pass approach.

    Deferred Shading

    Deferred shading takes a different approach than forward shading by dividing rendering into two passes: the g-buffer pass, which transforms geometry and writes positions, normals, and material properties to textures called the g-buffer, and the light accumulation pass, which performs lighting as a series of screen-space post-processing effects.

    // g-buffer pass
    foreach visible mesh {
      write material properties to g-buffer;
    }
     
    // light accumulation pass
    foreach light {
      compute light by reading g-buffer;
      accumulate in framebuffer;
    }

    This decouples lighting from scene complexity (number of triangles) and only requires one shader per material and per light type. Since lighting takes place in screen-space, fragments failing the z-test are not shaded, essentially bringing the depth complexity down to one. There are also downsides such as its high memory bandwidth usage and making translucency and anti-aliasing difficult.

    Until recently, WebGL had a roadblock for implementing deferred shading. In WebGL, a fragment shader could only write to a single texture/renderbuffer. With deferred shading, the g-buffer is usually composed of several textures, which meant that the scene needed to be rendered multiple times during the g-buffer pass.

    WEBGL_draw_buffers

    Now with the WEBGL_draw_buffers extension, a fragment shader can write to several textures. To use this extension in Firefox, browse to about:config and turn on webgl.enable-draft-extensions. Then, to make sure your system supports WEBGL_draw_buffers, browse to webglreport.com and verify it is in the list of extensions at the bottom of the page.

    To use the extension, first initialize it:

    var ext = gl.getExtension('WEBGL_draw_buffers');
    if (!ext) {
      // ...
    }

    We can now bind multiple textures, tx[] in the example below, to different framebuffer color attachments.

    var fb = gl.createFramebuffer();
    gl.bindFramebuffer(gl.FRAMEBUFFER, fb);
    gl.framebufferTexture2D(gl.FRAMEBUFFER, ext.COLOR_ATTACHMENT0_WEBGL, gl.TEXTURE_2D, tx[0], 0);
    gl.framebufferTexture2D(gl.FRAMEBUFFER, ext.COLOR_ATTACHMENT1_WEBGL, gl.TEXTURE_2D, tx[1], 0);
    gl.framebufferTexture2D(gl.FRAMEBUFFER, ext.COLOR_ATTACHMENT2_WEBGL, gl.TEXTURE_2D, tx[2], 0);
    gl.framebufferTexture2D(gl.FRAMEBUFFER, ext.COLOR_ATTACHMENT3_WEBGL, gl.TEXTURE_2D, tx[3], 0);

    For debugging, we can check to see if the attachments are compatible by calling gl.checkFramebufferStatus. This function is slow and should not be called often in release code.

    if (gl.checkFramebufferStatus(gl.FRAMEBUFFER) !== gl.FRAMEBUFFER_COMPLETE) {
      // Can't use framebuffer.
      // See http://www.khronos.org/opengles/sdk/docs/man/xhtml/glCheckFramebufferStatus.xml
    }

    Next, we map the color attachments to draw buffer slots that the fragment shader will write to using gl_FragData.

    ext.drawBuffersWEBGL([
      ext.COLOR_ATTACHMENT0_WEBGL, // gl_FragData[0]
      ext.COLOR_ATTACHMENT1_WEBGL, // gl_FragData[1]
      ext.COLOR_ATTACHMENT2_WEBGL, // gl_FragData[2]
      ext.COLOR_ATTACHMENT3_WEBGL  // gl_FragData[3]
    ]);

    The maximum size of the array passed to drawBuffersWEBGL depends on the system and can be queried by calling gl.getParameter(gl.MAX_DRAW_BUFFERS_WEBGL). In GLSL, this is also available as gl_MaxDrawBuffers.

    In the deferred shading geometry pass, the fragment shader writes to multiple textures. A trivial pass-through fragment shader is:

    #extension GL_EXT_draw_buffers : require
    precision highp float;
    void main(void) {
      gl_FragData[0] = vec4(0.25);
      gl_FragData[1] = vec4(0.5);
      gl_FragData[2] = vec4(0.75);
      gl_FragData[3] = vec4(1.0);
    }

    Even though we initialized the extension in JavaScript with gl.getExtension, the GLSL code still needs to include #extension GL_EXT_draw_buffers : require to use the extension. With the extension, the output is now the gl_FragData array that maps to framebuffer color attachments, not gl_FragColor, which is traditionally the output.

    g-buffers

    In our deferred shading implementation the g-buffer is composed of four textures: eye-space position, eye-space normal, color, and depth. Position, normal, and color use the floating-point RGBA format via the OES_texture_float extension, and depth uses the unsigned-short DEPTH_COMPONENT format.

    Position texture

    Normal texture

    Color texture

    Depth texture

    Light accumulation using g-buffers

    This g-buffer layout is simple for our testing. Although four textures is common for a full deferred shading engine, an optimized implementation would try to use the least amount of memory by lowering precision, reconstructing position from depth, packing values together, using different distributions, and so on.

    With WEBGL_draw_buffers, we can use a single pass to write each texture in the g-buffer. Compared to using a single pass per texture, this improves performance and reduces the amount of JavaScript code and GLSL shaders. As shown in the graph below, as scene complexity increases so does the benefit of using WEBGL_draw_buffers. Since increasing scene complexity requires more drawElements/drawArrays calls, more JavaScript overhead, and transforms more triangles, WEBGL_draw_buffers provides a benefit by writing the g-buffer in a single pass, not a pass per texture.

    All performance numbers were measured using an NVIDIA GT 620M, which is a low-end GPU with 96 cores, in FireFox 26.0 on Window 8. In the above graph, 20 point lights were used. The light intensity decreases proportionally to the square of the distance between the current position and the light position. Each Stanford Dragon is 100,000 triangles and requires five draw calls so, for example, when 25 dragons are rendered, 125 draw calls (and related state changes) are issued, and a total of 2,500,000 triangles are transformed.


    WEBGL_draw_buffers test scene, shown here with 100 Stanford Dragons.

    Of course, when scene complexity is very low, like the case of one dragon, the cost of the g-buffer pass is low so the savings from WEBGL_draw_buffers are minimal, especially if there are many lights in the scene, which drives up the cost of the light accumulation pass as shown in the graph below.

    Deferred shading requires a lot of GPU memory bandwidth, which can hurt performance and increase power usage. After the g-buffer pass, a naive implementation of the light accumulation pass would render each light as a full-screen quad and read the entirety of each g-buffer. Since most light types, like point and spot lights, attenuate and have a limited volume of effect, the full-screen quad can be replaced with a world-space bounding volume or tight screen-space bounding rectangle. Our implementation renders a full-screen quad per light and uses the scissor test to limit the fragment shader to the light’s volume of effect.

    Tile-Based Deferred Shading

    Tile-based deferred shading takes this a step farther and splits the screen into tiles, for example 16×16 pixels, and then determines which lights influence each tile. Light-tile information is then passed to the shader and the g-buffer is only read once for all lights. Since this drastically reduces memory bandwidth, it improves performance. The following graph shows performance for the sponza scene (66,450 triangles and 38 draw calls) at 1024×768 with 32×32 tiles.

    Tile size affects performance. Smaller tiles require more JavaScript overhead to create light-tile information, but less computation in the lighting shader. Larger tiles have the opposite tradeoff. Therefore, choosing a suitable tile is important for the performance. The figure below is shown the relationship between tile size and performance with 100 lights.

    A visualization of the number of lights in each tile is shown below. Black tiles have no lights intersecting them and white tiles have the most lights.


    Shaded version of tile visualization.

    Conclusion

    WEBGL_draw_buffers is a useful extension for improving the performance of deferred shading in WebGL. Checkout the live demo and our code on github.

    Acknowledgements

    We implemented this project for the course CIS 565: GPU Programming and Architecture, which is part of the computer graphics program at the University of Pennsylvania. We thank Liam Boone for his support and Eric Haines and Morgan McGuire for reviewing this article.

    References

  7. WebGL & CreateJS for Firefox OS

    This is a guest post by the developers at gskinner. Mozilla has been working with the CreateJS.com team at gskinner to bring new features to their open-source libraries and make sure they work great on Firefox OS.

    Here at gskinner, it’s always been our philosophy to contribute our solutions to the dev community — the last four years of which have been focused on web-standards in HTML and Javascript. Our CreateJS libraries provide approachable, modular, cross-browser-and-platform APIs for building rich interactive experiences on the open, modern web. We think they’re awesome.

    For example, the CreateJS CDN typically receives hundreds of millions of impressions per month, and Adobe has selected CreateJS as their official framework for creating HTML5 documents in Flash Professional CC.

    Firefox OS is a perfect fit for CreateJS content. It took us little effort to ensure that the latest libraries are supported and are valuable tools for app and game creation on the platform.

    We’re thrilled to welcome Mozilla as an official sponsor of CreateJS, along with some exciting announcements about the libraries!

    WebGL

    As WebGL becomes more widely supported in browsers, we’re proud to announce that after working in collaboration with Mozilla, a shiny new WebGL renderer for EaselJS is now in early beta! Following research, internal discussions, and optimizations, we’ve managed to pump out a renderer that draws a subset of 2D content anywhere from 6x to 50x faster than is currently possible on the Canvas 2D Context. It’s fully supported in both the browser and in-app contexts of Firefox OS.

    We thought about what we wanted to gain from a WebGL renderer, and narrowed it down to three key goals:

    1. Very fast performance for drawing sprites and bitmaps
    2. Consistency and integration with the existing EaselJS API
    3. The ability to fall back to Context2D rendering if WebGL is not available

    Here’s what we came up with:

    SpriteStage and SpriteContainer

    Two new classes, SpriteStage and SpriteContainer, enforce restrictions on the display list to enable aggressively optimized rendering of bitmap content. This includes images, spritesheet animations, and bitmap text. SpriteStage is built to automatically make additional draw calls per frame as needed, avoiding any fixed maximum on the number of elements that can be included in a single draw call.

    These new classes extend existing EaselJS classes (Stage and Container), so creating WebGL content is super simple if you’re familiar with EaselJS. Existing content using EaselJS can be WebGL-enabled with a few keystrokes.

    Layering Renderers

    This approach allows WebGL and Context2D content to be layered on screen, and mouse/touch interactions can pass seamlessly between the layers. For example, an incredibly fast game engine using WebGL rendering can be displayed under a UI layer that leverages the more robust capabilities of the Context2D renderer. You can even swap assets between a WebGL and Context2D layer.

    Finally, WebGL content is fully compatible with the existing Context2D renderer. On devices or browsers that don’t support WebGL, your content will automatically be rendered via canvas 2D.

    While it took some work to squeeze every last iota of performance out of the new renderer, we’re really happy with this new approach. It allows developers to build incredibly high performance content for a wide range of devices, and also leverage the extremely rich existing API and toolchain surrounding CreateJS. Below, you’ll find a few demos and links that show off its capabilities.

    Example: Bunnymark

    A very popular (though limited) benchmark for web graphics is Bunnymark. This benchmark simply measures the maximum number of bouncing bunny bitmap sprites (try saying that 5 times fast) a renderer can support at 60fps.

    Bunnymark

    The following table compares Bunnymark scores using the classic Context2D renderer and the new WebGL renderer. Higher numbers are better.

    Environment Context2D WebGL Change
    2012 Macbook Pro, Firefox 26 900 46,000 51x
    2012 Macbook Pro, Chrome 31 2,300 60,000 26x
    2012 Win 7 laptop, IE11 (x64 NVIDIA GeForce GT 630M, 1 GB VRAM) 1,900 9,800 5x
    Firefox OS 1.2.0.0-prerelease (early 1.2 device) 45 270 6x
    Nexus 5, Firefox 26 225 4,400 20x
    Nexus 5, Chrome 31 230 4,800 21x

    Since these numbers show maximum sprites at 60fps, the above numbers can increase significantly if a lower framerate is allowed. It’s worth noting that the only Firefox OS device we have in house is an early Firefox OS 1.2 device (has a relatively low-powered GPU), yet we’re still seeing significant performance gains.

    Example: Sparkles Benchmark

    This very simple demo was made to test the limits of how many particles could be put on screen while pushing the browser to 24fps.

    Sparkles

    Example: Planetary Gary

    We often use the Planetary Gary game demo as a test bed for new capabilities in the CreateJS libraries. In this case, we retrofitted the existing game to use the new SpriteStage and SpriteContainer classes for rendering the game experience in WebGL.

    Planetary Gary

    This was surprisingly easy to do, requiring only three lines of changed or added code, and demonstrates the ease of use, and consistency of the new APIs. It’s a particularly good example because it shows how the robust feature set of the Context2D renderer can be used for user interface elements (ex. the start screen) in cooperation with the superior performance of the WebGL renderer (ex. the game).

    Even better, the game art is packaged as vector graphics, which are drawn to sprite sheets via the Context2D renderer at run time (using EaselJS’s SpriteSheetBuilder), then passed to the WebGL renderer. This allows for completely scaleable graphics with minimal file size (~85kb over the wire) and incredible performance!

    Roadmap

    We’ve posted a public preview of the new WebGL renderer on GitHub to allow the community to take it for a test drive and provide feedback. Soon it will be included it in the next major release.

    Follow @createjs and @gskinner on twitter to stay up to date with the latest news and let us know what you think — thanks for reading!

  8. Live editing WebGL shaders with Firefox Developer Tools

    If you’ve seen Epic Games’ HTML5 port of ‘Epic Citadel’, you have no doubt been impressed by the amazing performance and level of detail. A lot of the code that creates the cool visual effects you see on screen are written as shaders linked together in programs – these are specialized programs that are evaluated directly on the GPU to provide high performance real-time visual effects.

    Writing Vertex and Fragment shaders are an essential part of developing 3D on the web even if you are using a library, in fact the Epic Citadel demo includes over 200 shader programs. This is because most rendering is customised and optimised to fit a game’s needs. Shader development is currently awkward for a few reasons:

    • Seeing your changes requires a refresh
    • Some shaders are applied under very specific conditions

    Here is a screencast that shows a how to manipulate shader code using a relatively simple WebGL demo:

    Starting in Firefox 27 we’ve introduced a new tool called the ‘Shader Editor’ that makes working with shader programs much simpler: the editor lists all shader programs running in the WebGL context, and you can live-edit shaders and see immediate results without interrupting any animations or state. Additionally editing shaders should not impact WebGL performance.

    Enabling the Shader Editor

    The Shader Editor is not shown by default, because not all the web pages out there contain WebGL, but you can easily enable it:

    1. Open the Toolbox by pressing either F12 or Ctrl/Cmd + Shift + I.
    2. Click on the ‘gear’ icon near the top edge of the Toolbox to open the ‘Toolbox Options’.
    3. On the left-hand side under ‘Default Firefox Developer Tools’ make sure ‘Shader Editor’ is checked. You should immediately see a new ‘Shader Editor’ Tool tab.

    Using the Shader Editor

    To see the Shader Editor in action, just go to a WebGL demo such as this one and open the toolbox. When you click on the shader editor tab, you’ll see a reload button you will need to click in order to get the editor attached to the WebGL context. Once you’ve done this you’ll see the Shader Editor UI:

    The WebGL Shader Editor

    • On the left you have a list of programs, a vertex and fragment shader corresponds to each program and their source is displayed and syntax highlighted in the editors on the right.
    • The shader type is displayed underneath each editor.
    • Hovering a program highlights the geometry drawn by its corresponding shaders in red – this is useful for finding the right program to work on.
    • Clicking on the eyeball right next to each program hides the rendered geometry (useful in the likely case an author wants to focus solely on some geometry but not other, or to hide overlapping geometry).
    • The tool is responsive when docked to the side.

    Editing Shader Programs

    The first thing you’ll notice about Shader program code is that it is not JavaScript. For more information on how Shader programs work, I highly recommend you start with the WebGL demo on the Khronos wiki and/or Paul Lewis’ excellent HTML5 Rocks post. There also some great long standing tutorials on the Learning WebGL blog. The Shader Editor gives you direct access to the programs so you can play around with how they work:

    • Editing code in any of the editors will compile the source and apply it as soon as the user stops typing;
    • If an error was made in the code, the rendering won’t be affected, but an error will be displayed in the editor, highlighting the faulty line of code; hovering the icon gutter will display a tooltip describing the error.

    Errors in shaders

    Learn more about the Shader Editor on the Mozilla Developer Network.

    Here is a second screencast showing how you could directly edit the shader programs in the Epic Citadel demo:

  9. Announcing the winners of the June 2013 Dev Derby!

    This June, some of the most creative web developers out there pushed the limits of WebGL in our June Dev Derby contest. After sorting through the entries, our expert judges–James Padolsey and Maire Reavy–decided on three winners and three runners-up.

    Not a contestant? There are other reasons to be excited. Most importantly, all of these demos are completely open-source, making them wonderful lessons in the exciting things you can do with WebGL today.

    Dev Derby

    The Results

    Winners

    Runners-up

    Congratulations to these winners and to everyone who competed! The Web is a better, more expansive place because of their efforts.

    Further reading

  10. The concepts of WebGL

    This post is not going to be yet another WebGL tutorial: there already are enough great ones (we list some at the end).

    We are just going to introduce the concepts of WebGL, which are basically just the concepts of any general, low-level graphics API (such as OpenGL or Direct3D), to a target audience of Web developers.

    What is WebGL?

    WebGL is a Web API that allows low-level graphics programming. “Low-level” means that WebGL commands are expressed in terms that map relatively directly to how a GPU (graphics processing unit, i.e. hardware) actually works. That means that WebGL allows you to really tap into the feature set and power of graphics hardware. What native games do with OpenGL or Direct3D, you can probably do with WebGL too.

    WebGL is so low-level that it’s not even a “3D” graphics API, properly speaking. Just like your graphics hardware doesn’t really care whether you are doing 2D or 3D graphics, neither does WebGL: 2D and 3D are just two possible usage patterns. When OpenGL 1.0 came out in 1992, it was specifically a 3D API, aiming to expose the features of the 3D graphics hardware of that era. But as graphics hardware evolved towards being more generic and programmable, so did OpenGL. Eventually, OpenGL became so generic that 2D and 3D would be just two possible use cases, while still offering great performance. That was OpenGL 2.0, and WebGL is closely modeled after it.

    That’s what we mean when we say that WebGL is a low-level graphics API rather than a 3D API specifically. That is the subject of this article; and that’s what makes WebGL so valuable to learn even if you don’t plan to use it directly. Learning WebGL means learning a little bit of how graphics hardware works. It can help developing an intuition of what’s going to be fast or slow in any graphics API.

    The WebGL context and framebuffer

    Before we can properly explain anything about the WebGL API, we have to introduce some basic concepts. WebGL is a rendering context for the HTML Canvas element. You start by getting a WebGL context for your canvas:

    var gl;
    try {
      gl = canvas.getContext("experimental-webgl");
    } catch(e) {}

    From there, you perform your rendering by calling the WebGL API functions on the gl element obtained there. WebGL is never single-buffered, meaning that the image that you are currently rendering is never the one that is currently displayed in the Canvas element. This ensures that half-rendered frames never show up in the browser’s window. The image being rendered is called the WebGL framebuffer or backbuffer. Talking of framebuffers is made more complicated by the fact that WebGL also allows additional off-screen framebuffers, but let’s ignore that in this article. The image currently being displayed is called the frontbuffer. Of course, the contents of the backbuffer will at some point be copied into the frontbuffer — otherwise WebGL drawing would have no user-visible effect!

    But that operation is taken care of automatically by the browser, and in fact, the WebGL programmer has no explicit access to the frontbuffer whatsoever. The key rule here is that the browser may copy the backbuffer into the frontbuffer at any time except during the execution of JavaScript. What this means is that you must perform the entire WebGL rendering of a frame within a single JavaScript callback. As long as you do that, correct rendering is ensured and the browser takes care of the very complex details of multi-buffered compositing for you. You should, in addition, let your WebGL-rendering callback be a requestAnimationFrame callback: if you do so, the browser will also take care of the complex details of animation scheduling for you.

    WebGL as a general, low-level graphics API

    We haven’t yet described how WebGL is a low-level graphics API where 2D and 3D are just two possible usage patterns. In fact, the very idea that such a general graphics API may exist is non-trivial: it took the industry many years to arrive to such APIs.

    WebGL allows to draw either points, or line segments, or triangles. The latter is of course what’s used most of the time, so we will focus entirely on triangles in the rest of this article.

    WebGL’s triangle rendering is very general: the application provides a callback, called the pixel shader or fragment shader, that will be called on each pixel of the triangle, and will determine the color in which it should be drawn.

    So suppose that you’re coding an old-school 2D game. All what you want is to draw rectangular bitmap images. As WebGL can only draw triangles (more on this below), you decompose your rectange into two triangles as follows,

    A rectangle decomposed as two triangles.

    and your fragment shader, i.e. the program that determines the color of each pixel, is very simple: it will just read one pixel from the bitmap image, and use it as the color for the pixel currently being rendered.

    Suppose now that you’re coding a 3D game. You have tesselated your 3D shapes into triangles. Why triangles? Triangles are the most popular 3D drawing primitive because any 3 points in 3D space are the vertices of a triangle. By contrast, you cannot just take any 4 points in 3D space to define a quadrilateral — they would typically fail to lie exactly in the same plane. That’s why WebGL doesn’t care for any other kind of polygons besides triangles.

    So your 3D game just needs to be able to render 3D triangles. In 3D, it is a little bit tricky to transform 3D coordinates into actual canvas coordinates — i.e. to determine where in the canvas a given 3D object should end up being drawn. There is no one-size-fits-all formula there: for example, you could want to render fancy underwater or glass refraction effects, that would inevitably require a custom computation for each vertex. So WebGL allows you to provide your own callback, called the vertex shader, that will be called for each vertex of each triangle you will render, and will determine the canvas coordinates at which it should be drawn.

    One would naturally expect these canvas coordinates to be 2D coordinates, as the canvas is a 2D surface; but they are actually 3D coordinates, where the Z coordinate is used for depth testing purposes. Two pixels differing only by their Z coordinate correspong to the same pixel on screen, and the Z coordinates are used to determine which one hides the other one. All three axes go from -1.0 to +1.0. It’s important to understand that this is the only coordinate system natively understood by WebGL: any other coordinate system is only understood by your own vertex shader, where you implement the transformation to canvas coordinates.

    The WebGL canvas coordinate system.

    Once the canvas coordinates of your 3D triangles are known (thanks to your vertex shader), your triangles will be painted, like in the above-discussed 2D example, by your fragment shader. In the case of a 3D game though, your fragment shader will typically be more intricate than in a 2D game, as the effective pixel colors in a 3D game are not as easily determined by static data. Various effects, such as lighting, may play a role in the effective color that a pixel will have on screen. In WebGL, you have to implement all these effects yourself. The good news is that you can: as said above, WebGL lets you specify your own callback, the fragment shader, that determines the effective color of each pixel.

    Thus we see how WebGL is a general enough API to encompass the needs of both 2D and 3D applications. By letting you specify arbitrary vertex shaders, it allows implementing arbitrary coordinate transformations, including the complex ones that 3D games need to perform. By accepting arbitrary fragment shaders, it allows implementing arbitrary pixel color computations, including subtle lighting effects as found in 3D games. But the WebGL API isn’t specific to 3D graphics and can be used to implement almost any kind of realtime 2D or 3D graphics — it scales all the way down to 1980s era monochrome bitmap or wireframe games, if that’s what you want. The only thing that’s out of reach of WebGL is the most intensive rendering techniques that require tapping into recently added features of high-end graphics hardware. Even so, the plan is to keep advancing the WebGL feature set as is deemed appropriate to keep the right balance of portability vs features.

    The WebGL rendering pipeline

    So far we’ve discussed some aspects of how WebGL works, but mostly incidentally. Fortunately, it doesn’t take much more to explain in a systematic way how WebGL rendering proceeds.

    The key metaphor here is that of a pipeline. It’s important to understand it because it’s a universal feature of all current graphics hardware, and understanding it will help you instinctively write code that is more hardware-friendly, and thus, runs faster.

    GPUs are massively parallel processors, consisting of a large number of computation units designed to work in parallel with each other, and in parallel with the CPU. That is true even in mobile devices. With that in mind, graphics APIs such as WebGL are designed to be inherently friendly to such parallel architectures. On typical work loads, and when correctly used, WebGL allows the GPU to execute graphics commands in parallel with any CPU-side work, i.e. the GPU and the CPU should not have to wait for each other; and WebGL allows the GPU to max out its parallel processing power. It is in order to allow running on the GPU that these shaders are written in a dedicated GPU-friendly language rather than in JavaScript. It is in order to allow the GPU to run many shaders simultaneously that shaders are just callbacks handling one vertex or one pixel each — so that the GPU is free to run shaders on whichever GPU execution unit and in whichever order it pleases.

    The following diagram summarizes the WebGL rendering pipeline:

    The WebGL rendering pipeline

    The application sets up its vertex shader and fragment shader, and gives WebGL any data that these shaders will need to read from: vertex data describing the triangles to be drawn, bitmap data (called “textures”) that will be used by the fragment shader. Once this is set up, the rendering starts by executing the vertex shader for each vertex, which determines the canvas coordinates of triangles; the resulting triangles are then rasterized, which means that the list of pixels to be painted is determined; the fragment shader is then executed for each pixel, determining its color; finally, some framebuffer operation determines how this computed color affects the final framebuffer’s pixel color at this location (this final stage is where effects such as depth testing and transparency are implemented).

    GPU-side memory vs main memory

    Some GPUs, especially on desktop machines, use their own memory that’s separate from main memory. Other GPUs share the same memory as the rest of the system. As a WebGL developer, you can’t know what kind of system you’re running on. But that doesn’t matter, because WebGL forces you to think in terms of dedicated GPU memory.

    All what matters from a practical perspective is that:

    • WebGL rendering data must first be uploaded to special WebGL data structures. Uploading means copying data from general memory to WebGL-specific memory. These special WebGL data structures are called WebGL textures (bitmap images) and WebGL buffers (generic byte arrays).
    • Once that data is uploaded, rendering is really fast.
    • But uploading that data is generally slow.

    In other words, think of the GPU as a really fast machine, but one that’s really far away. As long as that machine can operate independently, it’s very efficient. But communicating with it from the outside takes very long. So you want to do most of the communication ahead of time, so that most of the rendering can happen independently and fast.

    Not all GPUs are actually so isolated from the rest of the system — but WebGL forces you to think in these terms so that your code will run efficiently no matter what particular GPU architecture a given client uses. WebGL data structures abstract the possibility of dedicated GPU memory.

    Some things that make graphics slow

    Finally, we can draw from what was said above a few general ideas about what can make graphics slow. This is by no means an exhaustive list, but it does cover some of the most usual causes of slowness. The idea is that such knowledge is useful to any programmer ever touching graphics code — regardless of whether they use WebGL. In this sense, learning some concepts around WebGL is useful for much more than just WebGL programming.

    Using the CPU for graphics is slow

    There is a reason why GPUs are found in all current client systems, and why they are so different from CPUs. To do fast graphics, you really need the parallel processing power of the GPU. Unfortunately, automatically using the GPU in a browser engine is a difficult task. Browser vendors do their best to use the GPU where appropriate, but it’s a hard problem. By using WebGL, you take ownership of this problem for your content.

    Having the GPU and the CPU wait for each other is slow

    The GPU is designed to be able to run in parallel with the CPU, independently. Inadvertently causing the GPU and CPU to wait for each other is a common cause of slowness. A typical example is reading back the contents of a WebGL framebuffer (the WebGL readPixels function). This may require the CPU to wait for the GPU to finish any queued rendering, and may then also require the GPU to wait for the CPU to have received the data. So as far as you can, think of the WebGL framebuffer as a write-only medium.

    Sending data to the GPU may be slow

    As mentioned above, GPU memory is abstracted by WebGL data structures such as textures. Such data is best uploaded once to WebGL and then used many times. Uploading new data too frequently is a typical cause of slowness: the uploading is slow by itself, and if you upload data right before rendering with it, the GPU has to wait for the data before it can proceed with rendering — so you’re effectively gating your rendering speed on slow memory transfers.

    Small rendering operations are slow

    GPUs are intended to be used to draw large batches of triangles at once. If you have 10,000 triangles to draw, doing it in one single operation (as WebGL allows) will be much faster than doing 10,000 separate draw operations of one triangle each. Think of a GPU as a very fast machine with a very long warm-up time. Better warm up once and do a large batch of work, than pay for the warm-up cost many times. Organizing your rendering into large batches does require some thinking, but it’s worth it.

    Where to learn WebGL

    We intentionally didn’t write a tutorial here because there already exist so many good ones:

    Allow me to also mention that talk I gave, as it has some particularly minimal examples.