The whole web at maximum FPS: How WebRender gets rid of jank

The Firefox Quantum release is getting close. It brings many performance improvements, including the super fast CSS engine that we brought over from Servo.

But there’s another big piece of Servo technology that’s not in Firefox Quantum quite yet, though it’s coming soon. That’s WebRender, which is being added to Firefox as part of the Quantum Render project.Drawing of a jet engine labeled with the different Project Quantum projects

WebRender is known for being extremely fast. But WebRender isn’t really about making rendering faster. It’s about making it smoother.

With WebRender, we want apps to run at a silky smooth 60 frames per second (FPS) or better no matter how big the display is or how much of the page is changing from frame to frame. And it works. Pages that chug along at 15 FPS in Chrome or today’s Firefox run at 60 FPS with WebRender.

So how does WebRender do that? It fundamentally changes the way the rendering engine works to make it more like a 3D game engine.

Let’s take a look at what this means. But first…

What does a renderer do?

In the article on Stylo, I talked about how the browser goes from HTML and CSS to pixels on the screen, and how most browsers do this in five steps.

We can split these five steps into two halves. The first half basically builds up a plan. To make this plan, it combines the HTML and CSS with information like the viewport size to figure out exactly what each element should look like—its width, height, color, etc. The end result is something called a frame tree or a render tree.

The second half—painting and compositing—is what a renderer does. It takes that plan and turns it into pixels to display on the screen.

Diagram dividing the 5 stages of rendering into two groups, with a frame tree being passed from part 1 to part 2

But the browser doesn’t just have to do this once for a web page. It has to do it over and over again for the same web page. Any time something changes on this page—for example, a div is toggled open—the browser has to go through a lot of these steps.

Diagram showing the steps that get redone on a click: style, layout, paint, and composite

Even in cases where nothing’s really changing on the page—for example where you’re scrolling or where you are highlighting some text on the page—the browser still has to go through at least some of the second part again to draw new pixels on the screen.

Diagram showing the steps that get redone on scroll: composite

If you want things like scrolling or animation to look smooth, they need to be going at 60 frames per second.

You may have heard this phrase—frames per second (FPS)—before, without being sure what it meant. I think of this like a flip book. It’s like a book of drawings that are static, but you can use your thumb to flip through so that it looks like the pages are animated.

In order for the animation in this flip book to look smooth, you need to have 60 pages for every second in the animation.

Picture of a flipbook with a smooth animation next to it

The pages in this flip book are made out of graph paper. There are lots and lots of little squares, and each of the squares can only contain one color.

The job of the renderer is to fill in the boxes in this graph paper. Once all of the boxes in the graph paper are filled in, it is finished rendering the frame.

Now, of course there is not actual graph paper inside of your computer. Instead, there’s a section of memory in the computer called a frame buffer. Each memory address in the frame buffer is like a box in the graph paper… it corresponds to a pixel on the screen. The browser will fill in each slot with the numbers that represent the color in RGBA (red, green, blue, and alpha) values.

A stack of memory addresses with RGBA values that are correlated to squares in a grid (pixels)

When the display needs to refresh itself, it will look at this section of memory.

Most computer displays will refresh 60 times per second. This is why browsers try to render pages at 60 frames per second. That means the browser has 16.67 milliseconds to do all of the setup —CSS styling, layout, painting—and fill in all of the slots in the frame buffer with pixel colors. This time frame between two frames (16.67 ms) is called the frame budget.

Sometimes you hear people talk about dropped frames. A dropped frame is when the system doesn’t finish its work within the frame budget. The display tries to get the new frame from the frame buffer before the browser is done filling it in. In this case, the display shows the old version of the frame again.

A dropped frame is kind of like if you tore a page out of that flip book. It would make the animation seem to stutter or jump because you’re missing the transition between the previous page and the next.

Picture of a flipbook missing a page with a janky animation next to it

So we want to make sure that we get all of these pixels into the frame buffer before the display checks it again. Let’s look at how browsers have historically done this, and how that has changed over time. Then we can see how we can make this faster.

A brief history of painting and compositing

Note: Painting and compositing is where browser rendering engines are the most different from each other. Single-platform browsers (Edge and Safari) work a bit differently than multi-platform browsers (Firefox and Chrome) do.

Even in the earliest browsers, there were some optimizations to make pages render faster. For example, if you were scrolling content, the browser would keep the part that was still visible and move it. Then it would paint new pixels in the blank spot.

This process of figuring out what has changed and then only updating the changed elements or pixels is called invalidation.

As time went on, browsers started applying more invalidation techniques, like rectangle invalidation. With rectangle invalidation, you figure out the smallest rectangle around each part of the screen that changed. Then, you only redraw what’s inside those rectangles.

This really reduces the amount of work that you need to do when there’s not much changing on the page… for example, when you have a single blinking cursor.

Blinking cursor with small repaint rectangle around it

But that doesn’t help much when large parts of the page are changing. So the browsers came up with new techniques to handle those cases.

Introducing layers and compositing

Using layers can help a lot when large parts of the page are changing… at least, in certain cases.

The layers in browsers are a lot like layers in Photoshop, or the onion skin layers that were used in hand-drawn animation. Basically, you paint different elements of the page on different layers. Then you then place those layers on top of each other.

They have been a part of the browser for a long time, but they weren’t always used to speed things up. At first, they were just used to make sure pages rendered correctly. They corresponded to something called stacking contexts.

For example, if you had a translucent element, it would be in its own stacking context. That meant it got its own layer so you could blend its color with the color below it. These layers were thrown out as soon as the frame was done. On the next frame, all the layers would be repainted again.

Layers for opacity generated, then frame rendered, then thrown out

But often the things on these layers didn’t change from frame to frame. For example, think of a traditional animation. The background doesn’t change, even if the characters in the foreground do. It’s a lot more efficient to keep that background layer around and just reuse it.

So that’s what browsers did. They retained the layers. Then the browser could just repaint layers that had changed. And in some cases, layers weren’t even changing. They just needed to be rearranged—for example, if an animation was moving across the screen, or something was being scrolled.

Two layers moving relative to each other as a scroll box is scrolled

This process of arranging layers together is called compositing. The compositor starts with:

  • source bitmaps: the background (including a blank box where the scrollable content should be) and the scrollable content itself
  • a destination bitmap, which is what gets displayed on the screen

First, the compositor would copy the background to the destination bitmap.

Then it would figure out what part of the scrollable content should be showing. It would copy that part over to the destination bitmap.

Source bitmaps on the left, destination bitmap on the right

This reduced the amount of painting that the main thread had to do. But it still means that the main thread is spending a lot of time on compositing. And there are lots of things competing for time on the main thread.

I’ve talked about this before, but the main thread is kind of like a full-stack developer. It’s in charge of the DOM, layout, and JavaScript. And it also was in charge of painting and compositing.

Main thread doing DOM, JS, and layout, plus paint and composite

Every millisecond the main thread spends doing paint and composite is time it can’t spend on JavaScript or layout.

CPU working on painting and thinking "I really should get to that JS soon"

But there was another part of the hardware that was lying around without much work to do. And this hardware was specifically built for graphics. That was the GPU, which games have been using since the late 90s to render frames quickly. And GPUs have been getting bigger and more powerful ever since then.

A drawing of a computer chip with 4 CPU cores and a GPU

GPU accelerated compositing

So browser developers started moving things over to the GPU.

There are two tasks that could potentially move over to the GPU:

  1. Painting the layers
  2. Compositing them together

It can be hard to move painting to the GPU. So for the most part, multi-platform browsers kept painting on the CPU.

But compositing was something that the GPU could do very quickly, and it was easy to move over to the GPU.

Main thread passing layers to GPU

Some browsers took this parallelism even further and added a compositor thread on the CPU. It became a manager for the compositing work that was happening on the GPU. This meant that if the main thread was doing something (like running JavaScript), the compositor thread could still handle things for the user, like scrolling content up when the user scrolled.

Compositor thread sitting between main thread and GPU, passing layers to GPU

So this moves all of the compositing work off of the main thread. It still leaves a lot of work on the main thread, though. Whenever we need to repaint a layer, the main thread needs to do it, and then transfer that layer over to the GPU.

Some browsers moved painting off to another thread (and we’re working on that in Firefox today). But it’s even faster to move this last little bit of work — painting — to the GPU.

GPU accelerated painting

So browsers started moving painting to the GPU, too.

Paint and composite handled by the GPU

Browsers are still in the process of making this shift. Some browsers paint on the GPU all of the time, while others only do it on certain platforms (like only on Windows, or only on mobile devices).

Painting on the GPU does a few things. It frees up the CPU to spend all of its time doing things like JavaScript and layout. Plus, GPUs are much faster at drawing pixels than CPUs are, so it speeds painting up. It also means less data needs to be copied from the CPU to the GPU.

But maintaining this division between paint and composite still has some costs, even when they are both on the GPU. This division also limits the kinds of optimizations that you can use to make the GPU do its work faster.

This is where WebRender comes in. It fundamentally changes the way we render, removing the distinction between paint and composite. This gives us a way to tailor the performance of our renderer to give you the best user experience on today’s web, and to best support the use cases that you will see on tomorrow’s web.

This means we don’t just want to make frames render faster… we want to make them render more consistently and without jank. And even when there are lots of pixels to draw, like on 4k displays or WebVR headsets, we still want the experience to be just as smooth.

When do current browsers get janky?

The optimizations above have helped pages render faster in certain cases. When not much is changing on a page—for example, when there’s just a single blinking cursor—the browser will do the least amount of work possible.

Blinking cursor with small repaint rectangle around it

Breaking up pages into layers has expanded the number of those best-case scenarios. If you can paint a few layers and then just move them around relative to each other, then the painting+compositing architecture works well.

Rotating clock hand as a layer on top of another layer

But there are also trade offs to using layers. They take up a lot of memory and can actually make things slower. Browsers need to combine layers where it makes sense… but it’s hard to tell where it makes sense.

This means that if there are a lot of different things moving on the page, you can end up with too many layers. These layers fill up memory and take too long to transfer to the compositor.

Many layers on top of each other

Other times, you’ll end up with one layer when you should have multiple layers. That single layer will be continually repainted and transferred to the compositor, which then composites it without changing anything.

This means you’ve doubled the amount of drawing you have to do, touching each pixel twice without getting any benefit. It would have been faster to simply render the page directly, without the compositing step.

Paint and composite producing the same bitmap

And there are lots of cases where layers just don’t help much. For example, if you animate background color, the whole layer has to be repainted anyway. These layers only help with a small number of CSS properties.

Even if most of your frames are best-case scenarios—that is, they only take up a tiny bit of the frame budget—you can still get choppy motion. For perceptible jank, only a couple of frames need to fall into worst-case scenarios.

Frame timeline with a few frames that go over the frame budget, causing jank

These scenarios are called performance cliffs. Your app seems to be moving along fine until it hits one of these worst-case scenarios (like animating background color) and all of the sudden your app’s frame rate topples over the edge.

Person falling over the edge of a cliff labeled animating background color

But we can get rid of these performance cliffs.

How do we do this? We follow the lead of 3D game engines.

Using the GPU like a game engine

What if we stopped trying to guess what layers we need? What if we removed this boundary between painting and compositing and just went back to painting every pixel on every frame?

This may sound like a ridiculous idea, but it actually has some precedent. Modern day video games repaint every pixel, and they maintain 60 frames per second more reliably than browsers do. And they do it in an unexpected way… instead of creating these invalidation rectangles and layers to minimize what they need to paint, they just repaint the whole screen.

Wouldn’t rendering a web page like that be way slower?

If we paint on the CPU, it would be. But GPUs are designed to make this work.

GPUs are built for extreme parallelism. I talked about parallelism in my last article about Stylo. With parallelism, the machine can do multiple things at the same time. The number of things it can do at once is limited by the number of cores that it has.

CPUs usually have between 2 and 8 cores. GPUs usually have at least a few hundred cores, and often more than 1,000 cores.

These cores work a little differently, though. They can’t act completely independently like CPU cores can. Instead, they usually work on something together, running the same instruction on different pieces of the data.

CPU cores working independently, GPU cores working together

This is exactly what you need when you’re filling in pixels. Each pixel can be filled in by a different core. Because it can work on hundreds of pixels at a time, the GPU is a lot faster at filling in pixels than the CPU… but only if you make sure all of those cores have work to do.

Because cores need to work on the same thing at the same time, GPUs have a pretty rigid set of steps that they go through, and their APIs are pretty constrained. Let’s take a look at how this works.

First, you need to tell the GPU what to draw. This means giving it shapes and telling it how to fill them in.

To do this, you break up your drawing into simple shapes (usually triangles). These shapes are in 3D space, so some shapes can be behind others. Then you take all of the corners of those triangles and put their x, y, and z coordinates into an array.

Then you issue a draw call—you tell the GPU to draw those shapes.

CPU passing triangle coordinates to GPU

From there, the GPU takes over. All of the cores will work on the same thing at the same time. They will:

  1. Figure out where all of the corners of the shapes are. This is called vertex shading.GPU cores drawing vertexes on a graph
  2. Figure out the lines that connect those corners. From this, you can figure out which pixels are covered by the shape. That’s called rasterization.GPU cores drawing lines between vertexes
  3. Now that we know what pixels are covered by a shape, go through each pixel in the shape and figure out what color it should be. This is called pixel shading.GPU cores filling in pixels

This last step can be done in different ways. To tell the GPU how to do it, you give the GPU a program called a pixel shader. Pixel shading is one of the few parts of the GPU that you can program.

Some pixel shaders are simple. For example, if your shape is a single color, then your shader program just needs to return that color for each pixel in the shape.

Other times, it’s more complex, like when you have a background image. You need to figure out which part of the image corresponds to each pixel. You can do this in the same way an artist scales an image up or down… put a grid on top of the image that corresponds to each pixel. Then, once you know which box corresponds to the pixel, take samples of the colors inside that box and figure out what the color should be. This is called texture mapping because it maps the image (called a texture) to the pixels.

Hi-res image being mapped to a much lower resolution space

The GPU will call your pixel shader program on each pixel. Different cores will work on different pixels at the same time, in parallel, but they all need to be using the same pixel shader program. When you tell the GPU to draw your shapes, you tell it which pixel shader to use.

For almost any web page, different parts of the page will need to use different pixel shaders.

Because the shader applies to all of the shapes in the draw call, you usually have to break up your draw calls in multiple groups. These are called batches. To keep all of the cores as busy as possible, you want to create a small number of batches which have lots of shapes in them.

CPU passing a box containing lots of coordinates and a pixel shader to the GPU

So that’s how the GPU splits up work across hundreds or thousands of cores. It’s only because of this extreme parallelism that we can think of rendering everything on each frame. Even with the extreme parallelism, though, it’s still a lot of work. You still need to be smart about how you do this. Here’s where WebRender comes in…

How WebRender works with the GPU

Let’s go back to look at the steps the browser goes through to render the page. Two things will change here.

Diagram showing the stages of the rendering pipeline with two changes. The frame tree is now a display list an paint and composite have been combined into Render.

  1. There’s no longer a distinction between paint and composite… they are both part of the same step. The GPU does them at the same time based on the graphics API commands that were passed to it.
  2. Layout now gives us a different data structure to render. Before, it was something called a frame tree (or render tree in Chrome). Now, it passes off a display list.

The display list is a set of high-level drawing instructions. It tells us what we need to draw without being specific to any graphics API.

Whenever there’s something new to draw, the main thread gives that display list to the RenderBackend, which is WebRender code that runs on the CPU.

The RenderBackend’s job is to take this list of high-level drawing instructions and convert it to the draw calls that the GPU needs, which are batched together to make them run faster.

Diagram of the 4 different threads, with a RenderBackend thread between the main thread and compositor thread. The RenderBackend thread translates the display list into batched draw calls

Then the RenderBackend will pass those batches off to the compositor thread, which passes them to the GPU.

The RenderBackend wants to make the draw calls it’s giving to the GPU as fast to run as possible. It uses a few different techniques for this.

Removing any unnecessary shapes from the list (Early culling)

The best way to save time is to not do the work at all.

First, the RenderBackend cuts down the list of display items. It figures out which display items will actually be on the screen. To do this, it looks at things like how far down the scroll is for each scroll box.

If any part of a shape is inside the box, then it is included. If none of the shape would have shown up on the page, though, it’s removed. This process is called early culling.

A browser window with some parts off screen. Next to that is a display list with the offscreen elements removed

Minimizing the number of intermediate textures (The render task tree)

Now we have a tree that only contains the shapes we’ll use. This tree is organized into those stacking contexts we talked about before.

Effects like CSS filters and stacking contexts make things a little complicated. For example, let’s say you have an element that has an opacity of 0.5 and it has children. You might think that each child is transparent… but it’s actually the whole group that’s transparent.

Three overlapping boxes that are translucent, so they show through each other, next to a translucent shape formed by the three boxes where the boxes don't show through each other

Because of this, you need to render the group out to a texture first, with each box at full opacity. Then, when you’re placing it in the parent, you can change the opacity of the whole texture.

These stacking contexts can be nested… that parent might be part of another stacking context. Which means it has to be rendered out to another intermediate texture, and so on.

Creating the space for these textures is expensive. As much as possible, we want to group things into the same intermediate texture.

To help the GPU do this, we create a render task tree. With it, we know which textures need to be created before other textures. Any textures that don’t depend on others can be created in the first pass, which means they can be grouped together in the same intermediate texture.

So in the example above, we’d first do a pass to output one corner of a box shadow. (It’s slightly more complicated than this, but this is the gist.)

A 3-level tree with a root, then an opacity child, which has three box shadow children. Next to that is a render target with a box shadow corner

In the second pass, we can mirror this corner all around the box to place the box shadow on the boxes. Then we can render out the group at full opacity.

Same 3-level tree with a render target with the 3 box shape at full opacity

Next, all we need to do is change the opacity of this texture and place it where it needs to go in the final texture that will be output to the screen.

Same tree with the destination target showing the 3 box shape at decreased opacity

By building up this render task tree, we figure out the minimum number of offscreen render targets we can use. That’s good, because as I mentioned, creating the space for these render target textures is expensive.

It also helps us batch things together.

Grouping draw calls together (Batching)

As we talked about before, we need to create a small number of batches which have lots of shapes in them.

Paying attention to how you create batches can really speed things up. You want to have as many shapes in the same batch as you can. This is for a couple of reasons.

First, whenever the CPU tells the GPU to do a draw call, the CPU has to do a lot of work. It has to do things like set up the GPU, upload the shader program, and test for different hardware bugs. This work adds up, and while the CPU is doing this work, the GPU might be idle.

Second, there’s a cost to changing state. Let’s say that you need to change the shader program between batches. On a typical GPU, you need to wait until all of the cores are done with the current shader. This is called draining the pipeline. Until the pipeline is drained, other cores will be sitting idle.

Mulitple GPU cores standing around while one finishes with the previous pixel shader

Because of this, you want to batch as much as possible. For a typical desktop PC, you want to have 100 draw calls or fewer per frame, and you want each call to have thousands of vertices. That way, you’re making the best use of the parallelism.

We look at each pass from the render task tree and figure out what we can batch together.

At the moment, each of the different kinds of primitives requires a different shader. For example, there’s a border shader, and a text shader, and an image shader.

 

Boxes labeled with the type of batch they contain (e.g. Borders, Images, Rectangles)

We believe we can combine a lot of these shaders, which will allow us to have even bigger batches, but this is already pretty well batched.

We’re almost ready to send it off to the GPU. But there’s a little bit more work we can eliminate.

Reducing pixel shading with opaque and alpha passes (Z-culling)

Most web pages have lots of shapes overlapping each other. For example, a text field sits on top of a div (with a background) which sits on top of the body (with another background).

When it’s figuring out the color for a pixel, the GPU could figure out the color of the pixel in each shape. But only the top layer is going to show. This is called overdraw and it wastes GPU time.

3 layers on top of each other with a single overlapping pixel called out across all three layers

So one thing you could do is render the top shape first. For the next shape, when you get to that same pixel, check whether or not there’s already a value for it. If there is, then don’t do the work.

3 layers where the overlapping pixel isn't filled in on the 2 bottom layers

There’s a little bit of a problem with this, though. Whenever a shape is translucent, you need to blend the colors of the two shapes. And in order for it to look right, that needs to happen back to front.

So what we do is split the work into two passes. First, we do the opaque pass. We go front to back and render all of the opaque shapes. We skip any pixels that are behind others.

Then, we do the translucent shapes. These are rendered back to front. If a translucent pixel falls on top of an opaque one, it gets blended into the opaque one. If it would fall behind an opaque shape, it doesn’t get calculated.

This process of splitting the work into opaque and alpha passes and then skipping pixel calculations that you don’t need is called Z-culling.

While it may seem like a simple optimization, this has produced very big wins for us. On a typical web page, it vastly reduces the number of pixels that we need to touch, and we’re currently looking at ways to move more work to the opaque pass.

At this point, we’ve prepared the frame. We’ve done as much as we can to eliminate work.

… And we’re ready to draw!

We’re ready to setup the GPU and render our batches.

Diagram of the 4 threads with compositor thread passing off opaque pass and alpha pass to GPU

A caveat: not everything is on the GPU yet

The CPU still has to do some painting work. For example, we still render the characters (called glyphs) that are used in blocks of text on the CPU. It’s possible to do this on the GPU, but it’s hard to get a pixel-for-pixel match with the glyphs that the computer renders in other applications. So people can find it disorienting to see GPU-rendered fonts. We are experimenting with moving things like glyphs to the GPU with the Pathfinder project.

For now, these things get painted into bitmaps on the CPU. Then they are uploaded to something called the texture cache on the GPU. This cache is kept around from frame to frame because they usually don’t change.

Even though this painting work is staying on the CPU, we can still make it faster than it is now. For example, when we’re painting the characters in a font, we split up the different characters across all of the cores. We do this using the same technique that Stylo uses to parallelize style computation… work stealing.

What’s next for WebRender?

We look forward to landing WebRender in Firefox as part of Quantum Render in 2018, a few releases after the initial Firefox Quantum release. This will make today’s pages run more smoothly. It also gets Firefox ready for the new wave of high-resolution 4K displays, because rendering performance becomes more critical as you increase the number of pixels on the screen.

But WebRender isn’t just useful for Firefox. It’s also critical to the work we’re doing with WebVR, where you need to render a different frame for each eye at 90 FPS at 4K resolution.

An early version of WebRender is currently available behind a flag in Firefox. Integration work is still in progress, so the performance is currently not as good as it will be when that is complete. If you want to keep up with WebRender development, you can follow the GitHub repo, or follow Firefox Nightly on Twitter for weekly updates on the whole Quantum Render project.

About Lin Clark

Lin works in Advanced Development at Mozilla, with a focus on Rust and WebAssembly.

More articles by Lin Clark…


63 comments

  1. Michael Aquilina

    Exciting stuff and well explained! Will webrender also work on Linux?

    October 10th, 2017 at 10:13

    1. Ryan Oswald

      Excellent article.

      While this should immediately make the web faster on firefox, it may be years before safari adopts a similar approach. In the meantime, most mobile devs will choose to build native apps for rich 60fps experiences.

      I’m curious if it may be possible to build an extremely complex polyfill purely in javascript/webgl 2.0 so that application developers don’t need to wait until 2020+ to start building rich 60fps mobile experiences in html/css.

      October 11th, 2017 at 01:14

  2. John

    > Will webrender also work on Linux?

    Of course it will.

    Today, the problem with Firefox in Linux is the lack of good hardware acceleration…

    October 10th, 2017 at 10:23

    1. Tom

      > Today, the problem with Firefox in Linux is the lack of good hardware acceleration…

      That’s not true anymore.

      During last year, Mesa project did a giant leap and the current drivers for Intel and AMD are better, that their equivalents on other platforms.

      October 10th, 2017 at 12:32

      1. jdjjdkdmfjfj

        It’s a lie. I say this as AMD owner. I still don’t have atomics on my Northern Islands card.

        October 16th, 2017 at 11:53

  3. Thanks for the nicely illustrated summary.

    Will I get support for any of that without having to use a firefox release which only supports the WebExtensions API?

    October 10th, 2017 at 10:36

    1. Lin Clark

      No, it won’t land until after the 57 release.

      October 10th, 2017 at 10:44

  4. Ben Sandeen

    How does this affect the performance of computers with only integrated graphics? Are there still performance wins, but just smaller ones?

    October 10th, 2017 at 10:55

    1. Sam Harrington

      There are performance wins either way – interestingly, WebRender will sometimes run faster on integrated graphics than on discrete cards!

      October 11th, 2017 at 15:26

      1. Camilo

        Only cheap discrete ones? Because cheap discrete ones are understandably worse than some integrated ones.

        October 13th, 2017 at 14:52

        1. fjdnhejdjd

          I guess he meant that there is no overhead on transfering data over pcie bus since GPU is in CPU.

          October 16th, 2017 at 11:57

  5. Kai König

    Nice visuals, even tough i only skimmed through the text i still feel like I understood what change is about. How long did it take get that article together, including all the illustrations?

    October 10th, 2017 at 10:58

    1. Lin Clark

      They do take a very long time. Most of the time is in research and figuring out how to frame it and what metaphors to use.

      I was researching casually (an hour here or there) for about three months. Then I started the really in-depth, focused research and beginnings of the draft about a month ago.

      October 10th, 2017 at 11:02

  6. Vivek Gani

    This looks amazing! I’m still reading through the article, does this potentially enable the ability for Firefox to have a more fluid pinch/zoom like we see in native browsers (Safari / Edge)? If not, which area of the code would one look into to fix that?

    October 10th, 2017 at 11:46

    1. Permutator

      Zooming is exactly the sort of thing the layers-based model is bad for, so I would expect WebRender to help a lot there.

      October 12th, 2017 at 16:08

  7. caryelikarol

    Excellen

    October 10th, 2017 at 12:19

  8. Sang

    Since the GPU is going to play a larger part doesn’t this mean battery issues for mobile devices? Poor battery life performance in Chrome is one of the reasons I tend to use other browsers on my laptop. I guess my question is, is 60fps experience on the web worth the accompanying battery life drain that is sure to come with it? Much of the web does not require animation-heavy rendering.

    October 10th, 2017 at 12:51

    1. Cochon

      Same question. Apps should really care more about battery on mobiles. I would rather have longer battery time than 120 FPS on pages that most of the time have no animation.

      October 10th, 2017 at 19:43

      1. Lin Clark

        There was some discussion about this on the Hacker News thread.

        October 11th, 2017 at 06:45

      2. Marcus

        Also keep in mind: If the GPU is able to max out at a stable 60 FPS, there’s very likely some headroom (i.e. it COULD do >60 FPS but it won’t). With a less fluent solution (e.g. varying 20-40 FPS) the CPU and/or GPU is probably working at 100%, draining battery more quickly.

        October 11th, 2017 at 09:24

    2. Nikola

      It might use the built-in, energy efficient graphics most of the time, and use the discrete GPU only on heavier pages like games?

      October 11th, 2017 at 00:57

    3. PEPP

      When you move work from CPU to GPU, that is you do less on CPU and more on GPU, you can actually save power. I believe that folks at Mozilla will definitely take attention to this aspect bacause WebRender is part of servo project and they carefully measured power usage on mobile devices in the past.

      October 11th, 2017 at 05:00

    4. Marcus

      Higher FPS does not necessarily mean more battery drain. In general, the GPU is much more power efficient (per pixel drawn) compared to the CPU. It would be interesting to see real life figures.

      October 11th, 2017 at 08:21

  9. Craig

    Wow, I don’t think I’ve ever read such an approachable explanation of a complex topic. Thanks for your work on this Lin!

    I’ll be looking forward to WebRender making it into Firefox for yet another big performance boost (as if FF 57 isn’t already enough!)

    October 10th, 2017 at 12:56

  10. Henrik Olsen

    Thank you for this great article (and previous articles), you explain complex concepts really well! Love the illustrations!

    October 10th, 2017 at 13:00

  11. Tushar Arora

    Thanks for putting this together Lin! You’ve got a knack for explaining complex topics in an approachable way. Can you explain in short what kind of work web render has to do while switching tabs?

    October 10th, 2017 at 13:29

  12. Tushar Arora

    Thanks for putting this together Lin! You’ve got a knack for explaining complex topics in an approachable way. Can you briefly explain what kind of work web render has to do while switching tabs?

    October 10th, 2017 at 13:37

  13. Kirill

    Incredible work! Thank you!

    October 10th, 2017 at 14:32

  14. Blake

    Hi! Great stuff.

    I was wondering about two things:

    1. Is there a list of rule thumbs of what not do to? The knowledge of stuff that is heavy on the browser is not quite accessible or talked about. Like this background animation that can be costly

    2. Any tools to really test and analyze the rendering performance? In a way one could change stuff and see the difference

    Really glad to see FF getting bloomy

    October 10th, 2017 at 14:36

    1. Nexii

      Hi Blake,

      There are a lot of great resources out there, I’d particularly recommend checking out some of the videos from http://jankfree.org/ – a lot of the material is still relevant.

      For testing and analyzing, there’s nothing better than the DevTools provided in modern browsers; Firefox, Chrome, Edge and even recent versions of Safari all have great performance tools.

      I’m more personally accustomed to the Chrome DevTools myself and can assure you that it provides deeply extensive analytics on performance and it depends how far you want to go in detail really.

      October 10th, 2017 at 16:34

  15. Eric Lengyel

    You should also look into the Slug library for glyph rendering on the GPU. It is at a production-ready stage of development, and its hardware requirements are much less demanding than Pathfinder’s.

    October 10th, 2017 at 16:18

  16. Nathan

    Very well written and informative. Thank you!

    Is there a reason why the jobs of the GPU cores are all enumerated 1?

    October 10th, 2017 at 16:23

    1. Lin Clark

      Thank you for the compliment! RE: everything enumerated 1… that was a mistake when copy/pasting from my editing tool, so should be fixed now :)

      October 10th, 2017 at 16:35

  17. Fredrik

    Great article! WebRender is definitely something I’m very much looking forward to.

    Is anyone familiar with similar efforts by the chrome teams? Status? Articles?

    October 10th, 2017 at 23:19

  18. Praan

    Well explained! Thanks a lot!

    October 11th, 2017 at 00:14

  19. Simon

    Thanks you for this, it’s really great to better understand how things works (and how it’s demand hard work to optimize it !)

    Will GPU acceleration work with canvas drawing as well ?

    October 11th, 2017 at 01:32

  20. JON

    Hi, Thank you for this very pleasant and well-explained article. So good to read material like this. Much hope for the future of browsers.

    October 11th, 2017 at 01:50

  21. Josef

    Great article! I also like your article on Stylo.

    I especially like the way you use illustrations. It’s quite rare that you see playful illustrations inside a rather technical article. Makes the technical stuff feel a lot less complicated without dumbing it down until it’s unrecognizable :-) I can only imagine how much work went into this! Thanks!

    October 11th, 2017 at 02:27

  22. Hoony Chang

    I’ve been translating your articles on https://hacks.mozilla.or.kr(Korean ver.). Your drawings are so helpful to understand what you’re explaining. Your explanation is understandable as well! thank you always.

    October 11th, 2017 at 02:38

  23. Mathias

    Thank you very much for this article. It was very easy to follow because of its consistency and elegance and really sparked my interest in the topic.

    October 11th, 2017 at 03:19

  24. tim

    Are framerates higher than 60 going to be supported, e.g. 144hz?

    October 11th, 2017 at 05:05

    1. Lin Clark

      Yes, WebRender can go higher than 60 FPS. The WebRender project is independent of Firefox and can be used in other projects, so it is up to the embedding application that it using it to determine the framerate, but WebRender itself is capable of FPS in the hundreds (400+ FPS in some cases), as Patrick Walton explained in an early talk.

      October 11th, 2017 at 07:01

  25. Gabriel Konat

    Great article! A very detailed explanation of the problem and solution with great illustrations. I think taking inspiration from game engines to render web pages faster and more consistently is a great idea.

    My thought is that constantly rendering will increase the CPU and GPU utilisation, which increases power consumption. Is this an issue or is rendering relatively cheap?

    I also think it is important not to be limited to 60 FPS. While most people have screens that run at 60 FPS, some people have screens that can render up to 165 FPS. I’m running my main screen at 120 FPS and can definitely see the difference between 60 and 120 FPS, to the point where I don’t feel 60 FPS is smooth any more. Will it be possible to run at higher FPS when screens support it?

    October 11th, 2017 at 05:27

  26. budziq

    Thanks for an excellent article!

    One question though.
    > Single-platform browsers (Edge and Safari)

    Given that Safari has both MacOS and Windows releases, can we really call it Single-platform?

    October 11th, 2017 at 06:36

    1. Tyler Knott

      Apple stopped updating Safari for Windows five years ago.

      October 11th, 2017 at 13:39

      1. tzachs

        Given that Edge has both Windows and Android releases (and IOS coming soon), can we really call it single-platform?

        October 17th, 2017 at 07:52

        1. Kelly

          On iOS at least, Edge, just FF on iOS and indeed every other browser, will be just a skin on webkit. Apple still does not allow other rendering engines on iOS… Not sure about Edge on Android

          October 19th, 2017 at 14:33

  27. Mike Ratcliffe

    I have heard these things explained by lots of people but never as clearly as you have done here… awesome explanation!

    October 11th, 2017 at 07:49

  28. theo

    Very well written and cool graphics. Came by the way of HN.
    Many thanks.

    October 11th, 2017 at 20:17

  29. Lynton

    I rate this as the most clear explanation of a tech subject I have ever read. Congratulations!
    I am wondering if direct rendering of the video frames would be satisfactory? ie ignoring any overlays from the page, paint each pixel as it is?

    October 12th, 2017 at 12:43

  30. syafriardimelayu

    Lebih ditingkatkan

    October 12th, 2017 at 20:32

  31. eggze

    Great article! After reading it I wonder how this will work together with Wayland? Wouldn’t there be two (or more) competing processes fighting for GPU resources at the same time? How does this work?

    October 13th, 2017 at 00:49

  32. Ivo

    Hi Lin. Excellent article! I am very pleased with all improvements that the Firefox team have been working on in recent months, making the Firefox experience much smoother and faster than it used to be. Documenting those changes in articles like this is amazing and not the sort of transparency we would see from other companies working on web browsers.

    I have a question: since v57, while its performance gains are undeniable, there have been many reports of much larger battery drains than with previous versions of Firefox, not to mention the competition (Blink-based browsers and Edge). Will the Firefox team tackle those power consumption issues at some point in the near future? Portable devices users would appreciate this immensely.

    October 13th, 2017 at 02:58

  33. ozzi

    Awesome article, very informative. Thanks!

    October 13th, 2017 at 03:50

  34. Xiaobei Meng

    Waiting & Expecting !!!

    October 14th, 2017 at 07:17

  35. judiw,jdicm

    Hello. Does it mean that mozilla drops support of old GPUs without OpenGL 2? Please don’t do that!

    October 16th, 2017 at 11:18

  36. klop*cz

    Thank you for this great article. Illustrations are great and clear to understand. Well done, great job :-)

    October 17th, 2017 at 11:23

  37. Helen McInally Aitken

    Not sure I totally understand all this but I’m learning slew but sure…..thankyou x

    October 19th, 2017 at 03:07

  38. Andrea

    What about HTML canvases?

    Are the 2d rendering context drawing command generated by CPU o GPU? If generated by CPU, how they get uploaded in the GPU each frame to be composited?

    October 23rd, 2017 at 01:12

    1. Lin Clark

      This is part of what is being tackled in the Pathfinder project.

      October 23rd, 2017 at 08:52

  39. Steve Knoblock

    Thanks for an incredibly clear and plain language explanation. Love the illustrations. What strikes me as ironic is that the display list was used on the Atari computers and game consoles for managing graphics display.

    October 27th, 2017 at 07:46

  40. Luka

    I’ve never encountered such a simple but still detailed explanation of the work and optimization going on in the a browser and the developers minds.

    Very well done, it’s a shame that not more things related to science in general is explained like this

    October 31st, 2017 at 05:36

  41. Hillsie

    Thanks for the great informative article. How does this affect development of web apps?

    October 31st, 2017 at 16:12

Comments are closed for this article.