Quantum Up Close: What is a browser engine?

In October of last year Mozilla announced Project Quantum – our initiative to create a next-generation browser engine. We’re well underway on the project now. We actually shipped our first significant piece of Quantum just last month with Firefox 53.

But, we realize that for people who don’t build web browsers (and that’s most people!), it can be hard to see just why some of the changes we’re making to Firefox are so significant. After all, many of the changes that we’re making will be invisible to users.

With this in mind, we’re kicking off a series of blog posts to provide a deeper look at just what it is we’re doing with Project Quantum. We hope that this series of posts will give you a better understanding of how Firefox works, and the ways in which Firefox is building a next-generation browser engine made to take better advantage of modern computer hardware.

To begin this series of posts, we think it’s best to start by explaining the fundamental thing Quantum is changing.

What is a browser engine, and how does one work?

If we’re going to start from somewhere, we should start from the beginning.

A web browser is a piece of software that loads files (usually from a remote server) and displays them locally, allowing for user interaction.

Quantum is the code name for a project we’ve undertaken at Mozilla to massively upgrade the part of Firefox that figures what to display to users based on those remote files. The industry term for that part is “browser engine”, and without one, you would just be reading code instead of actually seeing a website. Firefox’s browser engine is called Gecko.

It’s pretty easy to see the browser engine as a single black box, sort of like a TV- data goes in, and the black box figures out what to display on the screen to represent that data. The question today is: How? What are the steps that turn data into the web pages we see?

The data that makes up a web page is lots of things, but it’s mostly broken down into 3 parts:

code that represents the structure of a web page
code that provides style: the visual appearance of the structure
code that acts as a script of actions for the browser to take: computing, reacting to user actions, and modifying the structure and style beyond what was loaded initially

The browser engine combines structure and style together to draw the web page on your screen, and figure out which bits of it are interactive.

It all starts with structure. When a browser is asked to load a website, it’s given an address. At this address is another computer which, when contacted, will send data back to the browser. The particulars of how that happens are a whole separate article in themselves, but at the end the browser has the data. This data is sent back in a format called HTML, and it describes the structure of the web page. How does a browser understand HTML?

Browser engines contain special pieces of code called parsers that convert data from one format into another that the browser holds in its memory ¹. The HTML parser takes the HTML, something like:

<section>
 <h1 class="main-title">Hello!</h1>
 <img src="http://example.com/image.png">
</section>

And parses it, understanding:

Okay, there’s a section. Inside the section is a heading of level 1, which itself contains the text: “Hello!” Also inside the section is an image. I can find the image data at the location: http://example.com/image.png

The in-memory structure of the web page is called the Document Object Model, or DOM. As opposed to a long piece of text, the DOM represents a tree of elements of the final web page: the properties of the individual elements, and which elements are inside other elements.

In addition to describing the structure of the page, the HTML also includes addresses where styles and scripts can be found. When the browser finds these, it contacts those addresses and loads their data. That data is then fed to other parsers that specialize in those data formats. If scripts are found, they can modify the page structure and style before the file is finished being parsed. The style format, CSS, plays the next role in our browser engine.

With Style

CSS is a programming language that lets developers describe the appearance of particular elements on a page. CSS stands for “Cascading Style Sheets”, so named because it allows for multiple sets of style instructions, where instructions can override earlier or more general instructions (called the cascade). A bit of CSS could look like the following:

section {
  font-size: 15px;
  color: #333;
  border: 1px solid blue;
}
h1 {
  font-size: 2em;
}
.main-title {
  font-size: 3em; 
}
img {
  width: 100%;
}

CSS is largely broken up into groupings called rules, which themselves consist of two parts. The first part is selectors. Selectors describe the elements of the DOM (remember those from above?) being styled, and a list of declarations that specify the styles to be applied to elements that match the selector. The browser engine contains a subsystem called a style engine whose job it is to take the CSS code and apply it to the DOM that was created by the HTML parser.

For example, in the above CSS, we have a rule that targets the selector “section”, which will match any element in the DOM with that name. Style annotations are then made for each element in the DOM. Eventually each element in the DOM is finished being styled, and we call this state the computed style for that element. When multiple competing styles are applied to the same element, those which come later or are more specific wins. Think of stylesheets as layers of thin tracing paper- each layer can cover the previous layers, but also let them show through.

Once the browser engine has computed styles, it’s time to put it to use! The DOM and the computed styles are fed into a layout engine that takes into account the size of the window being drawn into. The layout engine uses various algorithms to take each element and draw a box that will hold its content and take into account all the styles applied to it.

When layout is complete, it’s time to turn the blueprint of the page into the part you see. This process is known as painting, and it is the final combination of all the previous steps. Every box that was defined by layout gets drawn, full of the content from the DOM and with styles from the CSS. The user now sees the page, reconstituted from the code that defines it.

That used to be all that happened!

When the user scrolled the page, we would re-paint, to show the new parts of the page that were previously outside the window. It turns out, however, that users love to scroll! The browser engine can be fairly certain it will be asked to show content outside of the initial window it draws (called the viewport). More modern browsers take advantage of this fact and paint more of the web page than is visible initially. When the user scrolls, the parts of the page they want to see are already drawn and ready. As a result, scrolling can be faster and smoother. This technique is the basis of compositing, which is a term for techniques to reduce the amount of painting required.

Additionally, sometimes we need to redraw parts of the screen. Maybe the user is watching a video that plays at 60 frames per second. Or maybe there’s a slideshow or animated list on the page. Browsers can detect that parts of the page will move or update, and instead of re-painting the whole page, they create a layer to hold that content. A page can be made of many layers that overlap one another. A layer can change position, scroll, transparency, or move behind or in front of other layers without having to re-paint anything! Pretty convenient.

Sometimes a script or an animation changes an element’s style. When this occurs, the style engine need to re-compute the element’s style (and potentially the style of many more elements on the page), recalculate the layout (do a reflow), and re-paint the page. This takes a lot of time as computer-speed things go, but so long as it only happens occasionally, the process won’t negatively affect a user’s experience.

In modern web applications, the structure of the document itself is frequently changed by scripts. This can require the entire rendering process to start more-or-less from scratch, with HTML being parsed into DOM, style calculation, reflow, and paint.

Standards

Not every browser interprets HTML, CSS, and JavaScript the same way. The effect can vary: from small visual differences all the way to the occasional website that works in one browser and not at all in another. These days, on the modern Web, most websites seem to work regardless of which browser you choose. How do browsers achieve this level of consistency?

The formats of website code, as well as the rules that govern how the code is interpreted and turned into an interactive visual page, are defined by mutually-agreed-upon documents called standards. These documents are developed by committees consisting of representatives from browser makers, web developers, designers, and other members of industry. Together they determine the precise behavior a browser engine should exhibit given a specific piece of code. There are standards for HTML, CSS, and JavaScript as well as the data formats of images, video, audio, and more.

Why is this important? It’s possible make a whole new browser engine and, so long as you make sure that your engine follows the standards, the engine will draw web pages in a way that matches all the other browsers, for all the billions of web pages on the Web. This means that the “secret sauce” of making websites work isn’t a secret that belongs to any one browser. Standards allow users to choose the browser that meets their needs.

Moore’s No More

When dinosaurs roamed the earth and people only had desktop computers, it was a relatively safe assumption that computers would only get faster and more powerful. This idea was based on Moore’s Law, an observation that the density of components (and thus miniaturization/efficiency of silicon chips) would double roughly every two years. Incredibly, this observation held true well into the 21st century and, some would argue, still holds true at the cutting edge of research today. So why is it that the speed of the average computer seems to have leveled off in the last 10 years?

Speed is not the only feature customers look for when shopping for a computer. Fast computers can be very power-hungry, very hot, and very expensive. Sometimes, people want a portable computer that has good battery life. Sometimes, they want a tiny touch-screen computer with a camera that fits in their pocket and lasts all day without a charge! Advances in computing have made that possible (which is amazing!), but at the cost of raw speed. Just as it’s not efficient (or safe) to drive your car as fast as possible, it’s not efficient to drive your computer as fast as possible. The solution has been to have multiple “computers” (cores) in one CPU chip. It’s not uncommon to see smartphones with 4 smaller, less powerful cores.

Unfortunately, the historical design of the web browser kind-of assumed this upward trajectory in speed. Also, writing code that’s good at using multiple CPU cores at the same time can be extremely complicated. So, how do we make a fast, efficient browser in the era of lots of small computers?

We have some ideas!

In the upcoming months, we’ll take a closer look at some of changes coming to Firefox and how they will take better advantage of modern hardware to deliver a faster and more stable browser that makes websites shine.

Onward!

[1]: Your brain can do things that are like parsing: the word “eight” is a bunch of letters that spell a word, but you convert them them to the number 8 in your head, not the letters e-i-g-h-t. back

Potch is a Web Platform Advocate at Mozilla.

potch.me

19 comments

IdiotTake2

There is one thing I don’t like about Nightly performance: scrolling fps in the active tab can be affected by the tabs in the background. For example, when I open a bunch of bookmarks and they all start loading at once. Or when I restore previously saved session with browser.sessionstore.restore_on_demand set to false and Nightly starts to load 3 restored tabs at once until all restored tabs are loaded. That’s very annoying.

Will Project Quantum help to get rid of this problem?

May 9th, 2017 at 11:05
1. Potch
  
  The segment of Quantum we call Quantum DOM is all about this inter-tab interference! We’re using a mix of approaches to prioritize user input and foreground tab activity while throttling or even suspending background tabs. There will be a post that goes into much greater detail on this.
  
  May 10th, 2017 at 10:05
  1. IdiotTake2
    
    Thank you very much! When will Quantum DOM be enabled in Nightly? 2017? 2018? 2020?
    
    May 10th, 2017 at 23:47
    1. Potch
      
      Quantum DOM is an ongoing program, but the goal is to have the first pieces in place making a positive impact this year.
      
      May 11th, 2017 at 10:52
Alex Vincent

This is a very good explanation of the basics… nicely done. I’m going to keep this as a bookmark for people who don’t understand the many parts.

May 9th, 2017 at 17:18
Joshua

That’s a great write-up, man. Could you talk about what you guys are doing to improve your C++ codebase, in addition to the Rust parts that you are importing? That would be cool.

May 10th, 2017 at 05:36
1. Potch
  
  I’ll be talking about the C++ side of things in upcoming posts. There’s lots going on there! We have a general performance program we call Quantum Flow that is all about generalized improvements to our existing code base. There’s a great post series out there that goes way deeper than I ever can on it: https://ehsanakhgari.org/tag/quantumflow
  
  May 10th, 2017 at 10:04
Alex

Are Quantum and Servo the same thing?

May 10th, 2017 at 08:21
1. Potch
  
  Quantum and Servo are related, but not the same thing. Servo is a ground-up re-think (and re-write) of the browser in Rust. Servo is very much on a research track where we hope to use pure Servo code one day, but it’s multiple years out. Quantum is an effort to streamline existing components of Gecko and incorporate pieces of Servo that are ready and get the speed boost. This division lets Servo stay true to what it is while letting us improve Gecko in the meantime. Servo is really good at doing work quickly, and Gecko is really good at not doing work when it doesn’t have to, so the hybrid can be more than the sum of its parts!
  
  May 10th, 2017 at 10:00
Michael M.

The footnote is broken (missing quote after the ID). Yes, of course you just wanted to demonstrate how parsers work and how they can break ;-)

May 11th, 2017 at 00:52
1. Potch
  
  And the value of unobfuscated source for debugging! :)
  
  May 11th, 2017 at 10:51
Joshua Jacobson

Nice write up and nice diagrams. What tool did you use to make them?

May 11th, 2017 at 07:09
1. Potch
  
  I drew these in Sketch (after mocking them up in a Google apps drawing)
  
  May 11th, 2017 at 10:49
The Bionic Cyclist

AAC and DTS 5.1 playback please :)

I use plex an awful lot, but having no 5.1 audio is a must.
Do this and Mozilla is king, Mozilla rocks anyway, but 5.1 is just better

May 11th, 2017 at 08:14
Dago

It’s a great post.
PS. Actually I’ve read it so cautiously that I’ve caught ‘CSS is a programming language’ — is it?

May 11th, 2017 at 14:10
1. bjorn3
  
  Yes it is, it is a [declarative programming language](https://en.wikipedia.org/wiki/Declarative_programming).
  
  May 18th, 2017 at 06:55
Doug S.

As others have said this is a great high-level overview of the fundamentals, particularly for people new to the web. Broad in what it covers, but succinct and easy to understand. Look forward to the future posts in this series!

May 12th, 2017 at 18:20
Robert L

Articulated and displayed very well, even for a geekless geek like myself to appreciate.

May 13th, 2017 at 06:52
Brian

Project Quantum was a good name for a new browser engine.

We won’t know if the pages are displayed the same in Firefox and Chrome until we view them. Before we view them, we should should expect that they are both working and broken at the same time:-)

May 15th, 2017 at 03:33

Comments are closed for this article.

Hacks

By Potch

With Style

Standards

Moore’s No More

About Potch

19 comments

Quantum Up Close: What is a browser engine?

By Potch

With Style

Standards

Moore’s No More

About Potch

Discover great resources for web development

Thanks! Please check your inbox to confirm your subscription.