Crossing the Rust FFI frontier with Protocol Buffers

My team, the application services team at Mozilla, works on Firefox Sync, Firefox Accounts and WebPush.

These features are currently shipped on Firefox Desktop, Android and iOS browsers. They will soon be available in our new products such as our upcoming Android browser, our password manager Lockbox, and Firefox for Fire TV.

Solving for one too many targets

Until now, for historical reasons, these functionalities have been implemented in widely different fashions. They’ve been tightly coupled to the product they are supporting, which makes them hard to reuse. For example, there are currently three Firefox Sync clients: one in JavaScript, one in Java and another one in Swift.

Considering the size of our team, we quickly realized that our current approach to shipping products would not scale across more products and would lead to quality issues such as bugs or uneven feature-completeness across platform-specific implementations. About a year ago, we decided to plan for the future. As you may know, Mozilla is making a pretty big bet on the new Rust programming language, so it was natural for us to follow suit.

A cross-platform strategy using Rust

Our new strategy is as follows: We will build cross-platforms components, implementing our core business logic using Rust and wrapping it in a thin platform-native layer, such as Kotlin for Android and Swift for iOS.

Rust component bundled in an Android wrapper

The biggest and most obvious advantage is that we get one canonical code base, written in a safe and statically typed language, deployable on every platform. Every upstream business logic change becomes available to all our products with a version bump.

How we solved the FFI challenge safely

However, one of the challenges to this new approach is safely passing richly-structured data across the API boundary in a way that’s memory-safe and works well with Rust’s ownership system. We wrote the ffi-support crate to help with this, if you write code that does FFI (Foreign function interface) tasks you should strongly consider using it. Our initial versions were implemented by serializing our data structures as JSON strings and in a few cases returning a C-shaped struct by pointer. The procedure for returning, say, a bookmark from the user’s synced data looked like this:

Returning Bookmark data from Rust to Kotlin using JSON

Returning Bookmark data from Rust to Kotlin using JSON (simplified)

 

So what’s the problem with this approach?

  • Performance: JSON serializing and de-serializing is notoriously slow, because performance was not a primary design goal for the format. On top of that, an extra string copy happens on the Java layer since Rust strings are UTF-8 and Java strings are UTF-16-ish. At scale, it can introduce significant overhead.
  • Complexity and safety: every data structure is manually parsed and deserialized from JSON strings. A data structure field modification on the Rust side must be reflected on the Kotlin side, or an exception will most likely occur.
  • Even worse, in some cases we were not returning JSON strings but C-shaped Rust structs by pointer: forget to update the Structure Kotlin subclass or the Objective-C struct and you have a serious memory corruption on your hands.

We quickly realized there was probably a better and faster way that could be safer than our current solution.

Data serialization with Protocol Buffers v.2

Thankfully, there are many data serialization formats out there that aim to be fast. The ones with a schema language will even auto-generate data structures for you!

After some exploration, we ended up settling on Protocol Buffers version 2.

The —relative— safety comes from the automated generation of data structures in the languages we care about. There is only one source of truth—the .proto schema file—from which all our data classes are generated at build time, both on the Rust and consumer side.

protoc (the Protocol Buffers code generator) can emit code in more than 20 languages! On the Rust side, we use the prost crate which outputs very clean-looking structs by leveraging Rust derive macros.

Returning Bookmark data from Rust to Kotlin using Protocol Buffers 2

Returning Bookmark data from Rust to Kotlin using Protocol Buffers 2 (simplified)

 

And of course, on top of that, Protocol Buffers are faster than JSON.

There are a few downsides to this approach: it is more work to convert our internal types to the generated protobuf structs –e.g a url::Url has to be converted to a String first– whereas back when we were using serde-json serialization, any struct implementing serde::Serialize was a line away from being sent over the FFI barrier. It also adds one more step during our build process, although it was pretty easy to integrate.

One thing I ought to mention: Since we ship both the producer and consumer of these binary streams as a unit, we have the freedom to change our data exchange format transparently without affecting our Android and iOS consumers at all.

A look ahead

Looking forward, there’s probably a high-level system that could be used to exchange data over the FFI, maybe based on Rust macros. There’s also been talk about using FlatBuffers to squeeze out even more performance. In our case Protobufs provided the right trade-off between ease-of-use, performance, and relative safety.

So far, our components are present both on iOS, on Firefox and Lockbox, and on Lockbox Android, and will soon be in our new upcoming Android browser.

Firefox iOS has started to oxidize by replacing their password syncing engine with the one we built in Rust. The plan is to eventually do the same on Firefox Desktop as well.

If you are interested in helping us build the future of Firefox Sync and more, or simply following our progress, head to the application services Github repository.

About Edouard Oger

More articles by Edouard Oger…


7 comments

  1. Ari Ugwu

    Clean and concise explanation. Thanks for posting this and I will be adding it to our team discussions.

    It’s nice to see the thought space around Rust developing so quickly.

    April 2nd, 2019 at 23:49

  2. Dwayne Bradley

    What was your teams reasoning for choosing v2 instead of the newer v3 for protocol buffers?

    April 3rd, 2019 at 06:25

  3. Dwayne

    One more question…why use the “prost” crate vs the “protobuf” crate?

    April 3rd, 2019 at 06:32

    1. Edouard Oger

      Hi Dwayne, thanks for your questions.

      We use prost instead of protobuf because the API is cleaner and easier to use: “prost” auto-generated structs look like regular Rust structs [0], whereas the “protobufs” ones are closer to what you’d expect in a OOP language [1].

      We chose the v2 of protobufs because this version has better support for optionals (it’s possible to emulate them in v3 but not convenient) and should be supported for a while.

      [0] https://github.com/danburkert/prost/tree/26adada0bdd16f45e327bdca21c60c9e49c935b9#generated-code-example

      [1] https://github.com/stepancheg/rust-protobuf/blob/a1333bf024b39a4f1898a873a0074cde9a5f3d00/protobuf/src/descriptor.rs#L151-L170

      April 3rd, 2019 at 08:20

  4. Neville

    I did this 2 weeks ago, rewriting a CPU intensive function in Rust. The Rust function was much faster, but FFI overhead reduced the benefit.

    What FFI overhead have you noticed?
    I’ll try out the ffi-support crate, as I didn’t know about it until now.

    April 3rd, 2019 at 11:59

  5. dave

    Could this be a usecase for Apache Arrow?

    April 7th, 2019 at 02:23

    1. Dirk

      Apache Arrow is quite different, in that it is designed for sharing large amounts of columnar in-memory data between data-intensive applications. It actually tries to eliminate serialization/deserialization as much as possible by having each application use the same memory format.

      April 10th, 2019 at 05:38

Comments are closed for this article.