A faster Solidity compiler CLI in Rust

To make the most out of my quarantine time, I decided to set out to learn Rust, encouraged by the experiences of @frangio, @nventuro, and @alcuadrado (who has a great thread on his experience learning the language).

As a sample project to try it out, I wanted to picked something I was familiar with, but where Rust could make a difference. Eventually, I settled on building a small CLI wrapper for solc, the Solidity compiler. I’m happy to say that early tests show a ~10x speed increase vs other CLIs out there, but before I go into the details of this experiment, allow me to cover how solc works and how it’s used in most CLIs.

On solc, json interfaces, and emscripten

solc itself is quite impressive. It’s built entirely in cpp, and it’s blazing fast. Try using it directly from the command line, and you’ll see compilation times an order of magnitude faster than what you’re used to.

$ time solc --bin SafeMath.sol
real	0m0.007s

But solc has a few shortcomings. It’s command-line usage is intended only for simple projects with a few contracts. For more complex setups, the standard JSON interface is recommended. In this mode, the compiler accepts an input JSON via stdin, and outputs all compilation artifacts in another large JSON via stdout. Needless to say, this is quite impractical, so it’s common to use a dedicated CLI that takes care of preparing the input and processing the output of solc, such as truffle, buidler, or oz.

However, as soon as you start compiling using any of these CLIs, you’ll notice much longer build times. In many scenarios, this is largely caused by an emscripten build of the compiler being used under the hood.

What is emscripten? Emscripten is a toolchain that allows you to compile cpp code into asm.js or WASM, which allows you to run it in a browser or in node.js. For every new compiler version, the Solidity team releases an emscripten build to be used via solc-js, the Solidity compiler javascript bindings. This allows you to build your Solidity files in a browser (such as Remix) or directly in a node.js environment, like your favourite CLI.

Unfortunately, emscripten builds are slower than their native alternative. Some CLIs like truffle or oz allow you to call directly into the native compiler if it’s installed and available in your PATH, thus reducing compile times in half - but they are still in the order of seconds. We should be able to do much better.

~/openzeppelin-contracts$ time oz compile // with solc-js
real 0m5.746s
~/openzeppelin-contracts$ time oz compile // with solc native build
real 0m2.130s

While the base startup time of a node.js program is definitely a factor that contributes to these slow times, there is another cause to look into: parsing import statements.

On import statements and read callbacks

solc is smart enough to resolve most import statements. However, in some cases, it needs a helping hand. Have you ever wondered how solc, being platform-agnostic, figures out that @openzeppelin/contracts/math/SafeMath.sol is to be found somewhere in node_modules? The answer is that it doesn’t: it’s the responsibility of the caller to resolve it.

When used as a library, both solc and solc-js allow the caller to specify a read callback. The compiler will use it to ask for the sources of any import file it cannot resolve. This is the most efficient method to resolve contract dependencies.

However, most CLIs do not rely on the callback. One of the main reasons is that it only works when using the compiler as a library - whereas some CLIs prefer to retain the ability to spawn a new process running the solc native binary, which is not compatible with this usage. Another issue is that recursive dependencies are particularly hard to resolve with the callback (but we’ll leave this out of the picture for the time being).

This means that what most CLIs do is manually parse all import statements in all Solidity files before calling into solc, in order to manually resolve all imports and provide solc with all the sources in advance. This parsing operation is slow, especially since it happens in javascript world. And as a side note, different CLIs handle it differently: older versions of truffle used to fire a first fake compilation and grabbed error messages on import statements to parse them, buidler uses an ANTLR-based parser, and for the OpenZeppelin CLI we built a manual parser that only retrieves pragmas and imports to speed up the process.

The question now is whether it’s possible to have the best of both worlds: avoiding the initial parsing by relying on the callback, and use a native build of the compiler for increased speed.

Enter Rust

After this very long introduction, I hope it’s clear why Rust is a good choice for this problem. Rust has good support for interacting with native C libraries, and Solidity lead developer @axic had already built most of the bindings in the solc-rust project. Furthermore, a Rust binary has a much faster startup time than node.js, which has to load and interpret a zillion javascript text files.

I ended up building a very small tool to run the compiler through a fork of solc-rust with read-callback support. Needless to say, this is just an experiment, and is definitely not ready for production usage - but if you’re interested, you can check it out at spalladino/solc-rs.

To test it, I picked the latest version of OpenZeppelin Contracts (since the tool has solc 0.6.2 statically bundled), and added a few fake dependencies to exercise the read callback. I had expected to see some gains in build times, but I certainly did not expect this.

The Rust CLI wrapper around solc gets 10x faster compile times when compared with other tools using the solc-js build, and is 3x faster than the OpenZeppelin CLI using the native solc build. You can download a precompiled Linux binary here if you want to take it out for a spin.

Now, while this was a personal experiment and not related to the suite of OpenZeppelin tools, using native CLIs certainly shows promise, thanks to increased response times that could yield a better development experience.

With that in mind, I’m curious if anyone is familiar with Ethereum toolchains built in compiled languages - please share in this thread if you know any! Also, if you want to contribute to solc-rust, there is an incipient Telegram channel around it. Feel free to send me a DM with your Telegram user and I’ll add you to the group! Happy hacking!