How to speed up the Rust compiler some more in 2020

rkangel · on Aug 5, 2020

I've taken a non-technical lesson from these blog posts. It's a simple one - their very existence has made the rust compilation time less of an issue to me.

For most people, knowing that the relevant people (in this case the Rust team) agree with your problem and are doing something about it significantly reduces the pain of the problem, because you know it's getting better. You need both parts though - the Rust team could be quietly working on performance for years and double compile speed and I probably wouldn't notice. This blog series is excellent marketing for some forward progress, and means that I do notice, and feel better about it.

jfkebwjsbx · on Aug 5, 2020

The downside is that it is a very common marketing trick. Promise something so that customers feel good about it.

I know the Rust contributors are doing their best, and that this blog post is not writing a marketing ploy, but in the end, the result is what matters.

My take on this is that I am fine with compile times as long as some effort is being done to keep it near optimal.

But I really do not want the Rust team to feel pressure to improve the compile times so much that they will end up killing other features or designing the language differently just for that (like Go, for instance). Or even making the compiler so complex that adding new features is harder for them.

I really like Rust as a language that is more complex to write and to compile but in return gives you robustness and performance. Compilation times be damned. I am not writing Rust to compile fast!

bluGill · on Aug 5, 2020

I want two different languages. I want one that compiles fast so I can run my unit tests fast. I want one that takes however long needed to get the fastest binary. These need not share the same compiler so long as both compile the same source to the same result.

noch · on Aug 6, 2020

> I want one that takes however long needed to get the fastest binary.

Year after year, people keep saying this or things like it. But it's worth remembering that it is not the compiler that makes your code fast, but how you structure your data transformations that makes code fast. In 2014, Mike Acton taught us that the best compiler optimisations can only account for 1-10% of the performance optimisation problem space. "The vast majority of [performance] problems, are problems the compiler cannot reason about." https://youtu.be/rX0ItVEVjHc?t=2097

mlindner · on Aug 6, 2020

Is that really true in all cases though? That's certainly true for languages like c++ and C and other compiles languages with manual memory management because the compiler can't re-order things past a memory load. That's less true I think for languages like where Rust can get to (even if they're not there yet). In Rust you can reason about memory allocations because there's guaranteed invariants in safe rust that let the compiler reason more about what the code is doing and rearrange instructions more than what would be possible in C or C++.

noch · on Aug 9, 2020

> [...] because the compiler can't re-order things past a memory load.

Re-ordering of reads and writes is not really the problem either, even if we assume that one could automatically safely reorder reads and writes arbitrarily to yield significant optimisations. Fundamentally, the compiler does not know the semantics of your data and the relationships between pieces of data and the logic of their transformations. Knowledge of your specific problem and the specific data you are transforming yields the majority of the powerful optimisation opportunities. Similarly, a garbage collector doesn't know the meaning of your data or your algorithms.

> Is that really true in all cases though?

Consider: Given a set of data, if a compiler or garbage collector understood the data sufficiently, then there would be no need for you to exist to write the program, because the compiler could simply write the data transformations itself and generate the program.

However, one is encouraged to do experiments for oneself. Concrete profiling data, not abstract theory, is what matters when evaluating cost-benefit of any approach.

unrealhoang · on Aug 5, 2020

Hopefully we can use cranelift as backend soon. That will be the first milestone for fast compile - not as fast result.

whatshisface · on Aug 5, 2020

>The downside is that it is a very common marketing trick.

"The less you intend to do about something, the more you have to talk about it." However this maxim is not applicable here because compile times actually have been improving.

erik_seaberg · on Aug 7, 2020

This. Don't ever dumb down a language just to make tools faster. Anything the tools don't help with, I have to do with my brain, which is made of meat and runs at 0.0000001 GHz.

nnethercote · on Aug 5, 2020

Hi, I'm the author of the blog post.

There is definitely a marketing angle to these posts. Rust has a reputation for slow compilation, and one of my goals is to show that (a) people working on Rust care about compile times, and (b) improvements are being made.

I'll let you judge whether the marketing is backed by actual substance. It's a long-running blog post series, here are links to all the posts.

* https://blog.mozilla.org/nnethercote/2016/10/14/how-to-speed...

* https://blog.mozilla.org/nnethercote/2016/11/23/how-to-speed...

* https://blog.mozilla.org/nnethercote/2018/04/30/how-to-speed...

* https://blog.mozilla.org/nnethercote/2018/06/05/how-to-speed...

* https://blog.mozilla.org/nnethercote/2018/11/06/how-to-speed...

* https://blog.mozilla.org/nnethercote/2019/07/17/how-to-speed...

* https://blog.mozilla.org/nnethercote/2019/07/25/the-rust-com...

* https://blog.mozilla.org/nnethercote/2019/10/11/how-to-speed...

* https://blog.mozilla.org/nnethercote/2019/12/11/how-to-speed...

* https://blog.mozilla.org/nnethercote/2020/04/24/how-to-speed...

* https://blog.mozilla.org/nnethercote/2020/08/05/how-to-speed...

EDIT: I just saw that alilleybrinker already linked to all these and more below: https://news.ycombinator.com/item?id=24061346

fulafel · on Aug 5, 2020

I hope you're right and it gets better on the long term - the blog post says "since my last post, the compiler is probably either no slower or somewhat faster for most real-world cases".

ndesaulniers · on Aug 5, 2020

We just found a opportunity for improving compile times for the Linux kernel when built with LLVM targeting x86 by 13% (slowdowns due to inline asm and sillyness in LLVM). Appearantly you can `perf record` the invocation of `make` for an entire kernel build, and get a 5GB trace of mostly llvm (well, mostly clang). When you start measuring and profiling, easy wins pop up everywhere. WIP

I was just today discussing with clang's code owner the idea of lazy parsing for static inline functions.

mhaberl · on Aug 5, 2020

That sounds great!

>slowdowns due to inline asm and sillyness in LLVM

I just hope that isn't a Chesterton's fence

ndesaulniers · on Aug 5, 2020

It wasn't. For a given architecture you can enable/disable extensions via +sse,-mmx,+avx,-avx512 etc. Certain enables or disables imply enabling/disabling others. How that was being computed was accidentally quadratic, and further could be memoized.

db48x · on Aug 6, 2020

You should send that to Accidentally Quadratic (https://accidentallyquadratic.tumblr.com/). There have been no new stories there in a year, but perhaps they just need submissions.

MaxBarraclough · on Aug 5, 2020

No mention of the work being done on the CraneLift backend for Rust. [0] That seems like a great solution: have one compiler that runs quickly, for development, and have a different compiler for release builds, capable of state-of-the-art optimisation.

My understanding is that compile times can be greatly improved simply by using a faster (and less optimising) code-generator, even without doing the (presumably much harder) work of optimising Rust's borrower-checker.

Disclaimer: I don't know all that much about Rust.

[0] https://github.com/bjorn3/rustc_codegen_cranelift

Thaxll · on Aug 5, 2020

Having two different compiler is tricky because that means what you dev and what goes to prod are different?

MaxBarraclough · on Aug 5, 2020

Could be an issue, but it's one with various possible solutions. Devs could switch to the LLVM-backed compiler periodically, a continuous integration system could run both compilers, etc.

If both compilers are relatively bug-free it shouldn't be too much of a burden. Correct C/C++ code runs fine under both GCC and Clang, for example. If anything, things should be easier with Rust, as it gives you less opportunity to shoot yourself in the foot in strange compiler-specific ways than do C and C++. I recall having a C++ alignment issue that only arose in one compiler, for example. (Of course, my code was broken, but it worked ok with one of the compilers just 'by coincidence'.) I figure that kind of thing is less likely in Rust, but I don't know the Rust language well enough to say this with confidence.

tonyedgecombe · on Aug 5, 2020

That’s the case with other compiled languages as the debug and release builds have different optimisations.

edflsafoiewq · on Aug 5, 2020

My impression was cranelift doesn't improve compile time by enough to lift rustc out of the "slow" category, and borrowck isn't usually the expensive part of compilation.

MaxBarraclough · on Aug 5, 2020

If not the borrower-checker or the code-generator, what's the slow part of Rust compilation? The front-end?

trait · on Aug 5, 2020

The slowest part is code-gen indeed, that's why cargo-check[0] is really fast. [0]https://doc.rust-lang.org/cargo/commands/cargo-check.html

zozbot234 · on Aug 5, 2020

Typically the bottleneck is monomorphizing generic code. Also evaluating macros (if they're being used).

kanobo · on Aug 5, 2020

It's hard to not respect the devotion and love Rustaceans give to Rust.

est31 · on Aug 5, 2020

Not the author, but the reasons that brought me to Rust from C++ 5 years ago are still valid today. Consistent build system supporting an online repo (cargo), safety, C++ best practices encoded as compiler lints, consistent naming policy. The last one seems innocuous but it's actually really annoying to integrate multiple libraries with different naming styles, having to remember when to use PascalCase vs _.

Ygg2 · on Aug 5, 2020

Don't forget a hyper friendly compiler, with helpful error messages.

cft · on Aug 5, 2020

I feel that it's a future of compilers: it almost feels like in interactive session, where you can actually learn stuff. I suggest expanding this.

rowanG077 · on Aug 5, 2020

Have you heard of Idris? It's tagline is programming as a conversation.

cletus · on Aug 5, 2020

To be fair--and I say this as a Rust fan--the pages of inscrutable error messages caused by a misplaced parenthesis in a template are pretty much over. Clang spearheaded this change to the point where some companies would use clang for development and gcc for production builds.

This isn't to say that template errors are problem-free but the situation is much better than it was 5 years ago.

Ygg2 · on Aug 5, 2020

I didn't mean it as slight against C++ template errors.

Java is my daily professional development. That's the baseline. And it's errors are not good as Rust's.

I often joke that IntelliJ IDEA is better programmer than I am. But in Rust, the same can be said of the Rust compiler.

The amount of suggestions, and explanations is stellar.

jfkebwjsbx · on Aug 5, 2020

But you were adding to the parent's list, which was giving reasons to move from C++ to Rust.

Ygg2 · on Aug 7, 2020

I said, don't forget as in extra reasons to prefer it. It was 100% unrelated to C++ errors.

adsjhdashkj · on Aug 5, 2020

It's so helpful that it's jarring when it's not - hah.

Ie, i tried out Diesel for a few projects and after several months, i'm moving away from it. The compiler messages can be insane due to Diesel's rather.. excessive use of generics haha.

anw · on Aug 5, 2020

Just curious, but what are you replacing diesel with?

adsjhdashkj · on Aug 6, 2020

SQLx - unfortunately using some blocking wrappers for (currently) some non-async codebases.

SQLx has been really nice so far. I recommend it.

Tade0 · on Aug 5, 2020

It's hard to not love a language the compiler of which uses words like "perhaps" in error messages, suggesting solutions that are often what you wanted to do in the first place.

I've been writing in Rust in a very on-and-off fashion and thanks to the compiler messages I'm able to pick up where I left even after a few months.

All that while being a front-end developer or, in other words, as far removed from systems programming as possible.

dwheeler · on Aug 5, 2020

> Each Tuesday I have been looking at the performance results of all the PRs merged in the past week

I really appreciate the work described here, but it seems to me that the performance tests are in exactly the wrong place. The correct time to benchmark a PR is before it is merged; not after. It's the same for any other kind of test; your tests should pass before you merge, not after.

Sure, there will be exceptions where a performance regression is appropriate. But that should be known and agreed on before merging it.

I don't see a strong reason that the performance tests have to be done afterwards. I presume that it's okay to have a delay between creating the pull request and its merge, since humans will typically want to be able to review it!

dbaupp · on Aug 5, 2020

> It's the same for any other kind of test; your tests should pass before you merge, not after

Rust is very good about testing before merge, not after. It somewhat pioneered doing complete testing before merge (at least, built significant tooling and mind-share around doing it): https://graydon2.dreamwidth.org/1597.html https://bors.tech/

That is to say, if Rust isn't validating something pre-merge, it's more likely than not that there's a good reason for doing so. I don't personally know the reason for Rust, but, for performance testing in general, it's easy to overwhelm infrastructure because benchmarks should be being run on a single physical machine that's otherwise quiet. This means that testing is completely serialised and there's no opportunity for parallelism. Based on https://perf.rust-lang.org/status.html , it looks like the performance tests currently take a bit longer than an hour each.

As others have mentioned, there's automatic tooling for running tests on PRs that may be performance sensitive, and PRs are reverted if they accidentally cause a regression, to be able to reset and reanalyse.

nnethercote · on Aug 6, 2020

Coming up with a binary "is this acceptable?" heuristic is also harder for performance than it is for correctness.

dwheeler · on Aug 7, 2020

I think the heuristic is really easy: "Is it slower than before?"

If it's slower, then you require a manual override to accept the merge.

That will make performance regressions a whole lot less likely. Sure, in some cases they may be necessary, but in many cases they were unintentional; they can be fixed, and then merged.

MinusGix · on Aug 5, 2020

The linked llvm article about performance ( https://nikic.github.io/2020/05/10/Make-LLVM-fast-again.html ) says: "Rust does this by running a set of benchmarks on every merge, with the data available on perf.rust-lang.org. Additionally, it is possible to run benchmarks against pull requests using the @rust-timer bot. This helps evaluate changes that are intended to improve compile-time performance, or are suspected of having non-trivial compile-time cost." So, they do seem to have a bot to perform such tests, though I have no clue how often they use it.

dbaupp · on Aug 5, 2020

It looks like it's used on about 6% of PRs: since the start of this year, 204 PRs[0] match a search for "@rust-timer queue", among 3393 total PRs[1].

[0]: https://github.com/rust-lang/rust/pulls?q=is%3Apr+is%3Amerge...

[1]: https://github.com/rust-lang/rust/pulls?q=is%3Apr+is%3Amerge...

hermanradtke · on Aug 5, 2020

In practice I do not think this will work. There is already a queue of PRs waiting to be tested and merged.

Most changes are not going to change the compiler performance. Better to watch a trend line and revert any change that hurts performance in a significant way. Releases are every 6 weeks so there is plenty of time for this type of analysis and activity.

tester34 · on Aug 5, 2020

What's going on on this page?

When I'm scrolling then section titles like 'Weekly performance triage' have green border for a like 0.5sec.

daoxid · on Aug 5, 2020

In a linked blog post [0] and on the Rust performance page [1] the performance metric "instructions" is used. What exactly is meant by that? Number of instructions executed?

[0] https://nikic.github.io/2020/05/10/Make-LLVM-fast-again.html

[1] https://perf.rust-lang.org/

nnethercote · on Aug 5, 2020

Yes. The good thing about this metric is that there is very little variation, unlike wall times, which makes it easier to spot regressions.

alilleybrinker · on Aug 5, 2020

Nicholas has been doing yeoman's work on speeding up the Rust compiler. This is the latest in a long line of contributions to tackle what has been one of Rust's most persistent problems: long compile times.

    * 2020-08-05: "How to Speed Up the Rust Compiler Some More in 2020" https://blog.mozilla.org/nnethercote/2020/08/05/how-to-speed-up-the-rust-compiler-some-more-in-2020/ (this post)
    * 2020-04-04: "How to Speed Up the Rust Compiler in 2020" https://blog.mozilla.org/nnethercote/2020/04/24/how-to-speed-up-the-rust-compiler-in-2020/
    * 2019-12-11: "How to Speed Up the Rust Compiler One Last Time in 2019" https://blog.mozilla.org/nnethercote/2019/12/11/how-to-speed-up-the-rust-compiler-one-last-time-in-2019/
    * 2019-10-11: "How to Speed Up the Rust Compiler Some More in 2019" https://blog.mozilla.org/nnethercote/2019/10/11/how-to-speed-up-the-rust-compiler-some-more-in-2019/
    * 2019-07-17: "How to Speed Up the Rust Compiler in 2019" https://blog.mozilla.org/nnethercote/2019/07/17/how-to-speed-up-the-rust-compiler-in-2019/
    * 2019-06-25: "The Rust Compiler is Still Getting Faster" https://blog.mozilla.org/nnethercote/2019/07/25/the-rust-compiler-is-still-getting-faster/
    * 2018-11-06: "How to Speed Up the Rust Compiler in 2018: NLL Edition" https://blog.mozilla.org/nnethercote/2018/11/06/how-to-speed-up-the-rust-compiler-in-2018-nll-edition/
    * 2018-06-05: "How to Speed Up the Rust Compiler Some More in 2018" https://blog.mozilla.org/nnethercote/2018/06/05/how-to-speed-up-the-rust-compiler-some-more-in-2018/
    * 2018-05-17: "The Rust Compiler is Getting Faster" https://blog.mozilla.org/nnethercote/2018/05/17/the-rust-compiler-is-getting-faster/
    * 2018-04-30: "How to Speed Up the Rust Compiler in 2018" https://blog.mozilla.org/nnethercote/2018/04/30/how-to-speed-up-the-rust-compiler-in-2018/
    * 2016-11-23: "How to Speed Up the Rust Compiler Some More" https://blog.mozilla.org/nnethercote/2016/11/23/how-to-speed-up-the-rust-compiler-some-more/
    * 2016-10-14: "How to Speed Up the Rust Compiler" https://blog.mozilla.org/nnethercote/2016/10/14/how-to-speed-up-the-rust-compiler/ (the post that started it all!)

He's also written about tools and concepts for crate maintainers to understand and improve their own compile-times:

    * 2019-10-10: "Visualizing Rust Compilation" https://blog.mozilla.org/nnethercote/2019/10/10/visualizing-rust-compilation/
    * 2018-11-09: "How to get the size of Rust types with `-Zprint-type-sizes`" https://blog.mozilla.org/nnethercote/2018/11/09/how-to-get-the-size-of-rust-types-with-zprint-type-sizes/
    * 2018-07-24: "Ad Hoc Profiling" https://blog.mozilla.org/nnethercote/2018/07/24/ad-hoc-profiling/

He's also contributed to useful profiling tooling!

    * 2019-04-19: "A Better DHAT" https://blog.mozilla.org/nnethercote/2019/04/17/a-better-dhat/

All this to say: huge thank you to Nicholas, and if you're interested in learning how to do real-world performance work, these posts are an excellent resource!

nnethercote · on Aug 5, 2020

Thank you for the links and the kind words!