Speeding Up the Rust Compiler

ChrisSD · on Dec 11, 2019

Note that `cargo check` is faster than doing a full compile. Also I use the `rust-analyzer` language server for IDE integration to catch errors as I write them. Between the two, my workflow usually avoids the need for actually compiling a binary until I'm ready to run tests.

simias · on Dec 11, 2019

While developing I completely agree with you. However when I'm debugging it tends to get messier in my experience, I often want to make small changes and see how they influence the symptoms. In this situation you have to do full builds every time, and if your application is performance-critical enough you may not have the option of using the faster non-optimized builds.

Admittedly part of the blame falls on me, I'm a big fan of printf-debugging and I tend to only use debuggers as a last resort.

est31 · on Dec 11, 2019

Also, due to this bug, adding a single printf line causes everything after it in the file to be recompiled without there being a need: https://github.com/rust-lang/rust/issues/47389

ChrisSD · on Dec 11, 2019

Ah, I'm quite old fashioned when it comes to debugging, so I'll attach a debugger and maybe even read through the generated assembly if need be. I've also been trying to make more use of static asserts where possible. If I can ensure both me and the compiler have the same understanding of the code then there's hopefully less space for errors at runtime. But of course this is never going to eliminate the need for debugging, even if (and this is a big IF) the compiler knows my exact intent (i.e. my reasoning might be wrong somewhere, perhaps very subtly so). And static asserts are still quite hacky.

All that said, I totally agree that there are times when faster compiles are really useful.

banachtarski · on Dec 11, 2019

You’ve just given the reason I often use to explain why printf debugging is something I’m not a fan of. For native development, printf debugging takes a backseat to proper instrumentation and debuggers just for sheer productivity reasons IMO. The main thing it gives is a serializable log for debugging multithreaded bugs. However, for just viewing state during a run, I think people should just learn to use their debugger (which can inject prints and watches on the fly, no need to recompile).

simias · on Dec 11, 2019

I agree with you, the main reason I stick to printf debugging is because I can't be bothered to change my habits, it's not a very sensible choice.

That being said, if I had to defend myself, I'd point out that printf debugging has a few advantages:

- It's often more lightweight and less intrusive than debuggers, which makes it less likely to encounter an "heisengbug" that disappears when you attempt to debug it. This is especially true for timing-sensitive bugs (which occur in multithreaded code as you point out, but not only).

- Debuggers are often environment-specific, if you change language, environment or even simply editor you might have to re-learn how to use your debugger. Printf will always be here for you.

- I do a lot of work in embedded environments, including low level and bare metal stuff (bootloaders, drivers etc...). While these days there's generally some debugger support available for these targets it's often more limited or much more intrusive. If I put a breakpoint in an interrupt handler I basically freeze the entire kernel when it triggers which can sometimes do more harm than good if I'm trying to figure out what's going on. And again, in these environments the debugging solutions are often proprietary and sometimes quite expensive.

banachtarski · on Dec 12, 2019

Yea, I do graphics programming mainly and I suppose console development resembles embedded to a certain degree. I guess part of my preference is that I really don't have a choice. The codebase I'm currently in can take minutes to dozens of minutes to compile. This is compounded by the fact that just loading the engine takes a long time, and loading levels/assets to get to the point where I can reproduce the bug again takes more time. I have to deal with heisenbugs a lot, so necessity has more or less forced me to learn the toolchain (or never get anything done).

It is true though that if there are bugs in the toolchain, it affects me a lot more. For my personal projects, I actually avoid windows and "fancy" GUI based tools. I also painstakingly write all my code so they either compile fast or can hot-reload.

EugeneOZ · on Dec 11, 2019

'cargo clippy' is another good option (also faster than full compilation).

dakom · on Dec 11, 2019

Does the VSCode integration support this?

Though for webassembly I need the generated wasm to exist and be loaded before I can really see what's happening :\

steveklabnik · on Dec 11, 2019

rust-analyzer includes a VS: Code plugin.

bluejekyll · on Dec 11, 2019

rustc might never be as fast as the Go compiler because the language has so many additional features, but it always makes me so excited to see the continued progress in making the compiler ever faster.

Thank you for the hard work here!

Btw, mentioned in the article are these tools: “ All of the above improvements (and most of the ones in my previous posts) I found by profiling with Cachegrind, Callgrind, DHAT, and counts, but there are plenty of other profilers out there.” Does anyone have any good resources on using these with Rust, or just in general with C or C++?

LordHeini · on Dec 11, 2019

I think rustc will never be faster than go or even java.

Go has made quite a few language concessions to be fast to compile and is designed with that in mind. And it really shows, using auto reloading in Go feels like using Ruby or Python which is just great.

From my usage standpoint Go has basically no compile time at all.

And awful compile times can be a huge hindrance to productivity. I used to do web dev in Scala, but waiting for the sleepy compiler is one of the reasons is switched to Go.

Scala is a nice language but the long compile times in combination with the vm/jetty cycle feels quite a bit slower than Rust.

rapsey · on Dec 11, 2019

> And awful compile times can be a huge hindrance to productivity. I used to do web dev in Scala, but waiting for the sleepy compiler is one of the reasons is switched to Go.

If the compiler catches bugs that would otherwise only be found at run time, then the additional compile time pays for itself many times over in terms of productivity.

Thaxll · on Dec 11, 2019

It has yet to be proven, yes Rust catches more bugs than other language but is it worth the slow compile time? I'm not sure.

mbel · on Dec 11, 2019

I think he is referring to situation when compiler both catches errors at compile-time and is fast. After all the validation part happening in the fronted is rarely the most resource consuming thing that compiler does.

elbear · on Dec 11, 2019

And I think rapsey is referring to the fact that a compiler with a more powerful type system allows you to encode more logic in your types. This, in turn, means it will catch more errors at compile time at the cost of longer compile times.

ksec · on Dec 11, 2019

Yes, but I think the OP referring to Go as he dont want one or the other. He Wants both! And Go being near no-compiling time meant that best of both world ( From his perspective )

geodel · on Dec 11, 2019

Well it can run time during testing and run time in production. From my experience in Java I have caught quite a few bug at run time in testing. And it is not that bad.

majewsky · on Dec 11, 2019

> From my usage standpoint Go has basically no compile time at all.

I see you've never imported the Kubernetes client library.

pjmlp · on Dec 11, 2019

Delphi, F#, ML, Ada, D prove otherwise.

What they have going for them is not depending on LLVM.

paulddraper · on Dec 11, 2019

> F# proves otherwise

Not really...F# has a "slow" compiler too.

Any advanced type system, whether F#, C++, Haskell, Scala, Rust, TypeScript...is inevitably going to have a significantly slower compiler than a more basic system (Go, Java, C).

That's just part of the tradeoff; though obviously you can optimize within those bounds.

It's about as universal as the runtime "rule": compiled perf > interpreted perf. Not technically inviolable, but practically so.

---

P.S. And while the F# compiler is indeed faster than rustc, it's not apples-to-apples. F# compiles to CLR; Rust compiles through LLVM all the way to native. If Rust deferred work to runtime with an LLVM interpreter (http://llvm.org/docs/CommandGuide/lli.html ?), it would improve compile-time performance, at the cost of runtime overhead.

throwaway894345 · on Dec 11, 2019

I don't think this is true. C compilers often have a whole bunch of optimizations built in, which could easily make them quite a lot slower than a less-optimized compiler for a language with a more advanced/complex type system. There are other factors as well, such as the efficiency of the compiler--the same C compiler implemented in C and compiled with an advanced C compiler will handily beat a C compiler written in Python and executed on CPython. But to your point, all-else-equal, a program in a language with an advanced type system takes longer to compile than a simpler type system.

paulddraper · on Dec 11, 2019

I mention this at the end of my comment, but yes the second very significant factor is the nature of transformation.

A SASS (CSS preprocessor language) compiler can be very fast because it does very little.

Native compilers must do significantly more transformation than other compilers, e.g. F# CLR compiler. This is doubly so if you request optimized output (but that's probably not the case in this discussion).

pjmlp · on Dec 11, 2019

F# compiles to native code via Mono AOT, NGEN (available since .NET 1.0 which many seem to forget), .NET Native (does require a hack and some care due to missing support for some MSIL opcodes), Unity's IL2CPP.

Naturally Rust could offer an interpreter of some sort, however it still isn't there today, so we got to use what is available.

sitkack · on Dec 11, 2019

A similar path might be MIR to Wasm, w/o hitting LLVM at all. Or just interpret MIR directly.

whb07 · on Dec 13, 2019

Should checkout the Ocaml compiler. It is wicked fast.

nicoburns · on Dec 11, 2019

Indeed. LLVM seems like a big blocker for fast compiles, although I suspect Rust may additionally need some higher-level optimisation passes.

The alternative cranelift backend here has been making slow but steady progress, and seems to be approaching a useable state: https://github.com/bjorn3/rustc_codegen_cranelift/issues

thegeekpirate · on Dec 11, 2019

I'm not 100% sure blaming LLVM is the way forward, as Jonathan Blow and his Jai language are able to compile and link a full 3D game in under a second using it.

thegeekpirate · on Dec 12, 2019

EDIT: Although I've now verified that Jai no longer uses LLVM for debug builds, when it did, it would have been able to do a compilation of the same game within only a few seconds as well, so my post isn't _terribly_ incorrect, thankfully =b

nicoburns · on Dec 11, 2019

Possibly not, but there does seem to be strong correlation between languages using LLVM and long compile times. Perhaps Jai is an exception, but it's hard to know how or why that is given that he has not released his language.

singron · on Dec 11, 2019

AFAIK jai only uses LLVM in release mode since it does slow down the build. This makes sense since if you have optimizations off, you only have to do codegen. If you are using LLVM for codegen, then you have to codegen twice! Once for LLVM IR and once for machine code.

Rusky · on Dec 11, 2019

Even when not using LLVM, Jai still does "codegen twice." Most compilers today (even JITs) have at least one IR in between the syntax tree and the generated code.

The existence of LLVM IR is not the problem- rather, it's how it gets used (on both sides of the API). Generating a lot of naive IR and letting the optimizer clean it up, for example, has a large cost.

And, while I'm not too up to date on the details of Jai, the last thing I heard it was still very fast to compile even in release builds that did use LLVM. That is, the Jai compiler is smarter about how it generates IR.

pcwalton · on Dec 11, 2019

This is basically just saying that LLVM has an IR. While going straight from AST to machine code may sound like a great thing for compilation speed, it makes compiler maintenance really tough. Many AOT-based compilers nowadays are converging on four levels of IR, which seems to be a sweet spot. (Swift has AST, SIL, LLVM IR, MachineInstr; Rust has AST, MIR, LLVM IR, MachineInstr; GCC has AST, GENERIC, GIMPLE, RTL.)

aidenn0 · on Dec 11, 2019

FWIW SBCL, which is an incrementally compiled implementation of common lisp also has 4 stages:

Read, IR1, IR2, Assembly.

rob74 · on Dec 11, 2019

The thing Pascal-family languages have going for them (and I'm including Go here, because there is a lot of Pascal-family influence in Go, although it's not immediately obvious) is first and foremost the way they resolve dependencies. A detailed explanation of this can be found in the talk "Go at Google" (https://talks.golang.org/2012/splash.article#TOC_5.). I'm not sure about Rust, but maybe while trying to ensure a high degree of interoperability with C/C++ they also "inherited" some of their dependency management issues?

pcwalton · on Dec 11, 2019

Rust certainly does not use header files.

pjmlp · on Dec 11, 2019

I only cared to list languages whose complexity is in the same ballpark as Rust, hence why I left Go out.

rob74 · on Dec 11, 2019

Sorry for offending you by mentioning Go, I was referring to the two parent comments who mentioned it. I'm not a good judge of language complexity (are you?), but honestly I don't think that the Go compiler is much less complex than Delphi's - the "Delphi language" hasn't evolved much in 20 years (which is not necessarily a bad thing!). Ok, they now have cough generics, and - yay! - closures, but that's about it...

pjmlp · on Dec 11, 2019

No offence at all.

Well, even Turbo Pascal for Windows v1.5 (last TP version before Delphi happened) is more feature rich than Go. :)

touisteur · on Dec 11, 2019

AdaCore just built an Ada front end for LLVM, so maybe not for long :-).

I'm not sure I understand the problem with rust build times. In Ada, at least with GNAT, a full rebuild of 200KLOC can be a bit long, depending on your use of generics, and your number of cores (thanks to AMD, build times will soon be ridiculously short). But next builds with slight modifications are quite fast, thanks to modular compilation (every module built independently, same as for C, thanks to spec/body separation if you just change the body of a module you just recompile the specific module) and incremental compilation (only rebuild what changed and their dependencies).

Is there something inherently slow in the Rust compiler that disallows those ?

I know that writing an Ada compiler that could compile units independently seemed impossible to everyone at first, until (the late, sadly) Robert Dewar worked it out with RMS : https://news.ycombinator.com/item?id=15880160

I don't know Rust enough, just that what they're doing is amazing, and I hope they're not too focused on the small scale optimizations (which are great!) and look at the high level optimizations too, and especially to what's been done elsewhere through sweat and pain.

pjmlp · on Dec 11, 2019

Well, there are other Ada compilers around besides GNAT, although GNAT is the only affordable one to most mortals.

Most people might also not be aware that Rational Software started their business by selling Ada machines, where one could enjoy an experience somehow similar to Lisp Machines, just with Ada instead.

Thanks for the link.

gameswithgo · on Dec 11, 2019

None of those do the same kind of static analysis Rust is doing. Also F# doesn't compile fast either, in part because it is written in F# with functional idioms that are sometimes pokey.

pjmlp · on Dec 11, 2019

Ada with SPARK sounds pretty much the same to me, or OCaml.

The only thing missing being lifetimes.

F# might not win marathons, it is still faster than rustc, even when adding NGEN or Mono AOT into the pipeline.

xvilka · on Dec 11, 2019

Notably, Delphi uses[1] LLVM now too...

[1] http://docwiki.embarcadero.com/RADStudio/Rio/en/LLVM-based_D...

pjmlp · on Dec 11, 2019

Interesting, will have to have a look into it and see how it performs. Thanks.

pjmlp · on Dec 11, 2019

It looks it is only for mobile OS targets, and sadly Embarcadero got to make yet another bunch of incompatible language changes, oh well.

mamcx · on Dec 11, 2019

Is still very fast.

sitkack · on Dec 11, 2019

Eventually the Rust compiler will be fully incremental, possibly tied into the debugger so it can patch running code with just the diff, retaining state while doing so, but this is probably 5 years out.

codetrotter · on Dec 11, 2019

> And awful compile times can be a huge hindrance to productivity

Agree a lot with that. I write quite a bit of code in Rust and I use the JetBrains CLion IDE with the IdeaVim and Rust plugins.

CLion is very helpful when working with Rust code. It understands the language quite well and because of that it can help you by pointing out things that aren't going to work without having to constantly recompile your code.

For students and faculty members, JetBrains give out licenses free of charge that are valid for 1 year, and which can be renewed while you are still a student or faculty member. https://www.jetbrains.com/student/

Filligree · on Dec 11, 2019

Does it work any different from the Rust plugin for IDEA?

codetrotter · on Dec 11, 2019

I haven't tried IDEA so I can't compare it. All I can say is that CLion is great :)

If you write a lot of Rust, I suggest that you download the 30 day evaluation version of CLion and install the Rust plugin and see how it compares.

runevault · on Dec 11, 2019

It gives debug support in CLion which IDEA does not have last I knew.

Dowwie · on Dec 11, 2019

There are some really interesting projects under way that will challenge your claim. Never say never!

arijun · on Dec 11, 2019

Sounds interesting, can you elucidate?

Dowwie · on Dec 11, 2019

Disclaimer: I am under the impression that the following is exploratory work and so this ought to be taken with a grain of salt until it is communicated through official channels.

Main idea: Essentially, compile only what has changed, re-use compiled forms of everything that hasn't.

Essentially, pre-compiled dependencies will be sourced and plugged in to a Rust project. This presents great security concerns, but also great opportunities to advance software development. A large system has a vast network of crate dependencies. Certain versions of these crates will be pre-compiled, each uniquely identifiable through a hash. Some of these crates will even be audited and potentially certified. Hundreds of black box, pre-compiled crates will each be sandboxed in a very secure fashion such that it does only as specified and no more. No sandboxed library can reach beyond its advertised behavior. Unchanged, locally-developed parts can also be compiled and used through this flow as well. WASM projects are facilitating much of this work.

I may be missing important parts from this explanation, so hopefully it is corrected by someone more knowledgeable.

GrayShade · on Dec 11, 2019

> Essentially, compile only what has changed, re-use compiled forms of everything that hasn't.

That's what incremental compiling does, and it has been enabled for a while. Unfortunately, it doesn't help on fresh builds.

> Essentially, pre-compiled dependencies will be sourced and plugged in to a Rust project.

There are two aspects here. The more recent one, which you are thinking of, is building and shipping procedural macros as WASM binaries. Procedural macros are very powerful, but they take a token stream as input to allow for future changes to the language. Because of this, basically every macro uses a Rust parser crate called syn. Since it's a fully-fledged parser, syn takes a while to compile (30 seconds or so, certainly not minutes), which many people find annoying, and it gets worse if you end up using different versions of syn for different proc macro crates. The plan here is to build the proc macros (including syn) somewhere on the Rust infrastructure and ship the binaries to the users. The sandboxing story is somewhat complicated: proc macros and build scripts can legitimately read or download files, or produce non-deterministic output in other ways. A WASM macro would not be able to do this, so the whole thing would be opt-in. It also provides no benefit for crates that don't use procedural macros. See https://github.com/dtolnay/watt for more details.

The other possible avenue is MIR-only rlibs. When you compile a dependency, you get machine code (with the exception of generic code). It might be possible to compile crates to MIR (again, on the Rust infrastructure) and only do the final codegen on the user's computer. But that's still complex, and not necessarily much faster. See https://github.com/rust-lang/rust/issues/38913.

steveklabnik · on Dec 11, 2019

> That's what incremental compiling does, and it has been enabled for a while.

While this is true, the compiler is not yet fully incremental in my understanding, there's still a lot left to do here.

earenndil · on Dec 11, 2019

> rustc might never be as fast as the Go compiler because the language has so many additional features

Not necessarily true. The d compiler runs incredibly fast, for compiling the type of code you'd write in go; and it only slows down if you use a lot of complicated features like templates or CTFE.

frenchman99 · on Dec 11, 2019

One of the things with Rust is that while the compiler could be considered slow, once your Rust code compiles, if you stay away from `unsafe` code and `unwrap()`, the code is usually bug free (apart from logic bugs that no compiler could catch). At least that's been my experience with Rust.

gameswithgo · on Dec 11, 2019

The statement "if it compiles it works" is a feeling you get from small programs in very safe languages, but we all know it isn't true, and saying stuff like that immediately causes experienced programmers to think you are naive. It isn't good language PR to say it or anything like it.

wizzwizz4 · on Dec 11, 2019

"If it compiles, and the tests pass, it works" is more accurate; if your tests are pretty thorough, it's very unlikely that you've got any major logical errors, and Rust rules out a massive class of lifetime, double-use, mutability etc. errors just by being Rust.

I've translated pure-logic business code (with fairly thorough, albeit static, tests that all passed) from Python to Rust, and the compiler's complained about subtle errors that would have taken down a business using the code in days had they been put into production. Turns out there was a flaw in the spec.

Leveraging Rust's type system to logically separate the meanings of different integers, and only permitting arithmetic operations where meaningful (e.g. no distance+time, distance/time=speed), also uncovered a couple of even more subtle bugs – though luckily assertions would've caught these in the event of them actually making it through to production.

If it compiles, it probably works. But, more importantly, Rust forces you to write code that works in the first place. You have to think through the ramifications of what you're doing, sometimes to hold the entire function in your head at once, and that means you have to understand what you're doing. If you don't, it complains, and the chances are you won't be able to get it to compile until you do understand it.

And if it does compile when it's wrong, it's obviously wrong and your tests will fail. Most of the time.

Isn't programming fun?

frenchman99 · on Dec 11, 2019

As I said, if it compiles it means that there should be no bugs other than logic bugs. No null pointer exceptions because there is no `null` in Rust, for instance. No passing a wrong pointer type because doing something like `void *` in Rust only works with unsafe code. Same with mutable pointers, which only work with unsafe code. No concurrency bugs due to shared memory because passing data between threads is not possible without specialized data types.

Basically, if you've programmed in unsafe languages such as C, C++, Assembly or any language that has `null`, you know what Rust brings to the table. The Rust compiler does prevent very large classes of bugs all the while being non garbage collected. That's quite a feat!

And I'm far from being a Rust fanboy. I work with PHP, Javascript and devops tools in my day job. I'm not doing PR here. Just sharing my experience. Feel free to share yours.

Tainnor · on Dec 11, 2019

In my experience, though, the statement is largely true for refactorings. Even in a language like Java that doesn't have a very expressive type system, if I need to change the structure of my classes, rename some stuff, etc., just knowing that it compiles tells you that you probably didn't break anything.

Of course, once you add / change logic, I agree with your assessment that the statement is hopelessly naïve in most practical use cases.

abjKT26nO8 · on Dec 11, 2019

Although Rust's type system does catch a lot, I wouldn't say it's that effective. I, for one, will usually make an off-by-one error or forget that I left a stub somewhere and didn't come back to write a proper implementation. But it's easily caught by the most rudimentary tests; so you don't have to bother yourself with writing them as elaborate as people usually do with e.g. Python.

bluejekyll · on Dec 11, 2019

A lot of those basic issues, clippy often catches. One thing I enabled recently to make sure it’s caught before release, is checking for spurious uses of dbg! and unimplememted!, as well as println! in production code.

For off by 1 errors, in Rust it’s often better to turn to iterators where appropriate than to say using indexes (it’s also more efficient in most cases). Clippy can also catch those issues.

abjKT26nO8 · on Dec 11, 2019

In my experience, all Clippy does is find style issues. Perhaps it's more effective in other projects, but in the code I wrote all by myself it didn't find a single error (as in bug). That's not to say Clippy isn't useful. I value style consistency (in this case consistency among the general pool of all Rust programmers and not inside a single project). Of course, YMMV.

> For off by 1 errors, in Rust it’s often better to turn to iterators where appropriate than to say using indexes (it’s also more efficient in most cases).

The off-by-one errors I made weren't related to indexing collections. I don't remember anymore what it was exactly, it got caught by the very first tests I wrote before even trying to use anything. I do use iterators whenever they make sense and I write new iterators whenever it makes sense. Still, good note from you.

bluejekyll · on Dec 11, 2019

Some of those style things are better style specifically because they help avoid common bugs.

I’ve found that it is helpful in avoiding those and additional directing you to more performant choices. YMMV.

frenchman99 · on Dec 11, 2019

An off-by-one error is a logic error.

laumars · on Dec 11, 2019

That doesn't mean it isn't _also_ a software bug given that the result is still unintended / unexpected.

AnIdiotOnTheNet · on Dec 11, 2019

> the code is usually bug free (apart from logic bugs that no compiler could catch)

So basically your code is not even a little bug-free, it just probably doesn't have a certain small class of bug in it.

nullc · on Dec 11, 2019

Worse, it probably calls libraries written by rust users who drank that compiles=bug-free koolaid...

It won't likely crash, beyond that: all bets are off.

I've now encountered several pieces of rust software that have no error / exceptional case handling and just panic at things like unexpected command-line arguments.

gameswithgo · on Dec 11, 2019

panic at unexpected input is not ideal, but when you encounter bad software from a C/C++ programmer that bad input could be an ACE instead.

nullc · on Dec 12, 2019

With false claims like the ancestor here being commonly made it becomes plausible that software written in rust could be faulty at a higher rate than in less safe languages, specifically because of the lack of care and dismissal of risks (plus ecosystem considerations).

I wouldn't go so far as to argue that it is at this point. But I think rust advocates probably ought to stop arguing as though it's axiomatically true. Maybe Mozilla could be talked into funding an academic study comparing defect rates in rust vs other languages used for systems programming.

Beyond substantiating the rust-improves-reliability trope, it could also identify areas for improvement where it doesn't.

DarthGhandi · on Dec 11, 2019

Huh? How exactly do you stay away from error handling? Even the most basic code on Earth has unwrap or expect in it. Show me one single library that doesn't unwrap somewhere along the line.

pornel · on Dec 11, 2019

`unwrap()` is "handling" errors by crashing the program, so it shouldn't really be used. It's common in examples, because it's usually the shortest/laziest code possible. However, all practical uses of `.unwrap()` have better alternatives, e.g. instead of

    if foo.is_some() {
       let unwrapped = foo.unwrap();
    }

you write:

    if let Some(unwrapped) = foo {
    }

There are plenty of combinators like .map_or(), .ok_or()?, .filter_map() that deal with optional values gracefully.

burntsushi · on Dec 11, 2019

> `unwrap()` is "handling" errors by crashing the program, so it shouldn't really be used

This isn't the right advice to give, and is only going to confuse beginners. Panicking via unwrap/expect is perfectly fine, whether in a library or an application, only when the panic represents a bug. For example, accessing an element of a Vec with an incorrect index because of a logic error. If a panic occurs, then it should reflect a bug that is intended to be fixed.

Stated differently, it should be impossible for an end user to cause a Rust application to panic. If they can, then it's a bug.

This advice permits use of unwrap/expect anywhere, so long as its occurrence is never expected.

bluejekyll · on Dec 11, 2019

I mostly agree with you, but in practice I think using unwrap or panicking in libraries is usually wrong.

As an example it’s quite easy to fall astray of taking your perspective to say read a packet off the network, and panic on malformed data. I know you’d agree that would be a inappropriate time to use panic, as that would crash any upstream program with trivial DOS exploits. But this is easy to do if you say have an API that takes a value, translates it to something else and panics on invalid inputs.

My advice is generally in line with the GP, pornel—use the tools at hand for avoiding panics where it’s easy to do so, such as ?, map, etc. Only panic/unwrap on bad library API usage (which I think is what you’re suggesting). Feel free to panic anywhere in main.

People often translate “ use of unwrap/expect anywhere, so long as its occurrence is never expected” in ways that can create major bugs as the software goes into more use.

burntsushi · on Dec 11, 2019

I understand that perspective to an extent, but I really want to double down here because I think it's important. :-)

> but in practice I think using unwrap or panicking in libraries is usually wrong

Very strongly disagree. I've created dozens of Rust libraries, and probably all of them have dozens of code paths that can panic. Other core libraries do the same. Just a simple slice access, e.g., `slice[i]`, is a line of code that could panic. (Since it's just a shorter way of writing `slice.get(i).unwrap()`.) I don't think we should be giving advice that runs contrary to how important libraries actually work. A lot of people learn to code by reading others' code, and when we give advice like "don't use unwrap/expect," they get confused when they see that virtually every widely used piece of Rust code violates it.

The key here is really that one shouldn't panic unless the panic itself is indicative of a bug somewhere. Blanket advice like "don't use unwrap/panic in libraries" is bad because---as I argued in the Clippy issue I posted in a sibling comment---it's effectively a prohibition against runtime invariants themselves. As a programmer, you insert a panic when you've failed (for any number of reasons) to capture the invariant in the type system. Blanket advice saying that one shouldn't use unwrap/expect in these circumstances leads one toward a path of much more complex APIs with error types that are never constructed unless there is a bug in the code. (It's likely that actually adding all of those error types will be so annoying that it's plausible the programmer will give up on Rust.)

> As an example it’s quite easy to fall astray of taking your perspective to say read a packet off the network, and panic on malformed data. I know you’d agree that would be a inappropriate time to use panic, as that would crash any upstream program with trivial DOS exploits. But this is easy to do if you say have an API that takes a value, translates it to something else and panics on invalid inputs.

I think this is a reflection of error handling being hard. It's just as easy to code a program with the mistaken assumption that a particular file path will always point to a valid and readable file. It takes a bit of learning to understand which things you can rely on never happening and which you can't. In the mean time, I don't think we should be giving advice that is very easy to misinterpret into a conclusion that just isn't tenable. Personally, I think it's much easier to talk about end user behavior. If there's a panic, then it's a bug that ought to be fixed. If you follow that, it's hard to go wrong.

You could rephrase it with, "If there's a reachable code path that panics, then the code path should be removed." You could then talk about what "reachable" means, i.e., code paths that are determine based on data that the program doesn't control vs code paths that are enforced for all inputs to the program.

Tainnor · on Dec 11, 2019

I unfortunately know very little about Rust, but is it possible to recover from panics in Rust?

Because the problem that I'm seeing in Swift code (which also has a "crash on logic errors" approach) is that bugs can bring your whole system down, which is especially bad on server-side apps with multiple threads. Yes, you can use supervisord and/or load balancers, but you still lose in-flight requests.

By contrast, in a language with a runtime like Java or Ruby, you can catch almost everything at the top level, so you could just have some logic that generates a "whoops something went wrong" response in case of a serious error.

The point here is resiliency. I agree that you want to catch bugs early, and you should immediately abort execution once bad things happen; I also fundamentally agree that you don't want to litter your code with error types or other constructs that are supposed to never actually be used. But faults do happen in practice, because we write buggy code, and in such a case, it's good if you can isolate the fault and recover at a higher level. Languages like Erlang take this idea to an extreme.

This is currently a real problem for us with Swift, so I was wondering whether Rust has a better solution here.

burntsushi · on Dec 11, 2019

> I unfortunately know very little about Rust, but is it possible to recover from panics in Rust?

Yes: https://doc.rust-lang.org/std/panic/fn.catch_unwind.html

However, this comes with a critical caveat: unwinding can be disabled when compiling an application, which means one cannot guarantee that catch_unwind will actually work. (If unwinding is disabled, then panics turn into unrecoverable aborts.)

For this reason, and because the standard error handling mechanism is done through return values, panicking is not an acceptable way to do robust error handling in Rust. Recovering from panics is useful in niche scenarios, like keeping a web server running even if a request causes a panic or in tests for ensuring that all tests run even if one panics. Which basically matches your key concern here.

(To be clear, I think this is mostly orthogonal to my original comment in this thread. :-))

Tainnor · on Dec 12, 2019

Yep, that sounds like kind of what I would need. :)

But in general, exception handling is hard. There is value in having locally-unrecoverable, but globally-recoverable faults, stuff like that is probably where languages like Erlang shine.

bluejekyll · on Dec 11, 2019

I can’t tell if you and I are agreeing or disagreeing at this point. And yes, I would not disagree with the quality of your code and the libraries you’ve published, they’re some of the best and most widely used in the community.

You’ve said don’t panic for code paths that the program doesn’t control. That is nuanced, and often is not always obvious and easy to reason about, but the language makes those exceptional cases obvious, so why not use that to your advantage?

edit: and by the way, I didn't say that on indexes, I totally agree with you. Though, sometimes it is valuable to go to the extreme and prevent malicious index values from untrusted sources crashing your program...

burntsushi · on Dec 11, 2019

>I can’t tell if you and I are agreeing or disagreeing at this point.

I think we are disagreeing mostly over pedagogy.

> That is nuanced, and often is not always obvious and easy to reason about

Right. I think this is what I meant by "error handling is hard." But this really boils down to understanding what a runtime invariant is, and that takes time to learn. It's hard to do good error handling without internalizing that.

> but the language makes those exceptional cases obvious, so why not use that to your advantage?

I think you are, specifically by using unwrap/expect. In many such cases, the unwrap/expect has a comment explaining why it's impossible to panic.

> edit: and by the way, I didn't say that on indexes, I totally agree with you. Though, sometimes it is valuable to go to the extreme and prevent malicious index values from untrusted sources crashing your program...

Well, `&slice[i]` and `slice.get(i).unwrap()` are equivalent. So if we give blanket advice like "don't use unwrap/expect in libraries" then the latter gets caught up in that advice while the former doesn't, even though they are exactly equivalent. To me, this reveals the problem in that pedagogy because it focuses too much on one particularly common manifestation rather than the thing that actually matters: a panic visible to an end user is always a bug.

To be clear, the thing I am disagreeing with is the advice to "not use unwrap/expect; use case analysis instead." I can appreciate a pedagogy that simplifies things upfront with the cost of getting some corner cases wrong. But unwrap/expect are used too much in too many valid cases IMO for that type of strategy to be effective.

jayflux · on Dec 11, 2019

I’m pretty sure clippy “encourages” you to use .get() for Vectors returning an Option<> for you to deal with so accessing a bad index is avoidable in idiomatic rust.

I’m not sure that was the best example but I agree with your overall point.

burntsushi · on Dec 11, 2019

It does not. I know because I advocated against it. :-) https://github.com/rust-lang/rust-clippy/issues/1300

Also, Clippy's defaults are developed to be quite opinionated. It makes many suggestions that I (and many others) disagree with. So Clippy's defaults should not be used as a bludgeon for what's idiomatic and correct.

steveklabnik · on Dec 11, 2019

(I also do not use clippy very aggressively due to disagreeing with a lot of lints)

GrayShade · on Dec 11, 2019

It doesn't: https://play.rust-lang.org/?version=stable&mode=debug&editio... (click Tools / Clippy).

cesarb · on Dec 11, 2019

One of the commits in one of the first MRs in this article shows an example of a legitimate use of `unwrap()`...

    pub fn new(mut streams: Vec<TokenStream>) -> TokenStream {
        match streams.len() {
            0 => TokenStream::empty(),
            1 => streams.pop().unwrap(),
            _ => TokenStream::Stream(Lrc::new(streams)),
        }
    }

Since `streams.pop()` will never return `None` when `streams.len()` is `1`.

But I agree, most of the time using `match` (or its simplified form `if let`) or one of these combinators instead of `unwrap()` is better.

asdkhadsj · on Dec 11, 2019

As a sidenote, I've been thinking about instituting an internal styling guide to discourage `unwrap()` in your scenario, in favor of `expect("impossibly missing")` or etc.

The message helps a bit more, and I also want to avoid debug unwraps that were forgotten about.

Though, the more I think about it the more a macro makes sense to me, purely for code search. If we used a macro, something like `impossible!(streams.pop())` then I can code search for `.unwrap` and `.expect`.

Hmm

rrobukef · on Dec 11, 2019

A compiler error if the code cannot be eliminated during optimization. Kind-of hacky, but it could be valuable as a new function.

jashmatthews · on Dec 11, 2019

> `unwrap()` is "handling" errors by crashing the program,

IIUC, panicking via unwrap() or expect() IS graceful error handling. It unwinds the stack and releases resources. We should be clearer because it's confusing to people less familiar with Rust who will confuse this with actually crashing and immediately terminating.

Personally, I think if there's nothing you can actually do to handle this error then unwrap() or expect() are the correct choices. There's no point writing more code just to bubble an error that you can't handle.

cesarb · on Dec 11, 2019

> IIUC, panicking via unwrap() or expect() IS graceful error handling. It unwinds the stack and releases resources.

AFAIK, only with the default panic=unwind mode. With the panic=abort mode, it aborts the whole process instead of unwinding.

jashmatthews · on Dec 11, 2019

panic=abort is only intended for embedded stuff, I think.

steveklabnik · on Dec 11, 2019

Nope, it's intended whenever you want this semantic, and do not want to pay for landing pads. Embedded is a common case where this is true, but it's not exclusively for that.

galangalalgol · on Dec 11, 2019

If the filesystem you are trying to open a log file on fails, only your logging thread needs to give up, not your whole program. If there is anything you can still accomplish, don't panic, carry on. Tesla let the logging exception prevent charging of the vehicle.

lmm · on Dec 11, 2019

> If there is anything you can still accomplish, don't panic, carry on.

This is often poor advice, because continuing in a corrupt state can often be worse (and harder to debug) than a clean panic. E.g. it's better to crash than to overwrite a save file with corrupt data, or transfer money to the wrong account, or show one user data belonging to another user.

galangalalgol · on Dec 11, 2019

If we are talking microservices I agree, take down the service. I have written a lot of safety critical stuff, and the advice for a given address space (virtual or physical) has been throw on inputs that are out of range or otherwise bad. This makes you fail fast and notice stuff, but i makes the conglomerate of address spaces as a whole be horribly unreliable, all coming down all the time. As we have bevome less monolithic with little persistent data that is shared between concerns, it has become possible to kill and restart only the offending thread, service, hardware etc. The result is a degraded functionality with an alert that it has become so. Tesla shouldn't have just stopped, they should have alerte the user that maintenance was necessary and given up on logging those errors to the now failed flash.

EugeneOZ · on Dec 11, 2019

Let's say you have server or daemon and you call some library function - if it was written with such careless approach, whole server will fail just because user record had invalid email address, or because some entered data was not in expected format.

Let's say you have CLI tool - instead of message "XX is a directory - file expected" your tool will just fail silently.

There are much more examples. Graceful shutdown is what we have to do, even laziest of us.

jashmatthews · on Dec 11, 2019

Validation would be an expected Error and handled by returning an Err() rather than using unwrap() or expect().

.expect(format!("{} is a directory - file expected")) gives exactly the behaviour you've described for a CLI tool.

Graceful shutdown means releasing locks, deleting PID files etc when encountering an unrecoverable error. Not just any error.

EugeneOZ · on Dec 11, 2019

> Validation would be an expected Error and handled by returning an Err() rather than using unwrap() or expect().

When language allows you to make all errors "expected", it's really stupid to don't use it. Only reason is laziness.

jashmatthews · on Dec 11, 2019

I don't understand what you're trying to say here. Laziness has nothing to do with it. Returning a Result vs using .expect("this should never be missing or an error") is about whether your program can continue after this error.

EugeneOZ · on Dec 11, 2019

No, panicking is not a correct way to handle programm execution, it's just "I don't want to care about this possible error case right now", nothing more. I gave you 2 examples and you preferred to ignore them or say "it's ok". Let's agree to disagree, I will never understand people who are too lazy to allow panicking in their code and say "works as intended".

Many languages don't have Result types to return so there you can't do much sometimes to prevent data corruption or something worse. But in languages where you can return more than just "false", it's not acceptable.

gdxhyrd · on Dec 11, 2019

In libraries, yes. In applications, no. A lot of errors are fatal.

devit · on Dec 11, 2019

It should be used when you can prove that the error never happens, although unfortunately the compiler can't check that.

crematoria · on Dec 11, 2019

You have it wrong. Only the most basic code on Earth has unwrap in it. Real-world code will either actually handle the error with `match`, forward it with `?`, or unwrap it with `except` iff it's sure that an error will not occur.

arghwhat · on Dec 11, 2019

> https://github.com/servo/servo/search?q=unwrap&unscoped_q=un...

522 results in servo. Granted, some (but not all) are in tests.

`.unwrap()` is expected to be in every code base out there, as there is rarely much point in handling errors such as tainted mutex locks.

frenchman99 · on Dec 11, 2019

Most of the code you're linking to is unit tests, not application code.

arghwhat · on Dec 11, 2019

No, I am thinking of application code. If you're trying to handle a poisoned mutex, you're doing something wrong.

crematoria · on Dec 11, 2019

Yeah, most of that code is stuff like examples and tests, and most errors should be forwarded, not unwrapped. But yeah, of course if a resource that you depend on is potentially in an invalid state thanks to a thread crashing, it makes perfect sense to unwrap / expect.

abjKT26nO8 · on Dec 11, 2019

Unwrapping results with pattern matching in "match" or "if let" lets you handle error conditions safely.

tov_objorkin · on Dec 11, 2019

unwrap() or expect() unconditionally smash program execution in case of error. Properly written program should handle such errors or use unwrap_or_* as fallback. The point is that unwrap/unsafe is breaking safety rules, user writing one know what hes doing or just don't bothering.

jashmatthews · on Dec 11, 2019

Unwrap and expect don’t violate any safety rules! If they did you’d need to wrap it in an unsafe block.

https://doc.rust-lang.org/nomicon/unwinding.html

tov_objorkin · on Dec 11, 2019

You're technically right, i mean if user aim for durable code then using unwrap is not desired behavior most of the time. For example pointer deference in C is implicit operation. In Rust user must choose explicitly how to handle Option/Result. Even if using unwrap not so harmful as using unsafe and didn't violate rules this is our deliberate choose to break "end-user safety".

derefr · on Dec 11, 2019

> The PR gave some very small (< 1%) speed-ups on the standard benchmarks but sped up a microbenchmark that exhibited the problem by over 1000x, and made it practical for procedural benchmarks to use tokens.

Does this imply that existing benchmarks weren’t catching the problem here because they were avoiding making use of a feature because it was too slow? That seems like a strange way to write a benchmark, especially if tokens were actually in common use in the compiler itself.

pdpi · on Dec 11, 2019

Given that we're talking about token concatenation in macros, I doubt that's going to show up in that much day-to-day code, so the small gains come from benchmarks using stdlib functionality that uses this feature.

The bit about benchmarks using tokens sounds to me like he's talking about harness code, rather than the test subject code. You can speed up running the benchmarks without necessarily affecting the benchmark results themselves.

naniwaduni · on Dec 11, 2019

If your benchmark harness has performance properties not reflected in your benchmark suite, your benchmark suite is incomplete.

Benchmark harnesses are real programs too.

fnord77 · on Dec 11, 2019

We have a fairly small but complex library written in Rust.

a debug compile takes about 5 minutes from clean. Release takes 18 minutes. (1.38.0)

gameswithgo · on Dec 11, 2019

Do you happen to be using a lot of trait bounds? I had a small library that was slow due to this, and was able to refactor from ~3 minute builds to ~6 seconds by grouping the trait bounds into marker traits.

There is a compiler flag you can use to list where time is being spent and it showed all my time was spent dealing with those trait bounds.

There is an open issue to make that not necessary.

If your time is spent linking, you can swap the lld linker in

kibwen · on Dec 11, 2019

Though note that LLD isn't supported on Mac, so no luck there.

pcwalton · on Dec 11, 2019

That seems like a bug. Please file it.

ComputerGuru · on Dec 11, 2019

How much of that is linking? I have simple project where the linker used to take over an hour, but ld.lld is much faster.

kzrdude · on Dec 11, 2019

Is the library still small if we count its dependencies too?

eb0la · on Dec 11, 2019

Looks like if you're using an 'old' release (<=1.36 , like me) it is time to upgrade.

Performance improvements in this post are from november'19 to december'19 - IMHO the november version _already_ had impressive optimizations.

xiphias2 · on Dec 11, 2019

One thing that was not addressed is why the effort is not put into making the parallel compiler default. It would give a 8x speedup on a developer machine compared to 10% speedups from these optimizations, especially when AMD releases the Zen 2 architecture for laptops.

swsieber · on Dec 11, 2019

Any gains in the single threaded model (usually) carry over to the parallel model. And maybe there is a separate effort on making the parallel compiler working better.

Maybe they are working on it in parallel.

xiphias2 · on Dec 11, 2019

I understand that there's a separate effort, but I don't see many blog posts about it, and I see that as something that should be prioritised.

Rust was partly created because most of the transistors on the computers are heavily underutilized, and multiprocessing with C++ correctly is extremely hard.

Rust compiler is written in Rust, so it would be a perfect showcase of taking advantage of the multi-processing safety of the language.

I know that there are global variables in the compiler that the compiler team is getting rid of, but at this point that should be the main focus, as I see that most of the easy huge gains of single-threaded improvements are over.

ronlobo · on Dec 11, 2019

Haha, nice pun there!

CJefferson · on Dec 11, 2019

The problem is lots of bits are hard to parallelise. Also, rust already does quite a bit in parallel, it can usually fill my 6 CPUs when building multiple crates.

xiphias2 · on Dec 11, 2019

I see, I'm looking at the Rustc parallel meeting videos right now, I just wish there was a more organized blog for that team, or at least it would be easier to find the meeting notes.

https://www.youtube.com/watch?v=Wh20eXfMOSk&t=8s

I found the meeting notes:

https://hackmd.io/_1S8_ChMSa2N8mRw6EsGPA