Those languages were Wirthian, and were designed for fast compilation.

civility · on Feb 16, 2018

Oberon certainly compiles quickly, and I believe Delphi (Pascal-ish) was fast too, but I was thinking of Microsoft's Java compiler before they got spanked.

nine_k · on Feb 16, 2018

Java without generics used to compile pretty fast.

When you add generics, lifetimes, and type inference, the amount of work the compiler does grows significantly. It allows to check much more interesting invariants, leading to the "if it compiles, it runs correctly" effect.

ComputerGuru · on Feb 16, 2018

Basic generics in rust don’t have that much overhead; it sounds like your conflating generics with templating. Generics aren’t Turing complete, unlike C++’s templates.

Other languages with powerful generics and type inference like modern C# clearly manage just fine, too.

eptcyka · on Feb 16, 2018

Yes, but things like `fn do_something<T>(t: T) -> impl Future Where t: Serializable` definitely take a while. And C# has the added benefit of being JIT'ed, it doesn't have to flatten all the generic code during compilation, it can optimise later.

pjmlp · on Feb 16, 2018

C# could always be AOT compiled to dynamically linked native code via NGEN.

It is also the only deployment option on iDevices and Windows Store since Windows 8.

Also if you prefer other native examples, Ada, Eiffel, Sather, D come to mind.

littlestymaar · on Feb 17, 2018

> Generics aren’t Turing complete, unlike C++’s templates.

They are actually, but only anecdotically (like PowerPoint is also Turing Complete).

civility · on Feb 16, 2018

Below, steveklabnik claims these things are miniscule. Are you really sure you know what is actually slow?

frankmcsherry · on Feb 16, 2018

In my experience (with Rust programs that take 20mins+) it is all about the monomorphisation of generics. The great majority of the time is spent in LLVM, because Rust hands over many GB of LLVM IR for optimization, having done not very much (edit: in the way of code transformation) other than direct instantiation of the generic parameters.

Rust does have the opportunity to do more, and I suspect they are hoping to do more with MIR, but at the moment they rely on LLVM to optimize out all of the zero-cost abstractions that they introduce.

geofft · on Feb 16, 2018

If you're referring to the comment starting with "It's not a big-O thing," I think that comment says the opposite. "Generics, lifetimes, and type inference" aren't "static checks and optimization passes" (within rustc)—he says the bulk of the work is LLVM dealing with the quantity of IR rustc generates, which is exactly what you'd expect from having heavy source-level abstractions like generics and type-heavy patterns like libstd's approach to iterators that all need to be compiled out.

steveklabnik · on Feb 16, 2018

Edit: so, in the bit below about MIR vs non-MIR borrow checking, I went and asked niko. And he told me that -Z time-passes is pretty much misleading now. Gah. I'm leaving the comment below because I put a lot of work into it, but apparently it may not be correct.

https://github.com/rust-lang-nursery/rust-forge/blob/master/... is how you're supposed to do this these days. It's a ton of work that I don't have time to do right now, but if you look at the example HTML output linked there, for a hello world, my broader point still stands, which is "translation 78.1%". That is, taking the MIR and turning it into machine code, first through LLVM IR, is the vast, vast majority of the compile time. Not the safety checks, not some un-optimized algorithm in them.

-------------------------------------

To make this concrete, here's a project I'm working on, with -Z time-passes: https://gist.github.com/steveklabnik/c2646b209debf1f66355343...

Some annotated bits of the larger passes:

  time: 0.646; rss: 91MB expansion

This is for expanding out macros, over half a second of time.

  time: 0.338; rss: 169MB coherence checking

I'm sorta surprised this takes even a third of a second; it's never come up when I've looked at these results before. Coherence is the "you can only implement a trait for a type if you've defined at least one of them in your crate".

  time: 0.596; rss: 205MB item-bodies checking

It's early so maybe I'm slightly wrong, but I believe this pass includes the type inference stuff, because that's only done inside of function bodies. 6/10ths of a second isn't nothing, but...

  time: 0.588; rss: 219MB       borrow checking
  time: 2.566; rss: 220MB MIR borrow checking

This is actually something I'm quite surprised by. Right now, we borrow-check twice, since we're still working on the MIR-based borrow checker. I'm not sure that MIR borrow checking is supposed to be this much slower; I've pinged the compiler devs about it. Regardless, before this was introduced, we'd have shaved 2.5 seconds off of the compile time, and after the old one is removed, even if the borrowcheck doesn't get faster, we'd be shaving off half a second.

Then we get into a ton of LLVM passes. As you can see, most of them take basically no time. Ones that stick out are mostly codegen passes, which is what I was referring to with my comments above. But then we have:

  time: 5.625; rss: 413MB translate to LLVM IR

Even just turning MIR into LLVM IR takes five seconds. This completely dominates the half and third second times from before. In all:

  time: 7.418; rss: 415MB LLVM passes

So, all the other passes take two seconds out of seven, combined. Ultimately, none of this is Rust's checks as a language, this is burning away all of the abstractions into lean, mean code. Finally,

  time: 9.437; rss: 414MB translation

this is the "turn LLVM IR into binary" step. This also dominates total execution time.

Note that all of this is for an initial, clean build, so a lot of that stuff is setting up incremental builds. I deleted src, git checked it out, and then built again, and got this: https://gist.github.com/steveklabnik/1ed07751c563810b515db3f... way, way, way less work, and a faster build time overall: just five seconds.

So, anyway, yeah.

geofft · on Feb 16, 2018

> Ultimately, none of this is Rust's checks as a language, this is burning away all of the abstractions into lean, mean code.

Right - my understanding is that Rust generates very large MIR and also LLVM IR because the "zero-cost abstractions" aren't yet zero-cost, and compiling them into zero-cost machine code is inherently expensive.

So it's not the safety checks per se, but it's things like "there are three wrapper objects here with various generic parameters where C/C++ would have started with just a pointer from the beginning". Those three wrapper objects couldn't have existed without the safety checks, and rustc very quickly identified that everything is in order, but then it monomorphized the generics into lots of tiny functions and LLVM has to optimize all of that into machine code that resembles what the (unsafe) pointer approach would have generated. (Which explains why it's not time taken in rustc proper, but also why we don't see LLVM being equally slow when compiling C/C++.)

Is that interpretation correct?

steveklabnik · on Feb 16, 2018

Yes.

I might split out C++ from C here though; template-heavy C++ is known to be pretty slow to compile too, for what's basically similar reasons: you end up generating a bunch of code that needs to be crunched down.

There's also some possibly deeper reasons around heavy use of generics: if your code touches something that is generic, and you change the type, it's going to need to recompile that function too. And then anything generic it touches. Etc. Trait objects can help compile times for this reason, but they aren't considered idiomatic, so most Rust code tends to stress the compiler in this way.

Sindisil · on Feb 16, 2018

Most Java compilers are quick because they compile down bytecode, leaving virtually all optimization to the JIT.

AOT Java compilers certainly exist, but I don't recall Microsoft having one -- I fully admit my recollection of their Java dev environment is quite foggy at this point, though.

Regardless, the Java AOT compilers I've tried didn't seem particularly fast in relation to their level of optimization.