Hacker News new | past | comments | ask | show | jobs | submit login
Improving Rust compile times to enable adoption of memory safety (memorysafety.org)
241 points by todsacerdoti on Feb 3, 2023 | hide | past | favorite | 187 comments



Another build time improvement coming, especially for fresh CI builds, is a new registry protocol. Instead of git-cloning metadata for 100,000+ packages, it can download only the data for your dependencies.

https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-spar...


Great stuff. Now, if they can just have a globally shared (at least per $USER!), content-addressible target/ directory, two of my complaints with Cargo would be fixed nicely...


You can do that today, set CARGO_TARGET_DIR to an absolute path.


Huh? But the docs say this:

"Location of where to place all generated artifacts, relative to the current working directory."


You can set an absolute path.

The problem is that if two workspace build a dependency with sightly different features or flags, it will always be rebuild when changing workspaces


I really wonder how many Dockerfiles are out there that on every PR merge pull the entire cargo "metadata" without cache and how wasteful that is from a bandwidth/electricity standpoint or if in the grand scheme of things it's a small drop in the bucket?


In my experience it's pretty significant from the bandwidth side at reasonable levels of usage. You'd be astounded at how many things download packages and their metadata near constantly, and the rise of fully automated CI systems has really put the stress on bandwidth in particular, since most things are "from scratch." And now we have things like dependabot automatically creating PRs for downstream advisories constantly which can incur rebuilds, closing the loop fully.

If you use GitHub as like a storage server and totally externalize the costs of the package index onto them, then it's workable for free. But if you're running your own servers then it's a whole different ballgame.


I think github would have throttled that cargo index repository a long time ago if it wasn't used by Rust, i.e they get some kind of special favour. Which is nice but maybe not sustainable.


Github employees personally reached out to various packagers (I know both Cargo and Homebrew for certain) asking them not to perform shallow clones on their index repos, because of the extra processing it was incurring on the server side.


You can also use something like this to cache build artifacts and dependencies between builds:

https://github.com/Swatinem/rust-cache


Why would a CI build need the index at all? The lock file should already contain all the dependencies and their hashes.


You're correct that Cargo doesn't check the index if it's building using a lockfile, but I think the problem is that a freshly-installed copy of Cargo assumes that it needs to get the index the first time that any command is run. I assume (but haven't verified in the slightest) that this behavior will change with the move to an on-demand index by default.


Good thing they will continue to support the original protocol. I don't like downloading things on demand like that, not good for privacy.


How is it bad for privacy?

Before:

Download all metadata, Download xyz package

After:

Downolad xyz's metadata, Download xyz

They already know you are using xyz.


I don't care much either way, but you have the privacy argument backwards. If you're downloading all the things, then no knows if you are using xyz, only that you might be using xyz. If you're just downloading what you need and you're downloading xyz, then they know that you're using xyz.


I'm not sure I understand. This is talking about Cargo metadata download improvements. You still download individual packages regardless of receiving a copy of the entire registry, so privacy hasn't materially changed either way.

If knowing you use a crate is too much, then running your own registry with a mirror of packages seems like all you could do.


You're downloading specific packages either way, which can potentially be tracked, regardless of whether you're downloading metadata for all packages or just one.

Edit: A thought occurs to me. Cargo downloads metadata from crates.io but clones the package repo from GitHub/etc. So unless I'm missing something, downloading specific metadata instead of all metadata allows for crates.io to track your specific packages in addition to GitHub.


No, repos of packages are not used, at all. Crates don't even need to be in any repository, and the repository URL in the metadata isn't verified in any way. Crates can link to somebody else's repo or a repo full of fake code unrelated to what has been published on crates.io.

crates.io crates are tarballs stored in S3. The tarball downloads also go through a download-counting service, which is how you get download stats for all crates (it's not a tracker in the Google-is-watching-you sense, but just an integer increment in Postgres).

Use https://lib.rs/cargo-crev or source view on docs.rs to see the actual source code that has been uploaded by Cargo.


This has it backwards. crates.io has always hosted the crates themselves, but has used Github for the index. In the future, with the sparse HTTP index, crates.io will be the only one in the loop, cutting Github out of the equation.


I originally posted this on reddit[1], but figured I'd share this here. I checked out ripgrep 0.8.0 and compiled it with both Rust 1.20 (from ~5.5 years ago) and Rust 1.67 (just released):

    $ git clone https://github.com/BurntSushi/ripgrep
    $ cd ripgrep
    $ git checkout 0.8.0
    $ time cargo +1.20.0 build --release
    real    34.367
    user    1:07.36
    sys     1.568
    maxmem  520 MB
    faults  1575
    
    $ time cargo +1.67.0 build --release
    [... snip sooooo many warnings, lol ...]
    real    7.761
    user    1:32.29
    sys     4.489
    maxmem  609 MB
    faults  7503
As kryps pointed out on reddit, I believe at some point there was a change to add/improve compilation times by making more effective use of parallelism. So forcing the build to use a single thread produces more sobering results, but still a huge win:

    $ time cargo +1.20.0 build -j1 --release
    real    1:03.11
    user    1:01.90
    sys     1.156
    maxmem  518 MB
    faults  0

    $ time cargo +1.67.0 build -j1 --release
    real    46.112
    user    44.259
    sys     1.930
    maxmem  344 MB
    faults  0
(My CPU is a i9-12900K.)

These are from-scratch release builds, which probably matter less than incremental builds. But they still matter. This is just one barometer of many.

[1]: https://old.reddit.com/r/rust/comments/10s5nkq/improving_rus...


Re parallelism: I have 12 cores, and cargo indeed effectively uses them all. As a result, the computer becomes extremely sluggish during a long compilation. Is there a way to tell Rust to only use 11 cores or, perhaps, nice its processes/threads to a lower priority on a few cores?

I suppose it's not the worst problem to have. Makes me realize how spoiled I got after multiple-core computers became the norm.


`cargo build -j11` will limit parallelism to eleven cores. Cargo and rustc use the Make jobserver protocol [0][1][2] to coordinate their use of threads and processes, even when multiple rustc processes are running (as long as they are part of the same `cargo` or `make` invocation):

[0]: https://www.gnu.org/software/make/manual/html_node/Job-Slots...

[2]: https://github.com/rust-lang/cargo/issues/1744

[2]: https://github.com/rust-lang/rust/pull/42682

`nice cargo build` will run all threads at low priority, but this is generally a good idea if you want to prioritize interactive processes while running a build in the background.


To add, in rust 1.63, cargo added support for negative numbers, so you can say `cargo build --jobs -2` to leave two cores available.

See https://github.com/rust-lang/cargo/blob/master/CHANGELOG.md#...


Small quality of life changes like this really make the cargo and rust community shine, I feel. I'm not a heavy rust user but following all the little improvements in warnings, hints, they build up to a great experience. I wish we had the mindshare and peoplepower to do that in my language and tooling of choice (I'm specifically talking about effort and muscle, because motivation is clearly already there).


Are they real cores or hyperthreads/SMT? I've found that hyperthreading doesn't really live up to the hype; if interactive software gets scheduled on the same physical core as a busy hyperthread, latency suffers. Meanwhile, Linux seems to do pretty well these days handling interactive workloads while a 32 core compilation goes on in the background.

SMT is a throughput thing, and I honestly turn it off on my workstation for that reason. It's great for cloud providers that want to charge you for a "vCPU" that can't use all of that core's features. Not amazing for a workstation where you want to chill out on YouTube while something CPU intensive happens in the background. (For a bazel C++ build, having SMT on, on a Threadripper 3970X, does increase performance by 15%. But at the cost of using ~100GB of RAM at peak! I have 128GB, so no big deal, but SMT can be pretty expensive. It's probably not worth it for most workloads. 32 cores builds my Go projects quickly enough, and if I have to build C++ code, well, I wait. ;)


exec ionice -c 3 nice -n 20 "$@"

Make it a shell script like `takeiteasy`, and run `takeiteasy cargo ...`


Partly because of being a Dudeist, and partly because it's just fun to say, I just borrowed this and called it "dude" on my system.

  dude cargo ...
has a nice flow to it.


This also relates to something not directly about rustc: many-core CPUs are much easier to get than five years ago, so a CPU-hungry compiler needn't be such a drag if its big jobs can use all your cores.


It's true!

Steam hardware survey, Jan 2017 [1] vs Jan 2023, "Physical CPUs (Windows)"

           2017    2023
  1 CPU    1.9%    0.2%
  2 CPUs  45.8%    9.6%
  3 CPUs   2.6%    0.4%
  4 CPUs  47.8%   29.6%
  6 CPUs   1.4%   33.0%
  8 CPUs   0.2%   18.8%
  More     0.3%    8.4%
[1] https://web.archive.org/web/20170225152808/https://store.ste...


However, rustc currently has limited ability to parallelise at a sub-crate level, which makes for not-so-great tradeoffs on large projects.


Someone asked (and then deleted their comment):

> How many LoC there is in ripgrep? 46sec to build a grep like tool with a powerful CPU seems crazy.

I wrote out an answer before I knew the comment was deleted, so... I'll just post it as a reply to myself...

-----

Well it takes 46 seconds with only a single thread. It takes ~7 seconds with many threads. In the 0.8.0 checkout, if I run `cargo vendor` and then tokei, I get:

    $ tokei -trust src/ vendor/
    ===============================================================================
     Language            Files        Lines         Code     Comments       Blanks
    ===============================================================================
     Rust                  765       299692       276218        10274        13200
     |- Markdown           387        21647         2902        14886         3859
     (Total)                         321339       279120        25160        17059
    ===============================================================================
     Total                 765       299692       276218        10274        13200
    ===============================================================================
So that's about a quarter million lines. But this is very likely to be a poor representation of actual complexity. If I had to guess, I'd say the vast majority of those lines are some kind of auto-generated thing. (Like Unicode tables.) That count also includes tests. Just by excluding winapi, for example, the count goes down to ~150,000.

If you only look at the code in the ripgrep repo (in the 0.8.0 checkout), then you get something like ~13K:

    $ tokei -trust src globset grep ignore termcolor wincolor
    ===============================================================================
     Language            Files        Lines         Code     Comments       Blanks
    ===============================================================================
     Rust                   34        15484        13205          780         1499
     |- Markdown            30         2300            6         1905          389
     (Total)                          17784        13211         2685         1888
    ===============================================================================
     Total                  34        15484        13205          780         1499
    ===============================================================================
It's probably also fair to count the regex engine too (version 0.2.6):

    $ tokei -trust src regex-syntax                          
    ===============================================================================
     Language            Files        Lines         Code     Comments       Blanks
    ===============================================================================
     Rust                   29        22745        18873         2225         1647
     |- Markdown            23         3250          285         2399          566
     (Total)                          25995        19158         4624         2213
    ===============================================================================
     Total                  29        22745        18873         2225         1647
    ===============================================================================
Where about 5K of that are Unicode tables.

So I don't know. Answering questions like this is actually a little tricky, and presumably you're looking for a barometer of how big the project is.

For comparison, GNU grep takes about 17s single threaded to build from scratch from its tarball:

    $ time (./configure --prefix=/usr && make -j1)
    real    17.639
    user    9.948
    sys     2.418
    maxmem  77 MB
    faults  31
Using `-j16` decreases the time to 14s, which is actually slower than a from scratch ripgrep 0.8.0 build. Primarily do to what appears to be a single threaded configure script for GNU grep.

So I dunno what seems crazy to you here honestly. It's also worth pointing out that ripgrep has quite a bit more functionality than something like GNU grep, and that functionality comes with a fair bit of code. (Gitignore matching, transcoding and Unicode come to mind.)


It was me, and thanks for the details. I missed the multi threaded compilation in the second part, I thought it was 46sec with -jx


In addition, it's worth mentioning here that the measurement is for release builds, which are doing far more work than just reading a quarter million lines off of a disk.


The most annoying thing in my experience is not really the raw compilation times, but the lack of - or very rudimentary - incremental build feature. If I'm debugging a function and make a small local change that does not trickle down to some generic type used throughout the project, then 1-second build times should be the norm, or better yet, edit & continue debug.

It's beyond frustrating that any "i+=1" change requires relinking a 50mb binary from scratch and rebuilding a good chunk of the Win32 crate for good measure. Until such enterprise features become available, high developer productivity in Rust remains elusive.


To be clear, Rust has an "incremental" compilation feature, and I believe it is enabled by default for debug builds.

I don't think it's enabled by default in release builds (because it might sacrifice perf too much?) and it doesn't make linking incremental.

Making the entire pipeline incremental, including release builds, probably requires some very fundamental changes to how our compilers function. I think Cranelift is making inroads in this direction by caching the results of compiling individual functions, but I know very little about it and might even be describing it incorrectly here in this comment.


As far as I remember Energize C++ (and VC++ does a similar thing), allowed to do just that, and it feels quite fast with VC++ incremental compilation and linking.


> It's beyond frustrating that any "i+=1" change requires relinking a 50mb binary from scratch

It’s especially hard to solve this with a language like rust, but I agree!

I’ve long wanted to experiment with a compiler architecture which could do fully incremental compilation, maybe down the function in granularity. In the linked (debug) executable, use a malloc style library to manage disk space. When a function changes, recompile it, free the old copy in the binary, allocate space for the new function and update jump addresses. You’d need to cache a whole lot of the compiler’s context between invocations - but honestly that should be doable with a little database like LMDB. Or alternately, we could run our compiler in “interactive mode”, and leave all the type information and everything else resident in memory between compilation runs. When the compiler notices some functions are changed, it flushes the old function definitions, compiles the new functions and updates everything just like when the DOM updates and needs to recompute layout and styles.

A well optimized incremental compiler should be able to do a “i += 1” line change faster than my monitor’s refresh rate. It’s crazy we still design compilers to do a mountain of processing work, generate a huge amount of state and then when they’re done throw all that work out. Next time we run the compiler, we redo all of that work again. And the work is all almost identical.

Unfortunately this would be a particularly difficult change to make in the rust compiler. Might want to experiment with a simpler language first to figure out the architecture and the fully incremental linker. It would be a super fun project though!


Here, Energize C++ doing just that in 1993.

https://www.youtube.com/watch?v=yLZwLSzkH3E

VC++ has similar kind of support nowadays.


Most of the time for most changes you should just be relying on "cargo check" anyway. You don't need a full re-build to just check for syntax issues. It runs very fast and will find almost all compile errors and it caches metadata for files that are unchanged.

Are you really running your test suite for every "i+=1" change on other languages?


> Are you really running your test suite for every "i+=1" change on other languages?

You don't have to run your testsuite for a small bugfix (that's what CI is for), but you DO need to restart, reset the testcase that triggers the code you are interested in, and step through it again. Rinse and repeat for 20 or so times, with various data etc. - at least that's my debug-heavy workflow. If any trivial recompile takes a minute or so, that's a frustrating time spent waiting as opposed to using something like a dynamic language to accomplish the same task.

So you would instinctively avoid Rust for any task that can be accomplished with Python or JS, a real shame since it's very close to being an universal language.


Can you explain why the user time goes down when using a single thread? Does that mean that there's a huge amount of contention in the parallelism?


This is caused by hyperthreading. It's not an actual inefficiency, but an artifact of the way CPU time is counted.

The HT cores aren't real CPU cores. They're just an opportunistic reuse of hardware cores when another thread is waiting for RAM (RAM is relatively so slow that they're waiting a lot, for a long time).

So code on the HT "core" doesn't run all the time, only when other thread is blocked. But the time HT threads wait for their opportunity turn is included in wall-clock time, and makes them look slow.


Back in the early days of HT I was so happy to get a desktop with it, that I enabled it.

The end result was that doing WebSphere development actually got slower, because of their virtual nature and everything else on the CPU being shared.

So I ended up disabling it again to get the original performance back.


Yeah, the earliest attempts weren't good, but I haven't heard of any HT problems post Pentium 4 (apart from Spectre-like vulnerabilities).

I assume OSes have since then developed proper support for scheduling and pre-empting hyperthreading. Also the gap between RAM and CPU speed only got worse, and CPUs have grown more various internal compute units, so there's even more idle hardware to throw HT threads at.


To be honest, I don't know. My understanding of 'user' time is that is represents the sum of all CPU time spent in "user mode" (as opposed to "kernel mode"). In theory, given that understanding and perfect scaling, the user time of a multi-threaded task should roughly match the user time of a single-threaded task. Of course, "perfect" scaling is unlikely to be real, but still, you'd expect better scaling here.

If I had to guess as to what's happening, it's that there's some thread pool, and at some point, near the end of compilation, only one or two of those threads is busy doing anything while the other threads are sitting and idling. Now whether and how that "idling" gets interpreted as "CPU being actively used in user mode" isn't quite clear to me. (It may not, in which case, my guess is bunk.)

Perhaps someone more familiar with what 'user' time actually means and how it interplays with multi-threaded programs will be able to chime in.

(I do not think faults have anything to do with it. The number of faults reported here is quite small, and if I re-run the build, the number can change quite a bit---including going to zero---and the overall time remains unaffected.)


Idle time doesn't count as user-time unless it's a spinlock (please don't do those in user-mode).

I suspect the answer is: Perfect scaling doesn't happen on real CPUs.

Turboboost lets a single thread go to higher frequencies than a fully loaded CPU. So you would expect "sum of user times" to increase even if "sum of user clock cycles" is scaling perfectly.

Hyperthreading is the next issue: multiple threads are not running independently, but might be fighting for resources on a single CPU core.

In a pure number-crunching algorithm limited by functional units, this means using $(nproc) threads instead of 1 thread should be expected to more than double the user time based on these two first points alone!

Compilers of course are rarely limited by functional units: they do a decent bit of pointer-chasing, branching, etc. and are stalled a good bit of time. (While OS-level blocking doesn't count as user time; the OS isn't aware of these CPU-level stalls, so these count as user time!) This is what makes hyperthreading actually helpful.

But compilers also tend to be memory/cache-limited. L1 is shared between the hyperthreads, and other caches are shared between multiple/all cores. This means running multiple threads compiling different parts of the program in parallel means each thread of computation gets to work with a smaller portion of the cache -- the effective cache size is decreasing. That's another reason for the user time to go up.

And once you have a significant number of cache misses from a bunch of cores, you might be limited on memory bandwidth. At that point, also putting the last few remaining idle cores to work will not be able to speed up the real-time runtime anymore -- but it will make "user time" tick up faster.

In particularly unlucky combinations of working set size vs. cache size, adding another thread (bringing along another working set) may even increase the real time. Putting more cores to work isn't always good!

That said, compilers are more limited by memory/cache latency than bandwidth, so adding cores is usually pretty good. But it's not perfect scaling even if the compiler has "perfect parallellism" without any locks.


> Turboboost lets a single thread go to higher frequencies than a fully loaded CPU. So you would expect "sum of user times" to increase even if "sum of user clock cycles" is scaling perfectly.

Ah yes, this is a good one! I did not account for this. Mental model updated.

Your other points are good too. I considered some of them as well, but maybe not enough in the context of competition making many things just a bit slower. Makes sense.


User time is the amount of CPU time spent actually doing things. Unless you're using spinlocks, it won't include time spent waiting on locks or otherwise sleeping -- though it will include time spent setting up for locks, reloading cache lines and such.

Extremely parallel programs can improve on this, but it's perfectly normal to see 2x overhead for fine-grained parallelism.


I'd say there's still a gap in my mental model. I agree that it's normal to observe this, definitely. I see it in other tools that utilize parallelism too. I just can't square the 2x overhead part of it in a workload like Cargo's, which I assume is not fine-grained. I see the same increase in user time with ripgrep too, and its parallelism is maybe more fine grained than Cargo's, but is still at the level of a single file, so it isn't that fine grained.

But maybe for Cargo, parallelism is more fine grained than I think it is. Perhaps because of codegen-units. And similarly for ripgrep, if it's searching a lot of tiny files, that might result in fine grained parallelism in practice.


Well, like mentioned elsewhere, most of that overhead is just hyper threads slowing down when they have active siblings.

Which is fine; it’s still faster overall. Disable SMT and you’ll see much lower overhead, but higher time spent overall.


Yes, I know its fine. I just don't understand the full details of why hyperthreading slows things down that much. There are more experiments that could be done to confirm or deny this explanation, e.g., disabling hyperthreading. And playing with the thread count a bit more.


Hyperthreading only duplicates the frontend of the CPU.

That's really it. That's the entire explanation. It's useful if and only if there are unused resources behind it, due to pipeline stalls or because the siblings are doing different things. It's virtually impossible to fully utilize a CPU core with a single thread; having two threads therefore boosts performance, but only to the degree that the first thread is incapable of using the whole thing.

That's why the speedup is around 20%, not 100%.


I know all of that. There's still a gap because it doesn't explain in full detail how contended resources lead to the specific slowdown seen here. Hell, nobody in this thread has done the experiments nexessary to confirm that HT is evem the cause in the first place.


Spinlocks are normal userspace code issuing machine instructions in a loop that do memory operations. It is counted in user time, unless the platform is unusual and for some reason enters the kernel to spin on the lock. Spinning is the opposite of sleeping.

edit: misparsed, like corrected below, my bad.


I think you're saying the same thing as the GP. You might have parsed their comment incorrectly.


User time is the amount of CPU time spent in user mode. It is aggregated across threads. If you have 8 threads running at 100% in user mode for 1 second, that gives you 8s of user time.

Total CPU time in user mode will normally increase when you add more threads, unless you're getting perfect or better-than-perfect scaling.


There are hardware reasons even if you leave any software scaling inefficiency to the side. For tasks that can use lots of threads, modern hardware trades off per-thread performance for getting more overall throughput from a given amount of silicon.

When you max out parallelism, you're using 1) hardware threads which "split" a physical core and (ideally) each run at a bit more than half the CPU's single-thread speed, and 2) the small "efficiency" cores on newer Intel and Apple chips. Also, single-threaded runs can feed a ton of watts to the one active core since it doesn't have to share much power/cooling budget with the others, letting it run at a higher clock rate.

All these tricks improve the throughput, or you wouldn't see that wall-time reduction and chipmakers wouldn't want to ship them, but they do increase how long it takes each thread to get a unit of work done in a very multithreaded context, which contributes to the total CPU time being higher than it is in a single-threaded run.


Faults also drop to zero. Might be worth trying to flush the cache before each cargo build?


As someone who uses Rust on various hobby projects, I never understood why people were complaining about compile times.

Perhaps they were on old builds or some massive projects?


Wait, like, you don't understand, or you don't share their complaint? I don't really understand how you don't understand. If I make a change to ripgrep because I'm debugging its perf and need to therefore create a release build, it can take several seconds to rebuild. Compared to some other projects that probably sounds amazing, but it's still annoying enough to impact my flow state.

ripgrep is probably on the smallish side. It's not hard to get a lot bigger than that and have those incremental times also get correspondingly bigger.

And complaining about compile times doesn't mean compile times haven't improved.


I do understand some factors, but I never noticed it being like super slow to build.

My personal project takes seconds to compile, but fair enough it's small, but even bigger projects like a game in Bevy don't take that much to compile. Minute or two tops. About 30 seconds when incremental.

People complained of 10x slower perf. Essentially 15min build times.

Fact that older versions might be slower to compile fills another part of the puzzle.

That and fact I have a 24 hyper thread monster of CPU.


30 seconds isn't incremental, that is way too long.

I work on a large'ish C++ project and incremental is generally 1-2 seconds.

Incremental must work in release builds(someone else said it only works in debug for Rust), although it is fine to disable link time optimizations as those are obviously kinda slow.


> 30 seconds isn't incremental

I don't recall exact numbers. But bevy can pull a lot of depenencies. Enough for `target` directory to rival NPM worst offenders (e.g. ~1GB).


I'll echo Ygg2's comments. My previous job the minimum compile times were around 30 minutes so compile times under a minute feel like they're happening almost instantly. It's enough such that I don't need to break my thought process every time I compile.


Surely you can see how 1) it's all relative and 2) different people work differently. Like is this really so hard to understand? As far as I can tell, your comment is basically, "be grateful for what you have." But I am already. Yet I still want faster compile times because I think it will help me iterate more quickly.

I truly just do not see what is difficult to understand here.


First, compile times can differ wildly based on the code in question. Big projects can take minutes where hobby projects take second.

Also, people have vastly different work flows. Some people tend to slowly write a lot of code and compile rarely. Maybe they tend to have runtime tools to tweak things. Otherwise like to iterate really fast. Try a code change, see if the UI looks better or things run faster, and when you work like this even a compile time of 3 seconds can be a little bit annoying, and 30 seconds maddening.


It's less about "big projects" and more about "what features are used". It's entirely possible for a 10kloc project to take much more time to build than a 100kloc project. Proc macros, heavy generic use, and the like will drive compile time way up. It's like comparing a C++ project that is basically "C with classes" vs one that does really heavy template dances.

Notably, serde can drive up compile times a lot, which is why miniserde still exists and gets some use.


People are enabling serde codegen on every type, for no reason. That's it, that's the whole story. Those of us who don't do this will continue to read these "rustc is slow!!1!" posts and roll our eyes. Rustc isn't slow, serde is slow.


Completely agree, coming from a job where the C project I worked on took 30 minutes for basic software builds (you don't generally compile the code while writing it and spend a lot of time manually scanning looking for typos), Rust compile times are crazy fast.


Code gen takes quite a while. Diesel features are one way to see the effect...

diesel = { version = "*", features = ["128-column-tables"], ... }


I remember I would spend hours looking at my code change because it would take hours to days to build what I was working on. I would build small examples to test and debug. I was shocked at Netscape with the amazing build system they had that could continuously build and tell you within a short few hours if you’ve broken the build on their N platforms they cross compiled to. I was bedazzled when I had IDEs that could tell me whether I had introduced bugs and could do JIT compilation and feedback to me in real time if I had made a mistake and provide inline lints. I was floored when I saw what amazing things rust was doing in the compiler to make my code awesome and how incredibly fast it builds. But what really amazed me more than anything was realizing how unhappy folks were that it took 30 seconds to build their code. :-)

GET OFF MY LAWN


I dare to want better tools. And I build them when I can. Like ripgrep. ¯\_(ツ)_/¯


Keep keeping me amazed and I’ll keep loving the life I’ve lived


I wonder about the framing of the title here. Rust is great but realistically a lot of software with memory safety bugs doesn't need to be written in C in the first place.

For example Java has a perfectly serviceable TLS stack written entirely in a memory safe language. Although you could try to make OpenSSL memory safe by rewriting it in Rust - which realistically means yet another fork not many people use - you could also do the same thing by implementing the OpenSSL API on top of JSSE and Bouncy Castle. The GraalVM native image project allows you to export Java symbols as C APIs and to compile libraries to standalone native code, so this is technically feasible now.

There's also some other approaches. GraalVM can also run many C/C++ programs in a way that makes them automatically memory safe, by JIT compiling LLVM bitcode and replacing allocation/free calls with garbage collected allocations. Pointer dereferences are also replaced with safe member accesses. It works as long as the C is fairly strictly C compliant and doesn't rely on undefined behavior. This functionality is unfortunately an enterprise feature but the core LLVM execution engine is open source, so if you're at the level of major upgrades to Rust you could also reimplement the memory safety aspect on top of the open source code. Then again you can compile the result down to a shared native library that doesn't rely on any external JVM.

Don't get me wrong, I'm not saying don't improve Rust compile times. Faster Rust compiles would be great. I'm just pointing out that, well, it's not the only memory safe language in the world, and actually using a GC isn't a major problem these days for many real world tasks that are still done with C.


> you could try to make OpenSSL memory safe by rewriting it in Rust

Or just write a better crypto stack without the many legacy constraints holding OpenSSL back. Rustls (https://github.com/rustls/rustls) does that. It has also been audited and found to be excellent - report (https://github.com/rustls/rustls/blob/main/audit/TLS-01-repo...).

You're suggesting writing this stack in a GC language. That's possible, except most people looking for an OpenSSL solution probably won't be willing to take the hit of slower run time perf and possible GC pauses (even if these might be small in practice). Also, these are hypothetical for now. Rustls exists today.


OpenSSL was just an example. You could also use XML parsing or many other tasks.

Point is that the code already exists - it's not hypothetical - and has done for a long time. It is far easier to write bindings from an existing C API to a managed implementation than write, audit and maintain a whole new stack from scratch. There are also many other cases where apps could feasibly be replaced with code written in managed languages and then invoked from C or C++.

Anything written in C/C++ can certainly tolerate pauses when calling into third party libraries because malloc/free can pause for long periods, libraries are allowed to do IO without even documenting that fact etc.

I think it's fair to be concerned that rewrite-it-in-rust is becoming a myopic obsession for security people. That's one way to improve memory safety but by no means the only one. There are so many cases where you don't need to do that and you'll get results faster by not doing so, but it's not being considered for handwavy reasons.


I think the thing you’re missing is that opensource people love rewriting libraries in their favourite languages. Especially something well defined, like tls or an xml parser. Rustls is a great example. You wont stop people making things like this. Nor should you - they’re doing it for fun!

It’s much more fun to rewrite something in a new language than maintain bindings to some external language. You could wrap a Java library with a rust crate, but it would depend on Java and rust both being installed and sane on every operating system. Maintaining something like that would be painful. Users would constantly run into problems with Java not being installed correctly on macos, or an old version of Java on Debian breaking your crate in weird ways. It’s much more pleasant to just have a rust crate that runs everywhere rust runs, where all of the dependencies are installed with cargo.


> It is far easier to write bindings from an existing C API to a managed implementation than write, audit and maintain a whole new stack from scratch.

I’d agree, if rustls wasn’t already written, audited and maintained. And there are other examples as well. The internationalisation libraries Icu4c and Icu4j exist, but the multi-language, cross-platform library Icu4x is written in Rust. Read the announcement post on the Unicode blog (http://blog.unicode.org/2022/09/announcing-icu4x-10.html?m=1) - security is only one of the reasons they chose to write it in Rust. Binary size, memory usage, high performance. Also compiles to wasm.

Your comment implies that people rewrite in Rust for security alone. But there are so many other benefits to doing so.


> people looking for an OpenSSL solution probably won't be willing to take the hit of slower run time perf and possible GC pauses

Golang users would?

That aside excellent points about rust tls, and libssl legacy cruft.


No, I’m imagining cross-language usage. Someone not using Go isn’t going to use the crypto/tls package from the Go std lib regardless of its quality. The overhead and difficulty of calling into Go make this infeasible.

To include a library written in another language as a shared lib, it needs to be C, C++ or Rust.


That's not feasible for the millions of devices that don't have the resources for deploying GraalVM or GraalVM compiled native images.

The other thing to consider is that in many applications, nearly every single bit of i/o will flow through a buffer and cryptographic function to encrypt/decrypt/validate it. This is the place where squeezing out every ounce of performance is critical. A JIT + GC might cost a lot more money than memory safety bugs + AOT optimized compilation.


Native images are AOT optimized and use way less RAM than a normal Java app on HotSpot does. And you can get competitive performance from them with PGO.

Using a GC doesn't mean never reusing buffers, and Java has intrinsics for hardware accelerated cryptography for a long time. There's no reason performance has to be less, especially if you're willing to fund full time research projects to optimize it.

The belief that performance is more important than everything else is exactly how we ended up with pervasive memory safety vulns to begin with. Rust doesn't make it free, as you pay in developer hours.


How big are GraalVM native images?


Because that won't help with software that "needs" to be written in C or C++.

This fallacy is why these languages are still in use. Time after time, designers of all the safer languages were deciding that GC makes everything so much easier, and it's perfectly fine for overwhelming majority of programs. This is correct and rational if the goal is to get many people use the language, but a total self-own if the goal is to replace C and C++ entirely.

This dodging of the low-level memory management problem was consistently avoiding exactly the types of programs that people felt they had to use C or C++ for. The easy majority of programs that don't need C is already well served, but the tough cases that needed C were left uncontested.

From Rust's first introduction:

http://venge.net/graydon/talks/intro-talk-2.pdf

> Go seems to be barking up a different tree?

> Everyone is dodging the niche I'm interested in


A lot of software might not need to be implemented in C... but it is. Not just one piece either, but there's ecosystem effects at play too.

Rust interops much better with C than most languages, including Java, and offers a smoother transition path to increased safety.


There is Go, too.


"There are possible improvements still to be made on bigger buffers for example, where we could make better use of SIMD, but at the moment rustc still targets baseline x86-64 CPUs (SSE2) so that's a work item left for the future."

I don't understand this. The vast majority (I would guess 95%+) of people using Rust have CPUs with AVX2 or NEON. Why is that a good reason? Why can't there be a fast path and slow path as a failover?


Because it requires some kind of fat binaries.

Some C and C++ compilers offer this and it requires some infrastructure to make it happen (simd attribute in GCC), or explicitly loading different kinds of dynamic libraries.


I found most precompiled C++ libraries target a specific minimum microarchitecture because it swells the binary size if you do fat binaries, and for most users they wouldn’t know anyway. Companies always get complaints when they bump up the minimum though - I remember reading some articles about some games requiring SSE4.1 or AVX and gamers complaining they couldn’t play it on their ten year old machine.


Not playing on their 10 year old machine is the main reason why OpenGL 3.3 keeps being the baseline when using GL.

Although maybe GL 4.1 would do it nowadays.


Rust already has runtime CPU feature detection. But that would require hand writing the SIMD code.


Higher baseline and failovers are possible in Rust, just not the default.

The default configuration for the "x86-64" target is really meant to run on every x86-64 CPU. I know at lest Debian has the same policy. SSE2 is as old as x86-64 itself, so it can be assumed to be available in every x86-64 CPU, but nothing else can.

There has been some movement to define pseudo-targets like x86-64-v1, x86-64-v2, x86-64-v3 for higher baselines.


I really wish there was some work on hermetic compilation of crates. Ideally crates would be able to opt-in (eventually opt-out) to "pure" mode which would mean they can't use `build.rs`, proc macros are fully sandboxed, no `env!()` and so on.

Without that you can't really do distributed and cached compilation 100% reliably.


That would help some stuff, but it wouldn’t help with monomorphized code or macro expansion. Those two are the real killers in terms of compilation performance. And in both of those cases, most of the compilation work happens at the call site - when compiling your library.


Eiffel, Ada, D, C++ with extern templates (even better if modules are also part of the story) show ways how this can be improved.

Naturally somehow has to spend time analysing how their way maps into Rust compilation story.


Would it not allow macro expansion to be cached? Which I believe it can't be currently because macros can run arbitrary code and access arbitrary external state.


Those are simply different problems though. Macro expansion is not even exactly a problem.

For monomorphized code the compiler just needs a mode where it automatically does what the Momo crate does.

For proc macros the Watt crate (precompiled WASM macros) will make a big difference. It just needs official sanction and integration.

Anyway yeah those are totally separate problems to caching and distributed builds.


Haskell is one of the few languages that can compile slower than rust. But they have a REPL GHCI that can be used to fairly quickly reload code changes.

I wish there were some efforts at dramatically different approaches like this because there’s all this work going into compilation but it’s unlikely to make the development cycle twice as fast in most cases.


I've started liking evcxr (https://github.com/google/evcxr) for REPL. It's a little slow compared to other REPLs, but still good enough to be usable after initial load.


I agree, evcxr really needs to be advertised more. It might need a new name, I don't even know how to say it.



Phonetically for a german that sounds like "eww, disgusting wanker"


Are there any articles/papers that explain how a mix of compiled and interpreted code works for Haskell? I wanted to play with this idea for my toy language, but don't know where to start.


This might give you some idea, but you basically just want to reload your top level module and ghci will automatically reload any dependent modules. https://chrisdone.com/posts/ghci-reload/

It’s not really a mix, you do one or the other although certainly ghci for local dev and compilation for prod.


When people complain about rust compile times are they complaining about cold/clean compiles or warm/cached compiles? I can never really tell because people just gripe "compile times".

I can see how someone would come to rust, type `cargo run`, wait 3-5 minutes while cargo downloads all the dependencies and compiles them along with the main package, and then say, "well that took awhile it kinda sucks". But if they change a few lines in the actual project and compile again it would be near instant.

The fair comparison would be something akin to deleting your node or go modules and running a cold build. I am slightly suspicious, not in a deliberate foul play way but more in a messy semantics and ad-hoc anecdotes way, that many of these compile time discrepancies probably boil down more to differences in how the cargo tooling handles dependencies and what it decides to include in the compile phase, where it decides to store caches and what that means for `clean`, etc. compared to similar package management tooling from other languages, than it does to "rustc is slow". But I could be wrong.


> But if they change a few lines in the actual project and compile again it would be near instant.

If it’s a big project and the lines you are changing are in something that is being used many other places then the rebuild will still take a little while. (30 seconds or a minute, or more, depending on the size of the project.)

Likewise, if you work on things in different branches you may need to wait more when you switch branch and work on something there.

Also if you switch between Rust versions you need to wait a while when you rebuild your project.

I love Rust, and I welcome everything that is being done to bring the compile times down further!


> (30 seconds or a minute, or more, depending on the size of the project.)

I'm working on a largeish modern java project using gradle, and this sounds great... Every time I start my server it takes 40 seconds just for gradle to find out that all the sub projects are up to date, nothing has been changed and no compilation is necessary...


Yeah, even Maven seems much faster than Gradle, even on a clean build.

Neither are especially fast though.


I am not discouraging efforts to make compile times faster. However, I also see a lot of things that would really make Rust soar not being worked on, like syntax quality of life reworks that get complex under the hood being dropped, partially complete features with half baked PRs, IDE tooling and debugging support, interface-types and much of the momentum behind wasm, async traits and the sorely lacking async_std, etc. It seems like every time I dive into something moderately complex I start hitting compiler caveats with links to issues that have been open for 5 years and a bunch of comments like "what's the status of this can we please get this merged?". It can ever so slightly give one the impression that the rust community has decided that the language is mature and the only thing missing is faster compile times.


> "what's the status of this can we please get this merged?"

Having written Rust professionally for a number of years, this didn't happen too much. Where it did it was stuff like "yeah you need to Box the thing today", which... did not matter, we just did that and moved on.

> It can ever so slightly give one the impression that the rust community has decided that the language is mature and the only thing missing is faster compile times.

That is generally my feeling about Rust. There are a few areas where I'd like to see things get wrapped up (async traits, which are being actively worked on) but otherwise everything feels like a bonus. In terms of things that made Rust difficult to use, yeah, compile times were probably the number one.


I mean this is what you have to do to access variables from an async block:

    let block = || {
        let my_a = a.clone();
        let my_b = b.clone();
        let my_c = c.clone();
        async move {
            // use my_a, my_b, my_c
            let value = ...

            Ok<success::Type, error::Type>(value)
        }
    }
And you can't use `if let ... && let ...` (two lets for one if) because it doesn't desugar correctly.

And error handling and backtraces are a beautiful mess. Your signatures look like `Result<..., Box<dyn std::error::Error>>` unless you use `anyhow::Result` but then half the stuff implements std::error::Error but not Into<anyhow::Error> and you can't add the silly trait impl because of language limitations so you have to map_err everywhere.

It's not just "oh throw a box around it and you're good". It's ideas that were introduced to the language when there was lots of steam ultimately not making it to a fully polished state (maybe Moz layoffs are partly to blame IDK). Anyway I love Rust and we use it in production and have been for years, but I think there's still quite a bit to polish.


I just want to reiterate how ridiculous the async block example is. You have to add a wrapper block that captures the variables and clones them because there is no way to specify how you want variables to be captured and what works in a normal block does not work in an async block. Then, because of some other language/compiler limitations, the return type of the block can't be inferred so you have to specify it manually even when you use the block in a context where it ought be inferred trivially. All this adds up to the existence of the block syntax being more complicated than just defining a function to do the same thing (when blocks are supposed to be a convenience so that you don't need a bunch of function with context pointers taking pointers to other functions everywhere like you do in C). Which you'd happily do but wait, you're trying to curry a function for a specific use case where you pass it to an async iterator function, so the signature is fixed and you have to use a block to curry because rust doesn't support that either. So you end up with the above just to say "run this piece of code concurrently for each element in a list.


> I mean this is what you have to do to access variables from an async block:

I am clearly missing some context because that code is needlessly complex. Are you just trying to show that you have to clone values that you hold a reference to if you want to move them? Because yes, you do. But your example also needlessly borrows them in the outer closure.


It’s not needless. I explained it a bit in my previous comment but it’s not worth diving into in this forum. To get an idea, consider scenarios along the lines of transforming a list of data in parallel using an async collection/iterator where all the futures are spawned and joined within the local scope of a single function. Hit me up in the Rust Discord if you want to chat about it and go into more details. Otherwise all I can really say is that I have encountered a few scenarios where this is necessary but shouldn’t be with improvements to the compiler.


There are a few more fundemental missing pieces for me:

- It's impossible to describe a type that "implements trait A and may or may not implement trait B"

- It's impossible to be generic over a trait (not a type that implements a trait, the trait itself)


> - It's impossible to describe a type that "implements trait A and may or may not implement trait B"

So, specialization? Or something else? I haven't found a need for specialization. I remember when I came from C++ I had a hard time adjusting to "no specialization, no variadics" but idk I haven't missed it in years.

> - It's impossible to be generic over a trait (not a type that implements a trait, the trait itself)

Not sure I understand.


> So, specialization?

Basically yes. But that works with dynamic dispatch (trait objects) as well as static dispatch (generics).

> Not sure I understand.

A specific pattern I'd like to be able to represent is:

    trait AlgorithmAInputData {
      ...
    }

    trait AlgorithmA {
      trait InputData = AlgorithmAInputData;
      ...
    }

    trait DataStorage<trait AlgorithmA> {
      type InputData : Algorithm::InputData;
      
      fn get_input_data() -> InputData;

    }

    fn compute_algorithm_a<Storage: DataStorage<AlgorithmA>>() {
      ...
    }


> It's impossible to describe a type that "implements trait A and may or may not implement trait B"

How is this different from just describing a type that only "implements trait A" ?


It would allow you to call a function to check for trait B and downcast to "implements trait A and B" in the case that it does implement the trait.


It seems like a way to ask "Can this thing implement X and if so how?" from say the Any trait would be what you want here, I have no idea how hard that would be to deliver but I also don't see how the previous trait thing is relevant, like, why do we need to say up front that maybe we will care whether trait B is implemented?


I’m still learning the language but couldn’t you use an enum containing two types to accomplish the same thing?


You can if you know all of the possible types in advance. But if you want to expose this as an interface from a library that allows users to provide their own custom implementation then you need to use traits.


> It can ever so slightly give one the impression that the rust community has decided that the language is mature and the only thing missing is faster compile times.

Is not the case, is that the features are now good enough and compile times is the one major, big, sore point.

So, if you compare Rust to X you can make a very good case until you hit:

"... wait, Rust is THAT SLOW TO COMPILE?"

":(. Yes"


For the branch-switching usecase you might get some milage out of sccache [1]. For local storage it's just one binary and two lines of configuration to have a cache around rustc, so it's worth testing out.

1: https://github.com/mozilla/sccache


Practically all Rust crates make heavy use of monomorphized generics, so every use of them in a new project is bespoke and has to be compiled on the spot. This is very different from how Go or Node work. You could compile the non monomorphic portions of a Rust crate into a C-compatible system library (with a thin, header-like wrapper to translate across ABI's) but in practice it wouldn't amount to much.


Some popular proc-macros could be pre-compiled and distributed as WASM, and it would be impactful, since they tend to bottleneck the early parts of a project build. However I don't think that could be made entirely transparent, because right now there's a combinatorial explosion of possible syn features. For now I avoid depending on syn/quote if I can.


Incremental builds are what matter to me. On my 1240p if I change one file and build it takes ~11s to build. Changing one file and running tests takes ~3.5. That's all build time the tests run in <100ms.

The incremental build performance seems to be really dependent on single-thread performance. An incremental build on a 2014ish Haswell e5-2660v3 xeon takes ~30s.


> On my 1240p if I change one file and build it takes ~11s to build. Changing one file and running tests takes ~3.5

`cargo test` and default `cargo build` use the same profile, so presumably the first number is referring to `cargo build --release`. Release builds deliberately forego compilation speed in favor of optimization. In practice, most of my development involves `cargo check`, which is much faster than `cargo build`.


> so presumably the first number is referring to `cargo build --release`

Both numbers are for debug builds. I don't know why `cargo test` is faster but I appreciate it.

Incremental release builds with `cargo build --release` are even slower taking ~35s on the 1240p.


Honestly, it should be impossible; in the absence of some weird configuration, cargo test does strictly more work than cargo build. :P Can you reproduce it and file a bug?


i suspect most of that time is link time. Possibly the linker in use is not very parallel, and so linking one big executable with cargo build takes longer than many smaller test executable whose linking can actually be made parallel?



I contributed to Rome tools [1] and the build takes more than 1 min. This makes write-build-test loop frustrating... Such frustrating that I am hesitating to start a project in Rust...

My machine is 7yo. People tell me to buy a new one just for compiling a Rust project... That's ecologically questionable.

[1] https://rome.tools/


I'd say if you got a newer system, the SSD speed of the new system would be at least 2x of the old one. That's got to help at least the linking step substantially.


It's a few things:

1. Clean builds can happen more often than some may think. CI/CD pipelines can end up with a lot of clean builds - especially if you use ephemeral instances (to save money), but even if you don't it's very likely.

Even locally it can happen sometimes. For example, we used Docker to run builds. For various reasons the cache could get blown. Also, sometimes weird systemy things happen and 'cargo clean' fixes it, but you have to recompile from scratch. This can take 10+ minutes on a decent sized codebase.

2. On a large codebase even small changes can lead to long recompile times, especially if you want to run tests - cargo check won't be enough, you need to build.


Both, because some times a little change implies compiling the world due to configuration changes.

Also it is quite irritating sometimes seeing the same crate being compiled multiple times as it gets referenced from other crates.

Ideally Rust could use a dumb compilation mode (or interpreter) for change-compile-debug cycles, and proper compilation for release, e.g. Haskell and OCaml offer such capabilities on their toolchains.


I primarily develop the Virgil compiler in interpreted mode (i.e. running the current source on the stable binary's interpreter). Loading and typechecking ~45kloc of compiler source takes 80ms, so it is effectively instantaneous.


But you aren't on our timeline and haven't opted into our bullshit.

HTML+Js projects used to be testable with the load of a web page and that community has opt-in to long build times.

Most people are so far away from flow-state that they can't even imagine another way of being.


Mhm, maybe twas a bad idea to start learning React with ReasonML. Being used to compiles in milliseconds made the collision with TS at $Job extra painful.


Yeah it works great having toolchains that support all possible execution models.


cargo run is a command you'd generally use to actually get something running I guess. This is not going to be incremental development in many cases which focus on unit tests I guess.

FWIW cold builds (i.e., in docker with no cache) of cargo are much slower than go, hanging for a long time on refreshing cargo.io indexes. I don't know exactly what that is doing but I have a feeling it is implemented in a monolithic way rather than on-demand. Rust has had plenty of time to make this better but it is still very slow for cold cargo builds, often spending minutes refreshing the crates index. But Go misses easy optimizations like creating strings from a byte slice.

So it is what it is - Go makes explicit promises of fast compile times. Thanks to that, build scripts in go are pretty fast. Any language that doesn't make that explicit might be slow to compile and might run fast - that's totally fine and I would rather have two languages optimized to each case than one mediocre language.


Refreshing the crates index has gotten quite slow because it currently downloads the entire index, regardless of which bits you need. There's a trial of a new protocol happening now, due for release in March, that should speed this up (https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-spar...)


You don't even need a separate language, there's already a "fast" compiler for Rust based on cranelift which is used in debug builds by default.


Cranelift is not used for debug builds by default. I think that's probably a goal (although I'm not actually 100% sure about that just because I'm not dialed into what the compiler team is doing). Even the OP mentions this:

> We were able to benchmark bjorn3's cranelift codegen backend on full crates as well as on the build dependencies specifically (since they're also built for cargo check builds, and are always built without optimizations): there were no issues, and it performed impressively. It's well on its way to becoming a viable alternative to the LLVM backend for debug builds.

And the Cranelift codegen backend itself is also clear about it not being ready yet: https://github.com/bjorn3/rustc_codegen_cranelift

(To be clear, I am super excited about using Cranelift for debug builds. I just want to clarify that it isn't actually used by default yet.)


The more immediate goal of "distribute the cranelift backend as a rustup component" has been making good progress and seems like it might happen relatively soon https://github.com/bjorn3/rustc_codegen_cranelift/milestone/...


That's amazing. Thanks for that update. Can't wait.


Great news.


I write a lot of Rust code and, outside of performance optimization with Release builds, I've had next to no issues with iterative compile times, even on fairly large projects. Honestly it feels like a bit of a meme at this point. My CPU is also 5 years old (4 cores at 5ghz), so it isn't like I have a super beefy setup either.


I love to see work being done to improve Rust compile times. It’s one of the biggest barriers to adoption today, IMO.

Package management, one of Rust’s biggest strengths, is one of its biggest weaknesses here. It’s so easy to pull in another crate to do almost anything you want. How many of them are well-written, optimized, trustworthy, etc.? My guess is, not that many. That leads to applications that use them being bloated and inefficient. Hopefully, as the ecosystem matures, people will pay better attention to this.


On the contrary, commonly used Rust crates tend to be well written and well optimized (source: I have done security audits of hundreds of deps and I curate https://lib.rs).

Rust has a culture of splitting dependencies into small packages. This helps pull in only focused, tailored functionality that you need rather than depending on multi-purpose large monoliths. Ahead-of-time compilation + generics + LTO means there's no extra overhead to using code from 3rd party dependency vs your own (unlike interpreted or VM languages where loading code costs, or C with dynamic libraries where you depend on the whole library no matter how little you use from it).

I assume people scarred by low-quailty dependencies have been burned by npm. Unlike JS, Rust has a strong type system, with rules that make it hard to cut corners and break things. Rust also ships with a good linter, built-in unit testing, and standard documentation generator. These features raise the quality of average code.

Use of dependencies can improve efficiency of the whole application. Shared dependencies-of-dependencies increase code reuse, instead of each library rolling its own NIH basics like loggers or base64 decode, you can have one shared copy.

You can also easily use very optimized implementations of common tasks like JSON, hashmaps, regexes, cryptography, or channels. Rust has some world-class crates for these tasks.


I see a lot of work going on making the compiler faster (which looks hard at this point), but I wish I just would be able to make correct changes without needing to recompile code at least.

The extract function tool is very buggy. As I spend a lot of time refactoring, maybe putting time in those tools would have a better ROI than so much work into making the compiler faster.


Keep in mind that the people working on rustc are not the same working on rust-analyzer, even if there's some overlap in contributors and there's a desire to share libraries as much as possible. Someone working on speeding up rustc is unlikely to have domain expertise in DX and AST manipulation, and vice-versa.


Maybe you're right, but I think both are hard enough that people who are smart enough to do one can do the other if they really want :)

By the way AST manipulation is easy, the really hard part of refactoring (that I had a lot of problem with) is creating the lifetime annotations, which requires a deep understanding of the type system.

I was trying to learn some type theory and read papers to understand how Rust's life times work, but only found long research papers that don't even do the same thing as Rust.

I haven't found any documentation that documents exactly when a function call is accepted by the lifetime checker (borrow checking is easy).


Have you read the rustnomicon section on lifetimes? I found it pretty useful


It's cool, I just looked at the manual.

there's this part though:

// NOTE: `'a: {` and `&'b x` is not valid syntax!

I hate that I can't introduce new lifetime inside a function, it would make refactoring so much easier. Right now I have to try to refactor, see if the compiler accepts it or not, then revert the change.

Sometimes desugaring would be a great feature in itself, sugaring makes interactions between functions much harder to understand.


Maybe it’s a dumb idea but what about a mode where it won’t checks types and stuff? So we can just compile super fast when we are just tinkering around.

Or a mode where it compile automatically every time you change a line? (With absolutely no optimization like inlining stuff etc to make it fast) kind of like just compiling the new line from Rust to its ASM equivalent and adding that to the rest of the compiles code. Like a big fatjar type of way, if that make sense.


I don't know much about how the compiler works, so the answer here is probably that I should read a book, but can external crates from crates.io be precompiled? Or maybe compile my reference to a part of an external crate once and then it doesn't need to be done on future compilations?

If the concern is that I could change something in a crate, then could a checksum be created on the first compilation, then checked on future compilations, and if it matches then the crate doesn't need to be recompiled.


Cargo already does this when building incrementally, and there are tools for doing it within an organization like sccache.

> If the concern is that I could change something in a crate

It's possible for a change in one crate to require recompiling its dependencies and transitive dependencies, due to conditional compilation (aka, "features' [0]). Basically you can't know which thing to compile until it's referenced by a dependent and provided a feature set.

That said, many crates don't have features and have a default feature set, but the number of variants to precompile is still quite large.

[0] https://doc.rust-lang.org/cargo/reference/features.html

Note that C and C++ have the exact same problem, but it's mitigated by people never giving a shit about locking dependencies and living with the horrible bugs that result from it.


Hosted pre-compiled builds would need to account for

- Feature flags

- Platform conditionals

- The specific rust version being used

- (unsure on this) the above for all dependencies of what is being pre-compiled

There is also the impediments of designing / agreeing on a security model (do you trust the author like PyPI, trust a central build authority, etc) and then funding the continued hosting.

Compiling on demand like in sccache is likely the best route for not over-building and being able to evict unused items.


sure all the combos tail into practical inf, but with an 80/20 approach if you limit those combinations to just the most common, wouldn't it hugely reduce the waste of rebuilding everything locally?


sccache seems awesome. I wasn't aware of it. Thanks.


because of the lack of stable ABI they'd need to pre-compile it however many versions of the rust compiler they wanted to support.


I don't understand these overhyping folks. Instead of fixing their over thousands of memory safety bugs on their tracker, they rather ignore it and go for more compiler performance. Well, why not, it's insanely slow. But then I wouldn't dare to their mention their memory safety story. Which would need a better compiler, not a faster one.


It's funny, any other post on HN about improvements to Rust I've seen are chock full of comments to the effect of "I guess that feature is nice, but when will they improve the compile times?" And now many of the replies to this post are "Faster compiles are nice, but when will they improve/implement important features?"

The Rust dev team can't win!


I used to be a grad student adjacent to Bjarne Stroustrup; he has a quip he's used a bunch: if no one's complaining, no one cares.

I see all of these complaints — both the volume, and the count — as great indicators of Rust's total health.


So true. I use a similar quip at work: "Take the shortcut and if we're not out of business when it becomes an issue, then it will be a good problem to have".


I guess that depends on the work. By day I repair houses and taking shortcuts can mean I could have an even bigger problem to solve in a few months. Luckily I have yet to be in such a situation myself but I've fixed other's shortcuts a number of times.


Yes, I should specify I’m talking about software development. Physical products and work rarely have the luxury of making mistakes or taking shortcuts, the universe is a harsh place.


This is when it is important not to model the comment section as some sort of single composite individual.

Since it is impossible to mentally model them as the number of humans they are, I find it helpful to model them as at least a few very distinct individuals, or sometimes just as an amorphous philosophical gas that will expand to fill all available comments, where the only question is really with what distribution rather than whether a given point will be occupied.


Rust is amazing, I truly believe a large number of people are intimidated by it and so go out of their way to shit on it and pretend like it's only for some niche IOT device... when it's just as easy to write out a full crud application in Rust as any other language at this point.


I wish extreme Rust solipsists would stop blankly stating as a fact that working in Rust is 'easy' just because they find it so. If you do, good for you, but your experience is not universal. Many don't.

I find Rust extremely difficult and slow to work with. It takes me a long, long time to get anything working at all. Easily 5x more than any other language I've used (and that's many, and a good handful professionally), and I've been learning Rust for over a year.

Figures are hard to come by of course, but anecdotally I'm the only person in my circle who's continued using Rust. All the others have dropped out because they just find it too hard to get anything done in. Not sure if this is true, but I heard on a podcast the other day that one of the big surveys showed Rust to have the biggest learning drop-out rate of any mainstream programming language. That wouldn't surprise me, and comports well with Rust being the 'most loved' (people tend to love skills they have gained with much effort!).


Do you remember which survey? I’d love to see it!


No I was half-listening and think it was mentioned in a loose way, and I think I'm conflating it with some other mention in a different recent podcast so that's more than a few levels of looseness.

I've just scrubbed back through - the podcast was https://syntax.fm/show/571/supper-club-rust-in-action-with-t..., which follows the pleasing practise of providing chapters (yay). It was Tim McNamara speaking from about 12'40": what he actually said was that about half of Rust learners who do drop out fall away because of the purported difficulty of the language. Very different from my faux summary so apologies for the grievous misrepresentation.

I still find Rust as hard to use as others I've spoken to do though. I'm kind of keeping at it for reasons specific to projects I have in mind, as well as a certain dense stubbornness.


Thank you! It’s all good, I just hadn’t heard that before and was curious how it was figured out! The Rust Project has long acknowledged that Rust is tough to learn. That’s why I got to have a job back then! And there’s still new good work going on in that space. I don’t think the nut has been fully cracked yet.


> there’s still new good work going on in that space

Is there anything there someone like me (ie. having difficulty with Rust) might usefully contribute to? I'm a little overwhelmed right now trying to keep a roof over my head, but am compiling a list of things I might like to help with when the current storm has passed.


I don't know what's good to help with and what isn't, because I haven't been involved with Rust development for a pretty long time at this point.

Personally I think contributions work best when you're trying to solve a pain that you personally have or at least have some sort of connection to, so I'd encourage you to consider what/how/why you struggled to learn, and then try to fix that. I know it's vague, but it's the best I've got right now!


Fair enough, cheers. I've been keeping notes on and off. I may be able to turn them to use at some point.


Hey Steve, I was referring to the official Rust Survey in the podcast interview. I'll track down a specific link, but the stat comes from one of the PDFs rather than the analysis blog posts. I'm having difficulty finding it currently.


It’s all good, I know where that is, thanks!


1) HN is maybe one organism if you zoom out enough, but it consists of people with wildly different opinions, you'll have capitalists arguing with anarchists here, any post is bound to have both sides, no sides and every side, all on the same page 2) it's easier to complain about stuff, somehow. Not sure why, or if it's extra prominent on HN in particular, but people tend to start thinking "Why am I against this thing?" and then write their thoughts, rather than "Why do I like this thing?". Maybe it is more engagement to write something that can be challenged, and people like when others engage with them, so they start to implicitly learn to be that way.


I think pessimistic and cynical reactions are the literal lifeblood of news aggregator comments sections. It's been like this for as long as I can remember across as many aggregators as I've ever used.

Part of the problem is that news aggregates reward people who comment early, and the earliest comments are the kneejerk reactions where you braindump thoughts you've had brewing but don't have anywhere to put. (Probably without actually clicking through.)


Another part of it is just psychology. People seem much more inclined to join discourse to make objections than to pile on affirmative comments, which generally an upvote suffices for.


It's also partly the site's culture. Not saying it's wrong, because it adds some noise and not much new info, but I've been downvoted before for posting comments like "Thanks for saying this!"


I agree in general except HN also rewards quality a bit more. All new comments get a few minutes to bask at the top of their subthreads. So a really good, late comment can still get to the top and stay there.


To a degree, the comment ranking algorithm helps, though long/fast threads do often leave brand new replies buried upon posting.

Still, I believe what makes HN unusually nice is just the stellar moderation. It is definitely imperfect, but it creates a nice atmosphere that I think ultimately does encourage people to try to be civil, even though places like these definitely have a tendency to bring out the worst in people. Having a deft touch with moderation is very hard nowadays, especially with increasingly difficult demands put against moderators and absolutely every single possible subject matter turning into a miniature culture war (how in the hell do you turn the discussion of gas ranges vs electric ranges into a culture war?!) and the unmoderated hellscapes of the Internet wrongly painting all lightweight moderation with a black mark.

I definitely fear for the future of communities like HN, because the pressure from increasingly vile malicious actors as well as the counter-active pressure from others to moderate harder, stronger, faster will eventually break the sustainability of this sort of community. When I first joined HN, a lot of communities on the Internet felt like this. Now, I know of very few.


Some problems are good problems to have. If different groups of people are taking the time to substantively and constructively complain about different shortcomings of Rust and its ecosystem, it shows that there's a wide enough audience of serious users out there. My take is that the Rust devs are doing something right, even if it seems like they can't win.


That's how winning looks like.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: