Hacker News new | past | comments | ask | show | jobs | submit login
This isn’t the way to speed up Rust compile times (xeiaso.net)
128 points by cpach on Aug 26, 2023 | hide | past | favorite | 155 comments



On topic: I agree that upstream-provided 3rd party binaries are not a good solution to this particular problem. That is the type of solution that works in a closed environment like a corporate CI system, but it should not be the default.

Off topic: I don't understand why the article repeatedly says the Rust compiler or procedural macros are already fast, even "plenty fast". Aren't they slower or about as slow as C++, which is notorious for being frustratingly slow, especially for local, non-distributed builds?

"Plenty fast" would mean fast enough for some simple use-case. e.g., "1G ethernet is plenty fast for a home network, but data centers can definitely benefit from faster links". i.e., fast enough you won't notice anything faster when doing your average user's daily activities, but certainly not fast enough for all users or activities.

The Rust compiler is then in no way "plenty fast". It has many benefits, and lots of hard work has gone into it, even optimizing it, but everyone would notice and benefit from it being any amount faster.


> Aren't they slower or about as slow as C++, which is notorious for being frustratingly slow, especially for local, non-distributed builds?

Yes. Significantly slower. The last rust crate I pulled [0] took as long to build as the unreal engine project I work on.

[0] https://github.com/getsentry/symbolicator/


Ehhhhhhhh. Are you talking full build or incremental? How long did it take?

Clean and rebuild of Unreal Engine on my 32-core Threadripper takes about 15 minutes. And incremental change to a CPP takes… varies but probably on the order of 30 seconds. Their live coding feature is super slick.

I just cloned, downloaded dependencies, and fully built Symbolicator in 3 minutes 15 seconds. A quick incremental change and build tool 45 seconds.

My impression is the Rust time was all spent linking. Some big company desperately needs to spend the time to port Mold linker to Windows. Supposedly Microsoft is working on a faster linker. But I think it’d be better to just port Mold.


My 32 core threadripper builds ue5 in 12 minutes on Windows. Single file changes on our game are usually under 10 seconds due to unity builds, and a good precompiled header.

My first clone on symbolicator took the same length of time on my windows machine. Even with your numbers, 4 minutes to build what is not a particularly large project is bonkers. I


Sounds like we’re on basically the same page.

My experience across a wide range of C++ projects and wide range of Rust projects is that they’re roughly comparable in terms of compilation speed. Rust macros can do bad things very quickly. Same as C++ templates.

Meanwhile I have some CUDA targets that take over 5 minutes to compile a single file.

I feel like if Rust got a super fast incremental linker it’d be in a pretty decent place.


Is using a 32-core CPU as a baseline for speed comparisons some kind of satire in this thread?


No, threadrippers[0] are commonly used by people working in large compiled codebases. They're expensive, sure, but not incomparable to a top of the range Macbook Pro.

[0] https://www.cpubenchmark.net/cpu.php?cpu=AMD+Ryzen+Threadrip...


> No, threadrippers[0] are commonly used by people working in large compiled codebases.

I don't think that's true at all, unless you're using a very personal definition of "common".

In the real world, teams use compiler cache systems like ccache and distributed compilers like distcc to share the load through cheap clusters of COTS hardware or even vCPUs. But even that isn't "common".

Once you add CICD pipelines, you recognize that your claim doesn't hold water.


I know, I have one and it cost the company about 3x as much as a 16" MacBook Pro. An expense that's very unaffordable for most companies, not to mention most developers.

(Even most MBPs are unaffordable for a large set of developers.)

I don't think it's as accessible to average C++ or Rust developers as you expect.


My 3970x workstation, all in, cost about £3200.

I've also got a 14" Macbook Pro (personal machine) that was a _little_ cheaper - it was £2700.

> An expense that's very unaffordable for most companies I think it's unaffordable for some companies, but not most. If your company is paying you $60k, they can afford $3500 once every 5 years on hardware.

> I don't think it's as accessible to average C++ or Rust developers as you expect.

I never said they were accessible, just that they are widespread (as is clear from the people in this thread who have the same hardware as I do).

FWIW, I was involved in choosing the hardware for our team. We initially went with Threadrippers for engineers, but we found that in practice, a 5950x (we now use 7950x's) is _slightly_ slower for full rebuilds but _much_ faster for incremental builds which we do most of.


It’s definitely not a baseline. It’s simply what I have infront of me.

Lenovo P620 is a somewhat common machine for large studios doing Unreal development. And it just so happens that, apparently, lots of people in this thread all work somewhere that provides one.

I don’t think the story changes much for more affordable hardware.


I kind of does, given that the C and C++ culture depends heavily on binary libs (hence the usual ABI drama), in more affordable hardware building everything from source, versus using binary libraries makes a huge difference, thus C++ builds end up being quite fast unless they abuse templates (without extern template on libs).


The project itself may not be large, but if it's pulling in 517 dependencies, that's a lot to compile.

Rust also does significantly more for you at compile time than C++ does, so I don't mind the compiler taking some not time to do it


> The project itself may not be large, but if it's pulling in 517 dependencies, that's a lot to compile.

Why do you need to build hundreds of dependencies if you're not touching them?


Why do you assume they're not touching them?


Only if not taking metaprogramming and constexpr into account.


FYI Unity builds are often slower than incremental builds with proper caching.


What can I use to cache with MSVC that isn't Incredibuild? (a per-core license model isn't suitable - we'd spend more on incredibuild licenses every year than we do on hardware)

Also, I've spent a _lot_ of time with Unreal and the build system. Unreal uses an "adaptive unity" that pulls changed files out of what's compiled every time. Our incremental single file builds are sub-10-seconds most of the time.


> What can I use to cache with MSVC that isn't Incredibuild?

Ccache works, but if you use the Visual Studio C++ compiler you need to configure your build to be cacheable.

https://github.com/ccache/ccache/wiki/MS-Visual-Studio


Lack of Precompiled Header support kills this for us immediately. (We also currently use the unsupported method of debug info generation which we could change). A local build cache is no better than UnrealBuildTool's detection though.


> Lack of Precompiled Header support kills this for us immediately.

Out of curiosity, why do you use precompiled headers? I mean,the standard usecase is to improve build times, and a compiler cache already does that and leads to greater gains. Are you using precompiled headers for some other usecase?


Improved build times is _the_ reason.

> and a compiler cache already does that and leads to greater gains

Can you back that claim up? I've not benchmarked it (and I'm not making a claim either way, you are), but a build cache isn't going to be faster than an incremental build with ninja (for example), and I can use precompiled headers for our common headers to further speed up my incrementals.

You did encourage me to go back and look at sccache though, who have fixed the issues I've reported with MSVC and I'm going to give it a try this week


Mold is ELF only. They'd have to rewrite it anyway. I don't see much point in a Windows port


It’s tentatively planned for 3.0. https://github.com/bluewhalesystems/sold/issues/8

The file formats are indeed totally different. But the operation of linking is the same at a high-level.


Huh, interesting


If they ever plan supporting UNIXes like Aix, mainframes or micros, ELF isn't there as well.


Additionally, in Windows, when Rust is compiling the Microsoft's linker allays launch the M$ telemetry vctip.exe, that stablish an internet connection [Here is an icon of someone in sad thought].

If anyone knows a method for to avoid such launch (besides connection blocking after launch by firewall ), share it please.


I was curious about a couple numbers. Looks like symbolicator has about 20k lines of rust code and 517 crates it depends on (directly and indirectly).


> On topic: I agree that 3rd party binaries are not a good solution to this particular problem. That is the type of solution that works in a closed environment like a corporate CI system, but it should not be the default.

Why?

I mean, OS distributions making available whole catalogs of prebuilt binaries is a time-tested solution solution that's in place for decades. What leads you to believe that this is suddenly undesirable?


I think there's a very obvious distinction between the kind of "3rd party" that is your chosen OS distribution and the kind of 3rd party that is "some person on github or crates.io", and I didn't feel the need to clarify this as I thought it was obvious.

I've clarified it to "upstream-provided 3rd party" in the original post.


It's a 3rd party package that rustc and cargo already internally depend on themselves. If you're okay getting those distributed to you as binaries, this package too may as well come with a precompiled binary.


Either way you're running their code, I don't understand the outrage here, frankly. People who look at the code they're running and the changes they're pulling in of course noticed this, but I seriously doubt even 1% of the rageposters are in that set of rust users.


I'm not outraged. I dislike it, and it would be a "con" when considering any use of the project.

Binaries are relatively inscrutable, and there are massive differences in ease of doing something nefarious or careless with OS binaries, and upstream checked in source code, and upstream checked in binaries.

That's just one or two reasons why I think this approach is not at all commonplace in open settings, and is generally limited to closed and controlled settings like corporate environments.

A more typical and reasonable approach would be having a dependency on such a binary that you get however you like: OS, build from upstream source, upstream binary, etc., and where the default is not the latter.

And now if you allow me to get a bit off-topic again: to me, having to "innovate" like this is a major sign that this project knows it is very slow, and knows it's unlikely to ever not be very slow.


> I think there's a very obvious distinction between the kind of "3rd party" that is your chosen OS distribution and the kind of 3rd party that is "some person on github or crates.io"

That tells you more about crates.io than consuming prebuilt binaries.

Some Linux distro ship both source code packages and prebuilt binaries. This has not been a problem for the past two decades. Why is it suddenly a problem with Rust's ecosystem?


crates.io is just a platform for people to make their Rust code distributable; it's not curated in any way really (other than removing obvious malware). An OS distribution is explicitly curated.


I never saw anyone complain that crates.io isn't curated. Why is this suddenly an issue when discussing shipping prebuilt binaries? If it's somehow ok to download the source code as-is, why is it a problem if crates.io builds it for you?

Some problems have solutions, but people need to seek for solutions instead of collecting problems.


I'm not complaining that it's not curated, specifically because it's exclusively a source code package directory. Auditing source code for safety is hard, doing so with a binary is much harder.


I wouldn't mind if rust had reproducible builds, and the binaries had to be built+signed by both the original author and crates.io. But how the article describes it seems sketchy


Because it demonstrates a failure of the Rust ecosystem.

Proc macros have effectively become fundamental to Rust. This is unfortunate, but, given that it is reality, they should get compiler support. Part of what is making them so slow is that they need to process a firehose of information through a drinking straw. Placing proc macros into the compiler would not only speed them up but also remove the 50+ crate dependency chain that gets pulled in.

Serde, while slow, has a more fundamental Rust problem. Serde demonstrates that Rust needs some more abstractions that it currently does not have. The "orphan rule" means that you, dear user, cannot take my crate and Serde and compose them without a large amount of copypasta boilerplate. A better composition story would speed up the compilation of Serde as it would have to do far less work.

Rust has fundamental problems to solve, and lately only seems capable of stopgaps instead of progress. This is unfortunate as it will wind up ceding the field to something like Carbon.


> Rust has fundamental problems to solve

The orphan rule isn't a problem that needs solving, it is itself a solution to a problem. There's nothing stopping Rust from removing the orphan rule right this minute and introducing bold new problems regarding trait implementation coherence.


> The orphan rule isn't a problem that needs solving, it is itself a solution to a problem.

It is both.

The Orphan Rule was indeed chosen as a solution because other choices were dramatically more difficult to implement.

However, the Orphan Rule also has tradeoffs that make certain things problematic.

Engineering is tradeoffs. There are only two types of programming languages: those you bitch about and those you don't use. YMMV. etc.


Edward Kmett has a great explanation[1] of the considerations around this space. It's not that alternatives to the orphan rule are hard to implement, it's that they lead to things being much harder to reason about for the programmer.

[1]: https://youtube.com/watch?v=hIZxTQP1ifo


proc macros are user written code, how can you "place them into the compiler"?

What does the orphan rule have to do with the compilation speed of serde? How does it reduce the work it needs to do?


You have to pulverize the macro into syntax atoms and then reassemble it piece by piece into an AST that the compiler needs. All the processing to do all of that should be a part of the compiler--not a smattering of 50 crates (syn being the most oppressive one) and a weak compiler API.

Serde has to enumerate the universe even for types that wind up not used. Those types then need to be discarded--which is problematic if it takes a non-trivial amount of time to enumerate or construct them (say: by using proc macros). Part of this is the specification that must be done because Rust can't do compile-time reflection/introspection part of which is blocked because of the way Rust resolves types--different versions of crates wind up with different types even if the structures are structurally identical because Rust can't do compile time reflection and relies on things like the Orphan rule.

I am not saying that any of these things are easy. They are in fact hard. However, without solving them, Rust compilation times will always be garbage.

I'm beginning to think that the folks who claimed that compiler speed is the single most important design criterion were right. It seems like you can't go back and retrofit "faster compiler time" afterward.


> You have to pulverize the macro into syntax atoms and then reassemble it piece by piece into an AST that the compiler needs. All the processing to do all of that should be a part of the compiler--not a smattering of 50 crates (syn being the most oppressive one) and a weak compiler API.

First of all it's not 50 crates, it's 3 main crates syn, quote and proc-macro2 and a few smaller helper crates which are used less often.

And the reason to not include them in the compiler is exactly the same reason why "crate X" is not included in std. Stuff in std must be maintained forever without breaking changes. Just recently syn was bumped to version 2 with breaking changes which would have not been possible if it was part of the compiler.

In any case, shipping precompiled proc macros WASM binaries will solve this problem.

> Serde has to enumerate the universe even for types that wind up not used. Those types then need to be discarded--which is problematic if it takes a non-trivial amount of time to enumerate or construct them (say: by using proc macros). Part of this is the specification that must be done because Rust can't do compile-time reflection/introspection...

Yes, it's a design tradeoff, in return Rust has no post-monomorphization errors outside of compile time function evaluation. I think it's worth it.

> I'm beginning to think that the folks who claimed that compiler speed is the single most important design criterion were right.

I don't agree. As long as the compile times are reasonable (and I consider Rust compile times to be more than reasonable) other aspects of the language are more important. Compile time is a tradeoff and should be balanced against other features carefully, it's not the end goal of a language.


> First of all it's not 50 crates, it's 3 main crates syn, quote and proc-macro2 and a few smaller helper crates which are used less often.

That may be technically true. However, every time I pull in something that uses proc macros it seems my dependency count goes flying through the roof and my compile times go right out the door.

Maybe people aren't using proc macros properly. That's certainly possible. But, that also speaks to the ecosystem, as well.


> seems my dependency count goes flying through the roof and my compile times go right out the door.

I have two dependencies (memchr, urlencoding) and one two dev dependencies (libmimic, criterion). I have 77 transitive dependencies, go figure. People like to outsource stuff to other libraries, when possible.


I strongly agree with all your criticisms and factual assertions, but I think all your predictions are likely to end up being wrong.


I hope you are correct about predictions because I really want to see the day that C++ gets a spike put through it because something better exists.

I really don't want to see something like Carbon take off. It's not really that much "better" than C++, and it will take up the oxygen from something that could be a genuine replacement.


Definitely, as C++ allows using binary libraries from the get go, instead of rebuilding the world after each checkout, or clean build on the CI/CD pipeline.

There are workarounds like sccache, but again additional tooling to take into account, across all target platforms.


I think there's a different kind of "plenty fast" that more closely aligns with "as fast as it can be." If you develop some code that runs at 90% of the theoretical throughput, I'd call that plenty fast, even if the theoretical wait time was 100s of seconds. The rust compiler does a lot of stuff that simpler languages don't do. I would accept that level of checking to take longer while still being considered "plenty fast". Making that argument would require some serious benchmarking of the rust compiler though.


The Pre-RFC that came out of this situation is pretty interesting: https://internals.rust-lang.org/t/pre-rfc-sandboxed-determin...

Some points I hadn't considered:

- Current compile times aren't too bad, but compile times are a factor in how macros are developed. Making that less of an issue will likely make the macro ecosystem richer and more ambitious.

- Precompiled binaries can be compiled with optimizations, which anecdotally overcome the potential perf hit from running in WASM

- Nondeterminism in macros (especially randomness) has bit a lot of people and a more controlled execution environment is very beneficial

- Sandboxing doesn't have to be WASM, it's just the most capable, currently available option.

- Comments on the pre-RFC are exploring solving this in rustup/cargo instead of crates.io and that is also promising.

In general, I appreciate how this is creating awareness and driving exploration of solutions. In particular I really like the notion that there is currently implicit pressure on macro authors to make them somewhat quick to compile and releasing that pressure will enable innovation.


Please don't ship wasm binaries in cargo please don't ship wasm binaries in cargo please don't ship wasm binaries in cargo please don't ship wasm binaries in cargo please don-


Why not?

WASM with optimisations will mostly run faster than binaries without optimisations, which is the current strategy. Plus you won't need the compile step that's necessary at the moment.

Distributing arbitrary binaries right now is bad because it's not clear what those binaries contain, but the WASM environment is completely sandboxed, so it wouldn't be possible for them to read files they weren't meant to or phone home with telemetry.

The Cargo project already does some amount of automatic building e.g. of docs pages - this could be extended to proc macro crates to ensure that the uploaded binaries must be built from the uploaded source files. Alternatively, making sure that reproducible binaries are standard would help a lot from a trust perspective.

This isn't suitable for every situation, but as a tool to reduce the compile times of some of the bigger and slower macro crates, it sounds like a great option to have.


> However, this means that the most commonly used crate is shipping an arbitrary binary for production builds without any way to opt-out.

The nokogiri rubygem has been shipping a prebuilt windows dll for probably more than a decade. The sun has not imploded into a black hole at any point in the meantime.

> If we're going to be trusting some random guy's binaries, I think we are in the right to demand that it is byte-for-byte reproducible on commodity hardware without having to reverse-engineer the build process and figure out which nightly version of the compiler is being used to compile this binary blob that will be run everywhere.

I have some really, really fucking bad news about the O/S distros most of you all use...


One of the most used platforms nowadays is Java.

Java libraries are distributed as binary bytecode blobs. They can also include native libraries, which usually are extracted to /tmp and dynamically loaded into the process. The security of this approach is questionable, if you ask me, however that's the way things were for the long time and that's the way things will be for the long time.

I think that striving for better security is a noble goal, but if that prevents usability, security should step aside, when the world around is not as secure anyway.


> that's the way things will be for the long time.

There are some changes coming where application developers will have to explicitly acknowledge that libraries they're using are using native code: https://openjdk.org/jeps/8307341 (the new, soon-to-be-stable FFM API already has this behavior).

It doesn't change the distribution mechanism of native code, but it's some sort of step towards better security.


> Java libraries are distributed as binary bytecode blobs.

I don't think it's fair to describe JARs this way. While it may be technically true, a JAR file is far less opaque than an actual binary executable.


In the sense that it's easier to decompile sure, but plenty of non-OSS java libraries run an obfuscator on their internals and in that case it's only marginally less opaque than machine code


The author uses NixOS.


Debian and Fedora have likewise both been pushing reproducible builds.


I used to use a FreeBSD desktop at work back in like 2002.

And that is a reasonable thing for those projects to try to achieve.

Still we've been downloading binaries for decades and the world hasn't ended, this issue never warranted people treating it like nuclear war was imminent and hurling abuse into Dtolnay's github issues like he was a child pornographer. There's a way to have a discussion about this being the wrong direction without doing that.

And I'd encourage everyone to take an honest audit of how many precompiled binaries are in their lives, even if they've switched to NixOS and how much stuff they trust downloading from the internet. Very few people achieve an RMS-level of purity in their technological lives.


The world hasn't ended, but I'd argue that the number of malicious actors within the space is rising, as are the consequences of being compromised. We're simply seeing more and more malware make its way into open source ecosystems, and I don't imagine the trend will be reversing. For this reason alone we should be striving to achieve full build reproducibility, though I agree with you that demonizing (or threatening) Dtolnay is wildly inappropriate.


Yeah, people have woken up to "supply chain" problems, particularly with javascript and npm which is an entire tirefire.

But we've gone from 0% to 100% overnight and as usual people have adopted it as their new religion and they want to burn all the heretics and there can be no compromise.

I seriously doubt that this one specific issue was all that important in the larger problem of securing the supply chain, and there was a very good reason why it was done (which has now been entirely thrown away, which will certainly harm adoption of rust). I don't think it was remotely comparable to the way all of npm is a security hole.


This is all true, but even if you don't care about reproducible builds that doesn't mean that nobody cares about reproducible builds. And the people that do care about such things care about them very much. To the point that if you don't offer it to them that they'll go somewhere else.


I didn't say not to try to get to reproducible builds, though, but not to treat Dtolnay like he's a criminal.

There is a very real problem that he was trying to solve, but the right way to do it will require core language support. And his heart is certainly in the right place, since "slow build times" is probably the #1 complaint about Rust.

If you actually polled most Rust users they probably care about their build times more than they care about this issue, particularly if you designed the survey impartially and just asked them to stack rank priorities. They probably all would prefer "both" being the right answer though.

Where the criticism really needs to be leveled is on the core language design (although I appreciate that their job is also extremely tough).

[And the one good thing to come out of this shitshow will probably be getting people focused on solving this issue the right way]


It's funny how opposed the 'Go' and 'Rust' eco systems are, the 'Go' people optimized for compile speed from day 1 but have trouble making things safe, the 'Rust' people optimized for safety from day 1 but have trouble making things fast.

Maybe they could learn a trick or two from each other?


I don’t program in Go, but this is the first I’ve heard of it being unsafe. I thought Go was a GC language implemented via the Go runtime and properly handles memory safety (memory corruption and OOB access). Are you using a different definition of safety here or is there something else that I’m not aware of?


"Maps are not safe for concurrent use: it’s not defined what happens when you read and write to them simultaneously."

https://go.dev/blog/maps


This is completely tangential to language “safety” as it’s commonly referred to when discussing languages like Rust and C or C++. Typically, I’ve only heard safety used to mean memory corruption, out of bounds memory access, and undefined behavior are not allowed, or throw runtime exceptions when they happen.

Java’s standard HashMap isn’t thread safe, it has ConcurrentHashMap[0] for this reason. And I’ve never heard somebody refer to Java as an unsafe language. Lemire wrote a blog article arguing Java is unsafe for similar reasons that it seems you’re using here, and I think the comments section has some great discussion about whether this conflation of language safety makes sense or not[1].

And one final point: a lot of the world lives on software segregated across several different machines, operating asynchronously on the same task, usually trying to retrieve some data in some central data store on yet another machine. Rust is great for locking down a single monolithic system, but even Rust can’t catch data races, deadlocks, and corruption that occurs when you’re dealing with a massively distributed system. See this article that talks about how Rust doesn’t prevent all data races, and why it’s probably infeasible to do that[2]. And I doubt anyone here is about to call Rust unsafe :)

This conflation of language often confuses me and I hope as an industry we can disambiguate things like this. Memory safety is distinct from thread safety for a reason, and by all reasonable standards Go is as safe as a modern language is expected to be. (Obviously better thread safety is a goal to be lauded, but as an industry, it seems like we’re still trying stuff out and seeing what sticks in regards to that).

[0]: https://docs.oracle.com/javase/6/docs/api/java/util/concurre...

[1]: https://lemire.me/blog/2019/03/28/java-is-not-a-safe-languag...

[2]: https://doc.rust-lang.org/nomicon/races.html


> Memory safety is distinct from thread safety

From [2] you linked

> a race condition can't violate memory safety in a Rust program on its own.

From golang FAQ on maps not being atomic:

> uncontrolled map access can crash the program

This may be one very narrow case (or a more common theme), but at least in this case the thread safety becomes a memory safety concern in golang.


If maintainer didn't pledge themselves to obeying reproducible builds, you can't truly expect them to.

Otherwise you get into stupid situations like: https://xkcd.com/1172/ (your optimization is killing children!)

In FOSS there is the principle of four Fs:

- Fix it

- Fork it

- Fund it

- Fuck it/off


"Most of you all", addressing the HN audience, not the author.


most distros are moving towards reproducible builds.


With a distro we're trusting a single trustworthy source for all the software, whereas with package managers for most programming languages we're trusting hundreds of individual developers, any one of which can suddenly decide to sell us out.


Distro maintainers also apply some level of audit prior to merging packages. I doubt it’s a huge amount but it’s infinitely more than directly trusting a package maintainer directly.


The distro maintainers definitely had been in contact with the maintainer about the binary change, and I'm very thankful for them to have done that.


Sorry to clarify I’m talking generically me installing something from cargo, pip or npm, rather than any specific package.

I personally don’t hold much value in ‘reproducible builds’ as being some silver bullet - back doors can almost as easily be hidden in source code, particularly if nobody gives it a once-over read, same goes for compiled code only it normally requires a sandbox and some R.E.


I disagree, it is easier to hide back doors in binaries than in source code. "some RE" is way harder to do on scale than looking at diffs of source code. But yes, it's also possible to put back doors into source code as well.

Still, I think in the open source world, the primary distribution method, at least the one emanating from the original developer, should be in forms of source code, to ensure that the original developer can't include special closed source features. Also, for end users it's easier when the source code is available, then they can rebuild things using standardized methodology. all the distro packages have a bunch of commands you can run and then you can rebuild the distro package.


I think you’re probably right but it would depend on languages - something statically typed and well defined like rust would be much easier to analyse than something like php which could construct and execute a backdoor dynamically in a very non obvious way. All this depends on code readability and simplicity, which in software which is a complete mess would probably need a sandbox to have at least some level of assurance (I’d prefer to see unreadable apps outright rejected from use, but that opinion won’t fly in the face of the business world).

So in summary, I concede you are right in most, but not all cases.


Many distros are made up of hundreds of individual developers, not a single trustworthy source.


If you skipped over the header, like I did originally, it bears repeating: This change was reverted a couple days ago.


Note that the pre-compiled binary blob the blog post is referring to has since been removed [0].

[0]: https://github.com/serde-rs/serde/pull/2590


I thought I made that pretty clear in my article. Do I need to bring back the blink tag or something to make it more obvious in the future?


I missed it because the header to me looks like

> Something about ads

> Title

> Something I skipped

> Something about patreon

I dismissed it as "yet another part of the headers" and not as "part of the article".

I mean, to be fair, I was already aware it was removed from serde, but I'm pretty sure I didn't notice that box in the article the first time I read through it.


The top of the article looks like a screenshot of a chat, where someone said something, then some text and then the same person said something. I basically skipped that whole part to look for the article beginning, and when I didn't find the title, I skimmed back up.

The title is kind of hidden as it is right now, and the "notice" on the top doesn't look like a notice. Maybe try making it more different than other elements on the website, and put it below the article title rather than above?


I indeed missed the header, as other readers hear. I'd advise against using the characters' bubbles to provide important heads-up, as they're often used for additional insights or humorous opinion, and so can be easily missed by quick readers.


Classic example of banner blindness https://en.m.wikipedia.org/wiki/Banner_blindness


I always preferred marquee


Swift's `Codable`[0] system seems very similar to serde in this regard. Structs that you define can be marked with the Codable protocol, and the Swift compiler automatically generates code during compilation that encodes from and decodes to the structs properties, with the option for you to customize it using CodingKeys for different properties names or completely custom coding behavior.

It seems built-in to Swift, as opposed to a dynamically executed crate like with serde. I wonder how it's implemented in Swift and if it leads to any significant slowdowns.

[0] https://developer.apple.com/documentation/foundation/archive...


Codable (along with other derived conformances like Equatable, Hashable, and RawRepresentable) is indeed built in to the compiler[0], but unlike Serde, it operates during type-checking on a fully-constructed AST (with access to type information), manipulating the AST to insert code. Because it operates at a later stage of compilation and at a much higher level (with access to type information), the work necessary is significantly less. (It's difficult to compare Serde and Codable in a way that isn't apples-and-oranges, but at the very least, Codable makes use of information that the compiler has done, and has already needed to do anyway; there's very little manipulation it needs to do to achieve its aims.)

With ongoing work for Swift macros, it may eventually be possible to rip this code out of the compiler and rewrite it as a macro, though it would need to be a semantic macro[1] rather a syntactic one, which isn't currently possible in Swift[2].

[0] https://github.com/apple/swift/blob/main/lib/Sema/DerivedCon... [1] https://gist.github.com/DougGregor/4f3ba5f4eadac474ae62eae83... [2] https://forums.swift.org/t/why-arent-macros-given-type-infor...


Last I checked[1], Swift JSON performance was terrible, well over an order of magnitude slower than serde. It's possible it's been improved since then but if so I didn't find any announcement.

[1]: https://github.com/xi-editor/xi-mac/issues/102


The problem wasn't Swift Codable, the problem was Apple's implementation of JSONEncoder / Decoder on top of Obj-C runtime, which created unnecessary round trips.

It was only fully addressed in Swift 5.9.


Both are problems, and it's far from fully addressed in Swift 5.9. The new Swift implementation of JSONEncoding is "only" 2-5x faster than the old one, so while it's a big improvement it's still very slow. Codable itself is also very performance-hostile, and a no-op encoder which fully walks the input but doesn't produce any output will be slower than serde producing json.


Can you get into more details on why do you think Codable is performance hostile?


There's a few other options that I think in combination would make things much better.

1) Make using proc macros in your own application code faster: Encourage better macro re-export hygiene. Basically, one of the reasons serde + serde_derive is so slow is because for serde re-exports serde_derive to compile before it itself can be built. A solution (I didn't come up with this myself) would be to have a crate that re-exports serde and serde_derive together; see top comment herehttps://www.reddit.com/r/rust/comments/1602eah/associated_pr...

2) Make using libraries that use proc macros faster: Publish pre-expanded source code, such that no proc macros run for dependencies. Some work would have to be done around conditional compilation to make things just work (TM), but I think it could be done. (Who knows, maybe it'd be really hard to make the compiler deal with expanding proc macros on stuff behind conditional values. Also, there'd be issues regarding macro hygiene. Both solvable, but I'm not sure how much effort they'd take).


In case anyone missed the first line like I did this has been resolved and no longer works like this.


Caching needs to be brought to the ecosystem rather than point non-solutions. While sccache is a partial solution available now from the outside, the toolchain needs adjustment internally to deduplicate work.


I've been exploring local, cross-project caching for cargo and my biggest concern is users won't get enough cache hits. With maximal versions and a arbitrary set of dependencies being updated when adding a new dependency. all projects on a system can end up in very different states.


That is the whole issue, as scache is more like a workaround versus a proper solution, supported out of the box in cargo, like in Java, .NET, C++ (vcpkg and cargo do bin libs), node,...


> If we're going to be trusting some random guy's binaries, I think we are in the right to demand that it is byte-for-byte reproducible on commodity hardware

I don't think anyone has a right to demand anything of the project. The MIT license specifically has the whole "THIS SOFTWARE IS PROVIDED AS IS" spiel for a reason. Thinking that you can demand anything of an open source developer who, afaik, has no responsibilities in relationship towards you, is a rather toxic mindset that should be kept out of open source.


"No one has any right to complain because muh license agreement"

What a tiring argument. Why complain about anything? Why do anything differently? Maybe it's because people have invested time, money, and effort into supporting and using the project and because of your stupid decision they now have to do more work. There were a lot of comments about people showing up that monday to pin+vendor the old SerDe. Others planned to remove it entirely in favor of other libraries. Almost like actions have consequences even if your "license agreement" says otherwise.

> is a rather toxic mindset that should be kept out of open source.

I wish I lived in your echo chamber where everyone who uses your stuff is just totally, like, accepting of your garbage ideas. The precompiled binary idea was a classic garbage idea. Refusing to make it reproducible was just the corn on top of the cowpie. The author made a major screw up, doubled down, tripled down, and then finally gave in. The author has no idea what scale and scope of project this is used in. They also clearly did no evaluation on potential damage OR ask for feedback before moving it into mainline. For a library with 3M+ downloads this was the what third? Fourth? Classic ego-driven folly.

You know what else shouldn't exist in open source? Ego tripping morons.


I'm not sure why you are attacking the guy personally and saying he lives in an echo chamber. All he did is say that you can't demand anything of the guy as he has not obligation to do anything. That's in the agreement that you agree to when using the software.

He's free to tank his own project and you are free to use it, fork it, or not. He's allowed to be an ego maniac if he wants or do things in ways you disagree with. You got what he made for free, so why you want to demand things of him is beyond me.


It was demonstrated that his precompiled binary saved almost no time in the original, now locked and cleaned, issue.

I don't care what he does with his library. You're right. At the same time you can't "do whatever you want" when you have a 3M+ download library even if you do own it. It would be like if you suddenly decided an entire metropolitan is wrong and you're right. You need A LOT of evidence to do that. There are practical limitations to your personal freedom when so many people depend on you. Even if you want to believe that isn't the case how many potential sponsors, contributors, etc do you alienate with such a stupid idea in a pool of 3M? Even if it's 1% thats 30,000 people who now will do absolutely nothing to help you.


Yeah maybe but he still has the right to do it and you have the right to not like it.


I'd say that the core of your point is legally correct, but you are ethically arguing over what is, at best, a grey area. I mean, if the author is explicitly (by you) "allowed to be an ego maniac", then why isn't the user you are replying to allowed to judge them for it? The options available here are not merely to use it or to fork it: people being afforded the same freedoms you believe this maintainer should have also have the option to call them out over the result, no?


I'm not here to argue the morality of it. I'm just saying it's silly to complain about something and demand something of him. I mean I can demand a super model will give me a date but that doesn't mean I'm entitled to it or that it's going to happen. It's pretty much pointless.

The difference between this guy and the maintainer is that the maintainer did this guy a huge favor by writing this software for the guy where as this guy did nothing except complain about the free meal he's getting.


Do we as humans owe anything at all to each other outside of a capitalistic transaction-based framework?

I think the burden of open source maintenance often goes too far in the other direction, especially when corporations demand free labor, but I think it's reasonable to expect maintainers of core libraries to not actively cause harm.


What harm did he cause? He did something the way he wanted to. People didn't like it. You seem to think that the maintainer owes you something. Sure he can't do something illegal but other than that it's all available to him. But he also has to live with any repercussions of his actions.


Beyond the reputational harm to the Rust community, he actively engaged in attention-seeking behavior, something that is against the Rust Code of Conduct. (He's far too intelligent to not know that it was attention-seeking behavior.)


The Rust Code of Conduct doesn't apply to him at all. That is something that the Rust maintainers are following.

I can't speak to his motivations, but nothing that you've said seems to be against any law or any sort of agreement he's made. This is his software, he can do with it what he wants. If you don't like it just don't use it.

You can't police him, you can complain about him, but you seem to making up violations he has committed that are mostly made up, rather than based on any actual agreement he has entered into.


Huh? He's a Rust project member and part of the Rust library API team: https://www.rust-lang.org/governance/teams/library. As such, the Rust Code of Conduct applies to him.


You are right he is. So then they can kick him off if they think he's in violation. This isn't some solemn oath he's taken.


> Maybe it's because people have invested time, money, and effort into supporting and using the project and because of your stupid decision they now have to do more work.

Less time and money than rewriting the thing from scratch.


The demands are provided "as is," too, so you have no right to complain about them.


That’s not true. I can, for example, demand that an open source project not bundle malware. The maintainer may not listen to me but I reserve the right to demand it and I don’t see how this is toxic at all.


this article style is very hard to read.


I like the Go style better. I probably will not ever care about more than one or two encodings and being explicit is an upside in that regard. I do however really care about compile times. If performance really matters, I would not go for a generic serialization approach in the first place.


Apples and oranges. Go doesn't care about release or dev builds. Nothing about it applies here.

Rust needs specific toolchain improvements to cache partial compilations for reuse. sccache can cache some results, but it's not a comprehensive solution.

What specific proposal would replace serde? I'd like to know.


very different format to the blog with different characters chiming in. interesting!


Totally unrelated.

But generated picture at the beginning of the article, the one with space needle, robs me very wrong way, as real landscape doesn’t look like it. It’s like seeing Golden Gate Bridge in New York.


Author here. Yeah, the model I use (Anything v3) has a really weird way of generating the Space Needle in particular. I'm originally from Seattle and I really miss the pacific northwest, so I like to put the Space Needle in my AI generations because of how much it can stick out or clash with the rest of the image.

In the future I'm gonna use more of my own photography like this: https://pony.social/@cadey/110956174300162810, I'm just building up a library of viable photos.


What's the story behind "-amputee" in the arguments? Were such images so common that it was necessary to exclude them!?


If you don't include it, sometimes you get otherwise perfect images that have only half of a leg in a really distracting way. I usually prune negative arguments (my usual prompts are close to a kilobyte long including negative prompt elements), but I thought it would be amusing to leave it in this time.


Huh. You inspired me to finally get around to installing a local Midjourney-like[0], and, yeah, what do you know - the first result I got from those prompts _without_ the `-amputee` negative prompt was this horrifying (though SFW, despite what Imgur claims) monstrosity[1]

[0] https://github.com/AUTOMATIC1111/stable-diffusion-webui

[1] https://imgur.com/a/bMp5aAB


Yeah, the anime models seem to produce more coherent outputs without amputee in the negative prompt, but I just always put it in there anyways.


So in this case it did work as intended :)


I know a lot of people feel very strongly about this thing and thankfully Dtolnay reversed course but if there is anyone out there that has social media mob mentality please don't harass him. A lot of the packages I use have his name associated with them and if we get the classic "I'm leaving open source because of the toxicity" and he is the author the rust community will be screwed.


Let's see if the Rust community learned anything from the actix-web fiasco.


Rage blogging is never by contributors


[flagged]


Please don't do this here.


The issue is the reliance on macros in the first place, which half the time are just a shitty substitute for overloading, variadics and inheritance that can’t be nested, doesn’t have editor highlighting, and takes longer to compile. Macros should be a last resort and not a first resort, “rust macros are better” isn’t a good counter argument. Rust’s big government philosophy of forcing design choices onto the programmer yet again comes back to bite it.


Please describe how you would implement the functionality of the serde::Serialize macro with overloading, variadics, and inheritance.


I’m referring to things like println, which isn’t a macro in any other language. If you want to stick with the serde example, then look at the diy name mangling in the deserializer trait:

https://docs.rs/serde/latest/serde/trait.Deserializer.html

Or the copy and pasting in Axum:

https://github.com/tokio-rs/axum/blob/24f0f3eae8054c7a495cd3...


I'm failing to see how this is relevant then. You say the issue is with macros in the first place, but then it turns out you're actually referring to other macros and have no alternative to the macro that actually created the problem. What am I missing in your reasoning?


The issue is with macros in general, and the problem is present with macros in general, even if there’s no better solution in this specific case. If someone is willing to take such drastic measures to combat the downsides of macros when they are needed, what would they be willing to do when they’re forced to use macros when they shouldn’t be needed?


The drastic measure is because of the large amount of code serde generates. The features you describe, by contrast, do not involve much code from a macro standpoint and such macros would not benefit from precompilation.

If you're complaining about something that TFA is not an example of, it's a good idea to at least provide something that is an example of it, because otherwise you're just yelling at clouds.


Println is also much more complicated and expensive at runtime in "any other language"


Are you sure? Java's println is pretty well optimized especially with the recent string builder optimizations.


Except for C++.


You mean the language which, in every major implementation, that particular function is a macro?


None of printf, iostreams, or std::fmt are macros in any implementation I'm aware of.


va_arg, which drives printf, is what I was referring to, and is generally a macro, and printf is what I was assuming alphanullmeric was referring to. iostreams defeat the point, since Rust is just as capable of the functionality and indeed the exact syntax, and std::format uses the properties of templates that are exactly like macros in both benefits and drawbacks, templates being an unusual cross between macros and generics.



[flagged]


"Please don't pick the most provocative thing in an article or post to complain about in the thread. Find something interesting to respond to instead."

"Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting."

https://news.ycombinator.com/newsguidelines.html


What ad are you talking about?


If you have an ad-blocker enabled, the first paragraph of the page asks you to turn it off or donate. I have no problem with creators being compensated for their work, or asking for that compensation, but I found it confusing.


uBlock origin also blocks the adblock detection, so I couldn't see neither the ads nor the paragraph that talks about disabling the blocker.


>If we're going to be trusting some random guy's binaries, I think we are in the right to demand that it is byte-for-byte reproducible on commodity hardware without having to reverse-engineer the build process and figure out which nightly version of the compiler is being used to compile this binary blob that will be run everywhere.

David Tolnay is not a random guy.


Then extrapolate "random guy trying to follow dtolnay's lead."


Well then complain about that, not this.

And anyway it doesn't really matter. As soon as you use anyone's crates you're more or less completely trusting them. It's not difficult to hide malware in a Rust crate even if you don't ship a binary.

And... come on. David Tolnay came up with Watt. He's clearly not intending to ship a binary forever - the long term solution is WASM.

This author comes across as an annoying naysayer - everything is impossible, even things that have already been done like WASI.


wasm is also a binary format, and I wouldn't like it to be included into the .crate file either. The .crate files should ideally only include source code, to preserve the source first nature of crates.io (even if it's never been officially confirmed).

Rustdoc is also automatically built by docs.rs and nobody distributes it in their .crate file. I think the same should be done for wasm proc macros, too: they should be built by public infrastructure, and then people can opt into using binaries provided by that infrastructure to do their development, and if they want also opt towards using native binaries instead of wasm. But the binaries, including wasm, should only be a cache.


> I think the same should be done for wasm proc macros, too: they should be built by public infrastructure, and then people can opt into using binaries provided by that infrastructure to do their development

That's obviously how it would work. Read dtonlay's proposal. Crates.io would compile the WASM.


Only as a verification step though, not to obtain the actual binary, which will be included in the .crate package, where it IMO should not be put.


/shrug He is to me.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: