Hacker News new | past | comments | ask | show | jobs | submit login

> No problem, we thought and instructed the Rust compiler to disable its own ThinLTO pass when compiling for the cross-language case and indeed everything was fine -- until the segmentation faults mysteriously returned a few weeks later even though ThinLTO was still disabled. [...] Since then ThinLTO is turned off for libstd by default.

Instead of fixing the crash, they landed a workaround.

> We learned that all LLVM versions involved really have to be a close match in order for things to work out. The Rust compiler's documentation now offers a compatibility table for the various versions of Rust and Clang.

It's cool they got it working, but it sounds like this is currently proof-of-concept quality and not very productionized yet. To me, the overall tone of the article sounds like they ran into a bunch of issues and opted for duct tape instead of systemic fixes. Which is fine to get things off the ground, of course! But I hope they take the time to go back and fix the underlying issues they ran into too.




This isn't anything new. Rust has had to land workarounds for lots of LLVM issues in its history. For example, Rust had to stop using noalias on function parameters because LLVM miscompiled too many functions with it, as Rust can use it way more than C/C++ do and therefore it didn't receive much upstream test coverage.


Too bad LLVM doesn't have a first-class Fortran, then noalias would actually work.


Rust could fix upstream issues it runs into, no?


They could, and they do.

That said, they don't have infinite time, and if, as in this case, the upstream fix would: (1) be pretty involved and (2) be very likely to get regressed because upstream doesn't have the capability to run tests that would prevent that (e.g. because upstream only runs C++ compilation tests and there is no way to exercise the relevant bugs via C++ code), then investing in fixing upstream may not be the right tradeoff.

In theory, one could first change upstream's test harness to allow Rust code, but that involves upstream tests depending on the Rust compiler frontend, which apart from being a technical problem is probably a political one.

Maybe it would have been possible to do upstream tests via bitcode source instead of Rust or C++; I don't know about LLVM to say offhand. But in either case this is not as easy as just "fix a simple upstream bug"...


Upstream tests are generally done at the LLVM IR level actually. It's mostly just a question of (1) time; (2) worries about ongoing maintenance work upstream; (3) a general feeling that perhaps such optimizations are best done on MIR anyway, because they'll be more effective there than they would be in LLVM.


You're suggesting that rustc should do noalias optimizations on MIR? I'm skeptical of that idea... A lot of duplicate loads that would benefit from being coalesced are only visible after LLVM inlining.


Obviously MIR inlining needs to happen first (and I think it does happen already?) But to me it's clearly the right solution going forward. LLVM's noalias semantics are much weaker than what we can have on MIR, with full knowledge of transitive immutability/mutability properties.


'Classic' LLVM noalias as a function parameter attribute is weak, but the metadata version is much more flexible. I looked into it in the past and IIRC it's not a perfect match for Rust semantics, but close enough that rustc could still use it to emit much more fine-grained aliasing info; it just doesn't. But there was also a plan on LLVM's end to replace it with yet a third approach, as part of the grand noalias overhaul that's also supposed to fix the bug. Not sure if there's been any progress on that.

As far as I can tell, MIR inlining currently happens with -Z mir-opt-level=2 or higher, and that is not implied by -O. But I have no idea what future plans exist in that area.

I admit I have a bias here: it feels to me like everyone (not just Rust) is running away from doing things at LLVM IR level, and the resulting duplication seems inelegant. But on reflection, part of the reason I have that feeling is that I've recently been spending time fixing up Clang's control-flow graph functionality... which itself is 12 years old, so it's not a new trend at all!


Wouldn't MIR be more portable to the future as well? Building on Rust's own equity and all that, because future Rust will probably still use MIR but could replace LLVM(?)


They wouldn't want to do MIR without upstreaming the Rust frontend, which I don't see happening anytime soon.


MIR should stay with Rustc, and that's the point — to work on optimizations that happen in Rustc, and not later when the code is turned over to llvm, or other backend.


That's often done, but Rust is shipped on all major Linux distros using the system LLVM, which is often at least 6 months old, or years old, so it needs the workarounds anyways to be able to work with those. The LLVM fixes take a while to percolate back, and Rust supports up to LLVM versions that are ~2 years old (LTS linux distros). The workarounds can only be removed once versions without the fix are no longer supported.


They actually do sometimes AFAIK


Do you think it's planned to bring back noalias in rustc ?

This, along with const generics and simd, would make rust the perfect language for me.



The upstream fix seems to be blocked on DannyBee in fact…


???????

I stopped working on LLVM about 2 years ago (give or take), as i now have way too many reports to be able to do any effective patch or design review, or honestly, keep up with the mailing list. I'm also too far divorced from work being done.

(I unsubscribed late last year)

I specifically reviewed and approved the llvm.noalias patches before i stopped, which is why they are marked as accepted by hal, back in 2016.

More than that, i was one of the people who basically showed that !noalias/etc is fundamentally broken and can't be fixed.

Nothing should be blocked on me at this point, and my reviews account is deliberately disabled so that people can't assign/add me to things.

If something is blocked on me, hal certainly hasn't let me know :)


Understood. Sorry, it was not clear to me what was going on. In any case, the fix seems to be blocked on something, as it hasn't landed yet, and it's unclear what.


It's just how it works out in practice, in my experience. LLVM is a large, fast moving target that's incredibly complex to understand, because it has a complex job. It resolves many issues for you when you're developing a compiler, but you are gifted issues in return. One of those is that understanding, diagnosing, and properly fixing problems can take a very large amount of work. It has bugs! I mean, LTO has historically been fragile for a single source language when enough code gets thrown at it, much less two languages!

Another is that compiler developers often don't have infinite amounts of time to sort out shenanigans like this when they come up. Users generally prefer the compiler to work, even if suboptimally, when compared to "not working", so there is some tension between things and how long they take. So landing workarounds in various ways -- sending patches upstream, using custom builds with bespoke patches, code generation workarounds -- all have to happen on a case by case basis. Many LLVM clients do this to varying degrees, and a lot of features like LTO start off fragile due to these kinds of things. Over time things will get tightened up, hopefully.

When I worked on GHC (admittedly several years ago now), the #1 class of problems for the LLVM backend were toolchain compatibility issues above all else, because we relied on the users to provision it. At the time it was nothing short of a nightmare -- various incompatible versions between various platforms causing compilation failures or outright miscompilation, requiring custom patches or backports at times (complicated by distros backporting -- or not backporting -- their own fixes), some platforms needed version X while others were better served by Y, flat-out code generation buts in GHC and LLVM, etc. It's all much better these days, and many features/fixes got landed upstream to make it all happen, but that's just how it works. Rust made several design choices with their packaging/LLVM use that we didn't that I think were the right ones, but I'm not surprised they've had a host of challenges of their own to address. TINSTAAFL.


The long-term plan for Rust is to treat its standard library (almost) like any other crate, so it could be recompiled with your custom compiler settings if necessary.

However, libstd is by necessity tied quite closely to the compiler, it's one of the oldest and most fundamental parts of the stack, and there are tons of little complications around making it "just" a crate, so changes to libstd/rustc/Cargo necessary to make that possible will take longer.

In the meantime, changing one problematic compiler flag seems like a sensible solution.


The low-level tools team consistently reports bugs against the upstream projects. I have no doubt that they did so during the course of this project.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: