Mold 1.0: the first stable and production-ready release of the high-speed linker

gavinray · on Dec 15, 2021

FWIW, the author is also the original author of LLVM's "lld" linker, and has written multiple C compilers.

That makes them the author of both of the two fastest linkers in existence, AFAIK. This person has a very impressive resume.

https://github.com/rui314/mold/blob/main/docs/design.md

  > "Concretely speaking, I want to use the linker to link a Chromium executable (~1.8 GiB in size) just in 1 second. LLVM's lld, the fastest open-source linker which I originally created a few years ago, takes about 12 seconds to link Chromium on my machine. So the goal is 12x performance bump over lld. Compared to GNU gold, it's more than 50x."

There is good discussion as well in this prior post, and Reddit thread with the author in it a while back:

https://news.ycombinator.com/item?id=26233244

https://www.reddit.com/r/cpp/comments/kxvw5c/mold_a_modern_l...

jart · on Dec 15, 2021

Rui Ueyama is one of the most interesting coders of our age. chibicc in particular is one of the loveliest projects I've seen. I found it so captivating that when it got posted to HN last year I dropped what I was doing for a month just to hack on it. It's rare to find codebases with enough clarity to be educational. I learned so much reading his chibicc code. For example, in a world where the orthodox solution to these kinds of things is to use Bison or Antlr, the thought never would have occurred to me before that writing a parser in C could be so easy. That's how I was able to write an assembler for it. Rui is also living proof that the will and motivation to simplify also generalizes to the ability to create production-quality tools with superior performance, as evidenced by Mold. Very exciting to see his work take off.

gavinray · on Dec 16, 2021

This is some high praise coming from Justine Tunney.

rurban · on Dec 16, 2021

Yeah, but he refused to protect his unicode identifiers, and just went with the C committee solution to provide no solution to Unicode identifier security. I expected better.

rui314 · on Dec 16, 2021

Are you confusing me with someone else? I have no idea what you are talking about.

jart · on Dec 16, 2021

He's probably talking about https://justine.lol/dox/ansic-identifiers.txt I'm not sure what he wants to see happen here but I remember him asking me for a timing safe memcmp function in Cosmopolitan a while back and it's hard to refuse a small reasonable demand.

saurik · on Dec 16, 2021

I feel like this makes it even more confusing as to why this "drop-in replacement" is yet another new competing project rather than an upgrade/enhancement to an existing linker (such as lld)... has open source as a community concept failed so hard that people are unable to even contribute to and collectively improve projects they themselves started?

wyager · on Dec 15, 2021

Why is Chromium so huge?

rackjack · on Dec 15, 2021

This might be a nerd-sniping question because there are lots of reasons and people will fight over which ones are most relevant. e.g. V8, rendering, myriad web protocols, security measures, etc. From my understanding at least.

glandium · on Dec 15, 2021

Debug info is huge.

mrich · on Dec 16, 2021

Are they using split dwarf? This avoids linking the debug info.

simlevesque · on Dec 15, 2021

Back when I was using Gentoo, Chromium was the biggest dependency on my machine, by far. When there was an update I couldn't use my machine for like 30 minutes.

tomcam · on Dec 15, 2021

What would you do to shrink it?

wyager · on Dec 16, 2021

That would require me knowing the answer to the question I asked.

IshKebab · on Dec 15, 2021

It includes a mountain of stuff. Think of all the web APIs that exist.

nitsky · on Dec 15, 2021

I have been using mold every day for almost a year, and it has dramatically improved my quality of life by decreasing link times on the Rust project I work on from approximately 10 seconds to less than one second. This makes a big difference in keeping focus during the edit-compile-run loop. Thank you!

c0balt · on Dec 15, 2021

Maybe a dumb question but how would I go about using it with rust(c) and cargo? Would love to try it out, if it decreases total compile times.

the_duke · on Dec 15, 2021

RUSTFLAGS="-C linker=clang -C link-arg=-fuse-ld=/path/to/mold" cargo build

or persitant in your project:

.cargo/config.toml

[target.x86_64-unknown-linux-gnu] linker = "clang" rustflags = ["-C", "link-arg=-fuse-ld=/PATH/TO/mold"]

fbernier · on Dec 15, 2021

-fuse-ld has been replaced by --ld-path in clang 12+

the_duke · on Dec 16, 2021

Good to know, thanks.

fuse-ld still works though.

wyldfire · on Dec 16, 2021

I wonder -- "-fuse-ld" has some somewhat surprising behavior in how clang ends up discovering the linker. I think that even if clang has a sibling `lld` in the same distribution, "-fuse-ld=lld" will pick "ld.lld" from the $PATH if it's present in there before the directory where clang and lld are installed.

So maybe that "--ld-path" option helps resolve ambiguity by expecting an explicit path instead of a linker name.

c0balt · on Dec 16, 2021

Thank you. Will try it later

4khilles · on Dec 15, 2021

"mold -run cargo build" would also work.

the_duke · on Dec 15, 2021

Warning: don't do this if you use rust-analyzer or any other IDE that uses cargo check.

The flags are part of the cache hash, so your IDE and cli constantly invalidate the cache and compile from scratch.

tomcam · on Dec 16, 2021

> decreasing link times on the Rust project I work on from approximately 10 seconds to less than one second.

But enough about writing “hello, world“ programs in Rust.

I kid.

Mizza · on Dec 15, 2021

I think this is an interesting model: https://github.com/rui314/mold/blob/main/LICENSE

I'm glad he's doing this as (A)GPL. It's great work and it's Free for the people. If you want the option for it to be private, feel free to step up and do the right thing.

jacobmarble · on Dec 16, 2021

> Note: I'm looking for a sponsor who wants to purchase the copyright of this work and relicense it under a more liberal license such as the MIT license. For now, mold is released under the GNU AGPL v3.

https://github.com/rui314/mold/blob/main/LICENSE

invokestatic · on Dec 15, 2021

Why Affero? Are there cloud linker tools? Or simply because it is one of the most restrictive open source licenses?

rui314 · on Dec 15, 2021

If it were GPL, it would have been just the same as GNU linker. If it were MIT, no one wouldn't have an incentive to purchase the project. Some companies have a policy to not allow AGPL software at all in their orgs, which should give them more incentive to relicense, no matter whether the policy makes sense or not.

eitland · on Dec 15, 2021

Given the goals stated I think it is because it is the most restrictive license.

In fact for many years this was the only value I saw in AGPL: to provide a best possible starting point for an upsell while claiming it was all about Free Software.

kzrdude · on Dec 15, 2021

It's pretty clear we need AGPL when anything can be compiled to wasm soon.

jcelerier · on Dec 15, 2021

Since WASM blobs would run in the end-user's browser, GPL is sufficient to allow the user to request source. AGPL is more useful for preventing e.g. cloud build services.

Starlevel001 · on Dec 15, 2021

> most restrictive

It's the least restrictive for user freedom.

vlmutolo · on Dec 16, 2021

AGPL contrains the rights of the user. That's its job. The whole point is to be GPL, which prevents users from distributing modifications to the original program, with the addition of preventing users from offering their modifications as a service over the Internet.

MIT, Apache, and BSD have none of those restrictions on users. How is APL the "least restrictive"?

sagarm · on Dec 16, 2021

GPL does not prevent users from distributing modifications: it simply requires them to give _their_ users the same freedom if they do so. The AGPL further updates this for the cloud era.

vlmutolo · on Dec 18, 2021

> it simply requires them

This is the part where the (A)GPL constrains the rights of its users. Requiring user A to give user B more liberal licensing is constraining the rights of A for the benefit of B. Even if you think it's the "right" way, it's still a constraint.

orra · on Dec 15, 2021

There's certainly cloud IDEs.

wmf · on Dec 15, 2021

I wonder about AGPL "infecting" proprietary CI pipelines.

singron · on Dec 15, 2021

Please read the AGPL very carefully before saying things like this. There is no reasonable interpretation where some cli program in part of a CI pipeline would infect the result of the pipeline (i.e. build artifacts) unless it actually put itself into the result. (E.g. gcc puts a small amount of code from libgcc into the executable, but that explicitly has a different license).

The "A" of the AGPL expands the distribution rules of the GPL to include network access but doesn't really expand the virality or combined-works parts. It "infects" your program just as much as any GPL program would.

wmf · on Dec 15, 2021

I don't mean the build artifacts; I mean the CI build scripts and such.

infogulch · on Dec 15, 2021

Is that CI provided as a propietary service? If it applied, it would only apply to users of the CI service. So, publish your custom internal CI code to your own developers.

spekcular · on Dec 15, 2021

Dumb question: Why is this a separate project, instead of being an improvement to the LLVM linker (given they have the same author)? Does this have any chance of being adopted by LLVM?

nmstoker · on Dec 15, 2021

Perhaps because it's not just for LLVM?

Given the comments in the repo README about it being a drop in replacement for the GNU linker, it looks like could be useful for several other scenarios too (but I defer to those who can explain this more clearly/correctly than I!)

VWWHFSfQ · on Dec 15, 2021

According to the rejected patch to GCC to support arbitrary linkers [0], mold or any other linker is probably not working very well with GCC. So unless they're wrong, this doesn't really seem like a generic drop-in solution anyway, despite trying to be one.

[0] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573833.h...

wyldfire · on Dec 16, 2021

At the very least, lld is pretty much drop-in. Aside from some subtle differences about resolution order (IIRC lld implicitly has "--start-group"/"--end-group"), everything just works with lld in place of ld.bfd. My recollection was that I had just about the same experience w/gold -- it was just a drop-in replacement.

All of my experience was with using clang as the driver, so I can't speak to why gcc has trouble making a similar feature work like expected.

spekcular · on Dec 15, 2021

The key part:

"Note, all these extra linkers (lld, mold) will not really work properly, gcc during configuration detects various assembler and linker properties on which it then relies on and I'm sure neither lld nor mold supports those features."

touisteur · on Dec 15, 2021

It can in fact link gcc-ada specific things and others.

rui314 · on Dec 15, 2021

It does not use any LLVM libraries and has no feature specific to LLVM, so I think it simply doesn't have a reason to be a subproject of it (or some other large umbrella project).

wyldfire · on Dec 16, 2021

That doesn't stop other subprojects currently in the monorepo.

Rui, regarding -- "a sponsor who wants to purchase the copyright of this work and relicense it under a more liberal license such as the MIT license." Would you accept sponsorship for someone who wants to relicense under LLVM(apache)? And do you have a ballpark asking price?

DenseComet · on Dec 16, 2021

If you take a look at the CONTRIBUTING.md, Mold has taken an approach that is pretty novel to me. Instead of requiring a CLA to allow for relicensing, all patches are required to be released as dual-license AGPLv3 and MIT. I personally love this approach, but it could make it difficult to relicense to anything but MIT.

rui314 · on Dec 16, 2021

MIT is a very permissive and compatible with lots of other open-source licenses. For example, you can sublicense MIT-licensed code under GPL. Not sure if MIT is compatible with the LLVM license, though.

rui314 · on Dec 16, 2021

I don't want to write an asking price here, and honestly I don't know how to valuate an open-source project. But I could have been working for Google as a staff engineer and enjoy a decent salary instead of doing this project, so, well...

wyldfire · on Dec 16, 2021

Hmm, I thought you were at Google. Did you leave? Regardless of whether you're at Google I think you deserve fair compensation. But IMO the value of mold isn't limited by your opportunity cost of a salary. It's state of the art linker and someone like Apple/FB/etc can and should be willing to compensate you.

rui314 · on Dec 16, 2021

I left Google last year. And you are right; mold (or anything that I create) should not be valued at my opportunity cost but instead at its intrinsic value.

igorkraw · on Dec 15, 2021

Anyone know the state of link time optimisation on this? Last time I checked it was advised to use this for dev builds only

_dh54 · on Dec 15, 2021

I initially had the same question but seeing the author’s comments on this feature (which are generally positive) I personally have come to the conclusion that it is a misfeature.

LTO is not necessary for development builds, or builds where the speed of the write-compile-test loop matter. This is exactly the use case that mold was designed for and it is designed well. Adding LTO support would only add bloat to the code especially since mold’s techniques to improve the efficiency of linking don’t really apply to LTO (where link time is dominated by whole program analysis). I have no issue using the compiler’s native linker when doing production LTO builds and conceptually that makes more sense to me anyway.

I would prefer it if the author avoided features when doing so benefits innovating on the core use case.

tux3 · on Dec 15, 2021

LTO is planned, but not a priority. I think the author will work on macOS support next, before anything else.

jsnell · on Dec 15, 2021

Pre-1.0 discussion: https://news.ycombinator.com/item?id=26233244 (122 comments)

orra · on Dec 15, 2021

In the previous discussion we discussed how fast mold is at linking ELF files. I was also wondering (unvoiced) if it would also be possible to make a ridiculously fast linker for PE/COFF (Windows).

Somebody else asked the same question in an issue and the answer is yes, in principle, but it'll take a lot of work. That's not unlike how solid ELF support took a lot of work. I see in the v1.0 release notes, Windows support is nominally planned for v3.0.

citrin_ru · on Dec 15, 2021

> As soon as the second process writes a result file to a filesystem, it notifies the first process, and the first process exits. The second process can take time to exit, because it is not an interactive process.

Looks like a trade-off which may favor speed over reliability. If a syscall (like msync) in 2nd process will return an error there will no way to abort a build if 1st (main) process already has finished.

CJefferson · on Dec 15, 2021

The second process doesn't technically report until it has closed references to the binary it is writing (last time I checked). If anything errors after that, it doesn't effect your created executable in any way.

PudgePacket · on Dec 15, 2021

https://abramov.io/rust-dropping-things-in-another-thread

U1F984 · on Dec 15, 2021

Lots of interesting things in the design document: https://github.com/rui314/mold/blob/main/docs/design.md

pantalaimon · on Dec 15, 2021

Does this also work for embedded targets?

Ecco · on Dec 16, 2021

It does not intend to support linker scripts, which would be a problem for embedded. It also does not support LTO, and I dont think that a non-LTO link of an embedded executable can take any noticeable amount of time with any linker.