Andrew is super sharp so I imagine he and the team will get there now that it's declared as a goal.
But man, it seems to me (uneducated on the challenges zig faces w/LLVM) that this shifts the team's capacity away from zig and towards things like binutils. When I read the headline I assumed that they were throwing out the compiler (IIRC they had mostly/totally excised it already). But for a project like zig it just seems like there's a lot to be gained from keeping it.
That said -- the prospect of rewriting a lot of the stuff that's in the LLVM project now in zig instead of c++ - that's pretty cool and ambitious. Just as ambitious as it was for Lattner to create LLVM, I suppose.
But code that's accidentally quadratic - well, that's bound to happen to zig too, if it's as popular and useful as LLVM project is.
There's a document that I think was an unofficial project management document from NASA, I can't find it now, but one of the items was something along the lines of 'if your space mission isn't reusing an existing launch vehicle, it will be a launch vehicle development project and everything else, including what you think is the focus of your mission, will be secondary to this'.
Edit: additional bit I remembered, this statement is prefixed with something like 'you may start thinking you project will be cheaper/more efficient if you develop a launch vehicle suited to your particular mission instead of compromising on an existing one, but', then the above.
I'm reminded of a recent interview with Chris Latner about the success of Swift - he credits the success to the fact that you could start with a large objective C project and begin writing Swift without rewriting anything. Part of Rust's success is off the back of a similar compatibility with C/C++.
It is difficult for me to imagine Zig succeeding without a similar capability - which is to say I hope they do not follow through on this milestone.
But rust absolutely does not have any C/C++ compatibilty besides arguably very good FFI. And that's where zig shines. They have a c/c++ compiler built into the zig compiler via zig cc, that's even easier to use than clang or GCC due to having the benefits of a buildscript and baked in libcs for different architectures, making crosscompilation a breeze.
In zig you _can_ actually start with a c codebase and rewrite it file by file in zig and you _can_ include c headers in your zig files verbatim. Both of these are not possible in rust.
This milestone is only gonna remove the c++ (and objective c/c++) compiler for now from zig cc. So while you could argue this will ostracize people rewriting their c++ codebases in zig, I don't imagine there's actually many people like that.
EDIT: I just looked at the discussion and there's actually a lot of people that use the c++ capapbilities.
> In zig you _can_ actually start with a c codebase and rewrite it file by file in zig and you _can_ include c headers in your zig files verbatim. Both of these are not possible in rust.
well, you can: i have done this, with Rust. but yes, it’s more the type of thing that well-resourced companies do rather than solo devs, because even with Bindgen and the like Rust really wants the atomic codegen unit to be the “library”, not C’s notion of a “compilation unit”. still, C FFI compatibility is exactly why porting from C to rust incrementally is feasible while C to, say Java, is probably a bigger leap.
You still need a separate compiler toolchain next to Rust in order to compile the C, C++ and ObjC dependencies which is a massive build system headache (especially across UNIX-oids and Windows).
In Zig that all "just works" with the standard Zig install.
(here's how the sokol headers are integrated into a Zig project, note that there is only a build.zig which builds everything: https://github.com/floooh/kc85.zig)
But it's not so much about build systems, but requiring a separate compiler toolchain to build any C++ or ObjC dependencies (Rust needs this, Zig currently does not - unless that controversial proposal is implemented).
(also even complex C++ libraries can be wrapped in a build.zig, so you don't require a separate build system like cmake for the C++ dependencies)
I still don't see why this is a problem. Single-header dependencies are cute but blow up compile times, and most of the world needs CMake/Meson/Autotools anyways so there's not an added cost to using it for your own projects.
Similarly I don't really understand why you want one toolchain for a multi-language project. It's not that useful or convenient since it's going to need to be orchestrated with a buildsystem somehow.
Like with Rust, you probably only have a build.rs file. It invokes a separate (possibly multiple) toolchains. Or with CMake. A simple project has a single CMakeLists.txt that can invoke any number of toolchains. I don't see why zig can't do that with a build.zig file, or why it matters.
> But rust absolutely does not have any C/C++ compatibilty besides [..]
> In zig you _can_ actually start with a c codebase and rewrite it file by file in zig and you _can_ include c headers in your zig files verbatim. Both of these are not possible in rust.
You can. Either via FFI and bindgen-ing headers, or by using c2rust. The latter is not just a toy ambitious project, but actually a very impressive piece of engineering and does produce a result where you can transpile a project or file and start rewriting file-by-file or function-by-function.
c2rust is definitely cool, but it also supports transpiling to a single architecture, which often misses a lot of architecture dependent code and specialisations in real world c code. Especially because it has to do macro expansion and you only get the expanded code.
It also doesn't supportal a good amount of more complicated c features.
It's a help for sure, but the few times I tried I ended up just doing the rewrite by hand instead to actually cover all the cases.
Can Swift really be called a success when it's only being used in the very narrow niche (the Apple software ecosystem - and only because it's essentially dicated by the platform owner)?
Yes, that is how languages become a success on the market, either being pushed by the platform owner no matter what, or by having a framework written in them that everyone feels like using.
Everything else are aspiring extras in the cinema of programing languages, waiting on the sidelines for that major role that will come someday, it really will, one just has to believe enough.
Not at all, given the way Google is pushing it on Android, they are even mischievous enough to compare Kotlin with Java 6 when going on about why Kotlin.
For all practical purposes ART has turned into the Kotlin Virtual Machine.
However it so happens, that to keep the advantage of using Java libraries in the Android ecosystem, they actually need to support newer Java libraries unless they feel like rewriting all of them into Kotlin.
So Android 12 and 13 got a subset of Java 11 LTS, and Android 14 is getting Java 17 LTS support.
That doesn't change the reasoning that Android is all about Kotlin nowadays, unless one is writing platform code itself.
The idea Kotlin wouldn't be successful without Android isn't right. It was taking off in a big way even before Android did anything with it officially. Google's stated rationale for adopting it was because developers were already migrating to it regardless of what Google wanted.
Kotlin wasn't created by Google and wasn't pushed by them. It's more like they accepted it because the Android developer base was adopting it organically en-masse anyway. That's also the case for many other places where Java is used.
> Just as ambitious as it was for Lattner to create LLVM, I suppose.
A big difference is there was an element of no choice to LLVM, at least in Apple’s takeover of the project: GCC didn’t want to allow / do what Apple wanted / needed out of it.
binutils is quite a bit of a jack-of-all-trades code for handling binary code formats in portable ways, a compiler doesn't need half of it (incl legacy) if you focus it more like TCC, the larger part will be code-generation across a bunch of architectures.
Two issues here. The first is code generation and the other is
bootstrapping.
Ime, the optimizing passes of a compiler are easy and fun to write. You
have to read research papers to understand how register allocation and
ssa form works, but it's fun code to write. You just send the ir
through various optimizing passes, each refining it in some way. You
can write high-quality optimization passes without llvm. But then you
want to linearize your ir into machine code. This step is boring and
mundane to write unless you care about all the different ways
x86-32/64 can encode a "mov [eax+8*ebx], 123" instruction. If you are
optimizing for binary size, do you want to bother measuring on which
platforms "push eax; push eax; push eax" is shorter than "add rsp,12"?
And this is just for x86. Multiply it tenfold for all the other
non-x86 architectures irrelevant to most developers. The chance of
having major bugs in the code generator for obscure architectures that
stay undetected for years is very high.
The second issue is bootstrapping. What compiles the Zig compiler
written in Zig? You could, for example, have a non-optimizing minimal
Zig compiler written in C that compiles the Zig compiler. But since
it's non-optimizing, the Zig compiler would need to be recompiled with
the optimizing Zig compiler. Not unsolvable problems, but ime long and
complicated build processes risk driving away potential contributors.
Suppose you are implementing a new feature, how do you test it? First you compile the bootstrapping compiler. Then you use the bootstrapping compiler to compile an unoptimized optimizing compiler. Then you use the optimizing compiler to compile an optimized optimizing compiler. If the compiler doesn't rely on llvm then there will be more code to compile which will make this procedure slower. Especially since the bootstrapping compiler probably isn't very fast (though idk if this is the case with Zig).
Any change to the compiler can break or cause unintended side-effects to bootstrap. For example, adding an import to a module in the compiler might break bootstrap if the imported module uses syntactic constructs the bootstrap compiler doesn't support. The problems are often very subtle.
With the amount Zig promoted being able to use it to compile c (and maybe c++ I forget) only to decide no LLVM at all seems wild. Without a LOT more people chipping in to support the odds of even coming close to the platform support of LLVM seems incredibly unlikely as well. I would understand planning to add another backend of their own for those who want that, but just getting rid of LLVM seems... rash?
It might be rash if work had already started on it, but as with the other proposals on the issue tracker not labeled with "accepted", this is just a call for discussion and counter-proposals. At this point it's just the team collecting feedback, learning about affected use cases, and gauging feasibility -- opposite of rash, I'd say.
If you read the text in the linked issue you'll see that there they're not entirely removing the LLVM backend. They're just decoupling it from the main binary, so you can still easily use LLVM as a backend if you have it installed on your system.
Makes sense, kind of silly to have to bundle a 100MB+ copy of LLVM if you don't need it for the common case, and if you're a developer you'll probably have it installed already anyway.
The problem could be that it'll be hard to guarantee that your Zig installation works properly with the version of LLVM you have installed? We'll see..
I'd recently been reading a bit more about zig, and even tried out using it as an easy way to compile c++ with llvm without having to fuss with system packages -- transitioning from c/c++ to zig was a major selling point
It just seems really abrupt, certainly unexpected. I'm not saying it's the right/wrong thing for the project -- just really really out of left field from my point of view
I did the pre-1.0 language thing once with Clojure and don't really feel like doing it again, but I've certainly been keeping my eye on Zig because it has some interesting ideas. But this set of changes, even if it is still possible to wire it up yourself, makes me way less interested.
Sure, but it's a proposal from Andrew Kelley, the language creator, and it's got a bunch of thought-out subitems that seem to have real progress being made. It seems more likely than not that this will come to pass, unless the community reaction really is heavily against.
I wouldn't assume that just because Andrew has written the proposal it will be accepted. There have been plenty of times where proposals from Andrew have been rejected and/or reworked.
The sub-items of this task are still valuable to complete even if this overarching proposal were declined. Most of them are not being completed as a prerequisite for this proposal. There are benefits gained even without the full removal of LLVM.
I can really respect the really wide-reaching views and goals of Andrew with zig and in proposals like this, even if I don't agree.
I mean, one of the checkboxes under LLVM is literally "optimization passes", so... I don't know how close they are to the end goal in terms of progress actually
Im guessing you'll have two categories of tier 1 support:
- Built-in tier 1 support.
- Tier 1 support through the optional LLVM backend.
If they already have tier 1 support for an architecture, I don't see why they'd loose that support when they make the LLVM backend an optional dependency?
And as Andrew writes in the proposal, this way might provide a path for better support for more obscure architectures. I would happily work on a backend for Zig for some interesting processor architecture. No way I will ever contribute to LLVM. Working with C++ in not something I'll ever do to fun
LLVM just works well on the front that clang uses, after that, llvm itself is a wild beast full of worms, which is becoming more and more painful to work with if you are a language creator/mantainer, A lot of untested paths, and hidden bugs that zig has hit before... many times.
It will NOT reimplement everything, if you read the proposal, it's in favour of changing the LLVM dependency (the libs) not dropping LLVM IR generation, this will come with performance regression since now will be up to the team to get the correct IR, and making decisions LLVM IR generation does already in LLVM
The problem is that clang is being dropped, which means, unless we have a new C++ front made in zig (a-la AroCC for C) we are gonna suffer quite a bit for projects using C++ with zig.
Writing a C++ compiler is several order of magnitude more complicated than writing a C, Java, Go or Zig compiler. There's a very good reason there are only 3 in existence despite how ubiquitous C++ is (and even then, it takes years for them to keep up with the latest standards). C++'s grammar is type 0, there's isn't even an EBNF definition of it because it's pratically impossible to write a complete one. Clang only succeeded thanks to massive investments from the biggest players in the industry, and GCC/MSVC simply grew alongside the language. All other C++ compilers died a horrible death a long time ago.
Out of curiosity, does Intel's icc compiler see much use? It looks like it uses LLVM these days, but its frontend presumably still needs to handle all of C++'s complexity.
ICC is deprecated and will no longer see a release, but it uses the EDG front-end. Its replacement, ICX (the oneAPI compiler), uses clang as its front-end.
There are essentially only four extant C++ front-end implementations: GCC, Clang, MSVC, and EDG. All other C++ compilers are based on one of these four implementations, or have since gone extinct. (Except maybe Green Hills, but I can't recall anymore if their front-end is still in-house.)
Got it, thanks! I knew that Intel had a compiler for C and C++ from reading blogs about compiler research, but I didn't know any details about its current architecture.
One of the mantras for Andrew Kelly has been "do not use Zig in production" until it hits 1.0.
People know that, but it's difficult to avoid using it for real things once you've tried it and it works :D.
I haven't done that with Zig, but with Kotlin things like serialization/coroutines/kotest/KAPT (it was the same feeling: oh this stuff is "Experimental" but so cool, I can't do real Kotlin without them!!)... well yeah, I spent many hours rewriting stuff due to that, and totally acknowledge that was on me.
Just wanted to point out that this is just a proposal, so if enough people voice their opinion, which many have already done, I am sure the core team will adjust their approach.
Ah, the obligatory comment from Walter Bright talking about D in posts about Zig
(Nothing wrong with it, just a bit funny how predictable it has become)
I guess the interesting difference in approach here, is that D seems to have completely separate compilers, while it looks like the main Zig compiler will support LLVM as a backend if you have it installed? If true, I like the approach Zig is going for.
yesterday on a thread on DMD somebody quipped "Funny how WalterBright seems to comment in every single HN thread other than this one..." and I thought it was hyperbole. Next HN thread I open, I'm proven wrong
That's cool. Another language with multiple compilers is Common Lisp, which has a dozen or so production-ready compilers (some commercial, but most free).
That allows people to use the fastest compiler during development, and the fastest runtime (or smallest memory footprint) compiler for the final release, which is really useful.
Also, just having a language specification which all compilers adhere to ensures that the language is stable and won't just break things under you (for better or worse, there are advantages for languages like Zig that can still change anything they want in order to make the language more consistent/cleaner/more powerful). I tend to value this stability a lot more these days - it lets me put all my effort in creating real value for users instead of constantly keeping up with development tools.
15 years later it's negative. The user experience of having to download and switch compilers (or even think about it at all) is terrible, especially given that (after performance) one of the main motivators of switching to ldc (i.e. all serious use of D to make money) is either a bug or lack of platform support in dmd.
Most importantly it results in a significantly increased workload for the maintainers, and spreads out effort over a bigger area.
So the most likely outcome is a language with multiple mediocre compilers, instead of one really good one that everyone works on.
Which is the exact opposite of what the intention behind this proposal is, which is to make the language more maintainable.
One of the main reasons Zig was interesting to me was the fact that I could drop it in as an alternative to a C/C++ compiler. On Windows, my friends have mentioned how it is easier to install Zig as a C/C++ compiler than any other alternative.
If this proposal is accepted, I personally think Zig will drop to the popularity level of Hare or other extremely niche languages. Getting my colleagues to even try Zig out required me sending them articles about how Uber was using it in production. There is no way my colleagues would even have given it a second thought if it didn't have immediate value to their existing projects.
But I get where the proposal is coming from. LLVM compile times can seem awful, and there's lots of neat optimization tricks you could implement with your own byte code. And dealing with bugs in LLVM is basically a no-go, I've seen this happen in the Julia ecosystem as well.
If my recommendations are worth anything, I think Zig should
1. Use a custom bytecode for debug builds - fast build times, fast debug times etc
2. Use LLVM for release builds - fast runtime performance, slow release performance
If they can manage 1.) while still maintaining support for cross compiling C/C++ (just pass that part off to LLVM?) I think that might be the best of all worlds, with the tradeoff that there's additional backend code to maintain.
> And dealing with bugs in LLVM is basically a no-go, I've seen this happen in the Julia ecosystem as well.
As one of the folks dealing with LLVM bugs in the Julia ecosystem.
Yes it requires a distinct skillet different from working on the higher-level Julia compiler and yes it can sometimes take ages to merge bugfixes upstream, but we actually have a rather good and productive relationship with upstream and the project would get a lot less done if we decided to get rid of LLVM.
In particular GPU support and HPC support (hello PPC) depends on it.
But this is also why we maintain the stance that people need to build Julia against our patchset/fork and will not invest time in bugs filled against Julia builds that didn't use those patches. This happens in particular with distro builds.
Approaching this from the lens of an interested observer with experience with other projects working on non-LLVM backends for their languages, I'm just struck by the amount of hubris in the linked proposal.
If it weren't for the author, I'd have dismissed the whole thing as a low-effort GitHub issue by a Zig newbie. The underestimation of the work involved, the backhanded dismissal of all the work that has gone into LLVM, the "obviously we can do it cheaper, faster, better" confidence/machoism that exudes from each sentence or claim.. I'm disappointed.
I've had nothing but respect for Andrew's work in the past so I'm going to extend him some grace and the benefit of the doubt and say that maybe it was just written in haste or on a whim without paying attention to how it came across. But it's not something that inspires me with confidence or lends me to give the proposal itself more of a chance.
I know nothing about LLVM backends, but it’s pretty common that libraries aiming to solve 100% of users problems eventually become unwieldy and slow compared to a more optimized alternative. Web Pack vs. esbuild is an example of this.
I get what you are trying to say but I have experience with both web and native tooling. This isn’t a valid comparison. Moreover, llvm isn’t one tool, it’s modular, fairly well-architectured, and can be compiled with or without any of a million different features.
The correct time to build something without LLVM was years ago, but taking the C++ functionality out now will probably be the end of Zig. I'm surprised they didn't announce a plan to write their own C++ compiler in Zig instead of phasing it out.
Just writing a parser for C++ is a gargantuan project. I'm not even sure if an individual can write a C++ compiler if they started on their 18th birthday. There may not be enough hours left in their life to write a functioning compiler.
Check the history of Circle, not only did Sean Baxter implement a full C++ compiler frontend, it has lots of extensions from all C++ wannabe replacements and a Rust like borrow checker.
Zig wants to be the world new C not so much the worlds new C++. You’ll note that C cross compilation would still be supported under this proposal.
I think this might be the right choice. C is currently used heavily in the embedded world, and that’s a space where llvm is not good. If I was Zig, I’d want to be able to target all the microcontrollers and this is the only real way to achieve that.
One of the its biggest features is that the Zig toolchain is vastly superior to any other and opens up new possibilities to Devs of C/C++/Obj-C. One of its biggest selling points is to be able to use build.zig inside of projects with a C/C++/Obj-C codebase and thus get around using a shit ton of build tools.
If zig cc and zig c++ die I would instantly stop using the language!
(Rust user here, but admirer from afar of Zig and the principles it seems rooted in.)
My peanut gallery observation is that a very bright person named Andrew set out on a journey of hard work to come up with a new language. Along the way he (and others) solved many tough ancillary problems related to the toolchain of this new language and how it should provide a useful path for interoperability and migration from code written in other languages.
The toolchain innovations were noticed, embraced, and leveraged by users of other languages as it made their lives easier in contexts where they weren't even using the Zig language.
Now, for Andrew to make even more innovative progress on his endeavor, he needs to undo some of the toolchain innovation and remove complexity that was initially used to bootstrap his endeavor. Removing complexity (from language) is one of the admired traits of this bright person, and here they are applying that good trait again to move the whole project forward.
But 'lo! Other users of the toolchain innovation will be affected... some would say they even appear "entitled" to these innovations... as if their collective need somehow outweighs Andrew's right to direct his energies to his project in the way he decides.
This whole thing smells like a story I read once... I think it was called "The Fountainhead".
1. "We can attract direct contributions from Intel, ARM, RISC-V chip manufacturers, etc., who have a vested interest in making our machine code better on their CPUs."
They are nowhere near popular enough (or used enough by some particularly important customer) for any major architecture vendor to spend any time contributing except as someone's random side project. To think otherwise is ... really out there.
2. "We can implement our own optimization passes that push the state of the art of computing forward."
There are in fact, no magic bullets. Once you pass the baseline of optimization capability, the reason these compilers do well is because they've been worked on forever, and made better 0.2% at a time.
Also, anything you implement they can implement. Maybe it takes annotations, or whatever, but that's about speed and not capability.
3. "Compilation speed is increased by orders of magnitude."
Uh, not if you are doing #2. Most optimization passes, especially when you are first productionizing research, are quite bad. It takes a tremendous amount of applied engineering to make them fast.
This is what i did on both GCC and LLVM. Implement and speed up ad nauseum. I implemented plenty of high-optimization-value, never been productionized before algorithms. It usually took a few versions and lots of slow-compiler bugs to figure out the best way to implement. It turns out most researchers are not spending their time working on the compilation speed. At best, they care that it's passable.
For existing well-productionized algorithms (which don't push the state of the art), you will not get orders of magnitude speedup. You may get some percent depending on how you structure your compiler.
There are certainly slow parts of LLVM, but it's hubris to believe you are going to make something both better optimizing, and seriously faster, for this kind of language.
There are other languages for which it is true. Zig is super unlikely to to be one of them.
The way you gain compilation speed for this kind of language is to optimize less. Spend as little time processing things into machine code as possible, using as fast of algorithms as possible, and where you can't, relying on heuristics and such more to help generate good enough code most of the time.
There is more, but man, this feels out there.
If they said "we want to get 90% of the performance at 60% of the cost", sure, maybe.
But saying, basically, we will get >100% of the performance at "orders of magnitude" (their claim) less cost is just, as i said, a wild idea.
I wish them the best of luck.
Everyone who tries to reinvent good infrastructure is doomed to discover why that infrastructure was invented in the first place
.
Long time compiler hacker/engineer and compiler/programming language PhD here all great points. Worth saying out loud that many reasons why this stuff is slow is not due to bad code, its due to the fact that many of the best algorithms are in higher complexity classes and just scale poorly with program/translation unit size. For example when I was working on `rustc` a big challenge was heavy reliance on inlining and inlining has a cost, plus it increases program sizes making everything else take longer as well.
I feel like Go already went through a whole saga of this where the community started with "LLVM and SSA are bad and slow", then a few years go by and they end up building their own SSA IR and spending a bunch of time trying to bring compilation time closer to what it was before as it made everything much slower.
> I feel like Go already went through a whole saga of this where the community started with "LLVM and SSA are bad and slow"
I've been a contributor since the Go compiler was a tree-based C program and I've never heard anyone say that. What they said (and it's in the Go FAQ page) is: "At the beginning of the project we considered using LLVM for gc but decided it was too large and slow to meet our performance goals." [1]
If you're building a language with the explicit goal to make it compile fast, it's objectively true that starting out with LLVM is not the best approach. You'll get incredible runtime performance of the generated code since the early days, but NOT fast compilation. The Go makers choose a different tradeoff.
> and they end up building their own SSA IR
They switched to a SSA IR because it was a good idea to begin with, after an initial phase with the tree-base prototype. I've also never heard anyone argue that "SSA is bad", despite what you claim. The first compiler was tree-based because they reused a simple tree-based C compiler from plan9.
> building their own SSA IR and spending a bunch of time trying to bring compilation time closer to what it was before as it made everything much slower
The new compiler was ported to Go (machine-rewritten from C) and that's the main reason it was ~2x slower than the old compiler. It's not due to the switch to a SSA-IR.
"Worth saying out loud that many reasons why this stuff is slow is not due to bad code, its due to the fact that many of the best algorithms are in higher complexity classes and just scale poorly with program/translation unit size"
Yes, this is totally true.
It's also possible to affect this in theory but hard in practice.
Usually you shoot for making it O(N^2) (or whatever) where N is the number of variables you want to try to optimize instead of N being number of blocks.
The complete GVN and GVN-PRE in LLVM is theoretically N^3 or N^4, but as engineered it's often much faster (while getting more optimization) or at least not more than a few percent slower than the the existing O(N^2) GVN. It achieves this by being sparser in most cases. The old GVN has to iterate the algorithm, and iterates non-sparsely. Everything is reprocessed because it doesn't know what can change. The new one iterates only the things that could change using fine grained dependency tracking (something akin to how sparse constant prop works). This is often the best you can do.
I will say it's possible to affect these time bounds in theory because if you go down the single static rabbit hole, you realize it's more generally applicable. Kenny (and others) proved this in his thesis about sparse dataflow. That was the whole point. SSA is just one example of a form that enables things to be linear time (and in fact, you can do it without SSA at all using interval methods and such). There are papers that show this about other SS* forms and show examples, but they were mostly (sadly) ignored.
Concretely, for most optimizations like PRE/etc, there are linear time single static transforms of the IR that will make the optimization linear time (Or at least remove a factor of N for you) by explicitly exposing the dataflow in a way that matches what the optimization algorithm wants.
This is awesome in theory - like really cool when you think about it (if you are a compiler nerd), but also completely impractical (at least, AFAIK). Rewriting the IR form for each optimization to the exact requirements of the optimization (single static use, single static exposed uses, single static upwards exposed uses, single static exposed defs, etc) is much more expensive in practice than well engineered, higher complexity, "N in the number of variables" optimizations.
Equality saturation stands more of a chance, IMHO.
I agree that it's probably impossible to write something that is orders of magnitude faster than LLVM and better than LLVM at optimizing, but I don't see that claim in the original text.
It seems like a lot of work but also doable to create something that is orders of magnitude faster than LLVM for unoptimized debug builds. Furthermore, the Zig compiler might be an easier thing to work with than LLVM/Clang for doing productionizing of optimization research.
As the parent comment said, and I mentioned in my reply. JavaScript was just incredibly poorly optimized/not compiled and they applied 20-30 years worth of compiler research to make it significantly faster. You also had an alliance of every hyperscaler working on the tooling for a decade plus with help from all major hardware vendors to bring the best performance out of it. One driver of LLVM was Apple and WebKit which at one point was using LLVM for its JIT compiler so many improvements figured out in that period have also already been applied to LLVM.
LLVM already has decades of research applied to it to make it produce fast code, it will be incredibly challenging to even match its performance across all the targets it supports let alone improve on it in significant ways. It would be better to spend the time building an optimization pipeline for Zig itself and being more thoughtful about what code you send to LLVM versus trying to replace it wholesale.
My impression is that LLVM spends the bulk of its time chasing pointers, so one could fix the issue by "only" changing the layout of the data to one that is friendlier to computers.
There's a ton of C++ libraries that won't be rewritten in C anytime soon, and even loss of ObjC support is very painful for anybody caring about the Mac platform.
The ability to build mixed C/C++/ObjC/Zig projects without having to deal with multiple compiler toolchains was indeed one of the killer features of Zig to me and the one thing that separated Zig from other "better C" attempts.
Hopefully the Zig project is at least thinking about bundling a Clang toolchain with the vanilla Zig distribution, and call Clang from 'zig cc'. To the user this wouldn't make a difference compared to the current feature set (apart from losing backend support for some CPU architectures), but still decouple Clang and LLVM from Zig.
I see your comment is getting downvotes, but that's probably because people didn't get the joke. In addition to being the name of the programming language, Zig is a character in the old Namco 'Zero Wing' game, which is where the meme "All your base are belong to us" comes from [1]. "For great justice" is part of the same meme.
I'll admit that I've only considered using Zig for its ability to easily compile (and cross-compile) C/C++ code into static binaries.
Hopefully if any of that functionality gets removed someone manages to fork off the "C/C++ magic" part into its own project. Clang itself is much less convenient to use.
Every time I hear about LLVM, it turns into a rant. Clearly there is a problem there that needs to be fixed. Maybe the LLVM project team should address those issues.
The "problem" with the LLVM project is its massive success, coupled with its incredibly difficult problem domain. Turns out it's actually really hard to write modular compiler infrastructure that serves as the optimizer and code generator for N different arbitrary programming languages. The fact that it works in this capacity at all, and still manages to be competitive with GCC in its original use-case (being a C/C++ backend) is a monumental and unmatched achievement.
As someone who works on an LLVM-based compiler at $DayJob and also has written a compiler front-end that uses LLVM in my free time, I do have a ton of gripes, but any time I feel particularly frustrated by them, I spend a little bit of time working on my non-LLVM backend. After a few days of angry swearing with little to show for it, I go back to working with LLVM with a much greater appreciation for what it's giving me.
You got pro tips for working with LLVM types after parsing LLVM IR? I can't for the life of me figure out where in the class hierarchy I am, and the doxygen is... interesting. Language wrangling is much nicer in a language with proper ADTs like Haskell, but I also feel like there's probably that bit of LLVM documentation that I haven't read.
I'm also interested in this. I tried using the Node bindings from https://github.com/ApsarasX/llvm-bindings in Typescript, its types look like they map pretty closely to the original LLVM types; figuring out even a basic example was a bit tricky with LLVM's documentation.
Btw, is there a good learning pathway for LLVM/MLIR for folks w/o compiler background? I come mostly from HW but is super interested in the topic due to work-related stuff. Since your day job is LLVM related maybe you can give great advice.
Part of the issue is that LLVM is stuck between two very hard places.
1. Needing to support tons of different platforms with various different documented and undocumented behaviours
2. Needing to support many languages with specifically undefined behaviour.
Trying to bridge the undefined nature of the two sides means that things can be fragile. Assumptions that worked for one set of undefined problems may not work for another.
But so much depends on LLVM these days that I can’t think of it as anything but a success. A flawed success but one that is doing its best to bring order to a naturally chaotic problem space.
GCC and MSVC have similar problems because it’s inherent to the problem space. So while it’s frustrating hitting those bugs, everyone in the space knows they aren’t fundamental issues with the project itself.
It's not obvious that "LLVM, but not incredibly frustrating" is a thing which can exist. I don't think it's likely, but it's possible that in a few decades the widespread view will be that the very concept of LLVM was a mistake, and a universal compiler backend is just a trap which makes it easy to bootstrap a language but inevitably causes massive problems down the road.
LLVM does have plenty of incidental problems which are clearly fixable and just need a lot of work, but even if you fixed all of them you'd still have people who use LLVM ranting about it.
MLIR is "LLVM done better", in fact by the same person. It fixes many of unforced LLVM problems, for example LLVM's inability to parallelize code generation.
Andrew even thinking about dropping LLVM, says all I need to know. Its going to become the next D language of 2020s. LLVM has decades of mistakes FIXED, and rewriting the first time, even the 6th time is always... filled with huge mistakes and security issues...HUGE.
brew uninstall zig. Thats what I did when I read it. I do not want to waste any of my time on soon to be dustbin tech.
Does anyone have any good examples of Zig being used as a build system for a large project?
Coming from a Windows world when my gigantic C++ project is hidden inside a vcxproj I dont delve into that often, I tried to compile a standard autoconf based project into wasm the other week and eventually gave up.
Use autoconf to create your cmake to create your make which builds your lib? and wrap all those calls with emscripten calls...or something along those lines. It was exhausting. Eventually it errored out becuase it couldn't decide if the size of a size_t was available. I was tempted to just remove that line from the cmake.
As it is know, Uber - and some other heavily-capitalized firms (BigTech or otherwise) - use Zig (at least) for "straightforward" C++ compilation.
- What about asking your managers to seriously fund the Zig project?
It's only fair from all the value you're getting from it. And, then, reinventing an important part of the huge wheel that's LLVM wouldn't be such a seemingly unfeasible tall order.
I don't think the proposal says that llvm will be dropped as an optimising backend. It also doesn't say that you can't call clang at compile time using some other mechanism. It just says the zig compiler will not depend on llvm.
Since you're already not even close to on-topic, I just want to say that people on the internet commenting about "how they misread the title" are some of the worst kind of comments.
Titles could mean all kinds of various, new things if you substituted meaningful words for other meaningful words. Or in this case, acronyms hiding several words at a time.
The final result of your comment is that apparently you're critiquing LLMs in a thread that has literally nothing to do with them, and seemingly trying to talk about them instead.
I thought it was funny. They had an experience that amused them and they shared it. Not everything has to be to your personal tastes.
If you think that's one of the worst kind of internet comments, you might want to recalibrate a bit. I have a variety of things from my anti-abuse work that I'd be happy to send you.
Your theory is that because I have seen abusive content, my sense of humor is now so broken that the things I find funny are objectively bad? That makes no sense at all.
Also, it's wild to see that somebody who's been using this site about 4% of the time I have telling me that I just don't get the context.
Sure, bub. Your personal preferences just happen to be objectively correct, and mine are due to my ignorance and brokenness. Whatever you say.
> Your theory is that because I have seen abusive content, my sense of humor is now so broken that the things I find funny are objectively bad?
No, I did not present this theory, in my actual comment or any implication it made.
My theory is that if your job is dealing with abusive comments, perhaps you aren't judging the non-abusive comments properly in terms of what constitutes a bad comment. All kinds of bad comments can be funny. Jokes are generally considered bad comments here on HN, for instance.
Your theory was unclear. Don't get cranky when people do their best to interpret your lack of clarity.
I did not say my job is dealing with abusive comments. If it were, your theory would still make no sense; being a professional judge of content makes you better at evaluating content, not worse. Some jokes are considered bad comments here, many aren't. Which is irrelevant, because the comment you're grumping about was not a joke.
And I will note that you carefully dodged the heart of my point, which is that you are confusing your personal preferences with objective standards. And although you've quibbled in the neighborhood of the point quite a bit, you haven't addressed the ridiculous notion that these "are some of the worst kind of comments".
If you'd like to complain about bad HN comments, I'd say being querulous is not great, and then carefully avoiding dealing with substantive criticism is well worse.
Given the barrage of negative comments on the PR and here as well, it really feels like Andrew is the only person in favor of that change.
It's his language, obviously, but I strongly suspect that if this PR goes through, it will completely kill any kind of momentum that Zig has today and will most likely doom the language to obscurity.
Unless they sort out the issues related to use-after-free, the value preposition is hardly any better than using C and C++, with the memory corruption tooling there is around from the last 30 years.
But man, it seems to me (uneducated on the challenges zig faces w/LLVM) that this shifts the team's capacity away from zig and towards things like binutils. When I read the headline I assumed that they were throwing out the compiler (IIRC they had mostly/totally excised it already). But for a project like zig it just seems like there's a lot to be gained from keeping it.
That said -- the prospect of rewriting a lot of the stuff that's in the LLVM project now in zig instead of c++ - that's pretty cool and ambitious. Just as ambitious as it was for Lattner to create LLVM, I suppose.
But code that's accidentally quadratic - well, that's bound to happen to zig too, if it's as popular and useful as LLVM project is.