Hacker News new | past | comments | ask | show | jobs | submit login
6502 back end for LLVM (2022) [video] (youtube.com)
159 points by Turing_Machine 10 months ago | hide | past | favorite | 30 comments



My compliments for a very good 10 minute talk! I don’t think I would have clicked play on an hour-long presentation.

Also, I particularly like how they hacked the LLVM workflow to deal with the registers. I don’t know LLVM at all, its just a toolchain option to me, but it was interesting and engaging.


There is also a talk on the M68k backend in LLVM:

https://youtu.be/-6zXXPRl-QI


Huge fan of LLVM-MOS. I have a friend who built a W65C02S based console (gametank.zone), and I've been able to get some Rust demos running on it :)


For those who want to read further:

https://llvm-mos.org/wiki/Welcome


Static stack allocation is the approach that the 6502 really demands, and it's cool to see in a conventional compiler toolchain. See the Cowgol language for another example: https://cowlark.com/cowgol/


There's another that uses what they call a "compiled stack" which I believe is the same concept. https://www.dustmop.io/blog/2019/09/10/what-remains-technica...


That requires the compiler to be aware of non-reentrant functions, though. It might be viable in small-scale codebases as are common in much embedded code, which is where a 6502 would most likely be used.


I used this to build a NES game in Rust https://github.com/kirjavascript/rust-nes-tmp


(lazy DDG) : is there a tutorial that explains how to compile for the Apple 2(+ or e) ?


It would be great to hear how they've progressed, since this video was filmed a year ago, toward polishing and preparing their work for mainline inclusion.


Once again by newbie test. I have installed it and generate code. Now how to run the code is my problem.

The doc tells you to use one of the many simulators. But which one I wonder. For example there is a 6502 simulator generated code. Which one I should use? Not telling. And saying one should know is not that helpful. Just one more example link it would be all good.


Finally find after one day that it is quite easy. Download vice. right click the org file generated for c64 and then right click it to use vice's x64sc. It will output the Hello World message (as part of the basic).


m68k got merged. I am hopeful for 6502 to also be.

pic12/16/18 next? (a man gonna dream...)


It seems unlikely that it'll ever get fully upstreamed, but according to a post on the discord that I'm pasting in full below, there certainly can be parts that may get upstreamed. I do a lot of NES related development in my free time, and llvm-mos has been awesome for rapid development. I'd love to see of this get upstreamed in the hopes that it could reduce the maintenance burden on the small team, but I'm not trying to speak for them or anything haha.

> So, I wanted to do a little blurb on the topic of upstreaming LLVM. My previous answer to this questions was "yeah, we'd like to, but we have more work to do." This implied that we were working on it. More accurate to reality is that we were keeping it in the back of our heads and doing work to decrease the diff from upstream. The latter is also useful for making merges from upstream easier, and that's closer to the real reason I was doing it.

> Well, I've lost some rather high-profile fights upstream. In particular, upstream now strips out attribute((leaf)) from LTO builds, which is the whole thing that makes our static stack solution work. I personally think this decision was totally bogus, and wasn't alone in this, but the conservative voice won out. My experiences with the LLVM community so far has been one of deep conservativism; the stranger you are, the more you need to justify your strangeness with real-world impact. We're a very strange hobby project, which just doesn't bode well. We could make our backend a lot less strange by making it a lot less good, but then it becomes impossible to compete with incumbents like KickC and cc65.

> Accordingly, I'm not keeping the goal of maintaining llvm-mos entirely upstream in the back of my head anymore. I don't oppose work along those lines, unless it interferes with making llvm-mos the best 6502 compiler we can.

> That being said, LLVM may independently change to be more amenable to us, so this may become easier in the future. This has already happened prior to us, with GlobalISel and AVR making development of the backend far simpler. If that happens, I'll definitely reexamine my opinion on this.

> Alternatively, I'd definitely be open to upstreaming the unobjectionable parts of llvm-mos backend; we could then maintain the actual distribution as an increasingly thin fork from upstream. In fact, we could probably get started on that project today; I haven't yet spent much time considering the idea, but I'm starting to like it more and more, since it gives increased visibility, easier merges, and an excellent reference backend for upstream documentation. (We're really nice once you strip away all the crazy!)


It might be more accurate to say that OP "lost" that fight because the existing semantics of that LLVM attribute were half-baked, and would have resulted in wrong-compile bugs if preserved in LTO. That's something that can be fixed at least in principle, but it requires adding some other extension to LLVM IR, that's closer to what OP is looking for.


The story is a little more complex than that; the semantics were internally consistent, but foot-gun-ey, and they could technically be taken to match GCC's documentation of how they behaved, depending on an unfortunately ambiguous phrase in their docs.

The old semantics also matched GCC's actual behavior; when we brought this up in a GCC issue, those present decided that GCC's behavior was wrong, but the appropriate maintainer couldn't be reached for a final say. The issue is still hung like that.

There were also a few other folks trying to do the same kinds of whole-program call graph analysis this enables, IIRC for GPU purposes. So, there was a lot of conflicting opinions about how this should work, a lot of uncertainty, all the recipe for a big long endless thread.

EDIT: This is of course my extremely biased take on the proceedings. This was also the first and only "open source kerfluffle" I've so far been direct party to; I've seen these come and go on mailing lists before, but I was surprised how different it felt to actually be inside one.


Even though it might make oldschool/retro developers recoil, I fancy the notion of being able develop for older systems using modern languages and tools.

I know that even using C instead of ASM on these 8-bit systems typically results in painful size and performance penalties though. Are there effective tools/techniques for narrowing that gap?


After watching which 10 min is really limited I guess we should point to as the last slide does : https://llvm-mos.org/wiki/Welcome


As I know, LLVM uses 32-bit registers and operations even if you have a type like u8 or u16. Is this problem solved, or bool and u8 will still use 4 bytes on 6502?


LLVM does this on x86 because partial register writes don't break dependency chains in many cases, meaning that you can get stalls due to false dependencies.

There's nothing in LLVM itself that makes it use larger sizes, it just depends on the ABI and what's fastest when ABI doesn't matter.


We must have a Rust Apple IIe target, damnit! :) But will Hello, World fit on a floppy or will it be all stacktrace symbols even with no_std?


Build in release, optimize for size. LLVM-MOS produces some remarkably good assembly. Rust is still missing some optimizations though, since rust-mos is a few years out of date.


Amazing what he could present in 10min. I’d like to see a longer version.


I'm waiting for the day someone manages to get AI to do the job of a compiler...

It's easy to find a sequence of instructions to implement a program, but the challenge is finding a good, small, fast sequence of instructions. The difference between current compilers and the best possible sequence is often 10x, and that difference is worth tens of billions of dollars - since you can reduce your CPU budget by 10x if all your programs run 10x faster.

Searching all possible sequences is infeasible, but it seems very practical to get AI to assist with some kind of directed search. Either additively (starting with an empty program and adding instructions till it correctly implements the programmers program), or subtractively (use a dumb compiler to make an inefficient program, and then use some ML model to decide how to adjust the instruction sequence to be smaller/faster, while remaining correct).


There seems to be a lot of research into compiler optimizations guided by machine learning: https://github.com/zwang4/awesome-machine-learning-in-compil...


> starting with an empty program and adding instructions till it correctly implements the programmers program

This would require an algorithm to check if the output program and input program have identical behavior. However, this is impossible as per Rice's theorem.


> However, this is impossible as per Rice's theorem.

You moved a _little_ too quickly.

There exist program-pairs which can be proven equal, and those for whom no proof exists. You can organize the production of new programs into finite steps and organize the act of creating a proof-of-equivalence between the input and output into finite steps, then execute one step of creating a new program candidate followed by one step of finding the equivalence proof for each of the (finite number of) candidate programs created so far. In this way you are guaranteed to find an output program and its equivalence proof whenever such an (input-program, output-program, equivalence-proof) tuple exists.

Finding the equivalence proof is recursive enumeration—the same as creating the program candidates—but in some machine verifiable proofing language.

Speeding this up by leaving out syntactically incorrect programs and equivalent programs, as well as defining the proofing language and implementing the checker are left as an exercise to the reader.


Researches have been using machine learning to identify potential compiler optimizations for many years now. Naturally the problem with doing so is somehow ensuring, with a high degree of confidence, that such optimizations are correct wherever they are used.


Compilers have bugs of course, but I would be very surprised if any workload worth billions of dollars has a 10x problem. All the FAANGs spend millions of dollars on their respective compiler groups, and it is their job to profile and root out such problems.

Often they find fixes that improve this or that internal workload by 0.2% and consider that a big win.


If you have 5x orders-of-magnitude more computing power to stumble into constraints optimization solutions versus very good constraints optimizers. Go ahead and build it while being 10^5 slower and more expensive for no value.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: