Hacker News new | past | comments | ask | show | jobs | submit login
Introduction to LLVM [video] (fosdem.org)
271 points by matt_d on Feb 6, 2018 | hide | past | favorite | 23 comments



Youtube mirror, for those who'd like to add it to their "Watch later"-list:

https://www.youtube.com/watch?v=VKIv_Bkp4pk


Was just about to do the same, regrettable about the audio quality.


Or for those who want to easily watch on 2x speed.


mpv does a nice job with that '[' and ']' select any arb speed and '.' steps frame by frame (with sound).


You're my hero


Slides and code examples: http://www.mshah.io/fosdem18.html


Mike says in his talk that LLVM started out with the goal of having an intermediate representation to run optimizations on. Hasn't this always been the case with compilers in recent times? I seem to recall similar things being said about the GNU Compiler Collection at a presentation many years ago.

If this is true, what is (was) the appeal of the LLVM project at the time of the project inception?


GCC’s passes, at least back then (don’t know about today) used numerous intermediate representations, many of which were simply dumps of their data internal structs that the following pass happened to be able to parse (usually by including the header of the previous pass.)

LLVM’s IR is the same “thing” all the way through the pipeline, and that thing is standardized separately from the compiler, so you can write tooling that processes it and expect it not to break with new LLVM versions.


That same “thing” also round trips to a text representation that looks like assembly. That means people can more easily talk about it, that programmers can use it to write test cases or even to to write code in it.

I think that has helped llvm tremendously.


As I understand it, difference is more political than technical. GCC is made to prevent people plug in nonfree parts into compiler. So you cannot plug in your dynamic library or use part of GCC as library which is artificial restriction. See this https://gcc.gnu.org/ml/gcc/2014-01/msg00247.html, https://channel9.msdn.com/Events/GoingNative/GoingNative-201...


If you really wanted to, I imagine you could write a proprietary compiler that generated some serialisation of one of GCC's IRs (GENERIC or GIMPLE), and you could write a deserialiser for GCC that reads in that IR. Release the deserialiser under GPL, then you can legally use GCC as the backend for your proprietary compiler.

This sort of IPC-like hackery often seems to happen when someone is looking to work around copyleft - they can then break the spirit of the GPL without breaking the legal obligations.


That's not too far from how the the first LLVM frontend worked, IIUC. llvm-gcc was a GCC fork (pre-Clang) that produced LLVM bitcode or IR. That IR would be fed into an LLVM backend for code generation.


You would still need to use gcc code to operate on the deserialized IR, which is GPL. Unless you then transform the IR into a proprietary one, in which case that's not breaking the spirit of the GPL.


> You would still need to use gcc code to operate on the deserialized IR, which is GPL

The GPL's terms would not extend to the program that generated the IR.

A GPL web server can't require a web browser to relicense under GPL, either. Same idea.


By your logic, every single bit of code that has passed through gcc is now GPL licensed.


Not just political, gcc’s IR was designed in the early 1980s and “just growed.” It’s not easy to understand.


At its inception the appeal was the "multi-stage optimisation system" and the virtual instruction set design (from https://llvm.org/pubs/2002-12-LattnerMSThesis.html)


At the time, I had a DSL I had written in C++, and I wanted to produce proper compiled binaries. I had started looking at the now defunct TenDRA - I was thinking it would be nice to compile to a "machine-independent binary" which could be converted to machine code on the actual runtime system. But soon after I found LLVM, and it seemed like a no-brainer comparatively.

I seem to remember that "machine-independent binaries" was Apple's first use of LLVM: distributing LLVM IR and having it be converted to machine code on the user's computer, back when they were supporting Motorola and Intel chips. And I think consequently that's how LLVM got a lot of its momentum.


> Mike says in his talk that LLVM started out with the goal of having an intermediate representation to run optimizations on. Hasn't this always been the case with compilers in recent times?

Not just recent times. The dragon book from 1986 covers IRs.


> Hasn't this always been the case with compilers in recent times?

Yes. LLVM used SSA form earlier than GCC did, and it makes its intermediate representation more accessible than GCC or other compilers. But the general idea of having a midend level IR for optimizations was not at all novel at the time that LLVM appeared. I think the point was mainly to emphasize the fact that LLVM IR allows more optimizations than the somewhat comparable IR of JVM bytecode.


I attended this talk, and it was one of the highlights of the conference for me. Mike is a great speaker and I found myself inspired to dive deeper into LLVM internals afterwards.


Shame the mic is so hot, it's really hard to listen to.


[video]




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: