Hacker News new | past | comments | ask | show | jobs | submit login
Mapping High-Level Constructs to LLVM IR (lyngvig.org)
106 points by jcr on Nov 9, 2014 | hide | past | favorite | 16 comments



"does not differentiate between signed and unsigned integer"

This isn't really accurate. IT's accurate at some level, in that there are not two different types. But that's because it's not necessary, they have nsw/nuw (no signed wrap, no unsigned wrap) flags on the operations.


It is necessary - you cannot reconstruct your function prototypes from the IR declarations. That's why, for example, SPIR provides signedness information in the metadata. This ugly hack would not be necessary if IR had explicitly signed types.


"It is necessary - you cannot reconstruct your function prototypes from the IR declarations. "

Why would you want to do this? You are trying to reconstruct high level, language specific info from low level, language independent info. Of course you can't do this.

Your comment comes out to "lowering loses information", which is, of course, true.

So let's try again: What semantic information do you believe is lost here. IE where does it result in an incorrect translation, or the inability to optimize something?

Because that's what "necessary" would come out to.


> Why would you want to do this?

Because SPIR requires that you can load and enqueue an OpenCL kernel from an IR, without having a source. And you need this type information in order to be able to format your arguments properly.

Please note that I'm not commenting here on the very choice of LLVM IR as a common medium made by SPIR committee.

> Your comment comes out to "lowering loses information", which is, of course, true.

Exactly. But the current common uses of LLVM IR do require some of the information which is lost, and, in case of signedness, it was not even necessary to lower it.

> IE where does it result in an incorrect translation, or the inability to optimize something?

You're limiting IR uses to translation and optimisations. Fine. I would have welcomed this way of thinking. But, unfortunately, this is not the case.


" But the current common uses of LLVM IR"

Portability of a high level language is an explicit non-goal of LLVM :)

The fact that someone decided to do it just makes them silly :)


I know it's silly. I can go on forever on what I think about SPIR and RenderScript. But, unfortunately, now it's a fact, an objective reality we have to deal with.


Nice but damn, I'm about to implement closures in llvm, but there's only a TODO at that point ;)


Closures are trivial: a structure with a function pointer as a first element (i.e., the structure alignment is function pointer-compatible) and the captured environment following it. Pointer to this structure is passed as the first argument.

The only potentially funny bit with the closures is construction of a set of potentially mutually recursive closures - in such case you have to defer filling in the corresponding environment fields until all the closure structures are allocated.


What language are you working on?

I guess for closures you simply copy all the locals that they capture into a heap allocated structure that also has the pointer to the closure code.


I have never implemented this, so I'm sure this will be incomplete and possibly slightly inaccurate, but that is not quite true. Some of the issues: copying may not be appropriate if your data is mutable (multiple closures can share a value or it might be modified in the 'regular' code after the closure is created), or if the code does identity checks later on.

Also, for performance, it may be beneficial to skip the 'put a local on the stack' part and create a to-be-captured object directly on the heap.

I see somebody posted https://news.ycombinator.com/item?id=8580501, which points to the Wikipedia entry on 'spaghetti trees', the conceptual view on the needed data structures (which one may recognize from reading SICP, although I do not remember it using the term)

Many implementations, at runtime, will implement the 'main line' of the tree as a 'real stack', but that can be risky, as you will have to make sure that no closures survive the point where any locals they refer to get removed from the stack (what does C++ here? Declare it undefined behaviour or make it impossible?)


It's an inhouse shader language that is compiled to different targets, one of which is llvm another is glsl. In our case you are probably right, but it would have been nice to see a simple example. Of course there's many implementations in other open source languages, but it's much more work to analyse such large code bases.


You can always check how Haskell, Rust and Clojure-LLVM do it, just to cite three examples.


Can't wait until Apple will disclose more information about Swift IR that compiles to LLVM IR. Too bad they canceled this year LLVM Devmeeting keynote.


I'm guessing that it's only conventions to work around the hardware specific parts of LLVM IR. Stuff like pointer and integer size, etc.

If you want to look at something similar, OpenCL has an LLVM IR -based standardized binary IR.


Actually, this is very wrong :)

SIL has a very large number of differences from LLVM IR. It's essentially built as an IR that the can do static analysis and high level optimization on.

This means it has a number of higher level constructs that LLVM doesn't, in order to be able to achieve the semantics they want for static analysis, and in order to be able to do things like optimize dispatch.


Just check Haskell LLVM backend instead.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: