Show HN: Simplexhc – Haskell to LLVM compiler (design phase)

runeks · on June 25, 2017

> The way GHC currently compiles is to first reduce Haskell to a Core language, which is a minimal subset of Haskell, in some sense.

GHC Core is not a subset of Haskell, it's just a simplification of it. The same thing but explained with fewer words (constructors).

That being said, we completely agree that Spineless Tagless G-Machine is full of badassery. Just imagine being able to reduce any Haskell app to, like, 8 different instructions. Something about that fascinates me, even though I'm not quite sure what it is.

chrisseaton · on June 25, 2017

> Just imagine being able to reduce any Haskell app to, like, 8 different instructions.

It's a universal property of all programming languages that they can be reduced to a language with just one instruction, so it isn't a surprising result.

aisofteng · on June 25, 2017

Compiling C to assembly code that uses only one instruction, namely mov: https://github.com/xoreaxeaxeax/movfuscator

kobeya · on June 25, 2017

Calling 'mov' on x86 one instruction is a bit of a fiction to begin with though.

chrisseaton · on June 25, 2017

There are other, more straight-forward, one-instruction instruction-sets, such as subtract-and-branch-if-less-than-or-equal-to-zero.

tome · on June 26, 2017

But the point is that STG preserves[1] the operational semantics of the program, not just the final calculated value.

[1] Indeed it defines the operational semantics

mhh__ · on June 25, 2017

How badly does the function pointer technique effect interprocedural optimization? Since LLVM would know the exact control flow it should be able to do the optimisations anyway?

That match(int) technique seems naïve. How would you handle the function arguments/application. The size of the function would also, presumably, large enough to be either very bad for one's icache or requiring more optimisation, which on the assumption that my previous paragraph is correct would make this a wasted effort?

Has this been profiled at all?

mncharity · on June 25, 2017

Regards interprocedural optimization, an alternative is to return to the trampoline less frequently. As with Cheney on the MTA.

I wonder what it would take to add a wiki to lambda-the-ultimate.

kronos29296 · on June 27, 2017

As a person starting to learn Haskell I only understood the steps to Machine code and that you want to cut down the intermediate steps (to C-- step) and go directly to LLVM. If it makes Haskell better you have my best wishes. Haskell is one of those languages that gets less attention that it deserves.

rwallace · on June 25, 2017

> Going to C-- and then from there going to LLVM loses a lot of the semantic information that one can provide to the LLVM optimiser.

What semantic information?

codygman · on June 26, 2017

Types for instance.

rwallace · on June 26, 2017

What type information exactly are you thinking of, that the optimizer would be able to make use of, is lost?

mhh__ · on June 26, 2017

The only type information I can think of is invariants from type definitions and their use (e.g. this ADT contains an Integer that is never less than 5). This information can be used to (say) constant fold branches. This information would have to be reinferred by the optimiser. However, I don't think this would do anything because AFAIK LLVM can already get infer this.

This could give a speed boost: LLVM has llvm.assume with which one can annotate IR with, using this would mean that the optimizer doesn't have to infer this (and therefore save time).

kccqzy · on June 25, 2017

Great work! Minor typo: the first code snippet has a missing right parenthesis.