Hacker News new | past | comments | ask | show | jobs | submit login
Monads to Machine Code (Part 1) (stephendiehl.com)
151 points by primodemus on Jan 5, 2016 | hide | past | favorite | 12 comments



What's cool about monads is that the computation (assembly code in this particular case) is just a plain old value.

So if you have a fragment like this:

    factorial :: Int64 -> X86 ()
    factorial n = do
      mov rcx (I n)
      mov rax (I 1)
      l1 <- label
      mul rcx
      loop l1
      ret
you can just compose this into a bigger program:

    two_factorials :: Int64 -> Int64 -> X86 ()
    two_factorials n1 n2 = do
      factorial n1
      factorial n2
Notice how the labels won't get mixed up. This concept (representing computation by values) is incredibly powerful: one can compose, combine, abstract values into bigger values. For example, compare the compositional properties of RethinkDB's query language to SQL.


Correct me if I'm wrong, but isn't this basically the same as an HTML DSL, i.e. functions which transform something like (in LISP)

  (html
    (head
      (title "hello"))
    (body
      (p "world")))
into "<html><head><title>hello</title></head><body><p>world</p></body></html>"? Except this takes assembly keywords and outputs machine code.


Yes, there is some similarity in expressive power of Lisp macros and Haskell monads (edit: raw Lisp macros are even more powerful, since they may glue and transform anything to anything). However, there is a huge difference in the essence: Lisp macros work on the syntactic level, monads compose on the semantics level.

Lisp macros are not bound by types, so they can be combined irregardless of a program semantics, potentially producing any kind of incorrect code as the result. Such errors get especially tricky with several layers of quasiquotation.

In Haskell, the program meaning is partially encoded into types, so the compiler can understand your intention much better and assists you in programming it right. Also, getting some kind of inter-part optimization is much easier in Haskell: you know exactly what kind of pieces you are gluing up.


Yes, and you even can put your whole-block optimizations between the statements being glued (that's the magic of the custom composition primitives, monads).


Always fun to see this type of thing implemented in Haskell.

Here are some other examples.

This one uses indexed monads to keep the instruction stream correct:

https://intoverflow.wordpress.com/2010/05/21/announcing-pote...

And this one is very similar to the OP but from 2013, and emits 6502 assembly: http://wall.org/~lewis/2013/10/15/asm-monad.html (of course without the JIT portion).


A bunch of people on HN were asking where to learn the conventions of x86 ASM - this actually is a pretty good place to start. He picked the 8 important registers, the 3 forms of addressing, demonstrated hi/lo (ah/al) -> ax -> eax -> rax evolution, and kept with the acc, count, data, GP conventions for eax,ecx,edx,ebx which is often overlooked (you can tell which era of x86 asm I grew up in ;)).

I'm scared this has the potential to be information overload for people who aren't familiar with an assembly language and Haskell, but for me I thought this was a terribly neat (both in the "interesting" as well as "structurally well done") article. As usual, fantastic work by Steven. Up there with Bartosz and Yang in quality of write-ups.


I don't know Haskell (yet) but do have a small ammount of assembly experience and I managed to follow along fairly well. However on purpose I didn't worry about fully understanding the monadic side of things.


Awesome article!

Seems a reasonably complete assembler too with one notable exception: Forward jumps, conditionals specifically. These may be a tad tricky to fit in to a monad because they require either knowing the future (eg. how far away the jump target is) or rewriting the past (eg. leave a hole and fill it in later).

I'd really like to see both approaches in Haskell, since they're kinda different. My guess is that in the former case, you'd have nested/delimited monads with buffers. However, I don't know Haskell enough to figure out how to accomplish the hole-filling approach.

EDIT: On second thought, I guess both approaches (traditionally "two-pass" vs "one-pass" assemblers) can be accomplished by changing labels to be opaque rather than addresses and storing a symbol table in either JITMem or the two-types that would replace it for a two-pass approach.


Look here: http://www.eliza.ch/doc/wadler92essence_of_FP.pdf (page 11)

The state in the monad in the quoted paper flows backward. The part of assembler that works with label's addresses can use that trick from 1992.


I haven't looked at this one in detail, but MonadFix and the recursive do syntax (https://wiki.haskell.org/MonadFix) will probably solve this problem by allowing forward references.


That'll do for the hole-filling approach :-)


You can do even more with Arrows. This very short presentation goes from Monads to Arrows, and explains why: https://www.dropbox.com/s/e3la5wueg1zsid7/Arrows%20for%20CPU...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: