Minias – A mini x86-64 assembler for fun and learning

denton-scratch · on Oct 16, 2021

It's a long time ago that I last wrote 8086 assembler.

The OP speaks of the assembler not freeing allocated memory, and refers to "manual calls to free". Back in the day, memory management was the programmer's responsibility; I don't know how an assembler might divine the programmer's intention.

Like, you could allocate stack memory in assembler, and deallocate it simply by returning. But heap management would be the responsibility of some library or OS (and the stuff I wrote ran on bare metal, this was before Windows).

ac42dgu · on Oct 16, 2021

I read that part as the assembler itself never freeing memory during its work assembling. The assembled program can do whatever including using the stack and manual memory management.

denton-scratch · on Oct 16, 2021

Oh, thanks, I guess I misread it.

I wrote a crude assembler in Pascal once (for fun). It didn't need a heap, it was all static allocation and stack - it didn't cross my mind that one might write an assembler that needs to do heap allocation. I was a callow youth, and probably didn't know what a heap was!

The only storage you need other than stack is for a symbol table. In the general case, you can't know in advance how big that's going to be, so I suppose heap allocation for the symbol table makes sense.

ngcc_hk · on Oct 16, 2021

I learn a lot doing https://skilldrick.github.io/easy6502/ and hope there are other on that level for z80, x86 and arm.

May be 6502 is just simplifier. But dealing with basic memory, asm, machine code, cpu and flags are so intuitive when you got a javascript emulator using canvas.

userbinator · on Oct 16, 2021

A word of warning for those wondering: this is not for Intel syntax, despite referencing the Intel doc.

Minias can assemble itself

...but it's written in C and uses a parser generator? IMHO it feels a bit backwards --- and perhaps even a bit cheating if you're doing this for a "bootstrap pilgrimage" --- to write a lower-level tool in a higher-level language. On the other hand, the same author also links to a C compiler in C, without a parser generator: https://github.com/michaelforney/cproc

guerrilla · on Oct 16, 2021

> despite referencing the Intel doc.

I don't know why you would think referencing Intel's x86 software developer manual has anything to do with what syntax an assembler would use. Anyone writing an assembler, regardless of its input syntax would refer to it.

> ...but it's written in C and uses a parser generator? IMHO it feels a bit backwards --- and perhaps even a bit cheating if you're doing this for a "bootstrap pilgrimage"

Who says they share your goals? It says they wrote this just for "fun and learning," i.e. to learn how an assembler works and how x86 instructions are encoded, maybe even to practice their C.

programmer_dude · on Oct 16, 2021

The readme says it can "assemble" itself.

guerrilla · on Oct 16, 2021

That's a valid complaint but not the one I'm responding to. I wonder if they mean it can assemble the output of the C compiler though.

MobiusHorizons · on Oct 16, 2021

That is how I read it. This is intended to be paired with cproc, a c compiler that uses QBE for it’s backed. QBE outputs assembly not binaries, so an assembler could be used to assemble itself as part of that toolchain.

andrewchambers · on Oct 17, 2021

Author here, that's exactly what I meant. It can also assemble the cproc C compiler when it compiles itself - thus creating a loop. I updated the readme.

andrewchambers · on Oct 17, 2021

I updated the readme to better explain what I meant.

andrewchambers · on Oct 17, 2021

It can assembly all the assembly generated by cproc - which can compile the tools it depends on including cproc itself. I will update the readme.

nathell · on Oct 16, 2021

Here’s mine, in Clojure, very incomplete (16-bit x86 only for now) and using a s-expression based syntax, in 253 LOC:

https://github.com/nathell/lithium/blob/master/src/lithium/a...

MobiusHorizons · on Oct 16, 2021

Very cool! I’ve been playing around with QBE. It always felt like it defeats the purpose to compile the output with gas/gcc.

Are you interested in contributions for a custom parser?

andrewchambers · on Oct 17, 2021

First there is perhaps some slight refactoring that could be done. Though it does sound nice to reduce the dependency tree.

One thing is I like the declarative nature of the peg, so I considered trying to make a tiny peg parser generator to go with it (Also worth noting that peg/leg itself is very small).

All that being said, a hand written parser could probably dramatically increase performance and not be that much more code - so I am still unsure - but probably.