Does anyone know what's a good macro-assembler these days? The article recommend...

userbinator · on Aug 31, 2014

A possible interesting side-effect of really learning Asm is that you start discovering just how horrible compilers/high-level languages actually are at exploiting the full capabilities of the machine, despite "common wisdom" suggesting the opposite. I started with Asm, and when I eventually decided to learn C, I remember the first time I looked at the compiler output of a program I'd written, compiled with full optimisation, I was astounded. Unnecessary moves and other instructions, very poor register utilisation, and blindness to status flags were just some of the things that compiled programs regularly contained.

This was many years ago, but I still see the same today. I do RE so I've read a lot of compiler output, and I've seen some isolated instances where a compiler did something "clever" (Intel's is not bad at this), but it tends to be rare and it's easy to see the rest of the code still has that "compiler-generated" feel to it.

I said "really learning" above, because I think there's two ways that people are learning Asm: the first, which is probably more common, is that they only learn the ways in which compilers generate instructions. Those who learn the first way would likely not do any better job than a compiler if asked to write a program, and not see the inefficiency of compiler-generated code, so they wouldn't find any particular advantages to using Asm.

On the other hand, I believe that if you learn Asm by starting with the machine itself, independent of any HLL, then you don't get any preconceived notions of what it can and cannot do, which leads to what I'd call "real Asm programming." Then you can see the inefficiencies in compiler-generated code and what HLL abstractions introduce, and can easily beat the compiler in size or speed (often both). Good hand-written Asm has a very different look to it than compiler output.

This is especially true if you manage to find a novel solution that just wouldn't exist when the hardware capabilities are abstracted away by a high level language.

For some entertaining examples of what Asm can do that compilers cannot, look at the sub-1k categories in the demoscene:

http://www.pouet.net/prodlist.php?type%5B%5D=32b&type%5B%5D=... One of my favourites: http://www.pouet.net/prod.php?which=3397

DanWaterworth · on Aug 31, 2014

everyone should write at least one whole application in assembly just to appreciate how the hardware really works

Unfortunately, with out-of-order execution and instruction-level parallelism, I doubt learning assembly teaches you much about how the hardware really works.

Edit: To the downvoter, care to comment?

userbinator · on Aug 31, 2014

Microarchitecture doesn't change the fact that the instructions in your program - the ones that you can work with - still have the same programmer-visible behaviour (except perhaps being a little faster.)

DanWaterworth · on Aug 31, 2014

I don't dispute that. I'm saying the model that you learn from learning assembly is very different to what the hardware is doing.

Concretely, learning assembly, you might assume each core has a set of physical registers that correspond to the registers you see and that isn't the case.

floody-berry · on Aug 30, 2014

NASM or Yasm are both good. NASM has really powerful macro support, and Yasm is a NASM clone/rewrite. Yasm additionally supports GAS syntax (if you're in to that), although its documentation for non-NASM features is a bit lacking. Yasm is also a lot nicer to hack on as well due to its modular design.

RDeckard · on Aug 30, 2014

FASM: http://www.flatassembler.net/

robert_tweed · on Aug 30, 2014

Well, digging around the docs and FAQs on both sites I couldn't see much useful introductory information about what the unique features of either project are, but I did some further Googling and read a few discussions. For anyone else interested, my conclusions are:

NASM and FASM are really the only up-to-date and cross-platform capable assemblers. MASM is up to date, but Windows only. TASM is not up to date. Others appear to have been abandoned.

The differences:

NASM: Is written in C and generates object files. Requires a linker to produce executables. Slow, inefficient compilation. Has some syntax quirks. May be more flexible in some cases due to the multiple object formats available.

FASM: Written in FASM. Very fast compilation. Cleaner syntax, better debugging tools. Produces executables directly without a linker. Possibly limited due to smaller number of output formats, but likely good enough for most projects that would be written in pure asm anyway.

FASM looks like the best option to learn first and then move to NASM for any specific requirement that FASM cannot meet. The syntax is mostly compatible between the two, so porting code shouldn't be too much trouble in the worst case.

robert_tweed · on Aug 30, 2014

I found some helpful thoughts regarding GAS - agreed, for source distributions targeting Linux is has a place, but it's not really a full blown macro assembler. Using the C preprocessor seems like a poor hack to me. Although I haven't tried it, it's generally discouraged in C, never mind something it was never intended for. Also, AT&T syntax: Yuck!

http://x86asm.net/articles/what-i-dislike-about-gas/

MegaDeKay · on Aug 31, 2014

You don't have to use AT&T syntax in GAS. I wrote a blog post a while back showing how you can use Intel syntax instead, and skip a whole lot of % characters while you are at it.

http://madscientistlabs.blogspot.ca/2013/07/gas-problems.htm...

e12e · on Aug 31, 2014

Wow, thanks for that! It's still a little painful, but here goes:

    # file:hello.s
    #
    # Translated to gas syntax.
    # assemble with:
    # as --64 -o hello.o hello.s
    # link with:
    # ld -o hellos hellos.o
    #
    # Modifications to original code considered trivial and to be
    # public domain.
    #
    # Support intel syntal vs. ATT and don't use % before register names
    .intel_syntax noprefix

    .section .data
        msg: .asciz "hello, world!\n"

    .section .text

    .global _start

    _start:
        # write syscal
        mov     rax, 1
        # file descritor, standard output
        mov     rdi, 1
        # message address
        mov     rsi, OFFSET FLAT:msg
        # length of message
        mov     rdx, 14
        # call write syscall
        syscall

        #
        mov    rax, 60
        mov    rdi, 0

        syscall

Note the trailing new-line in the message (and length change from 13 to 14). For nasm:

    section .data
        msg db      "hello, world!",`\n`
    ;; Remember to use 14 for string length!

MegaDeKay · on Sept 2, 2014

Try these changes instead. Untested, but it should work

    # String is read only.
    .section .rodata
        msg: .asciz "hello, world!\n"
    # Put string length in a variable instead
        .set STR_SIZE, . - msg
    # <snip>
    mov     rdx, STR_SIZE

nkurz · on Aug 30, 2014

Are there compelling advantages to using one of the above rather than GNU 'as'? I ask out of ignorance rather to say there is not. But 'as' is well documented, and if you access it as 'gcc -c foo.S' (with a capital S) it gets run through the C preprocessor first for macros and definitions. And if you are distributing Mac/Unix/Linux source, you can generally presume it or something compatible is preinstalled.

One possible other tool to consider is 'terse': http://www.terse.com/howdoes.htm

It's got a lot of issues, and you probably don't want to actually use it. It's unmaintained, proprietary, DOS only, and according the website, still distributed on a 3.5" floppy. But the syntax has a lot of appealing things about it. You can't actually read the real manual without buying the product, but a short lived open source clone "nega" used a very similar one: http://webcache.googleusercontent.com/search?q=cache:7E6Ddug...

floody-berry · on Aug 30, 2014

Relying on as won't work with Visual Studio

It's easier to update an external assembler than the system assembler. A lot of distros don't ship with updated binutils so you can't reliably compile for newer CPU extensions on them.

Earlier versions of clang's integrated assembler (which clang uses instead of as) weren't fully compatible with as, e.g. no .intel_syntax support.

Different operating systems can have subtly different behavior, e.g. the ancient as that ships with OS X uses $name for macro parameters while most? other systems use \name. I think gcc on OS X is intentionally forgotten so everyone will switch to clang.

Cross platform x86 asm is a real headache no matter what. NASM/Yasm/fasm just make it less of one.

MegaDeKay · on Aug 31, 2014

FASM is interesting because it has an extension that supports ARM as well.

http://arm.flatassembler.net/

I don't count its fast compilation speed as much of a plus because you've got to write a heck of a lot of assembler before you'd ever notice much of a difference, I'd suspect.