Does anyone know what's a good macro-assembler these days? The article recommends NASM, but I don't know if it's comparable to MASM for larger, more complex projects.
I rather fancy getting back into writing asm, just to sharpen that skill. I haven't really written any since MASM 6.x on DOS, 20-ish years ago. I actually found it quite enjoyable and it's surprising how complex an application you can write from scratch in assembly without it becoming unmanageable, so long as you get into the right mindset and make effective use of macros.
Of course, any significant piece of assembly code is likely to contain considerably more bugs than just about anything else of the same complexity. You'll also experience a lot more segfaults during development than perhaps most are comfortable with, but there's something rewarding about controlling precisely what the machine is doing at that level. This is especially true if you manage to find a novel solution that just wouldn't exist when the hardware capabilities are abstracted away by a high level language.
In the same way that everyone should learn a Lisp to think in terms of ASTs and code-as-data, everyone should write at least one whole application in assembly just to appreciate how the hardware really works. Also to see how often there are many ways to solve the same problem (especially with an x86 instruction set), sometimes with wildly different performance characteristics.
A possible interesting side-effect of really learning Asm is that you start discovering just how horrible compilers/high-level languages actually are at exploiting the full capabilities of the machine, despite "common wisdom" suggesting the opposite. I started with Asm, and when I eventually decided to learn C, I remember the first time I looked at the compiler output of a program I'd written, compiled with full optimisation, I was astounded. Unnecessary moves and other instructions, very poor register utilisation, and blindness to status flags were just some of the things that compiled programs regularly contained.
This was many years ago, but I still see the same today. I do RE so I've read a lot of compiler output, and I've seen some isolated instances where a compiler did something "clever" (Intel's is not bad at this), but it tends to be rare and it's easy to see the rest of the code still has that "compiler-generated" feel to it.
I said "really learning" above, because I think there's two ways that people are learning Asm: the first, which is probably more common, is that they only learn the ways in which compilers generate instructions. Those who learn the first way would likely not do any better job than a compiler if asked to write a program, and not see the inefficiency of compiler-generated code, so they wouldn't find any particular advantages to using Asm.
On the other hand, I believe that if you learn Asm by starting with the machine itself, independent of any HLL, then you don't get any preconceived notions of what it can and cannot do, which leads to what I'd call "real Asm programming." Then you can see the inefficiencies in compiler-generated code and what HLL abstractions introduce, and can easily beat the compiler in size or speed (often both). Good hand-written Asm has a very different look to it than compiler output.
This is especially true if you manage to find a novel solution that just wouldn't exist when the hardware capabilities are abstracted away by a high level language.
For some entertaining examples of what Asm can do that compilers cannot, look at the sub-1k categories in the demoscene:
everyone should write at least one whole application in assembly just to appreciate how the hardware really works
Unfortunately, with out-of-order execution and instruction-level parallelism, I doubt learning assembly teaches you much about how the hardware really works.
Microarchitecture doesn't change the fact that the instructions in your program - the ones that you can work with - still have the same programmer-visible behaviour (except perhaps being a little faster.)
I don't dispute that. I'm saying the model that you learn from learning assembly is very different to what the hardware is doing.
Concretely, learning assembly, you might assume each core has a set of physical registers that correspond to the registers you see and that isn't the case.
NASM or Yasm are both good. NASM has really powerful macro support, and Yasm is a NASM clone/rewrite. Yasm additionally supports GAS syntax (if you're in to that), although its documentation for non-NASM features is a bit lacking. Yasm is also a lot nicer to hack on as well due to its modular design.
Well, digging around the docs and FAQs on both sites I couldn't see much useful introductory information about what the unique features of either project are, but I did some further Googling and read a few discussions. For anyone else interested, my conclusions are:
NASM and FASM are really the only up-to-date and cross-platform capable assemblers. MASM is up to date, but Windows only. TASM is not up to date. Others appear to have been abandoned.
The differences:
NASM: Is written in C and generates object files. Requires a linker to produce executables. Slow, inefficient compilation. Has some syntax quirks. May be more flexible in some cases due to the multiple object formats available.
FASM: Written in FASM. Very fast compilation. Cleaner syntax, better debugging tools. Produces executables directly without a linker. Possibly limited due to smaller number of output formats, but likely good enough for most projects that would be written in pure asm anyway.
FASM looks like the best option to learn first and then move to NASM for any specific requirement that FASM cannot meet. The syntax is mostly compatible between the two, so porting code shouldn't be too much trouble in the worst case.
I found some helpful thoughts regarding GAS - agreed, for source distributions targeting Linux is has a place, but it's not really a full blown macro assembler. Using the C preprocessor seems like a poor hack to me. Although I haven't tried it, it's generally discouraged in C, never mind something it was never intended for. Also, AT&T syntax: Yuck!
You don't have to use AT&T syntax in GAS. I wrote a blog post a while back showing how you can use Intel syntax instead, and skip a whole lot of % characters while you are at it.
Are there compelling advantages to using one of the above rather than GNU 'as'? I ask out of ignorance rather to say there is not. But 'as' is well documented, and if you access it as 'gcc -c foo.S' (with a capital S) it gets run through the C preprocessor first for macros and definitions. And if you are distributing Mac/Unix/Linux source, you can generally presume it or something compatible is preinstalled.
It's got a lot of issues, and you probably don't want to actually use it. It's unmaintained, proprietary, DOS only, and according the website, still distributed on a 3.5" floppy. But the syntax has a lot of appealing things about it. You can't actually read the real manual without buying the product, but a short lived open source clone "nega" used a very similar one: http://webcache.googleusercontent.com/search?q=cache:7E6Ddug...
It's easier to update an external assembler than the system assembler. A lot of distros don't ship with updated binutils so you can't reliably compile for newer CPU extensions on them.
Earlier versions of clang's integrated assembler (which clang uses instead of as) weren't fully compatible with as, e.g. no .intel_syntax support.
Different operating systems can have subtly different behavior, e.g. the ancient as that ships with OS X uses $name for macro parameters while most? other systems use \name. I think gcc on OS X is intentionally forgotten so everyone will switch to clang.
Cross platform x86 asm is a real headache no matter what. NASM/Yasm/fasm just make it less of one.
I don't count its fast compilation speed as much of a plus because you've got to write a heck of a lot of assembler before you'd ever notice much of a difference, I'd suspect.
I rather fancy getting back into writing asm, just to sharpen that skill. I haven't really written any since MASM 6.x on DOS, 20-ish years ago. I actually found it quite enjoyable and it's surprising how complex an application you can write from scratch in assembly without it becoming unmanageable, so long as you get into the right mindset and make effective use of macros.
Of course, any significant piece of assembly code is likely to contain considerably more bugs than just about anything else of the same complexity. You'll also experience a lot more segfaults during development than perhaps most are comfortable with, but there's something rewarding about controlling precisely what the machine is doing at that level. This is especially true if you manage to find a novel solution that just wouldn't exist when the hardware capabilities are abstracted away by a high level language.
In the same way that everyone should learn a Lisp to think in terms of ASTs and code-as-data, everyone should write at least one whole application in assembly just to appreciate how the hardware really works. Also to see how often there are many ways to solve the same problem (especially with an x86 instruction set), sometimes with wildly different performance characteristics.