Hacker News new | past | comments | ask | show | jobs | submit login
Introduction to x64 Assembly (2011) [pdf] (nyu.edu)
110 points by networked on Dec 22, 2015 | hide | past | favorite | 30 comments



Can someone recommend a tutorial on understanding the extreme basics of Assembly? For some reason I just cannot grok it, which is strange because it's just tiny building blocks that make up all other languages. Most tutorials assume you already have some basic understanding of low level computer hardware architecture and fail to point you anywhere if you don't.


Assuming you want to learn modern x86-64 assembly your best bet is to start by learning C well enough to write simple toy programs on your own. If you know enough C to understand what a bit of totally uninteresting code like

  int foo(int a, int b) {
      int c = a + b;

      int d = a * b;

      return c + d;
  }

  int main() {
      return foo(4, 5);
  }
does then you're more than ready to start diving into assembly. Using either Clang or GCC take your simple programs (like the one above) and compile them with the following options

  clang -O0 -S -f-no-asynchronous-unwind-tables foo.c
This will output a file called foo.s. Open it up and you should see the unoptimized assembly equivalent of the machine code the compiler would generate for your program (sans stuff involving linking). For the sake of brevity I'll omit it from this post.

Start off with very very simple programs and examine their output. Change the source slightly and observe how the assembly changes. And as usual when learning new stuff, google everything that doesn't make sense. If you come across an instruction you've never seen before then google it, odds are just knowing what it does will make its purpose obvious for simple enough programs. Starting with simple stuff really is the key. Eventually you can even try writing your own version in assembly first and then compare it to what the compiler gives you for an equivalent C program. Sometimes the results will be very surprising and you'll learn something interesting.

One thing I will say you should read up on first is the System V ABI (http://wiki.osdev.org/System_V_ABI). Assuming you're on Linux or OSX. It's the function calling convention used by modern systems. You don't have to use it yourself necessarily, but when you examine compiler generated assembly or if you want to know how to call libc functions understanding it is crucial.

I'm by no means an expert on assembly, but this is what I did and I'm at a place now where I'm comfortable looking at most pieces of assembly code and following along. Writing it is still hard, but mostly because it takes so damn long and higher level languages have spoiled me.


The option

  -fno-asynchronous-unwind-tables
will work slightly better.


Here is a short 6502 assembly language programming tutorial with emulator and development tools in the browser:

http://skilldrick.github.io/easy6502/

It's 6502 not x64 but simpler is better here. It is self-contained but if you would like more here is a 5 page introduction to the 6502 from another author:

http://people.cs.umass.edu/~verts/cmpsci201/spr_2004/Lecture...


I don't think 6502 is a particularly good arch for someone wanting to learn x86 assembly to start with. Its simply to restrictive, the lack of registers, mul, div, the lack of a decent accumulator, all the zero page tricks, etc all are pointless "tricks" for learning a more modern architecture. If you want something simple, but more modern, ARMv4 (AKA arm7tdmi) is probably a good bet. Its only got a handful of instructions (unlike v8 which does away with the predication). Plus, its 32-bit, and the predication might be a little confusing but probably wont damage a beginner like the 6502. I say this as someone who learned 6502 assembly with an apple ][ and then moved to to a 286, and it felt like learning assembly all over again without all the restrictions.


If you want an introduction to assembly with the hopes of achieving super complicated in the future, I would suggest taking a look at the Intel manuals [1]. Volume 1 specifically offers an overview of the processor's architecture and underlying components (registers, ALUs, branching); it's actually a pretty useful textbook. Volume 2 walks explains each and every single instruction and what it does. Volume 3 is a programmer's guide to the Intel processor, which is what might help you quite a bit, it provides examples of using stacks, function calling, memory management, and SIMD instructions.

[1] https://www-ssl.intel.com/content/www/us/en/processors/archi...


If you want an alternative to the good suggestions that others have already provided, you might like the recent game "Human Resource Machine"[1], by the creators of "World Of Goo". As they describe the game:

   Your office is a simple computer. ... You might describe this machine
   as Harvard Architecture with a single accumulator.
Don't think of it so much as a game, but a fun interactive CPU that you program in assembler. It starts with how to use basic opcodes (I/O, math) and registers ("save a copy on the floor"). Conditional jumps (w/ an implicit flag register) are gradually added, and later puzzles use indirect addressing a lot.

[1] http://tomorrowcorporation.com/humanresourcemachine


Charles Petzold's book CODE will sorta teach this from the ground, literally, on up. It's one of the best books I've read.


Another vote for CODE - very well written book that's especially great for total beginners and novices.


C is the 'portable assembly' of the world. I find it easiest when learning some CPU architecture to compile very simple C programs and then see what the resulting assembly looks like.

Some questions I look at:

* When I cast a float to an int, what actually happens behind the scenes? Sometimes there are floating point instructions that get generated. Sometimes the CPU doesn't have floating points, so it emits calls to a 'float2int()' function that implements it in software.

* What is the calling convention? Just write a function with 1,2,3,4 and 20 arguments and do something like add them together. Then see how the resulting function handles those arguments.

* What do some simple array accesses or pointer arithmetic generate?

* How are large structs passed to functions at higher optimization levels?

* If I write a loop to compute a dot product, do higher optimization levels emit vectorized instructions?

You can actually go pretty far with this. Just use tools like objdump or gcc to probe or create object files, and then start writing assembly once you understand what C code you'd write and what it would compile down to.


I remember studying `Computer Systems: A Programmers Perspective'[1] in my undergrad which was pretty good and covered how programs look in assembly (calling conventions, call frames, data representation, assembly ...).

Also, to get better understanding of x86 ISA I followed the old i386 manual[2]. Its old but much more smaller. Perhaps once you feel comfortable, you can move to newer manuals. I never read newer ones (because I never actually needed assembly for any project, was just studying for fun).

[1] http://www.amazon.com/Computer-Systems-Programmers-Perspecti... [2] http://css.csail.mit.edu/6.858/2015/readings/i386.pdf


Have you played any of https://microcorruption.com ?


Don't start with the x86 or -64 architecture; there's a lot of cruft in there due to backwards compatibility. Start with a simpler architecture like MIPS (see e.g., Sweetman's "See MIPS Run" for an introduction), or the old 6800 chips (few extant examples in the wild not attached to old Macs -- unless you are willing to pay exorbitant prices because you're maintaining an old weapons system -- but there are some wonderful out-of-print textbooks that are great reads).


ARM is nice and simple, and relatively modern.

Even simpler, the 6502. After writing tens of thousands of lines of 6502 assembly, the ARM feels like writing in a high level language. But people got real work done with it.

[The Macs were 68000s, not 6800s. Typo, I assume]


An effect of browsing HN on a phone at 6AM while packing for a trip, I'm afraid. Using the 6502 (though I know nothing operational about either processor) is a great idea; I'm sure there are some good systems programming texts kicking around from those days that target that family, and would make a nice introduction to development on a simple 8-bit design. Are the 6502s still used in microcontroller applications?


I doubt that 6502s are still actively used. There are many better microprocessors available today (the last one I used was under twenty cents in quantity, with a clock rate 10X that of the 6502s back-when).


Another nice assembly that is fun to write in is the one for Parallax's 8-core microcontroller, the Propeller. It's quite nice CISC with some high level instructions but still close enough to the metal to have the 'true assembly feel.'


I personally gained an interest in assembly programming when I started trying out The Legend of Random's reverse engineering tutorials [0]

It's not really a tutorial on asm, but he assumes you don't know much about it and it was a great starting point for me, with lots of practical applications, which helped motivate me to continue and learn more.

[0] http://thelegendofrandom.com/blog/sample-page


Link is not working.


Sorry. It was working when I posted it, but it doesn't seem to be working anymore.

Try : http://octopuslabs.io/legend/blog/sample-page.html


If you're into any electronics tinkering at all, a good way to get up to speed with assembly is to take a look at some microcontrollers, such as PIC and make some simple hardware projects happen. There are tons of project blueprints out there along with code. The beauty of it is that you can account for pretty much every peripheral on the device with your eyeball. There are many versions of microcontrollers, but a grandfather one with RISC assembly is the venerable PIC16F84A. It has 1 kb of program memory, 68 bytes of ram, and 64 bytes of 'eeprom' ram. That last part is like hard drive storage you can store settings in, or temperatures, or whatever.

The great thing is that you can download the IDE for the chips, write a program, and simulate its running for free. The debugger will light up pins as needed and let you simulate input. But, back to the understanding part: today's CPUs are very complicated and have tons of parts. The microcontroller is a very simplified version of it, and even so it's complicated.

Imagine a microchip in your hand with 18 pins on it. 2 of the pins are power supply (+5 v and ground), another 2 are supposed to be attached to a crystal oscillator (for its CPU clock), 1 pin is a reset button of sorts (or rather a permission to run connection). That leaves you with 13 pins to understand.

The remaining 13 pins are separated into ports "A" and "B", both of which you can address in software and either read or turn individually on or off. Port A has 5 pins and port B has 8 pins you can use for anything you like.

5 of the pins from port B can be optionally used as interrupts, just like on your desktop/laptop. Signal going into them, when configured for it, will interrupt your program and let you respond. Like an alarm sensor going off, needing immediate attention.

If I recall correctly, any of the pins can be configured to do either reading or writing, though I can't remember if you can change that mid-program or just configure startup time.

Normally, you fire up the MicroChip (company name) IDE, write a program, put a blank microcontroller in a simple "programmer" device (USB, or old school serial) and tell it to copy the program over. These microcontrollers are reprogrammable, so you have plenty of room for trial and error.

Anyhow, in your very beginning of the assembly program, you tell the IDE what device you're about to program and how to configure it for the 'burn' and running:

  processor 16f84a
  #include <p16f84a.inc>
  __config _HS_OSC & _WDT_OFF & _PWRTE_ON
So far this is not programming yet. This last config line is telling the machine a few things, but only a few. Like "hey I have a crystal connected to it" or "I don't have a crystal, use some internal resistors to simulate a less accurate clock so it can run the program" or "turn code protection on -- meaning microcontroller is not allowed to be overwritten" or "I don't need a babysitter watchdog in case my code freezes because I wrote good code and it won't freeze and the hardware is good too and won't cause a freeze either". Anyhow, here's some actual code.

  movlw    B'00000011'
  tris     PORTA
  movlw    B'00000000'
  tris     PORTB
  clrf     PORTB
This part defines how I want the pins on two ports to behave. B'' notation means binary, so you can clearly see pins in code. I load an instruction byte that defines pins (written in binary) to a working register (W), then copy the working register to PORT A/B with the TRIS command and that sets the port operation. Anyhow, in this code there I set two pins on port A as inputs for little contact switches. Specifically, pins 18 and 17 on the chip are buttons. Rest, including port B are all outputs, connected to LEDs.

After this point you have your chip up and running. If you had 8 LEDs connected to individual pins on port B (via appropriate resistor of course), you can turn every other one on with this simple program:

  main:
    movlw    B'10101010'
    movwf    PORTB
    goto     main
The 1 corresponds to an LED being on, 0 to off. If you wrote B'11111111' they would all be on (actually maybe they'd be off, I can't remember if a 1 or a 0 is a voltage low or voltage high, but you get the picture!)

Here's what a programmer device looks like that lets you transfer a program from a PC to the microchip: http://pp19dd.com/_old/geocities/geocities.com/krusko.geo/jd... - you put the chip in the socket, plug it into a serial port, and hit a key on the keyboard.

Here's the same microchip controlling a bunch of LEDs for a clock display: http://pp19dd.com/_old/geocities/geocities.com/krusko.geo/7s...

Here's how great this stuff is. The microchip only has 8 outputs for port B, but in it I'm able to control 4 digits, each requiring 8 leds. In other words, I'm controlling 32 LEDs with only 8 pins. Way this happens is that I'm multiplexing at high speed through the 4 displays, and each one is only turned on for a fraction of a second.

Anyhow, there are tons of resources on this chip and its version of assembly out on the web, including this repository of hundreds of projects with code and hardware descriptions: http://pic-microcontroller.com/project-list/

The fun stuff in this assembly language is that there are only about 50 instructions total, so writing simple algorithms like "divide this number by 2" or "multiply this number by 3" or "take square root of 14" become fun academic challenges.

If you're interested, take a look at this brief tutorial titled "PIC Assembly Language for the Complete Beginner": http://www.covingtoninnovations.com/noppp/picassem2004.pdf


This is how I first came to a good understanding of assembly language. A basic, working knowledge of how transistors and logic gates operate was really helpful. The PIC12F615 remains my favorite <$1 computer.


Stunning comment, thanks!


Knowing C certainly helps for many architectures.


As does having a machine-level visual debugger for at least one of those architectures. Being able to switch to the CPU pane and see how C-function call and return were handled at the register level in Borland's Turbo Debugger was profoundly enlightening for both my C and assembly coding.


There is a super-simplified assembly that is often used for teaching called the Little Man Computer[1] Playing with this is a good way to get a feel for how assembly and machine code work. On the Wikipedia page, there are a number of simulators you can try.

[1] https://en.wikipedia.org/wiki/Little_man_computer


What languages do you know currently?


"Programming from the Ground Up" - a gentle introduction to x86 assembly on Linux - http://savannah.nongnu.org/projects/pgubook/


I teach a systems course at Marlboro College which covers some x86 assembler and C, using Bryant and O'Hallaron's "Computer Systems: A Programmer's Perspective" textook. ( http://csapp.cs.cmu.edu/ ). The course materials include some great labs.

My students have particularly liked the "bomb lab" which requires reverse engineering an x86 binary to understand what input it expects, and the stack overflow exercises.

Highly recommended.


If you're developing x86-64 assembly in Sublime Text I suggest this (shameless plug) https://github.com/Nessphoro/sublimeassembly




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: