Code in ARM Assembly: Registers Explained

MarkSweep · on June 16, 2021

Related, Raymond Chen is doing a series of articles on the ARM instruction set this month: https://devblogs.microsoft.com/oldnewthing/2021/06/

saati · on June 16, 2021

That's about the deprecated Thumb-2, this article is about ARMv8, they are wildly different.

mhh__ · on June 16, 2021

I can recommend his articles to anyone interested in what they cover. It's very common to completely miss the bigger picture, or give an incomplete summary, when talking about processors and their design. If I can name names I've always found Lemire's blog anticlimactic - usually a good summary but not fleshed out until someone else comments with the juicy details.

johndoe0815 · on June 16, 2021

Related - M1 ARM assembly examples by Alex von Below: https://github.com/below/HelloSilicon

nonameiguess · on June 16, 2021

Would have posted exactly this if you hadn't already. So many pitfalls with M1 in the way Apple subtly veers from normal ARM64 conventions and the published spec. This guy has already figured them out for you. Also helpful to look at the XNU source code to see how they implement syscalls.

comex · on June 16, 2021

For the record, Apple does publish its own documentation describing how Apple’s ABI diverges from the ARM standard one:

https://developer.apple.com/documentation/xcode/writing-arm6...

saagarjha · on June 16, 2021

This is not true, Apple sticks pretty closely to AAPCS except for a handful of cases.

tyingq · on June 16, 2021

"Apple reserves X18 for its own use. Do not use this register."

Interesting. I wonder why that particular one. It's not exactly in the middle of the range. Maybe they searched for typical existing use and decided that was the least likely to have some conflict?

comex · on June 16, 2021

For more context, look at the standard ARM ABI spec. Apple’s ABI diverges from the standard one in some places, IIRC, but in this case it’s complying with it: the spec says that X18 (which it calls r18) can be reserved by platforms for platform-specific use. The spec also shows how the registers are divided into several categories, which helps explain the choice of X18:

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aap...

monocasa · on June 16, 2021

x16, x17, and x18 are all different env pointers not really available for general use per se in the main ABI. x16 and x17 are used for plt linkage there, x18 is reserved for the system to make use of as it sees fit, at it's leisure. x18's use there seems to have survived into Apple's slightly modified ABI.

regularfry · on June 16, 2021

"All these registers are yours - except X18. Attempt no MOVing there."

saagarjha · on June 16, 2021

x18 is the standard platform register. On Darwin 20 and higher Apple pins the Rosetta context to it. When not used, it gets zeroed out on return to userspace in what I assume is a subtle prod to avoid its use.

saagarjha · on June 16, 2021

> Passing arrays, structures and other arguments which can’t simply be put into a single register requires a different method, in which a pointer to the data is passed: ‘call by reference‘.

Note that small structures will be passed in registers. Here's an example: https://godbolt.org/z/rsGEbefcv. a and b are 32 bits and passed packed in x0, and c is a long passed in x1.

pdpi · on June 16, 2021

It's also wrong to call this "call (or pass) by reference." Pass by reference and pass pointer by value are different things (here's an example — https://godbolt.org/z/Kv7jPPvnz)

secondcoming · on June 16, 2021

A reference is just a pointer but with restrictions enforced by the compiler. ASM-wise, they're totally the same thing.

'Reference' doesn't mean C++'s reference, it means that you pass where the data is, not the data itself. Even Visual Basic had this concept.

MaxBarraclough · on June 16, 2021

Argument-passing semantics (evaluation strategies) are something defined by the source language, not by the calling convention of the target architecture/platform. pdpi is right to point out that what the article is describing is not pass-by-reference semantics.

Unlike C++, C does not support pass-by-reference (ignoring preprocessor silliness). Passing a pointer by value is not the same thing. edit To be clear, C++ supports both pass-by-value and pass-by-reference.

This topic cropped up a year ago: https://news.ycombinator.com/item?id=23553574

cat199 · on June 16, 2021

this is also a case of the later nomenclature redefining the previous nomenclature in a backwards incompatible way -

Wasn't there, but pretty sure in 70s, the notion of pointer and reference (and 'handle', etc) were pretty much interchangeable, since AFAIK there wasn't any compiler capable of creating a compiler-checked 'reference' construct.

so, if this is true, in the relm of asm, since really we are just dealing with bits in a register that could be interpreted as data or addresses, it makes sense to treat pointer and reference interchangably, since there is no such thing as a higher-order 'reference' anyway.

Joker_vD · on June 16, 2021

No, they used to use both "pointer" and "reference" interchangeably to mean an opaque handle that you could only pass around or dereference; for things that supported address arithmetic they used, well, "address". See for example [0] — the paper itself is from 1992, but it extensively quotes the papers from the seventies.

[0] Amir M. Ben-Amram and Zvi Galil. 1992. On pointers versus addresses. J. ACM 39, 3 (July 1992), 617–648. DOI:https://doi.org/10.1145/146637.146666

jhgb · on June 16, 2021

> are something defined by the source language, not by the calling convention of the target architecture/platform

In that case perhaps the response should have been "pass-by-reference doesn't exist on machine level", not "pass-by-reference is something different", as the latter would seem to be a category error.

Koshkin · on June 16, 2021

> the response should have been "pass-by-reference doesn't exist on machine level"

Which would barely make any sense.

MaxBarraclough · on June 16, 2021

Assuming the machine code has no concept of a function call, isn't it true to say that pass-by-reference doesn't exist at the machine level? Function calls are a high level concept that compile down to machine code.

A typical CPU deals with abstractions like jumps and copies, and has no real concept of a function call. It seems fair to say that, at the level of a typical assembly language/machine language, there is no argument passing, and therefore no evaluation strategy. Same goes for a Turing machine.

Koshkin · on June 16, 2021

This is similar to saying, for instance, that there are no pointers, at the machine level, just integers. (In fact, there are just bits and, maybe, bytes, for that matter.) But it is the usage semantics that matters. (Typically, CPU architectures are designed with it in mind.) An instruction that saves the the next instruction's address somewhere before performing the jump cannot be thought of as simply combining two arbitrary operations into one...

jhgb · on June 16, 2021

How come? It would illustrate the notion that to a machine, everything is a value, and that (abstract) references get translated into manipulations of values (and not even always in the same way).

Koshkin · on June 16, 2021

Because "on a machine level" and "to a machine" are two different things (the former referring to human understanding).

jhgb · on June 17, 2021

They're absolutely the same thing to me.

secondcoming · on June 16, 2021

I'm reluctant to bikeshed, but if I read your argument correctly, then technically everything is pass-by-value since even for references the actual reference (a.k.a. pointer) is passed by value.

MaxBarraclough · on June 16, 2021

Pass-by-reference is distinct from passing a pointer by value. In my old comment [0] I gave example code fragments in C# to illustrate the difference. (C# supports both pass-by-value and pass-by-reference. It's like C++ in that regard, although the syntax is different.)

> I'm reluctant to bikeshed

This is the kind of topic where precise phrasing is important, so I wouldn't chalk it up as pedantry. It doesn't help that Java muddied the waters with its use of the word reference.

[0] Here's the link again: https://news.ycombinator.com/item?id=23553574

secondcoming · on June 16, 2021

You've show a syntactic difference, not a functional difference. What does the 'ref' keyword change about how C# passes the argument to the function? I'd imagine (having zero C# experience) that it passes either a pointer, or an index, or an offset, to the function by value. It may also just copy the argument, pass it to the function, and then copy the copy back to the original argument (which would be daft).

MaxBarraclough · on June 16, 2021

> You've show a syntactic difference, not a functional difference.

Kinda, but that difference is important. We can say that C allows us to simulate pass-by-reference by using pointers. It's still true that the C language does not support pass-by-reference semantics.

Taking the address of a variable is an operation that the C language permits us to do, yielding a new value (a pointer value). This isn't part of C's argument-passing functionality, though.

> What does the 'ref' keyword change about how C# passes the argument to the function?

The internals of C# compilers aren't really the point, but I believe .Net does it the copy-intensive way, copying to pass in and copying again to pass the new value back out. I don't think this a real performance problem though, unless you overuse the feature.

pdpi · on June 16, 2021

Yes exactly. I believe Fortran and Perl only do pass by reference, for example.

It's not a C++ concept, I only used C++ in that example simply because godbolt.org makes it really easy to show how the language treats the two concepts differently even though they compile to the same thing.

TheRealKing · on June 22, 2021

The Fortran standard is deliberately silent about the passing method. However, virtually all Fortran compilers pass by reference.

jhgb · on June 16, 2021

Your example seems too C++-specific. Not all languages are defined in the same way.

Erlangen · on June 16, 2021

I am familiar with a small set of programming languages, but they define reference and value passing similarly.

* Java: Java spec says "pass by value"

* C: pass by value

* C++: pass by value and reference(with ref qualifier)

* C#: pass by value and reference(with ref keyword)

Do you know a language that defines "pass by reference" in a very different way?

sry2dogpile · on June 16, 2021

In c and assembly the terms reference and pointer are sometimes used interchangeably. An lval is a reference to a value, * is the dereference operator, and so forth. I don't know the spec admittedly but I've seen the terms used interchangeably in compiler code and documentation. I think c++ began distinguishing between and defining references as distinct from pointers, and java removed explicit pointers entirely and generates them and dereferences them automatically for objects.

jhgb · on June 16, 2021

Not sure if all language specs explicitly use the "pass-by-X" terminology but for example ANSI Smalltalk standard says things like "Each argument [of a message send] is a reference to an object."

> but they define reference and value passing similarly

Well, If all of them have been derived from C++ or designed by people trying to improve on C++, they would be very likely to have the notion defined similarly regardless of how anyone else would use the term, so there's not much surprise there.

MaxBarraclough · on June 16, 2021

They're accepted terms in computer science. They're not specific to any programming language.

https://en.wikipedia.org/wiki/Evaluation_strategy

jhgb · on June 16, 2021

I was simply saying that C++ using a certain terminology was a sufficient condition for C++-derived languages to use the same terminology, you wouldn't need a CS-wide definition for this and your small set of language wouldn't allow you to make the inference that such a common definition existed.

BTW the page you've linked seems to contradict some of your statements:

> so they are typically described as call by value even though implementations frequently use call by reference internally for the efficiency benefits

As you seem to be arguing that this is not call by reference you should probably correct the page.

EDIT: It also seems that there IS some amount of language specificity anyway, since in another place of the page it says:

> "In particular it is not call by value because mutations of arguments performed by the called routine will be visible to the caller. And it is not call by reference because access is not given to the variables of the caller, but merely to certain objects"

I'm reasonably sure that the first part would be called "call by value" in C++ even if in context of CLU it wasn't if the usual way of doing the same in C++, namely passing a pointer to an object, were to be used.

MaxBarraclough · on June 16, 2021

> As you seem to be arguing that this is not call by reference you should probably correct the page.

I don't really disagree with what the article says there, although I don't like the way it's phrased. The full quote:

> In purely functional languages there is typically no semantic difference between the two strategies (since their data structures are immutable, so there is no possibility for a function to modify any of its arguments), so they are typically described as call by value even though implementations frequently use call by reference internally for the efficiency benefits.

A purely functional language with the property of referential transparency [0] can indeed be treated as using pass-by-value, or pass-by-reference, or even some blend of the two. With referential transparency, nothing hinges on which of the two strategies is used. I'm not sure it's accurate to say that they're typically described as call by value.

> there IS some amount of language specificity anyway

As Wikipedia says, the term call by sharing is not as widespread. I hadn't seen it before. I can't say I really see where they're going with that term:

> In particular it is not call by value because mutations of arguments performed by the called routine will be visible to the caller.

No, they aren't. In Java, you can pass an object reference to the callee function, and the reference will be passed by value. The callee function can modify the members of the referred-to object, in a way that may later be visible to the caller function. So what? That's still pass-by-value. It's the same in C, where the callee can modify a pointed-to variable.

The Wikipedia article also appears to suggest that in Java, all arguments are boxed/unboxed when passed. That isn't true.

I can see some sense in having a term to emphasise that Java sometimes performs boxing, rather than simple pass-by-value. This is to say, I agree that it's a slight oversimplification to say that Java always simply passes by value. If we go too far down this road though we'll need to recategorise C, as it allows implicit conversions between many of its primitive types.

> you should probably correct the page

I don't have the will to get into a Wikipedia edit war, which is how this kind of thing often goes.

[0] https://en.wikipedia.org/wiki/Referential_transparency

Koshkin · on June 16, 2021

In assemblers and C passing by reference and passing by pointer value means the same thing.

MarkSweep · on June 16, 2021

You can also pass structs of floating point or SIMD types by register. They have to be 2-4 elements and all of the same type. These are called HFA (Homogeneous Floating-point Aggregate) and HVA (Homogeneous Short-Vector Aggregate).

Source: the Procedure Call Standard for the Arm 64-bit Architecture https://github.com/ARM-software/abi-aa/blob/2021Q1/aapcs64/a...

kps · on June 16, 2021

More importantly, it's just wrong. Large objects are not converted to references, they're passed on the stack.

¹ §5.4 https://developer.arm.com/documentation/ihi0055/b/

GeorgeTirebiter · on June 16, 2021

Seems to me that 'registers' are just really fast memory locations that get referenced by a 'stunted' address space in order to fit into most instructions. In other words, this is not a conceptual necessity. It exists because it's hard to build HW which can access all memory locations really fast. This hardware limitation leaks into software via hacks like 'registers'. Dealing with this hack is what forced Armv8 to remove PC as R15; "In the A32 and T32 instruction sets, the PC and SP are general purpose registers. This is not the case in A64 instruction set."

There is a lot of other HW crud that leaks into sw for dubious reasons. For example, although 'wasteful' if every memory location had tag bits -- hardware enforced -- capabilities would layer naturally. More sophisticated memory protection would be possible.

Instead, we have these von Neumann archs hellbent on winning some synthetic benchmark. Will the computer-using community ever demand some sacrifice in overall potential "adds-per-second" to enable better security and to support better abstractions? Sadly, I doubt it. Even Risc-V is a baby step.

saagarjha · on June 16, 2021

Do note that most processors have a physical register file that is several times larger than what they expose in the ISA.

ARM has various mechanisms in flight for tagging; there's already PAC shipping on certain chips and MTE seems like it might be on the horizon. More advanced efforts like Morello are being evaluated, too.

cesarb · on June 17, 2021

> Seems to me that 'registers' are just really fast memory locations that get referenced by a 'stunted' address space in order to fit into most instructions. In other words, this is not a conceptual necessity. It exists because it's hard to build HW which can access all memory locations really fast.

Registers are unavoidable. Even if you could directly address the whole memory space on every instruction, even if it was all "really fast", the hardware would still need to read the input operands from memory into internal registers, and store the result into an internal register before writing to memory.

gardaani · on June 16, 2021

Is there a reason for giving registers a different name depending on its value's length (D0 = 64-bit, S0 = 32-bit)?

I can never remember which names are mapped to the same register. I've had that issue on x86 and I'll have that issue on ARM.

I loved how Motorola 68000 made it simple. Each command had a dot-letter suffix indicating the length: move.b d0, d1

pm215 · on June 16, 2021

For 64-bit Arm FPU registers, the mapping is extremely simple: Q0 is the 128-bit vector register; D0 is the bottom 64 bits of it; S0 is the bottom 32 bits; H0 is the bottom 16 bits; B0 is the bottom 8 bits. Similarly Q1/D1/S1/H1/B1 are all the same underlying register.

For 32-bit Arm, unfortunately, things are different: D0 is the bottom half of Q0, and D1 the top half; similarly S0 is the bottom half of D0 and S1 is the top half (and so Q0 is S0 S1 S2 S3, with S3 its most significant 32 bits). This is why "just indicate the length on the insn" wouldn't have worked for 32-bit: there is more than one 32-bit register in each 64-bit register (and some kind of 'high/low' notation would have been annoying when you really do want to just think of it as a collection of 32-bit registers sometimes, especially if your hardware doesn't even have double-precision support!). The 64-bit transition fixed up the awkward overlaid mapping, but retains the notation for the benefit of all the people who were already familiar with the Arm notational conventions for things.

gardaani · on June 16, 2021

Thanks! I had a closed look at them and the naming is starting to make sense: Q=quad, D=double, S=single, H=half, B=byte (except the article claims that V=128bit and there's no mention of H or B)

For general purpose registers: X=long (64-bit), W=word (32-bit)

pm215 · on June 16, 2021

It's Qn when you're dealing with the register as a single 128-bit quantity (currently only the SHA256 and SHA512 insns need to do this, I think, judging from a quick search through the architecture reference manual), and Vn when you're dealing with the register as a vector of smaller units (eg "ADD V10.4S, V8.4S, V9.4S" does a vector-addition of V8 and V9 into V10, treating each 128-bit V register as a vector of 4 Single-precision (32-bit) floats): you write vector arguments as Vn.2D, Vn.4S, Vn.8H or Vn.16B. Hn is used for 16-bit floating point arithmetic insns. It looks like nothing's using Bn yet, but the manual defines the notation.

saagarjha · on June 16, 2021

You can ldr and str those registers too, if you're implementing a NEON memcpy for example.

Someone · on June 16, 2021

On 68k, that move writes 8 bits; it doesn’t really write to a 8-bit register. I make that difference because, on many (I think) other systems, such a byte move sign extends the least significant byte in d0 to the width of the target register and writes the full register.

Also, on 68k you can only address the least significant word or byte of a register that way. Some CPUs have, say, 8 8-bit registers that can alternatively be treated as 4 16-bit registers (https://en.wikipedia.org/wiki/Intel_8080#Registers). On such a CPU, one can write to the top half of a register in a single instruction by writing to the 8-bit register that shares its storage (some RISC CPUs support that by having a separate instruction “load top half of register”)

talideon · on June 16, 2021

My guess is that it's some effort to maintain some kind of compatibility with ARM32. Back when I used to write ARM asm, it was the ARM 26/32 days, and it was rXX for all the register named in the APCS.

It doesn't seem to be as confusing as the x86 situation anyway: none of the registers are _really_ special purpose on ARM.

Here's a link to the details in the ARM64 APCS: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aap...

jiri · on June 16, 2021

It's just a pity that the article ended just when I was most interested :-)

technicalya · on June 16, 2021

I thought the article contains something exciting, but excitement ended when I was reading the last lines.