It's cool to get a look at a piece of assembly code that seems oddly familiar from all those times when my app has crashed in objc_msgSend due to a prematurely released object... (I still write non-ARC code for compatibility reasons. Sigh.)
I find the following claim slightly misleading:
32-bit x86 was quickly overtaken by 64-bit, so this code received little attention after Tiger and many of those inefficiencies remain to this day.
There are still major 3rd party apps that run in 32-bit mode on the Mac. Google Chrome is probably the most popular.
It's perfectly correct from Apple's perspective (which is the perspective from which this article is written). Apple started shipping 64-bit Intel Macs less than a year after they started shipping Intel Macs at all. OS X 10.4 was the first version to support Intel, 10.5 was the first version to fully support 64-bit, and by 10.6 most of the system was running 64-bit. 10.7 dropped support for 32-bit Macs entirely.
Apple had little reason to improve 32-bit since apps can simply target 64-bit if they want the speed improvement. From Apple's perspective, Chrome makes a deliberate choice to run in that environment and they have no particular reason to help them out.
"X was quickly overtaken by Y" does not imply "X completely vanished."
Yes, a few notable apps are still 32-bit. But if you look at all Mac OS X programs in existence, a very healthy majority preferentially target the 64-bit platform. The 32-bit apps are the odd ducks, no matter how big a couple of those ducks have grown.
Does anyone know if this function is actually written in asm? Case in point:
The code was probably originally written for NeXT
by an engineer who was familiar with load-store
architectures like PowerPC but not so familiar with
register-memory architectures like x86 ... Those of
you who do know x86 better may be able to identify
some of the inefficiencies in this code
Why wouldn't/couldn't this function be written in C (are there any instructions that would need some non-portable intrinsics?) and leave it to an optimizing compiler to get the instructions right. Sure, sending messages is low-level and needs to be high performance but that, to me, doesn't necessitate "we have to do this by hand" asm instead of C.
Because objc_msgSend exists outside of the normal ABI and calling convention.
If you were to write it in C then unless you tore out the ASM generated by the compiler manually you'd have calls to objc_msgSend in the call stack in between every method call, and even still you'd probably emit a bunch of register spills and argument handling.
The thing is, calls to objc_msgSend are _set up_ at the call site as if it were a call to the class's method's function pointer. objc_msgSend is a small trampoline that sits between the call site and the method call, it's not a proper function.
Besides, for something so small, getting a few good engineers (probably engineers who have a background in writing optimizing compilers) to hand-optimize the code and improve on it with new ideas every major release is probably the best option for everybody.
Consider also that objc_msgSend doesn't exist on its own. objc_msgSend_stret handles methods that return a struct, which is a pain in the butt most of the time, and there's also a bunch of hand-rolled variable-argument trampolines for dealing with ObjC blocks (IIRC the file is called a1a2_blocktramps.a, if you want to see something truly disturbing) that do things like calculating byte offsets and jumping to them based on how many arguments we have.
I'm going off on a tangent now but hopefully the point is made. These procedures are so far removed from normal function calls that writing them in C is not the best approach.
objc_msgSend would actually be impossible to implement in C.
Its role is to look up the function pointer that implements a method, then call that function pointer with the arguments passed to objc_msgSend, returning the result of the implementation.
Implementing objc_msgSend in asm guarantees that the argument & return registers aren't touched by the method lookup. Similarly to how ffi libraries and implementations of setjmp/longjmp need to be implemented in asm, it operates at a lower level than C's stack & function abstractions.
To elaborate on this, a C implementation of objc_msgSend would look something like this:
id objc_msgSend(id self, SEL _cmd, args...) {
IMP methodImplementation = ...; // look up the method however
return methodImplementation(self, _cmd, args...);
}
But there is no facility in C that lets you take arbitrary additional arguments and then pass them all to another function unchanged.
(In theory you could accomplish this using variadic functions. But then every method would have to take a va_list for its parameters instead of just taking a regular parameter list, breaking the idea that ObjC methods are just C functions with two implicit parameters. It would also be substantially slower.)
If you don't care about performance, you can do the method lookup in C, and have the asm implementation of objc_msgSend simply call that C function and then the returned method.
A working implementation of this minimal objc_msgSend can be found in Cocotron sources:
iTunes and other Apple software on Windows runs with code that looks pretty much exactly like that. Those code bases are compiled from Obj-C into plain C by the objc rewriter module in clang, then compiled into machine code by Visual Studio. You can take a look at what the clang rewriter does here: http://clang.llvm.org/doxygen/RewriteObjC_8cpp_source.html
That's very interesting, I've never heard of that (but assumed it would be something pretty much along those lines).
What's the Obj-C runtime used on Windows for those apps? Is it a descendant of the NeXT runtime which did run on Win32 at one point, or something else?
That approach, if inlined, #define-style, for every message send, would probably be better than going through objc_msgSend. Since each method call then gets its own branch instruction, rather than every single last one sharing the one jmp at the end of objc_msgSend, you'd get less branch target history contention.
Most message sends end up at the same place each time, most of the time, so I'd think this is what you want. But whether it's actually an issue in practice I couldn't say, and if nobody's done it already I suppose it could not be...
>Sure, sending messages is low-level and needs to be high performance but that, to me, doesn't necessitate "we have to do this by hand" asm instead of C.
Well, if anything ever necessitates finely hand-tuned assembly code instead of living it to the compiler, it's precisely a case such as objc_msgSend.
Fascinating stuff. It's so weird to realize that such a critical piece of code is still changing significantly after 10 years at NeXT and 10 years at Apple.
I think it's weirder that so often such critical bits still are the makeshift unoptimized first version. But I know how it works in the real world, it's good enough and there are no reported bugs, so don't touch it until you have implemented all these features and fixed all these issues....
While this was probably an interesting exercise, I should remind everyone that one can get the same functionality by setting the environment variable NSObjCMessageLoggingEnabled to YES.
Where do you get that idea? What's shown is it. There's a lot more to the message send system, but the rest is helpers that live outside of objc_msgSend and aren't on the fast path. There's no room for "a bunch of other tricks", because objc_msgSend needs to be as fast as possible, and the more it does, the slower it'll run.
In the Mavericks version there are two jumps to labels that are not present in the code shown. One is for tagged pointers, the other is for a cache miss. The cache miss is probably interesting, tricky code.
Apropos nothing… the cache search algorithm is "enter at predictable point, linear search everything". When one thinks of all the PhD time spent developing and analyzing search data structures, this is a good reminder that at small N, other factors dominate.
Also, adding a one byte instruction prefix that you know will be ignored because your processor is happier if instructions aren't so bunched up is so close to witchcraft that the author should avoid campfires with stakes in the center.
The cache miss may or may not be interesting, depending on your interests, but it's all written in C and is much longer and more boring. As soon as you miss the cache, you're off the fast path, so the cache miss code in objc_msgSend just does the requisite register preservation and then calls into C.
As you can clearly see from the code it's perfectly acceptable in Objective-C to send a message to nil. This is a powerful feature that prevents the program from crashing but it is confusing for beginners.
It doesn't seem particularly confusing. It's pretty much the first thing you learn: square brackets send a message, and if there's no recipient, the return value will be zero. (Granted, it used to be more complicated in the PowerPC era, when e.g. a selector that returns a double was not guaranteed to return 0.0 with a nil receiver, but those cases were fixed in the runtime.)
I suspect the messaging-to-nil behavior is mostly confusing to intermediate programmers who only have experience with Java's passive-aggressive null references.
There is no "natural" behavior to a beginner. Either way has to be learned and understood. Nil messaging behavior is only going to be confusing to people who learned one way and internalized it enough to think of it as "the way", then were exposed to the other way. Java programmers going to ObjC will find ObjC's nil handling confusing, and ObjC programmers going to Java will find Java's null handling confusing.
It's a trade-off, because it's convenient to not have to check that objects are non-nil when you send messages to them. With experience, you learn to deal with code that requires non-nil objects via assertions:
Well, you have to learn what the "zero value" for each and every type is in order to fully understand the language semantics. IMHO, this adds complexity beyond null pointer exceptions.
You also need to understand "zero values" to understand what a freshly created object looks like (all of its instance variables are initialized to zero). So I don't think it would make Objective-C any easier to learn if messaging to nil would throw an exception.
Personally I find that the nil behavior provides a useful base guideline for API design. When defining a method signature in Obj-C, you have to ask yourself if it's clear to the user of the method what happens when the method (inevitably) gets sent to nil. In that way, you have to think about the circumstances of the API's actual use, rather than just the ideal case.
Yes, thanks! (I still haven't learned the English names for the various brackets, it seems... In my mind, the ones with the straight angles are the "angle" ones.)
And for those learning etymology, parenthesis comes from ancient Greek:
Parenthesis, from "para" (next, beside, near) and "enthesis" (embedd, inline). So, essentialy to "inline near/next to something" (what we do with a parenthetical phrase).
The ability to send messages to nil is not only there to prevent crashes. It can also be used to make clean APIs or to use APIs with a lot of convenience. It also gives you a lot of flexibility... It allows you to do the error handling on a case by case basis. If you are not interested in erros simply do not check for nil and go on...
I find the following claim slightly misleading:
32-bit x86 was quickly overtaken by 64-bit, so this code received little attention after Tiger and many of those inefficiencies remain to this day.
There are still major 3rd party apps that run in 32-bit mode on the Mac. Google Chrome is probably the most popular.