Hacker News new | past | comments | ask | show | jobs | submit login

Does anyone know if this function is actually written in asm? Case in point:

    The code was probably originally written for NeXT
    by an engineer who was familiar with load-store
    architectures like PowerPC but not so familiar with
    register-memory architectures like x86 ... Those of
    you who do know x86 better may be able to identify
    some of the inefficiencies in this code
Why wouldn't/couldn't this function be written in C (are there any instructions that would need some non-portable intrinsics?) and leave it to an optimizing compiler to get the instructions right. Sure, sending messages is low-level and needs to be high performance but that, to me, doesn't necessitate "we have to do this by hand" asm instead of C.



Because objc_msgSend exists outside of the normal ABI and calling convention.

If you were to write it in C then unless you tore out the ASM generated by the compiler manually you'd have calls to objc_msgSend in the call stack in between every method call, and even still you'd probably emit a bunch of register spills and argument handling.

The thing is, calls to objc_msgSend are _set up_ at the call site as if it were a call to the class's method's function pointer. objc_msgSend is a small trampoline that sits between the call site and the method call, it's not a proper function.

Besides, for something so small, getting a few good engineers (probably engineers who have a background in writing optimizing compilers) to hand-optimize the code and improve on it with new ideas every major release is probably the best option for everybody.

Consider also that objc_msgSend doesn't exist on its own. objc_msgSend_stret handles methods that return a struct, which is a pain in the butt most of the time, and there's also a bunch of hand-rolled variable-argument trampolines for dealing with ObjC blocks (IIRC the file is called a1a2_blocktramps.a, if you want to see something truly disturbing) that do things like calculating byte offsets and jumping to them based on how many arguments we have.

I'm going off on a tangent now but hopefully the point is made. These procedures are so far removed from normal function calls that writing them in C is not the best approach.


objc_msgSend would actually be impossible to implement in C.

Its role is to look up the function pointer that implements a method, then call that function pointer with the arguments passed to objc_msgSend, returning the result of the implementation.

Implementing objc_msgSend in asm guarantees that the argument & return registers aren't touched by the method lookup. Similarly to how ffi libraries and implementations of setjmp/longjmp need to be implemented in asm, it operates at a lower level than C's stack & function abstractions.


To elaborate on this, a C implementation of objc_msgSend would look something like this:

    id objc_msgSend(id self, SEL _cmd, args...) {
        IMP methodImplementation = ...; // look up the method however
        return methodImplementation(self, _cmd, args...);
    }
But there is no facility in C that lets you take arbitrary additional arguments and then pass them all to another function unchanged.

(In theory you could accomplish this using variadic functions. But then every method would have to take a va_list for its parameters instead of just taking a regular parameter list, breaking the idea that ObjC methods are just C functions with two implicit parameters. It would also be substantially slower.)


It can't be written in C since the stack/registers need to be preserved for the call to the actual method implementation.


If you don't care about performance, you can do the method lookup in C, and have the asm implementation of objc_msgSend simply call that C function and then the returned method.

A working implementation of this minimal objc_msgSend can be found in Cocotron sources:

http://code.google.com/p/cocotron/source/browse/objc/platfor...

It's the commented-out piece of code at line 10. Here's the C function it calls:

http://code.google.com/p/cocotron/source/browse/objc/objc_ms...


Sure but that would probably be about as performant as: `(objc_lookupImp(obj, sel) ?: objc_nilImp())(obj, sel, ..params..)`

(I realize you said "if you don't care about performance", but ObjC would be a pretty terrible language if it wasn't fast)


iTunes and other Apple software on Windows runs with code that looks pretty much exactly like that. Those code bases are compiled from Obj-C into plain C by the objc rewriter module in clang, then compiled into machine code by Visual Studio. You can take a look at what the clang rewriter does here: http://clang.llvm.org/doxygen/RewriteObjC_8cpp_source.html


That's very interesting, I've never heard of that (but assumed it would be something pretty much along those lines).

What's the Obj-C runtime used on Windows for those apps? Is it a descendant of the NeXT runtime which did run on Win32 at one point, or something else?


That approach, if inlined, #define-style, for every message send, would probably be better than going through objc_msgSend. Since each method call then gets its own branch instruction, rather than every single last one sharing the one jmp at the end of objc_msgSend, you'd get less branch target history contention.

Most message sends end up at the same place each time, most of the time, so I'd think this is what you want. But whether it's actually an issue in practice I couldn't say, and if nobody's done it already I suppose it could not be...


IIRC C compilers in the 90’s weren’t so optimising.


>Sure, sending messages is low-level and needs to be high performance but that, to me, doesn't necessitate "we have to do this by hand" asm instead of C.

Well, if anything ever necessitates finely hand-tuned assembly code instead of living it to the compiler, it's precisely a case such as objc_msgSend.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: