ELF files on Linux

haberman · on May 16, 2019

I wrote Bloaty (https://github.com/google/bloaty) which involved writing a totally custom ELF file parser. Here are some epiphanies I had about ELF while writing it.

ELF (and Mach-O, PE, etc) are designed to optimize the creation of a process image. The runtime loader mainly just has to mmap() a bunch of file ranges into memory with various permissions. This is quite different than loading a .jar file, .pyc, etc. which involve building a runtime heap and loading objects into that heap.

ELF has two file-level tables: sections and segments (the latter are also called program headers). Things clicked for me when I realized: sections are for the linker and segments are for the runtime loader. Sections are the atomic unit of data that linkers operate on: the linker will never rearrange data within a section (it may concatenate several input sections into a single output section though). The loader doesn't even look at the section table AFAIK, everything needed to load the binary is put into segments / program headers.

Only some parts of the binary are actually read/loaded when the binary is executed. Debugging info may bloat the binary but it doesn't cost any RAM at runtime because it's never loaded unless you run a debugger. Bloaty makes this clear by showing both VM size and file size: https://github.com/google/bloaty#running-bloaty

AceJohnny2 · on May 16, 2019

I've been wondering about qualitative differences between Mach-O and ELF, after hearing a Mach developer trash the ELF format, but I don't know enough about either to comment. Do you have any insight?

haberman · on May 16, 2019

That is a deep and interesting question. I'm not sure I can give a great answer, but here are a few thoughts.

If we look at the file format itself (separate from the features/semantics of the linker and loader), I think ELF is simpler and more orthogonal. You can iterate over the section/segment tables of an ELF file without knowing anything about what each section/segment means. ELF nicely decouples the high-level "container" aspect of the file format from the lower-level semantics of how you interpret each section/segment in the linker and loader.

Mach on the other hand couples these two concepts together. The top-level table is an array of "load commands", each with its own type, but you can't even parse a load command until you know what type it is. Unlike ELF, the entries of this table do not have a generic format or even a consistent size. If you haven't written code to specifically recognize a given command type, all you can do as fallback behavior is skip it. To me ELF feels like a refactoring of Mach to make it a little more general and layered.

If we consider the actual semantics and features of the file formats, there are pros and cons to both. Mach-O has built-in support for fat (multi-architecture) binaries, which is kind of nifty, though I've never actually used it myself. Mach-O distinguishes between "dylib" and "bundle" for shared libraries -- for the life of me I can never remember the difference between these two -- whereas ELF just has one type of shared library. (https://docstore.mik.ua/orelly/unix3/mac/ch05_03.htm). The distinction seems to add complexity and I'm not sure I understand the benefit. Mach-O has two-level namespaces (dynamic symbols are resolved by both name and the library they come from) -- colliding symbols aren't generally a problem I've seen with ELF, but maybe it's useful in some cases. ELF makes symbol interpositioning easy with LD_PRELOAD, though Mach-O seems to have its own version of this that I've never tried: https://stackoverflow.com/questions/12609728/changing-functi.... Overall I prefer ELF.

AceJohnny2 · on May 16, 2019

Thanks!

Regarding multi-arch support, Ryan C. Gordon (Linux game porter extraordinaire, icculus.org) had proposed FatELF [1] back in 2009 (LWN coverage [2]). It seemed simple enough to implement, but never really picked up steam (IMHO for reasons that speak of the culture of the Linux ecosystem).

[1] http://icculus.org/fatelf/

[2] https://lwn.net/Articles/359070/

yjftsjthsd-h · on May 16, 2019

> reasons that speak of the culture of the Linux ecosystem

"Everyone ships source; just recompile"? It would be convenient, but with source and a compiler you can hit everything anyways.

AceJohnny2 · on May 16, 2019

Yep, that's indeed my perspective, and I think that mindset dismisses the effort required to deliver closed-source binaries with long-term support.

glandium · on May 16, 2019

Mach-O also has a more compact bytecode-like representation for relocations, while ELF just wastes tons of space. See https://glandium.org/blog/?p=1177

monocasa · on May 16, 2019

Not haberman, but I've written loaders for both Mach-O and ELF, and really prefer ELF.

ELF is mainly structured like descriptive tables of how the relevant pieces look in memory; Mach-O is more structured like a script of commands that you run to load the binary. There's a couple places where the model breaks down for ELF, DWARF and GNU_STACK both feel more Mach-O, but if you're playing with binaries for non standard uses, ELF just feels a lot cleaner IMO.

I'd love to hear the Mach developer's arguments though.

haberman · on May 16, 2019

Interesting, what was your job that required writing both a Mach-O and ELF loader?

monocasa · on May 16, 2019

Binary analysis and introspection tools.

jcranmer · on May 16, 2019

I only know the ELF format in detail, but I think the major complaint is that ELF uses a flat namespace for symbols whereas Mach-O has a two-level namespace. Furthermore, ELF lets you preload dynamic libraries such that you can override calls even to symbols provided in the same shared object.

saagarjha · on May 16, 2019

Note that it's possible to force a flat namespace for Mach-O through a variety of linker flags and DYLD environment variables. And I'm not sure if it does everything you'd want it to, but you can use DYLD_INSERT_LIBRARIES to preload Mach-O dynamic libraries as well.

monocasa · on May 16, 2019

I just want to say thanks for bloaty. I've used it all the way from 100MB backend server programs, to deeply embedded, bare metal STM32F apps measured in KB.

haberman · on May 16, 2019

That makes me really happy to hear. I'm glad Bloaty is useful for you!

CalChris · on May 16, 2019

I've worked with both ELF and Mach-O. They're really very similar from a loading standpoint although Mach-O is a little more wordy and precise. One difference is that Mach-O allows you to specify the initial stack (x86_thread_state64_t.rsp) and ELF (for some reason which I've never understood) doesn't; but maybe there are ELF extensions in the embedded space that I don't know about.

With most Mach-O files, rsp is set to 0 and a default is used. The code is in bsd_i386.c/thread_userstack():

  if (state25->rsp) {
    *user_stack = state25->rsp;
    if (customstack)
      *customstack = 1;
  } else {
    *user_stack = VM_USRSTACK64;
    if (customstack)
      *customstack = 0;
  }

This can be set on the OSX ld command line with -stack_addr. OSX uses an obscenely old fork of GNU ld for Mach-O. No linker scripts.

Things like zero pages are explicit (but required!) in OSX while they are implicit in ELF+Linux. Also llvm lld's ELF support is first class whereas its Mach-O support is less fleshed out. With Mach-O it helps to read the Darwin source files to figure out what it does with a file. Same to a lesser degree with Linux.

IshKebab · on May 16, 2019

Sections and segments are so badly named. They may as well have called them data and different-data.

I guess it is too late to rename them though.

zimbatm · on May 16, 2019

If you ever need to tweak or inspect an existing binary, https://github.com/NixOS/patchelf is great.

royragsdale · on May 16, 2019

lief - Library to Instrument Executable Formats https://lief.quarkslab.com/

is another great programmatic option

mzs · on May 16, 2019

/usr/ccs/bin/dump on solaris is great for shell scripts too.

kccqzy · on May 16, 2019

One thing that's on my mind but haven't been able to spend time investigating is the fact that on my machine (Ubuntu 19.04), almost all distribution-installed executables are not ELF executables per se, but ELF shared objects. Running `file /bin/ls` shows that it's an ELF 64-bit LSB shared object. Running `readelf -h /bin/ls` also says that the type is DYN. Is the executable type basically deprecated now?

jzwinck · on May 16, 2019

Position Independant Executable (PIE) files are detected as shared libraries because they use the same old identifier as position independent shared libraries. The ELF folks could have added a new type but did not, leading to some confusion like this: https://bugs.launchpad.net/ubuntu/+source/shared-mime-info/+...

PIE is a security feature, which is why it has proliferated on newer systems. See https://access.redhat.com/blogs/766093/posts/1975793

usr1106 · on May 16, 2019

Here is an example to try and see yourself:

  $ cat show-addr.c
  #include <stdio.h>
  
  int main(int argc, char **argv)
  {
    printf("main() is at %p\n", main);
  }
  $ gcc -o show-addr show-addr.c
  $ file show-addr
  show-addr: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 2.6.32, BuildID[sha1]=60c8b61a7040adccc90934bc79e24342eecae15a, not stripped
  $ ./show-addr
  main() is at 0x400526
  $ ./show-addr
  main() is at 0x400526
  $ gcc -pie -fPIC -o show-addr.pie show-addr.c
  $ file show-addr.pie 
  show-addr.pie: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 2.6.32, BuildID[sha1]=73ae8065a65aad4b829567b7ce3464fb5d1e3fc3, not stripped
  $ ./show-addr.pie 
  main() is at 0x55c722ae1750
  $ ./show-addr.pie 
  main() is at 0x5649bf97b750
  $

(edit: incomplete copy-paste fixed)

saagarjha · on May 16, 2019

A somewhat lesser known but (IMO) fun fact is that if you compile your executable with -export-dynamic, you can dlopen it and dlsym for functions inside of it just like you would with any other shared library (note that if your goal is to just load the executable, such as if it already has a constructor defined, you don't even need this flag).

vectorEQ · on May 16, 2019

nice article but wish people would elaborate more on relocations instead of always skipping that. it's a very important part of understanding how ELF works when it's executed.

jcranmer · on May 16, 2019

Relocations boil down to this:

A relocation entry contains a location of the patch, the symbol to use in relocation, an optional addend, and a type. The type tells you how to compute the relocation and is completely defined by the processor-specific ABI.

The simple relocations boil down to "add the addend to the address of the symbol, subtract the address of the relocation, and store it as signed N-bit number". There are more complex relocations that involve things such as symbol sizes, TLS relocations, or the GOT and the PLT.

cmrdporcupine · on May 16, 2019

Absolutely, I once wrote an ELF loader for the Atari ST (never finished, but almost), and the documentation on relocation was absolutely arcane.

saagarjha · on May 16, 2019

> For those who love to read actual source code, have a look at a documented ELF structure header file from Apple.

Surely the Linux kernel would be a better source?