Please excuse my naivety, but I have only recently gotten interested in compiler...

vidarh · on Jan 20, 2014

The hard part is not the high level algorithm - most of the common ones can be prototyped for toy systems in trivial numbers of lines.

The hard part is system interactions. E.g. the GC needs a way to identify all the "roots" (references from outside the heap, such as registers, the stack, or pointers that may have been retained by external libraries etc.).

It also needs a way to identify which parts of an object are pointers and this varies with language. "Generic" GC's like Boehm's solve this by assuming you've done nothing to obscure the pointer, and then assumes everything that could be a pointer into the heap is, but this fails miserably for any language implementation that uses bits for type tagging, for example, and retains too much memory in many other situations because it isn't precisely determining which fields are pointers.

Then you have issues like threading models (threading and GC are intimately linked, as if you don't block all threads during marking or moving/collection, you have all kinds of additional exciting issues to deal with to ensure you don't break things)

saurabhnanda · on Jan 20, 2014

Thank you for the replies. A follow-up question, how does any GC/runtime/VM know which parts of the memory are actual data vis-a-vis pointers to other parts of the memory? Some sort of global pointer table, perhaps? If so, why cant the hypothetical GC lib define an API to share said pointer-table?

blt · on Jan 20, 2014

pointers that may have been retained by external libraries

Oh, if only this were easy...

jeffffff · on Jan 20, 2014

There are several issues in implementing gc as a library. The first thing a concurrent GC must do to make progress is identify the stack roots. This involves stopping all the threads and reading their stacks and register values. To stop all the threads safely you have to generate safe points in the code, which requires tight integration with the compiler. The second issue is that in a concurrent mark sweep collector for a language with mutable data (so everthing except haskell) you must put write barriers on all your writes. If you have object A that has been scanned and object B that hasn't been scanned, and you make object A have the only reference to object B, the write barriers will make sure that B is known to be referenced and that the things B references will be scanned. Emitting write barriers as well as their implementation also requires tight coupling between the compiler and the GC. Write barriers can alternatively be implemented with memory protection and signal handlers but usually with a performance penalty.

Generational GC introduces more problems. To be able to only scan the young generation in isolation, you must know either that there are no pointers from the old generation into the young generation or you must know precisely which objects in the young generation are referenced from the old generation. The first is the approach taken by haskell but unfortunately languages with mutable data can't take advantage of that so they must use the second approach, which is implemented with more write barriers (and more compiler-GC coupling).

Copying collectors are generally either stop-the-world or generational/thread local and they run in that thread while collecting. Stop the world again requires safe points but thread local doesn't. You still can't really do the thread local gc in a library because you have to be sure that the compiler isn't keeping pointers only in registers while you scan and you have to emit write barriers to know what's referenced in your thread local generation from older generations.

justincormack · on Jan 20, 2014

There are some general GC libraries eg Boehm. however there is a lot of specific detail in a language and allocator that means you may be able to get better performance.