Garbage Collection in Ruby 2.1

AlexanderDhoore · on Dec 30, 2013

It's interesting to compare this to how python does it (reference counting with generations): http://patshaughnessy.net/2013/10/30/generational-gc-in-pyth...

"At first glance, Ruby and Python seem to implement garbage collection very differently. Ruby uses John McCarthy’s original mark and sweep algorithm, while Python uses reference counting. But when we look more closely, we see that Python uses bits of the mark and sweep idea to handle cyclic references, and that both Ruby and Python use generational garbage collection in similar ways. Python uses three separate generations, while Ruby 2.1 uses two.

This similarity should not be a surprise. Both languages are using computer science research that was done decades ago – before either Ruby or Python were even invented. I find it fascinating that when you look “under the hood” at different programming languages, you often find the same fundamental ideas and algorithms are used by all of them. Modern programming languages owe a great deal to the ground breaking computer science research that John McCarthy and his contemporaries did back in the 1960s and 1970s."

oleganza · on Dec 30, 2013

I think Henry Baker had a paper on why generational and refcounting GCs are ultimately equivalent. Couldn't find it quickly, though, to give you a link.

hencq · on Dec 30, 2013

I think you might be thinking of this paper: A unified theory of garbage collection [1], which shows how most modern collectors are hybrids of GC and ref counting.

[1] http://atlas.cs.virginia.edu/~weimer/2008-415/reading/bacon-...

brandonbloom · on Dec 30, 2013

For the record: This is a very good paper. It's extremely easy to read, even for somebody who knows nothing about the background, and the scientific result is absolutely beautiful!

kirubakaran · on Dec 30, 2013

I just wanted to say thank you for this comment. I'm reading the paper because of it. Can you please point me to your other favorites that are good + accessible?

brandonbloom · on Dec 30, 2013

I wouldn't even know where to start :-)

kirubakaran · on Dec 31, 2013

Haha you aren't getting off that easy. Could you please tell me, say, 3 that come to your mind?

oleganza · on Dec 30, 2013

Oh, it was David Bacon. Thanks for finding this link.

iagooar · on Dec 30, 2013

Isn't it actually kind of sad, that such important researches don't get considered until 40 or 50 years later?

riffraff · on Dec 30, 2013

I don't think such research wasn't considered, it's been used constantly in many projects since.

The reason why some platforms have "silly" choices like stop the world M&S GC, or interpreters with AST walking is, likely, that optimization wasn't a goal in the original implementation and it's hard to retrofit it while keeping compatibility.

RyanZAG · on Dec 30, 2013

Also interesting to compare this with how the G1 collector for the JVM handles it: http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Ru...

EDIT: Better link: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.63....

stcredzero · on Dec 30, 2013

So Ruby objects occupy 40 bytes each, inside pages on the eden heap.

That's pretty huge. That's several times larger than some other dynamic languages -- languages that have essentially the same features as Ruby.

phamilton · on Dec 30, 2013

Not really. Python primitives are smaller (24 bytes). Python strings are 37 + 1 byte per character. Python objects are 72 bytes. Python classes are 104 bytes. (Got these numbers playing around with sys.getsizeof in the python console.)

Primitives are a bit bigger in ruby, but they aren't really primitives anyway (everything is an object).

stcredzero · on Dec 31, 2013

Python is another offender!

phamilton · on Dec 31, 2013

What languages are you referring to than? Perl and PHP are really the only others that are sort of in the same class as Ruby and Python.

stcredzero · on Dec 31, 2013

Smalltalk, for one.

rurban · on Dec 31, 2013

Perl objects are even bigger and need one indirection per access. A perl head is 32 bytes plus 48 byte body, ie. 80byte on 64bit systems.

For my perl rewrite p2 I need one single word per object on the stack. Common Lisps usually needs two words, the class pointer and the tagged value.

I really liked the visualization of the various GC's btw. Excellent work!

riffraff · on Dec 30, 2013

I have no idea of the size in other language implementations, but notice 40 bytes is on x86-64, on 32 bit it's either 20 or 24.

joevandyk · on Dec 30, 2013

this is why running large ruby apps on 64 bit OSs can take up a lot more memory than 32 bit.

riffraff · on Dec 30, 2013

isn't that true of most platforms? I know the JVM has compressed OOPS to limit this issue, but I am not aware of similar things elsewhere.

jeffdavis · on Dec 30, 2013

To me, it's interesting that after all this time and with the prevalence of languages that rely heavily on garbage collection, garbage collectors still don't seem to scale to large heaps.

Large heaps mean long pauses when doing the "mark" phase of the generation holding most of the objects. That seems like a big problem that will increase as memory sizes increase.

Maybe the way I'm looking at it is too simplistic and there are better methods now. But even in Java, which has had plenty of time to work these issues out, I have seen issues with long GC pauses.

My impression right now is that GCs just don't scale to large heaps. To use more memory, you need to either manage memory yourself (which not a lot of modern languages allow), or increase the number of independent heaps (by using more tasks/processes).

Please enlighten me if I'm wrong here.

eonwe · on Dec 30, 2013

That seems to be the case in Oracle Java at least. With 80+ GB heaps, you can experience multi-minute stop-the-world collections when a full GC strikes. Luckily, that can be tuned to be very rare (for the usual load).

For better latency without such hiccups, the only solution I know of is Azul's Zing (http://www.azulsystems.com/zing/pgc) which really did away with larger pauses at least with our software.

jeffdavis · on Dec 30, 2013

What is "the usual load"?

Interesting; I'll check out the Azul one. I'm a little skeptical, but it might be an improvement. Did you see better throughput or just reduced latency?

eonwe · on Dec 30, 2013

By usual I mean what our application was tuned for.

Which, sadly, may or may not be what it encounters. There are still some usage patterns that can cause large amount of time being spent doing gc (without Zing).

Latency was greatly reduced, I cannot really say about throughput (except that it was not clearly worse at least).

rurban · on Dec 31, 2013

The important number is the avg pause time. Here with ruby 2.1 7ms for a minor and 58ms for a major sweep for a typical ~500K heap app.

In my potion-based GC (a stack-scanning, compacting, cheney two-finger GC) the avg GC needs 3ms, but it's not concurrent (multi-threaded) yet.

All of these can be considered real-time, i.e. < 15ms. For bigger heaps the scans need to be incremental (saving and restoring GC state, which is easy with libgc or cheney).

The best java GC I found needed 150ms pause time. Good lisp's have real-time GCs.

venus · on Dec 30, 2013

What's this RailsApp.preload_all method mentioned in this post? Is that a github thing?

Great post btw. I am looking forward to getting some production apps onto 2.1 in the coming weeks and seeing how performance characteristics have changed in the new release.

tmm1 · on Dec 30, 2013

We implement preload_all, which loops over app/{models,controllers}/ and requires everything. This method is generally called from config.ru, and happens in the unicorn master before it forks off workers.

I recently upstreamed a warmup method for Rack::Builder you can also use for this purpose: https://github.com/rack/rack/pull/617

venus · on Dec 30, 2013

Thanks for that link (and for submitting it!). I've done something similar in deploy scripts, to warm up a fresh instance before bringing it into rotation - even better having it able to be baked into rack like that.

purephase · on Dec 30, 2013

That's interesting. Thanks for sharing. I'll have to check this out.

ksec · on Dec 31, 2013

And all these are all very similar to Google's Chrome V8 and Mozilla's SpiderMonkey. Although both had incremental done already, for Ruby that is scheduled for v2.2.

Now just when will Ruby get a Method or Tracing JIT?