Gosh, I hate to be that "old guy," but using bitmaps for mark & sweep GC dates from the early 70's.
In particular, in GC implementations for Lisp and Lisp-like languages (such as the ECL implementation we used at Harvard in the mid-70's) where CONS cells were two eighteen-bit halfwords (CAR and CDR) fitting in a 36-bit word, there was no place for mark bits, so you'd use a bitmap instead (as rediscovered here).
And we used the same technique for finding the page headers (e.g., containing the mark bitmaps) for each part of the heap, aligning to a larger power of two so you could chop any pointer down by and'ing off the lower bits to get to the page header.
There really ain't much new under the sun. Too bad every generation has to rediscover all this stuff.
Indeed - information about good garbage collectors is readily available. Paul R. Wilson's "Uniprocessor Garbage Collection Techniques" (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.2...) is an excellent survey, and Jones, Hosking, and Moss's _The Garbage Collection Handbook_ is an excellent reference. (The previous book by Jones and Lins is also great, but the new one adds important material on concurrent GCs, real-time GC, and interactions between the GC and runtime systems.)
As an exercise, I wrote a simple standalone mark-bit-page & lazy sweeping garbage collector for C. (https://github.com/silentbicycle/oscar) Marking occurs in a user-provided callback, since I don't assume any particular data structure. It's only ~300 LOC.
> Too bad every generation has to rediscover all this stuff.
Even though this technique existed 30 years ago, doesn't mean that it's been recently "rediscovered" by the Ruby folks and excitedly implemented to the musings of, "how did we miss this technique?!?" There are likely lots of reasons why this wasn't used before. The abstract of their paper seems to shed some light on this:
[...]But Ruby interpreter runs on various platforms, and we do not have portable
memory allocation API to obtain aligned memory region without wasting region. In this paper, we propose portable scheme to map from object pointers to corresponding bitmap table
in constant time.[...]
I may be totally wrong, and the paper is entirely in Japanese, but I think there is more to it.
Ruby has been around since 1995. That's over 15 years.
Implementing bitmaps for mark-and-sweep a project well within the skills of a final year undergrad, or first year graduate student of CS.
You can say many things, but I doubt they didn't implement this because they didn't have time, or didn't have the skills to do so. They merely prioritized against it from day-one.
Not really excited. Sorry. To me this is an incremental improvement to a method of GC that is just not going to cut it.
Rubinius uses Generational GC. There are definitely some parts of Rubinius that are (currently) slower than MRI, but memory management stands to be one of places where Rubinius will demolish MRI in the long-run.
JRuby/JVM uses (from what I understand) a Generational-first hybrid GC. There's been so much work and analysis done on the JVM that there's no way in hell any mark and sweep GC is going to compete.
Honestly if MRI is going to remain competitive it needs to completely rethink the way it does GC.
Jruby seems like it has the most potential in the GC department. It seems like most of the interesting GC papers I've seen are work that's been done on the JVM.
Most [1] modern GC implementations use generational copying collectors. A copying GC does not need to maintain a bit for each allocated object, and therefore will not break copy-on-write.
They are also much faster for many common loads.
[1] I suppose this is debatable. Oddly, many modern scripting languages use ancient compiler technology for no discernible reason. See also: the GIL.
> Oddly, many modern scripting languages use ancient compiler technology
Historical: most of them are 10~20 years old, "new compiler technologies" had not trickled down much when they were created and they had no real need for the things those provided (not to mention they have drawback, a GIL has better single-threaded performances than fine-locking which is important when single-treaded is your primary workload).
Also manpower, probably, it's harder to implement a generational concurrent GC than a refcounting or a basic M&S, and it's even harder to retrofit that into an existing codebase (hence pypy having pluggable GC backends and defaulting to a hybrid GC)
And they both had to spend a lot of effort to get concurrent GC to work efficiently. The biggest downside of concurrent GC is that you need read barriers, not just write barriers as in single-threaded incremental garbage collection.
I have a theory, which has so far held up [1], that every remotely popular language that needs automatic storage reclamation ends up growing a generational and incremental/concurrent collector, either in its main branch or in a fork [2]. It's just a matter of time. Efficient automatic storage reclamation is hard.
[1] Language developers sometimes claim they don't need it. I believe they're on the wrong side of history.
[2] Forks are common for this, and for good reason; often times languages that start with reference counting end up taking large dependencies on external libraries that manipulate reference counts (like Python with its Py_INCREF and Py_DECREF), at which point the language can't fully move to tracing GC without breaking its ecosystem.
Yes, but generally if you're forking an existing process, then maybe 90%+ of the objects have already been tenured into the old space. It's only young or newly created objects that are subsequently found to be long lived that will need to be copied.
The standard state-of-the-art for garbage collection is two-space copying for the nursery, and mark-and-sweep for the tenured generation. That way you avoid copying the whole heap on every GC; you only get copying for the nursery. You may have a compaction pass for the tenured generation that runs very rarely (much more rarely than a regular minor or major GC). V8, Java, and OCaml all follow this model, and I believe .NET and Haskell do as well. Lua skips the generational part for simplicity's sake (as games care a lot more about latency than throughput); SpiderMonkey does too because Mozilla hasn't implemented generational GC yet.
If fork() performance is important to you, you probably don't want compaction for the tenured generation at all. That's totally fine; while fragmentation will accumulate over time, it's no worse than C++. You'll still have copying for the nursery, but the nursery fills up and gets recycled so often that the COW optimization isn't really helping you there anyhow.
Second most straightforward, perhaps. They previously had the bit for marking inside objects, rather than on a separate memory page of just mark bits, which interacts poorly with copy-on-write.
I'm pretty sure you got downvoted for writing something that didn't add anything to the conversation.
"Oh, verelo's never been excited about Ruby? Well that changes everything! I'm sure glad I know his personal opinion, offered without any commentary related to the content of the article or even off-topic supporting reasons!"
Fair enough, i honestly just a little tired of justifying why ruby has never excited me (not so much on here but in real life)
I find other languages, that I'm already a lot more familiar with, that have very well documented functionality, offer everything i need.
My main issues have been installation headaches and deprecated calls in other peoples code (without an easy swap in and out of a new function). That's really just the start of it but its enough for me to have lost interest until I see some significant changes to the language, more significant than this change.
You would have the same problem with deprecations in any open source projects that are either not maintained or you can't update for whatever internal reasons. The language has nothing to do with that. As for installation issues, rvm solved that for me long ago.
RVM does help, deprecations...i dont know though. I have found that the language has gone through a lot of changes in a short period of time, which is great, but it hasnt been very convenient for apps that have not been very closely maintained.
In particular, in GC implementations for Lisp and Lisp-like languages (such as the ECL implementation we used at Harvard in the mid-70's) where CONS cells were two eighteen-bit halfwords (CAR and CDR) fitting in a 36-bit word, there was no place for mark bits, so you'd use a bitmap instead (as rediscovered here).
And we used the same technique for finding the page headers (e.g., containing the mark bitmaps) for each part of the heap, aligning to a larger power of two so you could chop any pointer down by and'ing off the lower bits to get to the page header.
There really ain't much new under the sun. Too bad every generation has to rediscover all this stuff.