It seems that the title of this submission does not actually reflect its content very well.
Title of submission: "Pauseless GC for OpenJDK"
Within the first minute of the presentation: "Shanendoah is not pauseless" (0:45, emphasis mine).
The presenter then continues, "It could be pauseless. We have plans to make it pauseless, but for now the first step is just doing concurrent evacuation."
Perhaps a more apropriate submission headline would be "Potentially pauseless GC for OpenJDK" or something along those lines?
For what it is worth, GCs are a major headache generally for system level code. The expense in terms of system behavior are far higher than the implied CPU time due to the scheduling issues. This has always been the challenge of GC-centric environments.
I think there are two big challenges for garbage collected environments with respect to core server systems level code. First, an acceptable pause needs to be reliably sub-millisecond or it breaks the userspace schedulers in a lot of high-performance server code. In the video presentation, they had a difficult time guaranteeing 10 milliseconds. They are working on it but this is a problem. Second, most high-performance server engines eliminate almost everything that looks like context-switching, mostly because it is very expensive on modern hardware. The idea that there may be hundreds or even thousands of threads running on real servers does not comport with the reality of a lot of server environments where context switching is severely restricted for good reason. The idea that it makes sense to have a lot of background processes cleaning up the memory allocation contradicts many well-conceived software architectures. These days, everything is explicitly scheduled and bounded if server performance matters.
>In the video presentation, they had a difficult time guaranteeing 10 milliseconds.
In fairness, the estimate was 10 milliseconds for 100 GB heaps. Not everything (especially, I'd guess, programs that benefit from pauseless GC) deals with that much memory.
I used to be big into GC, specially due to my experience with the Oberon system, which showed me workstations developed in GC enabled systems programming languages are possible.
This also lead me to research systems like Mesa/Cedar, Modula-2+, Modula-3, or more recently Singularity.
Different approaches were taken, RC with local GC for collecting cycles (Cedar/Modula-2+), GC as kernel service (Modula-2+/3, Oberon), static binaries with runtime enabled GC Syngularity.
So nowadays I tend to refer to automatic memory management in general, be it GC, RC, RAII or via dataflow analysis.
System programming languages with GC trace back to Algol 68.
Yet, besides Apple work with ARC or Microsoft with Singularity/COM/WinRT, there isn't much happening in mainstream OS mostly due to backwards compatibility to the culture that followed UNIX instead of Xerox PARC systems.
If systems like Mesa/Cedar had the focus and money spent that JVM or V8 enjoy, the computing world would be a bit different.
I am glad someone is giving Azul competition here. Their products are ridiculously expensive. I hope Oracle also takes notice, great throughput is great, but sometimes latency bounds are also important.
I'm I the only one who is a bit uneasy with forwarding pointers? In terms of memory use this is like disabling compressed oops and in terms of performance this is like going back to Java 1.1 with an object table.
Personally, deal with the low-hanging fruit first. For example: there are plenty of cases where an optimizing compiler could statically (read: at compile time) compute the lifespan of an object and thus bypass the GC entirely. (Or, for another example for a related issue: recognize that a threadsafe object can only be accessed by one thread and as such can be optimized in ways that break update order.) And yet most compilers don't see to do this sort of optimization, or do so in an extremely limited fashion.
In other words: deal with the slowness of garbage collection after you've reduced the amount of garbage to be collected.
This does happen partially with escape analysis within a function. Objects/primitives could be allocated on the stack instead. I totally agree that this should be an area that could be explored more in depth.
Not just stack allocation. Pool allocation also, especially when one has many threads. And inserting destructor calls directly into code, etc, etc. There are a lot of optimizations here.
Take Java. One of the concepts of Java was that you didn't need to worry about the stack / heap distinction - the JVM handles where to put things. But in practice, all this ends up doing is making (almost) everything ends up on the heap. And then people wonder why it's slower and more memory-hungry.
(Also: I wish compilers would be smarter about function boundaries. Have functions be source-level constructs, that get turned into a single global control flow graph, which the compiler then inspects to determine where function boundaries "make sense". With manual overrides of course.)
I've always wondered why there isn't an open source clone of Azul's Zing JVM. The patent issues can be sidestepped by developing in software-patent-free jurisdictions.
Actually, Azul attempted to get others interested in their technology with a "Managed Runtime Initiative" (see e.g. http://lwn.net/Articles/392307/) and released a lot of code, under the GPL as I remember.
Pauseless GC for immutable objects could be achieved easily without any patent encumbrance or OS level support. Given how widespread immutable state is, I do wonder why it's never been attempted.
Title of submission: "Pauseless GC for OpenJDK"
Within the first minute of the presentation: "Shanendoah is not pauseless" (0:45, emphasis mine).
The presenter then continues, "It could be pauseless. We have plans to make it pauseless, but for now the first step is just doing concurrent evacuation."
Perhaps a more apropriate submission headline would be "Potentially pauseless GC for OpenJDK" or something along those lines?