Azul launches pauseless gc jvm for unmodified linux

6ren · on Nov 8, 2011

How much would Oracle pay to acquire this?

It's disconcerting that they need to mention "512GB", as if they found that "pauseless garbage collection was" not a big enough selling point (perhaps because for most server applications - where the big money Java apps are - individual responsiveness doesn't matter all that much).

It's also disconcerting, because (it seems to me) that big apps are increasingly a thing of the past, as we move to massively distributed smaller, slower processes - each individual JVM with its own smaller memory needs. Big apps seem to be an upmarket niche, disappearing fast. Whereas the great strength of pauseless gc is in realtime client-side apps - like mobile devices.

I think these guys have done something great here, and it would be a shame to see them pushed upmarket into oblivion, when they could create an entirely new disruption based on Java - perhaps (finally) fulfilling its promise... (and for the java-haters, also for other JVM-based languages).

EDIT Azul tech previously on HN (links include explanations IIRC): http://news.ycombinator.com/item?id=810506 --- http://news.ycombinator.com/item?id=2022723 --- http://news.ycombinator.com/item?id=2058476

ryanpers · on Nov 9, 2011

So in theory you are correct, but the reality on the ground is that 96gb in a box is getting to be pretty standard now.

Also the other thing to remember is that the standard java GCs dont reliably work under all workloads above 2-4GB. You can make a 8gb or 16gb heap work, but if you hit a full-compaction your JVM will go MIA for MINUTES at a time. So you can't even scale to a 16gb ram heap.

The 512GB is for the previous customers who do in fact use heaps of that size.

6ren · on Nov 9, 2011

Thanks, I guess needing over 2-4GB is not terribly high-end; and even if I'm right about disruption, demand will keep creeping up over time at both high and low end. Even phones have 1GB these days.

jules · on Nov 8, 2011

How did they do it? I thought that their GC critically relied on virtual memory support that current OSes do not provide, hence the need to run virtualized.

bascule · on Nov 8, 2011

Looks like it uses a kernel interaction of some sort to implement memory barriers in software (whereas Vega provided hardware memory barriers):

http://www.azulsystems.com/products/zing/c4-java-garbage-col...

aidenn0 · on Nov 8, 2011

I bet they just sacrifice granularity and use the MMU to do what they need. Unmap an entire page and handle the segfault would work, but be not as good as the VTD stuff they use in the hypervisor version.

rayiner · on Nov 8, 2011

That's roughly what they do in the hypervisor version too (protection is at page granularity). The big problem is that this causes major suckage on an unmodified kernel because mprotect() changes mappings synchronously and only works on a linear region at a time. So basically every page that gets scanned involves a separate global TLB shootdown.

The Azul collector doesn't need the changes in page protection to be visible immediately, so their kernel patch allows batching together the changes and committing in a single operation.

As far as I can tell from reading their whitepaper, however, they still require kernel patches.

Jach · on Nov 8, 2011

Maybe they contributed a kernel patch?

moonboots · on Nov 8, 2011

Azul's last upstream patch request wasn't received well by the kernel community.

http://lwn.net/Articles/392307/

zerosanity · on Nov 8, 2011

Too bad it's not open source.

moonboots · on Nov 8, 2011

Azul's software jvm is based on openjdk, and they have an old source dump at managedruntime.org. I wonder how timely source code releases need to be under the gpl...

Palomides · on Nov 8, 2011

when you get the binaries, you need to have access to the source at the same time, I understand

chc · on Nov 8, 2011

Under GPLv2, under which OpenJDK is distributed, that is technically optional. You basically have two options:

a) Ship the source code alongside the binary

b) Offer to furnish the source code to people who ask for it

AFAIK, the second option was intentionally introduced to provide for a delay in releasing source code, since providing the source code to uninterested parties might be cost-prohibitive if it's 1992 and you're publishing on floppies.

itsnotvalid · on Nov 8, 2011

If so, then we need to wait until somebody got a copy of it and then distribute it. Since GPLv2 should have clauses prohibiting the any parties from baring redistribution given the same license is used.

But just one thing... according to Wikipedia [1] it is actually based upon Hotspot, but they also entered an undisclosed agreement. In such aspect, Azul may have licensed Hotspot not through GPLv2 but through different terms with Oracle. If that is true, they may be able to keep part of their code out of GPLv2.

Also they had some open source stuff released under GPLv2 at another website [2].

[1]: http://en.wikipedia.org/w/index.php?title=Azul_Systems&o...

[2]: http://www.managedruntime.org

rbanffy · on Nov 9, 2011

Is HotSpot is GPL?

rbanffy · on Nov 9, 2011

A "yes" would do...

ajross · on Nov 8, 2011

It was intentional, but not to allow for "delay". It just recognizes that many forms of "software distribution" are to parties who don't want the source code. The point is you have to give it when asked; there's no grace period in the license.

dandrews · on Nov 8, 2011

OpenJDK? "The Zing™ JVM is a full-featured JVM based on Sun HotSpot" according to http://www.azulsystems.com/products/zing/virtual-machine

Edit: I upvoted you moonboots, thanks for the pointer below.

moonboots · on Nov 8, 2011

"We are basing ourselves now on OpenJDK and to do that we have to release our stuff as open source as well because it’s substantially derived work off at the Open JDK. So the managed runtime is essentially a source drop as of a couple of months ago, and we are due to put on another source drop before long here. Essentially what it is I am building and running on my desk every day. Zing is the productized version of that that we intend to sell for money. There are a few features we don’t have to put out in the public domain, and those are going to be extras, are going to be probably the high res profiling tools."

- Cliff Click, http://www.infoq.com/interviews/click-gc-azul

mtarnovan · on Nov 8, 2011

"The Zing™ JVM 'Pauseless' Garbage Collector implements a highly concurrent algorithm that is able to concurrently compact the Java heap, and to allow the application to continue to execute while remapping of memory is performed. This patented solution allows applications to completely separate heap size from response times for predictable, consistent Java GC behavior." (from http://www.azulsystems.com/zing/pgc)

Great, so you can patent a GC algorithm now.

tedunangst · on Nov 8, 2011

What's the problem? They didn't patent gc in general. And pauseless gc seems like one of those things everybody wants, so if it really were trivial and obvious, it would have been invented by now. The fact it took this long implies that their invention is, in fact, novel.

hga · on Nov 15, 2011

Indeed, it's been a Holy Grail since I showed up on the scene in 1979. I think I stopped using Lisp Machines before generational garbage collection was added to the ones I was using, which made them "less pauseless", but Azul's entire approach is novel:

Instead of deferring the hard cases as long as you can (e.g. in generational GCs doing collections beyond the nursery) at which point they're likely to be painful, concentrate on the hard cases first and foremost. Get those right and everything else falls into place. Here's their CTO discussing that in a short interview: http://www.artima.com/lejava/articles/azul_pauseless_gc.html

A relevant quote:

[...] Our collector really does the only hard thing in garbage collection, but it does it all the time. It compacts the heap all the time and moves objects all the time, but it does it concurrently without stopping the application. That's the unique trick in it, I'd say, a trick that current commercial collectors in Java SE just don't do.

Pretty much every collector out there today will take the approach of trying to find all the efficient things to do without moving objects around, and delaying the moving of objects around—or at least the old objects around—as much as possible. If you eventually end up having to move the objects around because you've fragmented the heap and you have to compact memory, then you pause to do that. That's the big, bad pause everybody sees when you see a full GC pause even on a mostly concurrent collector. They're mostly concurrent because eventually they have to compact the heap. It's unavoidable.

nickik · on Nov 8, 2011

Does that mean there "patches" went into the linux kernel or did the work around that?

If they do does anybody know more about whats going on in terms of moving these features into the kernel? I could really find anything on it.

sausagefeet · on Nov 8, 2011

Where does it say you don't need to modify linux?

moonboots · on Nov 8, 2011

"Azul Systems have today announced Zing 5.0, eliminating their previous requirement for a hypervisor, and therefore bringing their pauseless JVM to unmodified 64-bit Linux for the first time."

http://www.infoq.com/news/2011/11/zing5-native

rayiner · on Nov 8, 2011

It sounds like there is still a kernel module (the system tools RPM).

sausagefeet · on Nov 8, 2011

Thank you

bascule · on Nov 8, 2011

Cliff Click rules everything around me