Jvm performance tuning (notes)

azuriel · on Jan 19, 2012

I'm the author of this post. Note that it's entirely adapted from the talk done by Attila Szegedi (http://www.infoq.com/presentations/JVM-Performance-Tuning-tw...), who deserves all the credit. I just filled in some of the gaps and reworked it so it flows in textual form.

I'll try to update the post with any comments I get here. It's definitely not ground truth as I was a bit fuzzy on some parts of the presentation.

Yrlec · on Jan 19, 2012

Great article. Note that the point about Strings using 2 bytes per char isn't always true. The latest HotSpot VM has an option called "-XX:+UseCompressedStrings" (see http://www.oracle.com/technetwork/java/javase/tech/vmoptions...) where the VM will try to use ASCII-strings whenever possible to save memory.

babebridou · on Jan 19, 2012

> Don’t write your own memory manager, and stop if you find yourself doing something ugly with byte buffers.

What's wrong with byte buffers? Serious question, as I use them all the time whenever I do OpenGL-ES 2.0 surfaceviews for android apps.

udp · on Jan 19, 2012

Well, you don't really have an alternative for that (short of writing your rendering code with the NDK, as I ended up doing).

The overhead of all the crap I had to do just to get coordinates from Java to OpenGL was downright ridiculous.

squarecog · on Jan 20, 2012

Nothing is wrong with byte buffers. Use them where appropriate. The advice is to stop when you find yourself implementing a full-blown memory manager / quasi-malloc in user code on top of byte buffers...

bdunbar · on Jan 19, 2012

FTA: Twitter had a service that would have a terrible GC pause every three days. Solution: just bounce the machine after less than three days.

That is not what I would call a solution. It's a work-around.

And not a very good one.

squarecog · on Jan 20, 2012

The context isn't clear from these notes.

Full context as explained in the talk: used to have stop-the-world GC for 2 minutes every hour. After implementing bytebuffer-based slab allocation, this is only several seconds, and once every three days. Service runs on 200 nodes, with redundancies. Kicking the process on one of them, in a slow roll that finishes in under 3 days, works around the unresponsiveness window (planned shutdown easier to manage than an unpredictable pause).

It's totally a workaround and not a solution. Atilla follows up this example with an anecdote of talking to Oracle folks about when are they going to have true pauseless GC, and they responded with "not that big an issue, really, everyone finds a workaround..." So, this is an example of a workaround. A pretty good one once you realize it's not "the one" machine, it's "some one" machine out of 200.

bdunbar · on Jan 20, 2012

Ah - in context, for that problem, I agree it's 'okay'.

Thanks for clarifying.