I attended a talk at Strata a few years back by a Spark committer who was talking about how Spark was stretching JVM memory allocations far past how the JVM was originally designed. Do a couple searches for "spark JVM OOM" and you'll see some discussions about similar things.