The problem is that parallelism, and performance at this level in general, is not the kind of problem that you can solve by adding additional layers of abstraction. I can only speak to FOS, and not the other projects, but the idea behind it is to optimize the stack all the way down. My guess is that if you tried to run the Erlang VM on a 1000-core cpu, it would run into Linux scalability issues, regardless of how parallel Erlang is. Instead, we need to think about the base primitives of how we interact with the hardware, and potentially redesign them in a way that let us fully utilize the hardware.
Hmm... is it possible to extend the Erlang turtles further down, as it were, and re-write some of the stuff the VM is based on in Erlang, similar to Ruby/Rubinius?