Multiprocessing is a terrible solution to the GIL. You gain parallelism, but you...

falcolas · on June 4, 2013

Technically, the GIL is a "flaw" in the implementation. The language does not specify a GIL, it's just an implementation detail in cpython.

I don't necessarily agree that it's a flaw, but that's another discussion entirely.

mzs · on June 4, 2013

And in CPython it's less that the GIL is the flaw, but more the refcounted GC. Back in '96 there was a patch to remove the GIL, run of the mill single threaded python code ran 2-6x slower (largely dependent on threading implementation used) due to all the locking overhead around the refcount updates happening all the time. When you have a language that is perceived as slow already, making the vast majority of typical scripts of the time that much slower to allow MT python to be faster was going to be a very difficult thing to sell.

valdiorn · on June 4, 2013

In this day an age, the inability to compute two things at once is a pretty major flaw, if you ask me.

falcolas · on June 4, 2013

Using Python for high performance computation is also a flaw, if you ask me.

rogerbinns · on June 4, 2013

> Multiprocessing is a terrible solution to the GIL. You gain parallelism, but you are then required to serialize/deserialize and duplicate every shared object.

"You" are not required to do the serialization. Multiprocessing does it automatically behind the scenes. The only time it becomes relevant is when something can't be serialised.

It is correct though that serialization consumes cpu and time while it is happening - something that doesn't happen when all actors are local to the process. However the moment you do serialisation you can also do it across machines, or nodes within a machine which gives far greater scope for parallelism assuming the ratio of processing work to size of serialised data is large.

yummyfajitas · on June 4, 2013

However the moment you do serialisation you can also do it across machines, or nodes within a machine which gives far greater scope for parallelism assuming the ratio of processing work to size of serialised data is large.

You can also do that with Akka, for example.

It's true that you can't avoid serialization when you need to work across multiple boxes. That doesn't mean serialization and IPC should be forced upon you the minute you want to parallelize. There are a LOT of jobs that can be handled by 2-8 cores, provided your language/libraries give support for it.