> Not everyone shares your narrow requirements when using a general purpose ...

cbs · on Oct 4, 2011

>remove the overhead of the interpreter, before you worry about overhead of a GIL. The overhead of the GIL killing parallelism is nontrivial, even next to the interpreter overhead

>(since the above argument can be made that those workloads would be better in a different language).

Thats a pretty self defeating argument. Every single possible ruby workload would perform better in another language.

>So, asking for GIL removal is asking all users of the language to pay a 'threading tax',

First, the existence of the multithreading support already has a threading tax, the GIL. Second, sprinkling in the use of some more mutexes wouldn't make a significant performance impact, especially not in a interpreted language that we already agree is not good for performance.

Of course, if the runtime knew it was only running one thread it could ignore much the synchronization code altogether, but we don't want to pay the tax of a single branch decision, do we?

>IMHO, if the resource difference between processes and threads is significant [...]

What about all situations where you want to use threads, but not due to resource constraints? I would assume your answer would be "don't use threads" but you say you're not rabidly anti-thread.

jshen · on Oct 4, 2011

"Second, sprinkling in the use of some more mutexes wouldn't make a significant performance impact"

The people who tried this in python would differ.

http://www.artima.com/weblogs/viewpost.jsp?thread=214235

"This has been tried before, with disappointing results, which is why I'm reluctant to put much effort into it myself. In 1999 Greg Stein (with Mark Hammond?) produced a fork of Python (1.5 I believe) that removed the GIL, replacing it with fine-grained locks on all mutable data structures. He also submitted patches that removed many of the reliances on global mutable data structures, which I accepted. However, after benchmarking, it was shown that even on the platform with the fastest locking primitive (Windows at the time) it slowed down single-threaded execution nearly two-fold, meaning that on two CPUs, you could get just a little more work done without the GIL than on a single CPU with the GIL."

scott_s · on Oct 4, 2011

And my response from the last time that was posted (http://news.ycombinator.com/item?id=2916886):

we have plenty of examples of lock-based code scaling well. Even large, non-trivial systems, such as the Linux kernel. It used to rely solely on the BKL - big kernel lock - for synchronization, but they have moved to finer grained, data structure level locking. That makes me think that an implementation of Python that uses fine-grained (or at least finer grained) locks that scales is possible. It's just a question of how feasible is it to transform CPython into such an implementation. Beazley's experiment indicates that the changes may have to be fundamental, such as moving away from reference counting.

jshen · on Oct 4, 2011

I don't think we disagree about the fundamentals, but there are trade-offs there too. C extensions become harder to write for example. This is fine for something like the JVM where people stay in JVMland for the vast majority of things.

I don't see why people are so insistent that other languages need to go down this route. Use one that does if you really want that stuff. Hell, use ruby then you can switch between C ruby and JRuby depending on your needs.

jbert · on Oct 4, 2011

> What about all situations where you want to use threads, but not due to resource constraints? I would assume your answer would be "don't use threads" but you say you're not rabidly anti-thread.

But you can use threads now,with the GIL? The discussion about removing the GIL is a performance issue?

So - my answer would be "sure, if you want to use threads for convenience, go ahead - but note that in this case the GIL isn't a problem".

> Of course, if the runtime knew it was only running one thread it could ignore much the synchronization code altogether, but we don't want to pay the tax of a single branch decision, do we?

That's an interesting one. I'd love to see the numbers. Well, if you remove the GIL it would be a branch per elided lock. You may well also find yourself increasing the size of all potentially-shared data structures (to hold the lock). It would be interesting to compare that to the case of uncontended mutexes.