Do we really need threads? From my limited Ruby experience, it'll happily fork new interpreters, it has connectivity with pretty much all major messaging queue implementations as well as various serialising and networking libraries. In short, talking to other processes is easy, even if they are a bit slower than threads (but if speed is such an issue, it's unlikely ruby would be your implementation language).
Threads only ever scale so far, when you need more processor cycles you'll have to go off-host eventually. By adopting a multi-process model with data shared over the network (with or without broker queue in between) you can benefit the app's ability to scale and its robustness greatly.
For the non-compute intensive reasons to parallelise, non-blocking code often performs better (e.g. chatty networking code) than threads anyway.
If threads aren't great (they aren't in Python), forget about them and move on. There are other tools in the toolbox, with the bonus that the other tools are actually better (in most if not all cases on unix like platforms).
Although you are right about threads only ever scaling so far, you need to remember that network I/O has a rather large overhead.
If you always assume your code is going to be run over a network you might miss an opportunity to efficiently solve some problems that might be solved on just a single machine with a bunch of cores.
I think frameworks like celluloid allow you to deal with this elegantly, but they need the help from the language to realize this potential, which is why bascule requests these features.
An example: a computer game might be built concurrently by having the rendering system, the two physics engines, any AI's and the main game loop execute on separate threads. Obviously there is a bunch of information to be shared between these systems with as little delay as possible.
Simply put, if you map out storage levels like this:
L1 -> L2 -> (L3) -> Memory -> Disk/Network
These are orders of magnitude different in performance. Network can be faster than disk, but not generally by an order of magnitude.
So, everything you know about memory vs. disk for performance ought to translate fairly well to memory vs. network.
It's a good observation that extremely performance-bound jobs might want to look to other languages, but avoiding a level of that data storage hierarchy is no meager 2-3x speedup.
You are right that threads are not the only way to do it, but there are some advantages to some Ruby applications. The process model is expensive in terms of memory usage. Take a look at the Sidekiq testimonials for specific examples of this: https://github.com/mperham/sidekiq/wiki/Testimonials
If you can replace 10 servers with 1 server that is not just a cost-save in terms of hosting, it also makes your deployments so much simpler that you may find yourself doing changes more frequently, as just one example.
The process model also really falls down when those threads interact. That isn't the most common model we see in for example Rails applications - more often we are thinking of either request/response or batch process, in either case its essentially a single thread that only coordinates with others through a database and maybe memcached or redis. When you have large real-time processes the communication overhead can be very detrimental to the process model.
You are right though that eventually you have to take that hit in some form or fashion to scale to another box but I don't agree you should agree to take it up-front for every process you ever develop. That said, I think Erlang is closer to this model (though isn't always serializing on a network) and that has proven to be pretty successful and efficient when viewed at a macro-scale.
Threads only ever scale so far, when you need more processor cycles you'll have to go off-host eventually. By adopting a multi-process model with data shared over the network (with or without broker queue in between) you can benefit the app's ability to scale and its robustness greatly.
For the non-compute intensive reasons to parallelise, non-blocking code often performs better (e.g. chatty networking code) than threads anyway.
If threads aren't great (they aren't in Python), forget about them and move on. There are other tools in the toolbox, with the bonus that the other tools are actually better (in most if not all cases on unix like platforms).