I didn't mean that threaded servers don't exist, that's pretty obvious. I meant, as my comment stated, that there aren't threaded servers popping up and demonstrating this fantastic performance/scalability. Hell, Apache moved away from threads.
Twitter and WhatsApp could be good examples, but no-one outside the company can really say that each server there scales well. How many servers does Twitter run? Who knows? So who knows if they are each 'scaling to 100,000 threads and achieving excellent performance', as per the paper's claims?
As for which model is harder or not, I guess that is an arguable matter of opinion. When the paper is claiming that one model is a 'more natural' programming style then it's clearly moving beyond objective reasoning.
Erlang routinely runs 100,000 processes in the same memory space and scales very well. The reason it is possible to even handle this is that it is not based on shared memory but on message passing.
There are two things at play here: a) Is the solution easy to design/code. b) Is the solution fast enough.
There is far too much focus on (b) and relatively little attention is given to (a). The claim is that writing code in a direct style in a highly concurrent environment (Go, Erlang, Haskell, SML, ...) is easier and achieves roughly the same performance (within a factor of, say, 2).
My experience is that this is true. I have seen fast evented and threaded systems. But I do feel the threaded systems were easier to write.
Another side effect of focusing on (b) is that it frees up the resources someone else (a runtime implementer) has to focus on (a) independently. This is just another corollary of "build it, build it right, build it fast".
In particular, consider the new MIO IO manager included in Haskell GHC 7.8. It's increasing performance of IO-bound lightweight threading by orders of magnitude... all performance provided completely for free to consumers of these abstract threading services who've spent all their time making their algorithms correct, not fast.
Twitter and WhatsApp could be good examples, but no-one outside the company can really say that each server there scales well. How many servers does Twitter run? Who knows? So who knows if they are each 'scaling to 100,000 threads and achieving excellent performance', as per the paper's claims?
As for which model is harder or not, I guess that is an arguable matter of opinion. When the paper is claiming that one model is a 'more natural' programming style then it's clearly moving beyond objective reasoning.