Concurrency ≠ Threads

ajross · on July 8, 2008

"But there are other forms of concurrency. Processes that communicate using files, pipes, sockets, or even shared memory."

I don't see how this solves the problem. It's true that some problems work well as pipelines where each stage does some finite, balanced amount of work before passing the data off to the next stage. But many don't.

And I don't follow the files and shared memory argument at all: those are both shared, unsynchronized-in-the-base-API abstractions just like threading. And they're subject to all the same deadlocks and race conditions. The only difference is that you use slightly different synchronization primitives (e.g. fcntl() locks instead of mutex locks).

It's correct to point out that traditional threaded code is difficult to get right. But it's naive and unhelpful to point to any kind of "secret sauce" as the solution. The difficulty of concurrency lies in the problem itself, not the API.

sah · on July 8, 2008

"The difficulty of concurrency lies in the problem itself, not the API."

Isn't it possible that it's both? Concurrency is hard, but I think the classical threading model makes it harder. There are simplifications that really do make it easier to reason about concurrency, like doing all inter-thread communication via queues, OpenMP-style "parallelize this for loop" constructs, and pure-functional immutable data structures.

ajross · on July 8, 2008

Sure. And a reasonable discussion about that stuff and the tradeoffs involved (queues handlers need to be carefully balanced, for instance, and parallel loop constructs tend to drown CPUs in locking overhead) would be welcome. But the linked post was about "threads are bad, use secret sauce instead". There is no sauce.

LogicHoleFlaw · on July 8, 2008

I think he is complaining more that so many other people think that threads are the secret sauce.

demallien · on July 9, 2008

Yup, it's the problem space that is hard, not the implementation.

A good way of seeing this is to look at languages that have "solved" concurrency issues, such as Erlang. I mean, sure, you won't have concurrency problems in Erlang, but that comes at the price of only having immutable variables...

I think the problem with concurrency is that certain things that we do in sequential programming are provably impossible to do reliably in concurrent programming. So we end up using traditional languages, such as C/Java/Python, which are very error prone because we can only correctly use a subset of the language features, and it's not clear what the subset is, or we can use a language such as Erlang, which explicitly enforces concurrency-correct code, but then we find programming in such languages very restrictive.

Meh. I'm not enough of a language wonk to figure this one out!

keefe · on July 8, 2008

I was just about to post a comment with : "Me: But there are other forms of concurrency. Processes that communicate using files, pipes, sockets, or even shared memory." You are spot on with the problem itself being hard. As I understand it, the distinction between a process and a thread is that threads share memory. If you don't want to use the shared address space, simply don't use it - thread local variables are fine. The issue is that there are multiple threads of execution happening concurrently, and therefore communicating data is difficult. It's hard to believe that taking away the shared memory will do anything but make things harder. Like ajross says, parallel algorithms are just hard.

aaronblohowiak · on July 8, 2008

files and sockets are easy to attempt reconnection if something blows up. is it as easy with pipes and shared memory?

This article contains some interesting viewpoints about shared memory: http://nerdwisdom.com/2007/08/23/programming-erlang/

maw · on July 8, 2008

With files, you can go quite a long way with a good naming scheme. The Maildir/ protocol/format is probably the best known and most widely used example of such, but it isn't difficult to imagine something more elaborate based on the same underlying ideas.

ajross · on July 8, 2008

Mail in general (including maildir) is actually a classic example of how hard synchronization is even in the file regime. Take a look at the source to something like procmail sometime. It's a huge rats nest of locking abstractions.

It's not something you can solve with file naming. That's just a technique for avoiding collisions.

bayareaguy · on July 8, 2008

Indeed. I've found it worthwhile to ask interview candidates how they would implement the equivalent of the maildir protocol - http://www.qmail.org/man/man5/maildir.html

bayareaguy · on July 8, 2008

For application-level concurrency I really like the way Lua explicitly excludes the most error-prone and system-dependent concurrency primitives (preemptive threads, mutexes, semaphores, shared memory, etc) and instead offers API's for efficient non-preemptive coroutines and states which share no memory.

sah · on July 8, 2008

This paper makes a pretty compelling argument against threads as a model for concurrency:

http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1....? [pdf]

nailer · on July 9, 2008

"But there are other forms of concurrency. Processes that communicate using files, pipes, sockets, or even shared memory."

Using multiple single-threaded processes means you're using multiple threads.

What he's saying is true, but he's worded it really clumsily. He's pointing out alternatives to multithreaded processes, not threads per se.