I'm not convinced that parallelism is something that *needs* support at the lang...

jerf · on July 30, 2010

Erlang gets a lot of good mileage out of it, though. You start to really miss the OTP system when you try to write actually-important code in any other language. And there are some aspects that are hard to get right at a later time when it's not in the language from day one, cross-process messaging in particular, which is a primitive used to build a lot of other features. I do not know where Clojure or Scala have a generic cross-process messaging ability, and I'm not saying it's impossible by any means, but it's much easier if you start from scratch with that in mind.

j_baker · on July 30, 2010

"I do not know where Clojure or Scala have a generic cross-process messaging ability, and I'm not saying it's impossible by any means, but it's much easier if you start from scratch with that in mind."

I don't understand how. The only difference I see is whether the interprocess communication parts are coded into the language itself or whether they're coded into a library. It could theoretically make the problem more difficult to put it in the language itself. Instead of only affecting the programs that use that library, now your concurrency features could potentially affect any program written in your language.

Could you give me an example of a concurrency feature that really, truly benefits from being in the language and not the standard library? I simply can't think of any examples.

jerf · on July 31, 2010

For the real answer to this question, attempt to implement a generic serialization syntax for Haskell that requires no work by the user to use. Not even declaring typeclasses. Not even requiring Typeable to be implemented. Just feed it a datatype, and even if the other end has a different version of the software installed (which can happen in a distributed system, after all), it'll all Just Work to some degree.

Erlang has that, because it has its datatypes and a defined serialization for them and there is nothing in the language that is not those datatypes. It also doesn't permit you to layer any type-level assertions about those types into a user type. In fact, Erlang basically has no concept of user types. (Records are syntax sugar around the built-in tuple.) Since the Erlang data types are so weak, an automated serialization can be built that requires no work to use. But that doesn't come without cost

If you don't start with that, you have to use some sort of introspection to examine data types, and you probably don't have a good story for what to do if two ends have totally different ideas about those datatypes, or how to upgrade running processes where you literally want to change the datatype without shutting the process down. The stronger your type system, the harder that gets. The easier it is, the weaker your type system must be. Erlang's capabilities don't come for free, they exist because they wrote their (non-)type system so that their data structures never have any guarantees in them that don't trivially travel across the network. This has its own issues; Erlang has just about the weakest types you can have without actually having arbitrary casts, and that has consequences too.

There are a handful of characteristics in a language, like its type system, that have radical impacts throughout the language and no amount of library jiggery-pokery can completely paper over. Another example not entirely relevant to cross-process concurrency (but not entirely irrelevant) is immutability; no amount of library work can turn a mutable language into an immutable language.

cjenkins · on July 31, 2010

This comment resonates with me especially well today.

I've been attempting to work with GWT and the serialization issues are entertaining (did you implement IsSerializable? is there a no-arg constructor? Do all of the classes your class depend on have the same? Did you remember to recompile the GWT/JS code to update the serialization white list? etc.).

It makes me appreciate more playing with Erlang a while back and shooting data around being so simple.

Silhouette · on July 31, 2010

> Could you give me an example of a concurrency feature that really, truly benefits from being in the language and not the standard library?

Isn't that a bit of a loaded question? In some academic sense, an ideal programming system might have a kernel language that is probably rather small and certainly clean, flexible and extensible in its feature set, and then almost everything else built on top of that kernel in libraries. In practical programming systems, the perfect kernel has proven rather elusive, though.

It's rather like the question of whether a standard library should be small, clean and extensible or comprehensive with a good-enough version of everything. Many of us might be inclined toward the former approach from an academic/theoretical/intuitive point of view, but in practice, just about every widely used programming language from the past two decades has come with a batteries-included library. While those libraries are often criticised for various technical weaknesses, and many of those weaknesses are never addressed because it's too much work, these systems are still good enough for most users out of the box.

scott_s · on July 30, 2010

See the paper by Hans Boehm from PLDI 2005, "Threads Cannot Be Implemented as a Library": http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf

A bit further discussion: http://news.ycombinator.com/item?id=939364

alec · on July 30, 2010

Fortress makes a convincing argument that it must be supported at the language level. It has a smart work splitting algorithm that does for managing parallelism granularity what garbage collection did for managing memory allocation. It works much better when it's baked into the core and is available from the ground up.

kanak · on July 30, 2010

> a smart work splitting algorithm

Not to detract from anything you said (which I agree with), but when Guy Steele gave a guest lecture at my university on Fortress a few months ago, he said that the work splitting algorithm still needed work, in particular the part that decided the right amount of granularity for the given task (i.e. when to stop splitting the task into smaller subtasks).

carterschonwald · on July 31, 2010

shouldn't it also be set up so that different algorithms can be swapped in? I think its pretty accepted at this point that different application scenarios perform better with differently tuned schedulers..

swannodette · on July 30, 2010

?

Clojure has deep language support for concurrency - from syntactical constructs to it's core performant immutable data structures.

lukev · on July 30, 2010

But the beauty is that because it's a Lisp, it's all implemented in terms of macros - Clojure's complete set of concurrency tools could be written as a library.

The only exception is the deref reader macro "@", since Clojure doesn't allow user-defined reader macros by default.

kanak · on July 30, 2010

> it's all implemented in terms of macros

This is not true at present. The core data structures, and the STM machinery are implemented in java, not as macros.

plinkplonk · on July 30, 2010

He probably meant "distributed". Erlang has a better "distributed" story than Clojure.

jacquesm · on July 30, 2010

Occam.

And a whole pile of others.

Did you use a parallel programming language?

tophercyll · on July 30, 2010

True, the nice things about libraries is that they can compete and make your ecosystem stronger. But the nice thing about building it into the language is that it becomes a common layer of compatibility.

We've gotten comfortable enough with our concurrency model, that we felt it deserved to be in the language.

As an aside, some features that benefit parallelism can be tricky to add retroactively, although possible (like an N:M threading model).

10ren · on July 30, 2010

I believe race conditions can still occur in pure message-passing languages, like Erlang (Joe Armstrong says so in Programming Erlang, p.173). This doesn't get much press, perhaps because it's much easier to get right than shared-memory, and because not that many people are actually doing it yet.

Does Spin address this issue?

tophercyll · on July 30, 2010

You can build message passing code that suffers from race conditions. In practice, though, it's almost always obvious if you're about to do something dangerous.

On p.173 Joe writes:

  When we say things such as this:
    Pid = register(...),
    on_exit(Pid, fun(X) -> ..),
  there is a possibility the process dies in the gap between these two statements.

Regarding process death, we allow multiple inter-process relationships to be specified at spawn time. Hence any number of processes may be configured to observe a newly-spawned process before it has even started. Additionally, all children are linked to (die with) their parent, which both eliminates the risk of orphaned processes and has other benefits.

So in Spin you wouldn't be tempted to write something like above (probably not Erlang either?).

X9 · on July 30, 2010

It might not need language level support, but anything to make it easier to leverage is a plus in my book. If the programmer doesn't have to concern themselves with extra parallelism-specific library calls and is still able to write a binary that can take advantage of multiple cores, that seems like a good idea to me.