Hacker News new | past | comments | ask | show | jobs | submit login

Compilers and CPUs only reorder over tiny instruction windows. He's talking about re-orderings over enormous windows, in a way that requires whole program analysis.

But that doesn't really happen in reality. FP languages promised auto-parallelisation for decades and never delivered. Plus you can get it in imperative languages too - like with Java's parallel streams. But I never see a parallel stream in real use.




It's not completely automatic but it is fairly close. If you enable the threaded runtime then "sparks" will be evaluated in parallel. You do have to say which expressions you want evaluated as separate "sparks" with the `par` operator but that's it—the runtime manages the threads, and common sub-expressions shared by multiple sparks will typically be evaluated only once. There are no race conditions or other typically concurrency issues to worry about since the language guarantees the absence of side effects. (That is the biggest difference between this and Java's parallel streams: If the reduction operation isn't a pure function then the result is undefined, and there isn't anything at the language level in Java to enforce this requirement.)

EDIT: An example of effective parallelism in Haskell:

    import Control.Parallel (par)

    fib n
       | n < 2   = 1
       | n >= 15 = b `par` a `seq` a + b
       | True    = a + b
       where a = fib (n-2); b = fib (n-1)

    main = print $ map fib [0..39]
Note that the implementation of `fib` has been deliberately pessimized to simulate an expensive computation. The only difference from the non-parallel version is the use of `par` and `seq` to hint that the two operands should be evaluated in parallel when n >= 15. These hints cannot change the result, only the evaluation strategy. Compile and link with "-threaded -with-rtsopts=-N" and this will automatically take advantage of multiple cores. (1C=9.9s elapsed; 2C=5.4s; 3C=4s; 4C=3.5s)


Yeah, I know how it works, and the level of automation is the same in all modern languages - as you note, Java's equivalent of "par" is writing ".parallelStream()" instead of ".stream()" so no real difference, beyond the language enforced immutability.

But it doesn't actually matter. How often is parallelStream used in reality? Basically never. I would find the arguments of FP developers convincing if I was constantly encountering stories of people who really wanted to use parallelStream but kept encountering bugs where they made a thinko and accidentally mutated shared state until they gave up in frustration and just went back to the old ways. I'd find it convincing if I'd had that experience also. In practice, avoiding shared state over the kind of problems automated parallelism is used for is very easy and comes naturally. I've used parallel streams only rarely, and actually never in a real shipping program I think, but when I did I was fully aware of what mutable state might be shared and it wasn't an issue.

The real problem with this kind of parallelism is that it's too coarse grained and even writing par or parallelStream is too much mental overhead, because you often can't easily predict when it'll be a win vs a loss. For instance you might write a program expecting the list of inputs to usually be around 100 items: probably not worth parallelising, so you ignore it or try it and discover the program got slower. Then one day a user runs it on 100 million items. The parallelism could have helped there, but there's no mechanism to decide whether to use it or not automatically, so in practice it wasn't used.

Automatic vectorisation attacked this problem from a different angle and did make some good progress over time. But that just creates a different problem - you really need the performance but apparently small changes can perturb the optimisations for unclear reasons, so there's an invisible performance cliff. The Java guys pushed auto-vectorisation for years but have now given up on it (sorta) and are exposing explicit SIMD APIs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: