FWIW, the linked book, "Concepts, Techniques, and Models of Computer Programming"(CTM), is imo a fantastic book, and there are free classes on edX ("Paradigms of Computer Programming", I think it's now split in two now) that go over the material.
Highly recommended if you have an interest in this stuff.
As far as I remember one of the main ideas was "dataflow variables", which are lazily evaluated in parallel (?)
I've read about this paradigm in the literature, e.g. with the Lucid language and others. And the book goes into some depth about it.
But somehow this paradigm hasn't caught on, despite the constant production of new experimental languages. And I've never seen a real program in it -- e.g. one over 5K lines that people use.
Anyone have a conjecture on what happened? Is it flawed, or did nobody pay enough attention? I think there is some overlap with async/await which seems to be the dominant concurrency paradigm now (C#, JS, Python, Rust, etc.)
I think one issue is that real systems are composed of multiple languages, and it's hard to bridge programs with wildly different evaluation semantics with C or JavaScript code.
And I guess the paradigm is not "reactive". It wants to control the main loop, but in GUIs and networking code where async/await is appropriate, the app doesn't "own" the thread of control. A new paradigm for batch programs is perhaps of limited use.
I guess I sort of answered my own question -- the model doesn't solve enough problems to justify its cost. (Also, another issue is that it's a model of very fine-grained parallelism, where as coarse-grained parallelism with limited communication is faster. That's how people optimize in practice.)
> I think there is some overlap with async/await which seems to be the dominant concurrency paradigm now
Dataflow variables are exactly promises where every use that calls for the result is awaited; in a language where it's the core paradigm, you just don't write async and await everywhere (and often the runtime will be free to abandon calculations that won't be used), so, no, it doesn't just have “some overlap” with async/await; async/await is syntax to for using the paradigm in a language that is otherwise eager and synchronous.
So, it's also wrong to say that it hasn't caught on, the paradigm is pervasive and frequently used in industrial programming, which usually uses multiparadigm languages, not languages purely devoted to a single paradigm.
> (Also, another issue is that it's a model of very fine-grained parallelism, where as coarse-grained parallelism with limited communication is faster. That's how people optimize in practice.)
Dataflow variable don't require any parallelism, though they can leverage it.
Citation needed for both your first and second paragraphs :)
As mentioned in my sibling comment [1] there are lots of definitions of dataflow. It's plausible that the model in CTM can be expressed entirely with async/await, but I'd like to see it.
As for the second claim, if it's caught on, then it should be easy to point to code that uses it. Or name some industrial systems that use it. There are some programming models that live only in specific industries but they also tend to leak out into open source over time.
And what programming language do those systems use? Are they using Mozart/Oz or something else? I wasn't aware of any production usage of that language but I could be wrong.
For data flow in industrial programming look at IEC 1131 which is a major standard. It's a different world and, no, it doesn't seem to leak out to the mainstream despite there being some very interesting solutions to i/o heavy concurrent problems.
Haskell has dataflow support too (just to mention a better known language), but not sure how popular this compared to all the other concurrency libraries Haskell provides.
I worked with LabView enough to have an opinion, I think.
The issue is that it isn't actually any easier or faster. The machine still needs to do the same synchronizations and cache invalidations. The engineer becomes very limited in what they can effectively do (recursive structures? They become like {} in Go, performance nightmares).
Labview is very effective in its niche, factory monitoring and automation, but I found beyond that it was a pain and a half.
> Labview is very effective in its niche, factory monitoring and automation, but I found beyond that it was a pain and a half.
I’ve never used Labview but it’s worth pointing out that it’s only one possible implementation, just like we have many different takes on other paradigms. I’ve heard people say that labviews specific design is somewhat flawed, but I can’t comment on that since I’ve not used it myself.
I wouldn't include Excel -- it's similar to but not the same as the model presented in the CTM book.
You could call "excel" dataflow but the problem with that term is that are a dozen distinct models that can be called "dataflow". Similar to why "Google Cloud Dataflow" is a bad product name. That's also dataflow but the term is so general that it's not useful.
For example, in Excel, derived cells are recomputed when their sources change. This is true of some dataflow proramming languages but not others, including the one in CTM. This is expensive and has a lot of bearing on the implementation.
In fact I learned a few years ago Excel is occasionally wrong in both directions:
- it fails to recompute cells that are dirty in the name of speed
- it recomputes cells that are clean even though it's not strictly necessary
That's probably a fine tradeoff for Excel but you wouldn't want a programming language with those semantics.
What I found quite interesting, when I read CTM a few years back, was that they only introduced mutable state about half way through. That you could talk about all of these different aspects of programming before introducing mutable state was pretty enlightening.
It does but not to the exclusion of other computation models. I'd agree with GP - I found it a wonderful book. It builds different models of computation successively by incrementally adding features. It's a bit like SICP[0] in that it starts with declarative/functional fundamentals and brings in mutable state really late. The coverage of dataflow was the first one that gave me an "aha!" moment.
It's easily 10 years since I bought it and I still dip into it every few months or so. Every time I do, I get something new: insights, deeper understanding, etc.
Yes, I think concurrency is about 1/3 of the book, but it's approached the same way as the other paradigms; you have a basic language, add a single feature and see what happens: what new things can you do? How does the expressivity change? What problems arise? Is this equivalent to another paradigm?
I found the insights interesting, but I would agree they are not of immediate utility to many people's day job.
In this poster's classification, "basic" Forth (the data manipulation part without parsing words) to me seems to be "procedure" + "cell (state)", i.e., "Imperative programming". This classification is along general semantic lines, not "what syntax do you use to access values". In concatenative languages you use a stack implicitly while in C you use variable names and nested expressions, but this more of a "syntactic" issue than what the poster is concerned with, namely things like "does the language have native closures" which Forth, like C, does not.
Parsing words add a huge amount of metaprogramming power, but this classification doesn't deal with any kind of metaprogramming. That's not a dimension that is included here, and even if it were, it would be debatable if it's that different in Forth from Lisp macros, once you look past syntactic concerns as above. But I could very well be wrong here, I'm not an expert on this part.
By the taxonomy of the diagram, pure APL-style array programming is just first-order functional programming. Real array programming languages also introduce features from other branches of the tree (especially imperative programming and closures), of course. I don't think this diagram is supposed to be a clear-cut categorisation of languages, but more of a rough map of idealised paradigms. By the standards of this diagram, most languages are multi-paradigm.
It's also missing multiple dispatch (multimethods) from Common Lisp, Dylan and Julia (which did not exist in 2009), which is different enough from OOP to deserve a place (and hopefully enough recognition to appear in more new languages).
It's one of those things you can't really appreciate until you've tried it. Imperative programming languages are to assembly languages, as array programming is to imperative programming: it's a higher level of abstraction. The difference between the array and imperative paradigms may not be quite as pronounced as the difference between C and assembly, but it's still significant nonetheless.
There's no need for control structures like loops or if statements in array programming.
Highly recommended if you have an interest in this stuff.