Hacker News new | past | comments | ask | show | jobs | submit login
Flow-based Programming (jpaulmorrison.com)
93 points by jarsin on Jan 10, 2015 | hide | past | favorite | 32 comments



There's a good discussion in the Google group of the ideas that differentiate what we are now calling "classical" FBP (or CFBP) from FBP-like systems, more and more of which are appearing all the time - in particular the topic "Underlying models" - https://groups.google.com/forum/#!topic/flow-based-programmi... .

If you are interested in the more large volume, data-processing, "classical" FBP implementations, a good place to start is my book, "Flow-Based Programming", 2nd ed., and the web site http://www.jpaulmorrison.com/fbp/ .


Hey Paul,

Glad to hear FBP is going strong. You may not remember me but I had a chance to work with you many years ago on getting the Java based FBP framework to input/output HTML (ie like a servlet node). Always loved the paradigm, the parallelism and genericness of the whole concept. I wrote many, many networks and eventually wrote an FBP parser that would create programatic FBP networks from XML files. I was wondering if this sort of thing (interpreter/parser/language) was ever incorporated into FBP itself? If not - are their any plans? Nowadays JSON would be the way to go. Thanks again - hope all is well. Chris S


On the topic of dataflow programming I found the book "Dataflow and Reactive Programming Systems: A Practical Guide to Developing Dataflow and Reactive Programming Systems" by Matt Carkci [0] to give a pretty good overview of the different types and implementations of dataflow programming systems. I think that flow-based programming sounds incredibly cool, but I haven't found any project where I could use it well... yet!

[0] https://leanpub.com/dataflowbook


I think the biggest problems with flow based or reactive programming are the occurrence of glitches, and, more importantly, the problem of how to compute updates without performing unnecessary work.

I wonder if these problems have been successfully addressed somewhere.


My work [1] addresses glitches head on (hence the system is called "Glitch") by tolerating glitches and at least making the extra work functionally benign (via logging and rollback, like a transaction). Glitch is not based on data flow, however, but somewhere in between even if still very reactive.

I've built a few data flow systems in the past (my dissertation work [2] was based on one), but I decided to tolerate rather than avoid glitches early on because I found that way to be more resilient. Flapjax [3] avoids glitching via a topological sort, which I rather see as an optimization (to reduce unnecessary work) rather than as a correctness issue.

[1] http://research.microsoft.com/apps/pubs/default.aspx?id=2265...

[2] http://research.microsoft.com/pubs/179366/mcdirmid06superglu...

[3] http://cs.brown.edu/~sk/Publications/Papers/Published/mgbcgb...


Interesting. Do you know of any work that computes a data flow graph as the program runs (i.e. data flow is implicit in the source code), which performs then reactively?


Ya, that is exactly how Glitch works, except I just call the data flow graph as a dependency graph :).

Computing the dependency graph dynamically obviously leads to a lot more flexibility. I'm not really sure what you mean by "reactively" here: if you mean does it reactively update computations as state (and even code, if you want to get Bret Victor about it) changes, then ya.

Data flow by itself isn't necessarily reactive; actually the way it was originally defined to work very lazy push things through. Continuous interpretations where changes were propagated non-lazily didn't come until much later.


A framework that is similar to this in python (that I am one of the authors of) [1] (and [2] for more in-depth docs/examples). It's used in openstack (which some people may have heard of) where appropriate (and where desirable...) to help in consistent, scalable execution using a flow like methodology to declaratively describe workflows (and execute them using various strategies/engines).

[1] https://github.com/openstack/taskflow

[2] http://docs.openstack.org/developer/taskflow/


Just gonna say that task flow is wonderful software that is usable outside of the giant openstack ecosystem. I use it for some internal projects and am quite pleased with it.


The one I did know in python was PaPy:

https://code.google.com/p/papy/

Will have to have a closer look at this one as well, though.


Author of PaPy here.

Thanks for giving PaPy a try. Although PaPy [https://github.com/mcieslik-mctp/papy] has never gained traction :(, I have been using it daily for over 5 years (no bugs in the scheduler so far). By now our PaPy based genomics pipelines have probably processed petabytes of data.


Cool, looks similarish; feel free to jump on freenode (usually during weekdays) and ping me, channel #openstack-state-management or #openstack-oslo

Pretty diagram also @ https://wiki.openstack.org/wiki/TaskFlow#Big_picture ;)


For anyone wanting to read-up / discuss FBP in depth, don't miss the mailing list, it's a gold mine of information:

https://groups.google.com/forum/#!forum/flow-based-programmi...


What specifically differentiates this from functional reactive programming?


Matt Carkci's book discusses the different closely related topics of data flow, flow based programming and reactive programming, very clearly and i quite some detail:

http://dataflowbook.com/cms

Can highly recommend.


Matt Carkci here (the book author)...

Thank you for the kind words... everyone.

When I started writing the book "Dataflow and Reactive Programming Systems" I thought I completely understood the various forms of dataflow. Yet in doing research for the book I found there are many more subtle variations than I ever knew existed.

The book went through many iterations before I felt I accurately described the similarities and differences between the many forms of dataflow. I'm happy to hear that I may have succeeded.


Otherwise, a search for "reactive programming" in the FBP mailing list, will give quite some info too:

https://groups.google.com/forum/#!searchin/flow-based-progra...


Wow, we just recommended the same book within seconds of each other. But seriously it's a good book, if you have interest in the topic it's worth reading.


I believe the author started http://www.reddit.com/r/dataflowprogramming/ It's a bit slow, but there's a lot of good stuff in there already.


Indeed :)


Not really related. FRP has as much in common with dataflow programming as generic functional programming does. Related incidentally in that both refuse side effects, but one is based on function application while the other is based on data flow connections.

I'm not really familiar with how flow-based programming differs from data flow programming, however.


On a related note, there's a new Apache Incubator project, built on these FBP principles: http://nifi.incubator.apache.org


This can be done in Clojure with core.async [1] and transducers [2]

[1] http://clojure.com/blog/2013/06/28/clojure-core-async-channe...

[2] https://www.youtube.com/watch?v=6mTbuzafcII


We implemented a flow-based programming language in Mozart Oz at github.com/fractalide/fractalide if you're interested.


I can testify to the usefulness of Mozart Oz. It is a multi-paradigm programming system. I didn't know it had dataflow as one of its paradigms. Is this something written in Mozart Oz rather than an enhancement to it ?


Ask HN: Is Excel a dataflow programming tool?


Sure, but it is not "flow-based programming" (which is what this article is about) because there are no named output ports.


Yes, cell-oriented.


Is this similar to Flux of ReactJS?


ITT: lots of people hawking other (semi-)related content.


    Bandwidth Limit Exceeded

    The server is temporarily unable to service
    your request due to the site owner reaching
    his/her bandwidth limit.

    Please try again later.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: