The tradeoff, then, is that if someone's current problem maps exactly to the cur...

tim_h · on March 7, 2012

Pangool actually seems like a generalization of Hadoop. This doesn't necessarily make it more complex. If a problem maps exactly to the Hadoop API, then it should also map exactly to the Pangool API by setting m=2 (in the extended map reduce model described at http://www.datasalt.com/2012/02/tuple-mapreduce-beyond-the-c...).

scott_s · on March 7, 2012

I agree with your first sentence, but disagree with the second. That you can find an exact mapping does not prevent the underlying API from being more complex than what you need. That you had to realize "Oh, m=2" is more complexity.

I'm not arguing this is a terrible thing. In fact, I think this is an acceptable level of additional complexity for the power it buys you. But if we're going to make an honest evaluation of the trade-offs, I think we must mention this.

It may be relevant to the discussion to point out that I work on a tuple-based streaming system. Product: http://www-01.ibm.com/software/data/infosphere/streams/ Academic: http://dl.acm.org/citation.cfm?id=1890754.1890761, http://dl.acm.org/citation.cfm?id=1645953.1646061