Apache Pig is a much different beast than this project, from what I can tell rea...

Apache Pig is a much different beast than this project, from what I can tell reading the documentation for Pangool. While they both operate on tuples and work at a higher level than pure Hadoop, they accomplish their goals much differently.Pig uses its own language called Pig Latin (http://pig.apache.org/docs/r0.9.2/basic.html), which is then compiled down into code that interfaces with the Hadoop library. Pangool is much closer to Hadoop, in that you are writing Java. If you look at one of their examples (http://pangool.net/introduction.html), I get the sense that the developers aim to make Hadoop easier to use, while Pig aims to make data analysis easier to use.

These goals are greatly divergent. In Pig, Java code is written to create new functions that can be used for analysis--i.e. Java is written in support of Pig Latin. Pangool focuses instead on extending Hadoop by making the Java code easier to write. This means Pig could potentially be implemented in Pangool, if Pangool were to satisfy the requirements for the task. (Not that I am suggesting that Pig actually be written--it might just be possible, depending on the technical requirements.)

Having used Hadoop in the past, I would be more inclined to use Pangool. Parts of Hadoop are poorly written--especially the reliance on singletons--and anything that makes it easier to write code that runs on a Hadoop cluster is a desirable goal in my eyes. I look forward to seeing how this project shapes up.