Hacker News new | past | comments | ask | show | jobs | submit login

Ok, so Cascading has a slow implementation of secondary sort, but is there any reason you believe that couldn't be improved? I don't think you're really comparing architectures there, just how well optimized particular implementations are.

I'm asking because in my experience the extra level of abstraction provided by Cascading, Crunch etc is a huge advantage, and if you're making a conscious choice to operate at a lower level, you better be getting something significant in return; it's not clear to me yet what that is.




Pangool is not an alternative for Cascading. For example, at this point, Pangool does not help you managing workflows. If you are starting a MapReduce application, it is probably the best option to start using higher level abstractions: Cascading, Hive, Pig, etc.

But if you are thinking about learning Hadoop using the standard Hadoop API, or if you need for some particular reason to use it for your project, we recommend you to use Pangool instead.

Or if you are considering to implement another abstraction on top of Hadoop, probably using Pangool for it would also be a good idea.

In fact, what we believe is that the default Hadoop API should look like Pangool.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: