I expect that multicore MapReduce will become popular (e.g. http://mapreduce.sta...

I expect that multicore MapReduce will become popular (e.g. http://mapreduce.stanford.edu/ or google for more literature).

I suppose it is strictly less powerful than regular MapReduce, but at least with Hadoop the system administration costs are too much for a lot of people, and machines are getting beefier, so you can get a lot more done on one machine. In another recent thread there was a MS research paper about "ill-conceived" Hadoop clusters processing 14GB of data...

The main benefits I see are:

1) You don't have to write in a specialized language. You should be able to use any language with a good implementation. Scientific code often has Matlab, R, C++, and Python glued together.

2) MapReduce lets you write sequential code, which is easier to learn.

3) You can adapt/port sequential legacy code easily, so you can use a lot of your existing code.

MapReduce is of course similar to "parallel for" but more powerful -- parallel for is essentially the map stage. The reduce stage adds a lot. For some reason most people who haven't programmed MapReduce think of MapReduce as just mapping, and they don't understand reducing.

If you want to do it quick and dirty, don't underestimate "xargs -P" :) That's your "parallel for" that works with any language. You can run that on your Matlab, Python, C++, etc. You need a serialization library but there are a lot of those around. It works well and with a minimum of programming effort.