Yep. He's got half of the right idea - move the code to the data, process it where it is - but tied to the most awful way of storing the data. The right way is what today's stream-processing frameworks tend to do - the heavyweight global coordination piece is used only for the part that actually needs to be coordinated (temporal ordering, and only the partial form that's required), the bulk data mostly lives in a leveldb or similar embedded into the application, which can process it directly where it is.