This was a straw man article in 2014, it was a straw man article the other times...

This was a straw man article in 2014, it was a straw man article the other times it’s been posted to HN in the intervening years, and it’s still a straw man article in 2020. As noted in another comment here, the contemporary technology of Apache Flink really isn’t far off command-line tools running on a single machine. Meanwhile, HDFS has made a lot of progress on its overhead, particularly unnecessary buffer copies. There are datasets where a Hadoop approach makes sense. But not for ones where the data fits in RAM on a single system. No one has ever argued that.