It feels like this should have a citation to *Scalability! At What COST?* https:...

It feels like this should have a citation to Scalability! At What COST?

https://www.usenix.org/system/files/conference/hotos15/hotos...

which evaluates distributed graph processing frameworks (circa 2015) on various problems and shows that they can be slower than a single threaded program on a laptop.

They have a data set with 105 M nodes and 3.7B edges that fits in 14 GB, which all fits in memory (at least 128-256 GB of memory should be taken for granted in these types of problems). Maybe with a clever encoding you can store the entire social network of the world, or at least a FB-sized one. (Storing nodes as integers is efficient and useful for many graph algorithms. You can later "join" with the ancillary data using a fast non-graph distributed processing framework).

This article mentions the challenges but doesn't mention the solution of simply using a single machine for big graphs:

Even seemingly minute HPAD variations, for example the graph's degree distribution, can have significant performance implications.

Graph processing systems rely on complex runtimes that combine software and hardware platforms. It can be a daunting task to capture system-under-test performance—including parallelism, distribution, streaming vs. batch operation

Anyone who has done programming with say MapReduce knows that it can be very elegant and fun, but also limiting. This is even more true of graph processing frameworks, as the article explains.

If you simply put everything in memory and write native code (which isn't hard at all for batch jobs), then you have so much more flexibility with ANY algorithm, and especially graph algorithms. This type of programming is also fun! You don't have to fit your problem into the framework and switch frameworks when it doesn't fit (or more likely just accept 10-1000x performance decrease when it doesn't fit).

---

Abstract:

We offer a new metric for big data platforms, COST, or the Configuration that Outperforms a Single Thread. The COST of a given platform for a given problem is the hardware configuration required before the platform outperforms a competent single-threaded implementation. COST weighs a system’s scalability against the overheads introduced by the system, and indicates the actual performance gains of the system, without rewarding systems that bring substantial but parallelizable overheads.