I wonder, why aren't graph databases used more often? Why is neo4j relatively al...

felixgallo · on June 9, 2017

Using the word 'literally' doesn't magically imbue speed into a system. Traversing a graph -- how does that work in a transaction? Is it going to be quicker than striding a packed in-memory hash?

EGreg · on June 9, 2017

Simple. You store the exact pointer to related data, so you go and get it in O(1). In a join, you have to do a O(log N) search through an index. And all indexes usually have to be loaded into memory, to boot.

elvinyung · on June 10, 2017

> Simple. You store the exact pointer

How would that work in a scale-out, distributed cluster? What is a pointer? How do I figure out what machine an object is really located? What happens if that machine is down? What if I want to move the object/rebalance the cluster? How do I keep multiple copies of an object (for e.g. fault tolerance)? How do I figure out which copy is the right one?

How do I organize the pointers? Would I use a hash table? A tree? A graph? How would that data structure be distributed? Would every machine store a copy of the lookup data structure, or just some specific machines? What if those machines fail? How do I maintain copies? How do I keep the lookup data structure up to date?