That makes a lot of sense. Regardless of where your data is, you want to best map your accesses to the underlying reality of whatever hardware is in use. And memory has significant locality on any even moderately big system.
I do think this glosses over the difference in data structures that someone focusing on efficiently storing and reading the data to and from disk might use versus the data structures that someone focused mainly on efficiently representing the data in memory might use, ignoring durability. Queries are one part of this, but dealing with updates and indexing can also be quite important.
I don't know if that difference is fundamental to the design of a database or just more of a lingering consequence of many databases being designed around most data being on disk, but it is what I observe looking at a variety of current database solutions.
Getting back to the main article here, I've been doing some basic testing against MemSQL today and, "world's fastest" aside, I like a lot of what I see, other than painfully long query parse/compile times. It does, however, appear to be true for my queries that most of the performance benefits are due to the distributed query engine and not due to any fundamental data structure differences compared to something like postgres or mysql/innodb. But my queries are very anti index.
SpaceCurve also sounds interesting, hopefully we can firm up our use cases enough and get far enough along in a technical evaluation that I can find time to play with it.
I do think this glosses over the difference in data structures that someone focusing on efficiently storing and reading the data to and from disk might use versus the data structures that someone focused mainly on efficiently representing the data in memory might use, ignoring durability. Queries are one part of this, but dealing with updates and indexing can also be quite important.
I don't know if that difference is fundamental to the design of a database or just more of a lingering consequence of many databases being designed around most data being on disk, but it is what I observe looking at a variety of current database solutions.
Getting back to the main article here, I've been doing some basic testing against MemSQL today and, "world's fastest" aside, I like a lot of what I see, other than painfully long query parse/compile times. It does, however, appear to be true for my queries that most of the performance benefits are due to the distributed query engine and not due to any fundamental data structure differences compared to something like postgres or mysql/innodb. But my queries are very anti index.
SpaceCurve also sounds interesting, hopefully we can firm up our use cases enough and get far enough along in a technical evaluation that I can find time to play with it.