The code is available on GitHub. If the benchmark is flawed, help us fix it.
Here are some ways it could be flawed:
* MongoDB and Cassandra were set to default consistency levels. They are not configured to wait for the majority of replicas to respond. HyperDex waits until data is committed at all replicas. This clearly biases the benchmark away from HyperDex.
* MongoDB and Cassandra operate on the key for the search benchmark, while HyperDex operates (solely) on secondary attributes. Another bias against HyperDex.
Most of our gain in the benchmarks comes from high GET throughput and high SEARCH throughput, although PUTs are competitive as well.
We are faster than MongoDB because they do idiotic stuff they don't have to. When I was investigating why HyperDex is faster, I looked solely at client libraries (since MongoDB's default config just writes to socket buffers, or buffers it in userspace). HyperDex has one function to create the request packet, one function that enqueues it with a constant number of operations, and one function to flush the queue. Once a request is created it is not copied until the kernel moves it to the a socket buffer. MongoDB, on the other hand, bounces through half a dozen different layers, some of which perform memmove to compact the data, keeping it contiguous in memory. While I've not examined the whole MongoDB code base, I suspect that it's more of the same. I can tell you first hand that the same diligence paid to making the HyperDex client efficient was paid at all layers of the HyperDex stack.
Here are some ways it could be flawed:
* MongoDB and Cassandra were set to default consistency levels. They are not configured to wait for the majority of replicas to respond. HyperDex waits until data is committed at all replicas. This clearly biases the benchmark away from HyperDex.
* MongoDB and Cassandra operate on the key for the search benchmark, while HyperDex operates (solely) on secondary attributes. Another bias against HyperDex.
Most of our gain in the benchmarks comes from high GET throughput and high SEARCH throughput, although PUTs are competitive as well.
We are faster than MongoDB because they do idiotic stuff they don't have to. When I was investigating why HyperDex is faster, I looked solely at client libraries (since MongoDB's default config just writes to socket buffers, or buffers it in userspace). HyperDex has one function to create the request packet, one function that enqueues it with a constant number of operations, and one function to flush the queue. Once a request is created it is not copied until the kernel moves it to the a socket buffer. MongoDB, on the other hand, bounces through half a dozen different layers, some of which perform memmove to compact the data, keeping it contiguous in memory. While I've not examined the whole MongoDB code base, I suspect that it's more of the same. I can tell you first hand that the same diligence paid to making the HyperDex client efficient was paid at all layers of the HyperDex stack.
Edit: Trying to get bullets to work