From the setup page http://hyperdex.org/performance/setup/ each YCSB workload was run with only 16 threads. I'm not too sure how MongoDB and HyperDex go with concurrent connections, but Cassandra performance with YCSB will be much better if you increase the number of threads YCSB uses.
Also for the Cassandra benchmark, was caching turned on for the data column family in the usertable keyspace?
Additionally it would be great to see the operations per second for the "load" portion of workload A as this represents a 100% write workload.
Other than those few points HyperDex looks pretty cool and has some great read performance.
Where is the benchmark code for this? Cassandra and MongoDB are not key-value stores, you should have compared Memcache and Redis for this. It's like comparing MySQL and Memcache for your key-store corner case and be surprised that it's 20x slower! Ridiculous
Comparing Hyperdex against Cassandra and MongoDB is appropriate. All three are distributed databases with secondary indexes and an emphasis on consistency and scalability.
The other two databases you mentioned are not in the same problem domain. Redis will not manage more data than it can address in memory (barring the deprecated virtual memory paging) and Memcached doesn't bother with persistence at all. Both of these are very much memory databases targeting ephemeral storage tasks. Neither of them have distributed capabilities beyond very minimal master/slave configurations.
The YCSB benchmarks [0] were developed for Yahoo to evaluate a wide range of NoSQL databases by abstracting a few common operations to measure the primitives shared by all of them.
There's ongoing work to write one. Tibor Vass started hacking on it [1] awhile back. Scott Dunlop (swdunlop?) has recently started working on finishing the client. Hopefully he'll notice this and reply so maybe you won't have to write your own...
@tiborvass's client is 10 months old. How it handles Hyperdex's data structures works well, but some of the API's have changed and the client's handling of hyperclient_loop looks like it can enter a fast busy loop when it should idle.
I have started work on scavenging parts of @tiborvass's library to make a more minimal and maintainable Go library, but there are other projects between me and completion. :)
Also for the Cassandra benchmark, was caching turned on for the data column family in the usertable keyspace?
Additionally it would be great to see the operations per second for the "load" portion of workload A as this represents a 100% write workload.
Other than those few points HyperDex looks pretty cool and has some great read performance.