This looks like an interesting finding. Unfortunately, the trade-off for this type of efficient small data storage is real:
> Note that LSM-trie uses hash functions to organize its data and accordingly does not support range search.
Range search, while not directly applicable to all data sets, is an important feature of the LSM data stores compared (LevelDB & RocksDB).
The authors acknowledge this and say:
> There are techniques available to support the command by
maintaining an index above these hash-based stores
So, don't plan on using an LSM-Trie for a direct replacement for your LevelDB or other LSM-Tree based projects that rely on Range searches without considering the additional complexity of building and maintaining an index to perform those Range searches.
For the use-case that the paper presents, I don't think that range search is that important. Most use-cases for LSM-trees so far have been keeping in-memory metadata costs down (i.e. replacing storage engines such as Bitcask), and converting random writes into sequential writes so as to make the most of disk throughput and not wear out SSDs. LSM-trees have had shortcomings in the past with write amplification, and I think this paper does well in addressing that (besides pushing the envelope in terms of reducing in-memory metadata costs).
Every time I come across a new KV store, I go out looking for an userland implementation from the Acunu guys: http://arxiv.org/abs/1103.4282. So far, none seen yet.
I'd like to see Stratified B Trees kick LSMs' butts. Or the other way around. I can't code this yet. But I can hope that somebody's already on to this.
I haven't read the whole paper, but "LSM" is "log-structured merge"; I guess they merge runs over time to keep write amplification down? As eyan points out, Twigg's (Acunu's) "stratified B-trees" may have obsoleted the whole copy-on-write-B-tree family of data structures.
This is a form of LSM tree which sacrifices range queries to gain at most 5x write amplification over 5 levels. That translates into a significant increase in throughput compared to LevelDB and RocksDB.
This paper targets datasets of 1-10 billion keys, keeping write amplification down to at most 1x per level (i.e. 5x over 5 levels), and in-memory indexes to a minimum.
This is very different to LevelDB. But LevelDB's claims are actually valid, provided your dataset is not massive, and you understand how your workload interacts with its compaction and write amplification. This paper presents a form of LSM-tree which solves both those issues (massive dataset, low write amplification).
Now that we know that "number of atoms in the universe" is not a good extremely large number [1], maybe we can use "number of key-value-stores" as the new reference?
> Note that LSM-trie uses hash functions to organize its data and accordingly does not support range search.
Range search, while not directly applicable to all data sets, is an important feature of the LSM data stores compared (LevelDB & RocksDB).
The authors acknowledge this and say:
> There are techniques available to support the command by maintaining an index above these hash-based stores
So, don't plan on using an LSM-Trie for a direct replacement for your LevelDB or other LSM-Tree based projects that rely on Range searches without considering the additional complexity of building and maintaining an index to perform those Range searches.