The nodes of the tree resemble Clojure's HAMTs, just skipping the hash. It's interesting, because Clojure itself opts for RB trees when building sorted maps, but for keys that can be cast to a bit sequence (strings, integers), AMTs are a pretty good option.
My side project is in some respects quite close to BTrDB; tree based analytics. Stumbling across it made my day, because it appears to have vindicated some design choices I was unsure about.
I've reimplemented Clojure's data structures in Clojure, and made the nodes serialize to a garbage collected KV backend (ardb or Redis depending on requirements). This means that they can persist between program runs, be sent over a network, and don't need to fit in memory. Using structural sharing, you can take a big hash map, add a single key to it, and have both the old and the new side by side.
The next piece is cached map/filter/reduce, which saves intermediate results on each node, making batch jobs incrementally compute. You can take a big hash map on disk (my main one contains ~20gb, a map of string->set of data points), add some more values to it, and recalculate the analytics of the new map in a few milliseconds, having both side by side.
The ultimate goal is to enable you to write more or less idiomatic Clojure programs that behave the same whether you're reducing a map with a thousand entries, or a trillion (e.g. by firing up a bunch of AWS Lambda instances).
Not yet. I'm using it in prod for my own analytics, so I know it works, but it'll need some clean up and features before release. I do plan on it though.
My side project is in some respects quite close to BTrDB; tree based analytics. Stumbling across it made my day, because it appears to have vindicated some design choices I was unsure about.
I've reimplemented Clojure's data structures in Clojure, and made the nodes serialize to a garbage collected KV backend (ardb or Redis depending on requirements). This means that they can persist between program runs, be sent over a network, and don't need to fit in memory. Using structural sharing, you can take a big hash map, add a single key to it, and have both the old and the new side by side.
The next piece is cached map/filter/reduce, which saves intermediate results on each node, making batch jobs incrementally compute. You can take a big hash map on disk (my main one contains ~20gb, a map of string->set of data points), add some more values to it, and recalculate the analytics of the new map in a few milliseconds, having both side by side.
The ultimate goal is to enable you to write more or less idiomatic Clojure programs that behave the same whether you're reducing a map with a thousand entries, or a trillion (e.g. by firing up a bunch of AWS Lambda instances).