Hacker News new | past | comments | ask | show | jobs | submit login
Fast Ordered Collections for Swift Using In-Memory B-Trees (github.com/lorentey)
80 points by ingve on March 1, 2016 | hide | past | favorite | 7 comments



Sounds very similar to the copy-on-write-friendly b-trees used in Btrfs, originally researched by by Ohad Rodeh: http://liw.fi/larch/ohad-btrees-shadowing-clones.pdf


Absolutely awesome readme, comprehensive and covers an huge chunk of background. I know more about in-memory tree data structures than when I started. Beautiful API. This is how a data structure library should be done.


I'm very sceptical that in-mem B Trees can beat hashes, given their huge size overhead and cache unfriendlyness


Quote the readme: "B-trees were originally invented in the 1970s as a data structure for slow external storage devices. As such, they are strongly optimized for locality of reference: they prefer to keep data in long contiguous buffers and they keep pointer derefencing to a minimum. (Dereferencing a pointer in a B-tree usually meant reading another block of data from the spinning hard drive, which is a glacially slow device compared to the main memory.)"

Sounds like it could be pretty cache friendly. Besides, a B-tree can be used to implement a hash/dict/map.


I know b-trees. Still patricia trees or the optimized variant judy hashes are more cache friendly, and non fucked-up hashes even more. For OrderedDict it makes sense, but I would still consider judy or patricia better.


Tries are awesome; but they're more specialized. So, as long as you have to choose which one to implement first (and I do), B-trees provide better bang for the buck.


B-trees don't beat hash tables at their own game, but hash tables don't beat B-trees at theirs either. Both are general-use data structures that have their particular niches. The trick is to always select the correct tool for the job.

You might be thinking about red-black trees. The size overhead of B-trees is the same or better than that of hash tables. As for cache friendliness, B-trees were explicitly designed to work great on two-level storage; the advantage is clearly theirs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: