Some years ago I was playing around with the idea of an append-only splay tree for storing a database file. The thinking was that the algorithm would keep recently-accessed items at the business end of the log (in addition to any new/updated/deleted items).
This concept is clearly quite heavy on writes, but the tradeoff is that you would then have the ability to expire unused entries simply by picking an offset in the log and chopping everything that precedes it. Any record accessed more recently than that cutoff point would be guaranteed to have been written one or more times later on in the log.
Any access would result in a new modified tree & root node being written, but the prototype did batch IO using an MPSC queue abstraction which meant that I could amortize things a bit. Multiple transactions could fit in a single IO command if they are issued from different threads, occur within a small slice of time and are smaller than the block size.
This concept is clearly quite heavy on writes, but the tradeoff is that you would then have the ability to expire unused entries simply by picking an offset in the log and chopping everything that precedes it. Any record accessed more recently than that cutoff point would be guaranteed to have been written one or more times later on in the log.
Any access would result in a new modified tree & root node being written, but the prototype did batch IO using an MPSC queue abstraction which meant that I could amortize things a bit. Multiple transactions could fit in a single IO command if they are issued from different threads, occur within a small slice of time and are smaller than the block size.