Hacker News new | past | comments | ask | show | jobs | submit login

All serious databases are doing their own low-level buffer pool management for relations&tuples, which bypass e.g. malloc, for a bunch of reasons. There are various techniques for this. The papers I link to in my comment refer to some recent ones.

My point is that in the end what a DB needs to do at this level is translate tuple or btree etc. node references into physical memory references; either by a fault that pulls them from disk, by a fault that maps a virtual address into the kernel as physical memory, or by a live reference, etc. And it feels to me that a lot of the logic there mirrors what already happens in the OS's VM subsystem. (For various reason's the "automagic" facilities that do things like this in the OS -- file-backed mmap'd -- are a terrible way to implement a DB, BTW: https://db.cs.cmu.edu/mmap-cidr2022/)

That and of course at the persistent storage layer the data structures are tuned around physical device performance characteristics. Back in the days to deal with the mechanics of head & cylinders etc; these days, the interface and aspects of flash storage, write amplification, etc. Hence log structured merge trees, b-epsilon-trees, etc. etc. The persistent storage layer inside the DB looks very much like a filesystem. I mean, it kind of is one, but with a different concept of namespace (relations&tuples&versions etc instead of files and directories)

So, I dunno, I'd maybe rather drink straight from the spring rather than get a bottle of bubbly water at the restaurant. Or something something weird analogy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: