Hacker News new | past | comments | ask | show | jobs | submit login

> Therefore the database should handle its own I/O caching using O_DIRECT on Linux or the equivalent on Windows or other Unixes.

That's not wrong, but at the same time it adds complexity and requires effort which can't be spent elsewhere unless you've got someone who really only wants to DIO and wouldn't work on anything else anyway.

Postgres has never used DIO, and while there have been rumbling about moving to DIO (especially following the fsync mess) as Andres Freund noted:

> efficient DIO usage is a metric ton of work, and you need a large amount of differing logic for different platforms. It's just not realistic to do so for every platform. Postgres is developed by a small number of people, isn't VC backed etc. The amount of resources we can throw at something is fairly limited. I'm hoping to work on adding linux DIO support to pg, but I'm sure as hell not going to do be able to do the same on windows (solaris, hpux, aix, ...) etc.




PostgreSQL has two main challenges with direct I/O. The basic one is that it adversely impacts portability, as mentioned, and is complicated in implementation because file system behavior under direct I/O is not always consistent.

The bigger challenge is that PostgreSQL is not architected like a database engine designed to use direct I/O effectively. Adding even the most rudimentary support will be a massive code change and implementation effort, and the end result won't be comparable to what you would expect from a modern database kernel designed to use direct I/O. This raises questions about return on investment.


I have found that planning for DIO from the start makes for a better, simpler design when designing storage systems, because it keeps the focus on logical/physical sector alignment, latent sector error handling, and caching from the beginning. And even better to design data layouts to work with block devices.

Retrofitting DIO onto a non-DIO design and doing this cross-platform is going to be more work, but I don't think that's the fault of DIO (when you're already building a database that is).


Is there a known library with a cross platform abstraction that could help?


I wrote this for Node.js, which is a native binding in C, exposing cross platform functionality: https://github.com/ronomon/direct-io

Although if it's a new project and you're used to C, I would recommend also taking a good look at Zig (https://ziglang.org/), because it's so explicit about alignment compared to C, and makes alignment a first-class part of the type system, see this other comment of mine that goes into more detail: https://news.ycombinator.com/item?id=25801542

Something that will also help, is setting your minimum IO unit to 4096 bytes, the Advanced Format sector size, because then your Direct IO system will just work, regardless of whether sysadmins swap disks of different sector sizes from underneath you. For example, a minimum sector size of 4096 bytes will work not only for newer AF disks but also for any 512 byte sector disks.

Lastly, Direct IO is actually more a property of the file system, not necessarily the OS (e.g. Linux), so you will find some file systems on Linux that return EINVAL when you try to open a file descriptor with O_DIRECT, simply because they don't support O_DIRECT (e.g. a macOS volume accessed from within a Linux VM) so that should be your way of testing for support, not only the OS.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: