> O_DIRECT is not a safe way to recover from the journal if you have decided you cannot trust fsync to do its job, because you need fsync to make O_DIRECT write-cache durable.
I was specifically referring not to an fsync in your sense (where the disk or fs does not respect fsync at all so that fsync is a no-op, or where the fs has a bug with O_DIRECT not flushing if it sees nothing dirty in the page cache - by the way I think this is no longer an issue, otherwise it's a kernel bug you can report)
...but to handling an fsync error in the context of the paper from WISC that I linked to in that parent comment, where the kernel's page cache has gone out of sync with the disk after an fsync EIO error ("Fsyncgate"):
The details are all in the paper. Sure, some disks may not respect fsync, but O_DIRECT is still the only way to safely read and recover from the journal when the kernel's page cache is out of sync with the disk (again, details in the paper). It's another fantastic paper out of WISC.
I was specifically referring not to an fsync in your sense (where the disk or fs does not respect fsync at all so that fsync is a no-op, or where the fs has a bug with O_DIRECT not flushing if it sees nothing dirty in the page cache - by the way I think this is no longer an issue, otherwise it's a kernel bug you can report)
...but to handling an fsync error in the context of the paper from WISC that I linked to in that parent comment, where the kernel's page cache has gone out of sync with the disk after an fsync EIO error ("Fsyncgate"):
"when the page cache can no longer be trusted by the database to be coherent with the disk: https://www.usenix.org/system/files/atc20-rebello.pdf"
The details are all in the paper. Sure, some disks may not respect fsync, but O_DIRECT is still the only way to safely read and recover from the journal when the kernel's page cache is out of sync with the disk (again, details in the paper). It's another fantastic paper out of WISC.