Hacker News new | past | comments | ask | show | jobs | submit login

In the future this possible on Linux with the filesystems that support DAX. Currently this all pretty experimental with lots of work being done in this space in the last two years.

But this will require you to have the right kind of flash storage, right kind of fs, right kind mount options, and probably a different code path in userspace for DAX vs traditional storage.

So we're a little ways away from this.




DAX doesn't appear related here at all. That is about bypassing the page cache for block devices that don't need one.

That doesn't move anything from kernel land into userspace, certainly not in the app's process in userspace anyway.


If you bypass the page cache you do not have read()/write() and mmap you avoid the syscall overhead. This matters a lot for high IOPs devices. Also these new fangled devices claim support word cache line sync using normal cpu flush instructions. Also avoiding fsync syscall.


One does not follow the other. Where are any references to how this will let you bypass read & write? User-space applications are still interacting with a filesystem, which they access via read/write and not a block device.

There's no talk in the DAX information about how this results in a zero-syscall filesystem API, and I'm not seeing how that would ever work given there would then be zero protections on anything. You need a handle, and that handle needs security. All of that is done today via syscalls, and DAX isn't changing that interface at all. So where is the API to userspace changing?


Please re-read my above comment. There is no new API. The DAX userspace API is mmap.

This work is experimental but you can mmap a single file on a filesystem on this device using new DAX capabilities. Most access will not longer require a syscall.

This comes with all the usual semantics and trappings of mmap plus some additional caveats as to how the filesystem / DAX / hardware is implemented. Most reads/writes will not require a trip to the kernel using the normal read()/write() syscalls. Additionally, there is no RAM page cache baking this mmap instead the device is mapped directly at a virtual address (like DMA).

Finally, flush for these kinds of devices is at the block level implemented using normal instructions and not fsync. Flush is going to be done using the CLWB instruction. See: https://software.intel.com/en-us/blogs/2016/09/12/deprecate-...

LWN.net has lots of articles and links in their archives from 2016/2017. It's a really good read. Sadly I do not have time to dig more of them up for you. Do a search for site:lwn.net and search for DAX or MAP_DIRECT.


Please re-read mine. How is the number of syscalls (which is the only thing that matters in this context) changing if there's no API change to apps? mmap already exists and already avoids the syscall. DAX "just" makes the implementation faster, but it doesn't appear to have any impact on number of syscalls

As in, if you call read/write instead of using mmap you're still getting a syscall regardless of if DAX is supported or not. Not everything can use mmap. mmap is not a direct replacement for read/write in all scenarios.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: