Wear leveling generally happens well below the user-level filesystem (and is quite complicated!), and altering your user-level behavior because you think it helps is a little bit silly. Zoned NVMe is an obvious exception to this, where the FTL takes advantage of the append-only zones (even that is an abstraction only shown to filesystems), but it will frequently remap your blocks if you do a lot of read-modify-writes to keep the wear even.
True, which is why I called out "raw" flash in particular. I think there are embedded cases for example where it might make sense to skip that layer. Even on general purpose machines I think it'd be interesting to see alternate filesystem models that avoid double logging for databases, but I don't know if it will ever happen...
> Zoned NVMe is an obvious exception to this
And host-managed SMR HDDs, as namibj pointed out. I still haven't managed to get my hands on one, though; they seem to be hyperscaler-only for now.
Even "raw" flash as exposed in most Linux servers is not anywhere near raw. For embedded use cases, it absolutely matters, but if you have Linux and an enterprise NVMe SSD, nothing you do in userland will matter.
Well, you still don't get fine-grained random deletions on Zoned NVMe.
Also, there is another storage target which expects append-only style: SMR HDDs.
They don't support random writes into a zone.