Btrfs is still unacceptably less reliable than ZFS, after _decades_ of developme...

sgbeal · 2024-07-26T12:36:13.000000Z

> I've lost so much data due to btrfs corruption issues that I've (almost) stopped to use it completely nowadays.

Just out of curiosity: is there a specific reason you're not using plain-vanilla filesystems which _are_ stable?

Personal anecdote: i've only ever had serious corruption twice, 20-ish years ago, once with XFS and once with ReiserFS, and have primarily used the extN family of filesystems for most of the past 30 years. A filesystem only has to go corrupt on me once before i stop using it.

Edit to add a caveat: though i find the ideas behind ZFS, btrfs, etc., fascinating, i have no personal need for them so have never used them on personal systems (but did use ZFS on corporate Solaris systems many years ago). ext4 has always served me well, and comes with none of the caveats i regularly read about for any of the more advanced filesystems. Similarly, i've never needed an LVM or any such complexity. As the age-old wisdom goes, "complexity is your enemy," and keeping to simple filesystem setups has always served my personal systems/LAN well. i've also never once seen someone recover from filesystem corruption in a RAID environment by simply swapping out a disk (there's always been much more work involved), so i've never bought into the "RAID is the solution" camp.

qalmakka · 2024-07-27T12:49:05.000000Z

ZFS is just too convenient, IMHO:

- ZStandard compression is a performance boost on crappy spinning rust

- Snapshots are amazing, and I love being able to quickly send and store them using send and receive

- I like not having to partition the disk at all, and still be able to have multiple datasets that share the same underlying storage. LVM2 has way too many downsides for me to still consider it, like the fact that thin provisioning was quite problematic (i.e. ext4 and the like have no idea they're thin provisioned, ...)

- I like not having to bother with fstab anymore. I have all of my (complex) datasets under multiple boot roots, and I can mount pools from a live with an altroot and immediately get all directories properly mounted

- AFAIK only ZFS and Btrfs support checksums out of the box. I hate the fact that most FS can in fact bitrot and silently corrupt files. With ZFS and Btrfs in theory you can't easily restore your data, but at least you'll know if it got corrupted and restore it from a backup

- I like ZVOL; I appreciate being able to use them as sparse disks for VMs that can be easily mounted without using loopback devices (you get all partitions under /dev/zvol/pool/zvol-partN)

- If you have a lot of RAM,the ZFS ARC can speed up things a lot. ZFS is somewhat slower most of the time than "simpler" FS, but with 10+ GB availble to ARC it's been faster in my experience than any other FS

I do use "classic" filesystems for other applications, like random USB disks and stuff. I just prefer ZFS because the feature set is so good and it's been nothing but stable in day to day use. I've literally had ZERO issues with it in 8+ years - even when using the -git version it's way more stable than Btrfs ever was.

roenxi · 2024-07-26T13:11:17.000000Z

> Just out of curiosity: is there a specific reason you're not using plain-vanilla filesystems which _are_ stable?

I'd guess that it is the classic case of figuring out if something works without using it being a lot harder than giving it a go and seeing what happens. I've accidentally taken out my own home folder in the past with ill-advised setups and it is an educational experience. I wouldn't recommend it professionally, but I can see the joy in using something unusual on a personal system. Keep backups of anything you really can't afford to lose.

And one bad experience isn't enough to get a feel for how reliable something is. It is better to stick with it even if it fails once or twice.

sgbeal · 2024-07-26T16:27:41.000000Z

> And one bad experience isn't enough to get a feel for how reliable something is.

For non-critical subsystems, sure, but certain critical infrastructure has to get it right every time or it's an abject failure (barring interference from random cosmic rays and similar levels of problems). Filesystems have been around for the better part of a century, so should fall into the category of "solved problem" by now. i don't doubt that advanced filesystems are stupendously complex, but i do doubt the _need_ for such complexity beyond the sheer joy of programming one.

> It is better to stick with it even if it fails once or twice.

Like a pacemaker or dialysis machine, one proverbial strike is all i can give a filesystem before i switch implementations.

remexre · 2024-07-26T14:47:14.000000Z

snapshots every 15 minutes are a big selling point of ZFS for me; losing a file to a tired

    $ grep bar foo.txt | tr A-Z a-z > foo.txt

is much more common than losing a disk

sgbeal · 2024-07-26T16:19:42.000000Z

> losing a file to a tired ...

If the file isn't in source control, a backup, or auto-synced cloud storage, it can't be _that_ important. If it was in either, it could be recovered easily without replacing one's filesystem with one which needs hand-holding to keep it running. Shrug.

remexre · 2024-07-26T17:54:12.000000Z

ZFS is the mechanism by which I implement local (via snapshots) and remote (via zfs send) backups on my user-facing machines.

- It can do 4x 15-minute snapshots, 24x hourly snapshots, 7x daily snapshots, 4x weekly snapshots, and 12x monthly snapshots, without making 51 copies of my files.

- Taking a snapshot has imperceptible performance impact.

- Snapshots are taken atomically.

- Snapshots can be booted from, if it's a system that's screwed up and not just one file.

- Snapshots can be accessed without disturbing the FS.

In my experience it hasn't required more hand-holding than ext4 past the initial install, but the OSes that most of my devices use either officially support ZFS or don't use package managers that will blindly upgrade a kernel past what out-of-tree modules I'm using will support, which I think fixes the most common issue people have with ZFS.

riku_iki · 2024-07-26T19:16:00.000000Z

> is there a specific reason you're not using plain-vanilla filesystems which _are_ stable?

my personal reasons are raid + compression

tuetuopay · 2024-07-26T12:31:49.000000Z

Funny because I have the opposite experience. The main issue with btrfs is a lack of tooling for the layperson to not require btrfs-developer level knowledge to fix issues.

I've personally had drive failures, fs corruptions due to power loss (which is supposed not to happen on a cow filesystem), fs and file corruption due to ram bitflips, etc. All the times btrfs handled the situation perfectly, with the caveat that I needed the help from the btrfs developers. And they were very helpful!

So yeah, btrfs has a bad rep, but it is not as bad as the common feeling makes it look like.

(note that I still run btrfs raid 1, as I did not find real return of experience regarding raid 5 or 6)

jauntywundrkind · 2024-07-26T13:53:04.000000Z

It's funny because Facebook uses btrfs for their systems & doesn't have these issues.

ZFS lovers need to stop this CoW against CoW violence.

SirGiggles · 2024-07-26T14:33:14.000000Z

Someone correct me if I'm wrong but to my understanding FB uses Btrfs in either RAID 0, 1, or 10 only and not any of the parity options.

RAID56 under Btrfs has some caveats but I'm not aware of any annecdata (or perhaps I'm just not searching hard enough) within the past few weeks or months about data loss when those caveats are taken under consideration.

tuetuopay · 2024-07-26T14:41:26.000000Z

> RAID56 under Btrfs has some caveats but I'm not aware of any annecdata (or perhaps I'm just not searching hard enough) within the past few weeks or months about data loss when those caveats are taken under consideration.

Yeah this is something that makes me consider trying raid56 on it. Though I don't have enough drives to dump my current data while re-making the array :D (perhaps this can be changed on the fly?)

SirGiggles · 2024-07-26T14:53:52.000000Z

What's your starting array look like? If you're already on Btrfs then I recall you could do something like `btrfs balance -d raid6 -m raid1c3 /`

https://btrfs.readthedocs.io/en/latest/Balance.html

tuetuopay · 2024-07-26T16:06:21.000000Z

Yeah I'm on btrfs raid 1 currently, with 1x1TB + 2x3TB + 2x4TB drives. Gotta love btrfs's flexibility regarding drive size :D

I'll have a look, thanks! I guess failing this will make me test by backup strategy that I have never tested in the past.

bscphil · 2024-07-26T17:37:54.000000Z

Out of curiosity, how much total storage do you get with that drive configuration? I've never tried "bundle of disks" mode with any file system because it's difficult to reason about how much disk space you end up with and what guarantees you have (although raid 1 should be straightforward, I suppose).

tuetuopay · 2024-07-26T18:01:45.000000Z

I get half of the raw capacity, so 7.5TB. Well a bit less due to metadata, 7.3TB as reported by df (6.9TiB).

For btrfs specifically there is an online calculator [1] that shows you the effective capacity for any arbitrary configuration. I use it whenever I add a drive to check whether it’s actually useful.

1: https://carfax.org.uk/btrfs-usage/?c=2&slo=1&shi=1&p=0&dg=1&...

SirGiggles · 2024-07-27T22:01:28.000000Z

Just want to do a follow up and make a correction that the command to go from whatever to RAID 6 for data and RAID 1c3 for metadata in Btrfs is instead: `btrfs balance -dconvert=raid6 -mconvert=raid1c3 /` instead of what I originally posted

riku_iki · 2024-07-26T17:51:28.000000Z

> It's funny because Facebook uses btrfs for their systems & doesn't have these issues.

they likely have distributed layer on top which takes care of data corruption and losses happening on specific server

cpuguy83 · 2024-07-26T12:47:30.000000Z

fs corruption due to power loss happens on ext4 because the default settings only journal metadata for performance. I guess if everything is on batteries all the time this is fine, intolerable on systems without battery.

immibis · 2024-07-26T13:07:32.000000Z

The FS should not be corrupted, only the contents of files that were written around the time of the power loss. Risking only file contents and not the FS itself is a tradeoff between performance and safety where you only get half of each. You can set it to full performance or full safety mode if you prefer.

cpuguy83 · 2024-07-26T17:59:08.000000Z

True this is file corruption.

BSDobelix · 2024-07-26T09:42:53.000000Z

>It's better to fight to keep the damned OpenZFS modules up to date and get an actual _reliable_ system

Try CachyOS (or at least the ZFS-Kernel) it has excellent ZFS integration.

Volundr · 2024-07-26T16:42:13.000000Z

This. I may still give up on running ZFS on Linux due to the common (seemingly intentional from the Linux side) breakage, but for my existing systems switching them over to CachyOS repos has been a blessed relief.

BSDobelix · 2024-07-26T17:01:57.000000Z

Well i use mainly FreeBSD but have used CachyOS for about 3mo to have some systemd refresher :)

magicalhippo · 2024-07-26T19:06:27.000000Z

Hadn't heard of CachyOS before, looks very nice! Was looking to move to Arch from KDE Neon, but this might be a much better fit.

BSDobelix · 2024-07-26T21:03:11.000000Z

Well or don't move from arch and just use the cachyos repos:

https://wiki.cachyos.org/de/cachyos_repositories/how_to_add_...

No reinstall needed ;)

magicalhippo · 2024-07-26T22:37:19.000000Z

Fair point. Currently running KDE Neon though, which is Debian based so reinstall needed...