So last I checked, btrfs was the way of the future according to Ted, but every t...

Mister_Snuggles · on May 30, 2017

SuSE/OpenSuSE has been using BTRFS by default for a while and it seems to work well enough. There's a default schedule of running 'btrfs balance' every week, seemingly based on when the OS was installed (it relies on the timestamp of a file that gets updated with every run), that makes the system(1) virtually unusable for about 15 minutes.

(1) I've only seen this on one machine, so maybe it's a quirk of that machine's workload. But it sure does suck when it's in the middle of a workday.

codys · on May 30, 2017

On btrfs causing latency: I've got a few systems with btrfs as the rootfs (on top of lvm, on top of dm-crypt, on a SSD).

I recently started using `snapper` to create snapshots on a schedule on them. I enabled quota support in btrfs so I could see how much space snapshots were using.

I noticed that filesystem wide latency tended to spike when removing snapshots (several minutes of all fs access stalling).

Balancing with quotas enabled is even worse: my systems were hung for multiple days, until I forcibly restarted them and disabled quotas. Then the fs hangs were much smaller (a few seconds) and not to noticeable. Balancing finished in something on the order of an hour.

While I had quotas enabled, I was constantly having btrfs tell me the data was bad and needed rescanning (rescanning quotas would also induce fs wide latency).

The thing is, ZFS has snapshot space usage info, and doesn't have awful latency (it also doesn't have a "balance" operation, but I'm not sure how relevant that is).

Given my experience with both btrfs & ZFS, I'll likely consider using ZFS as my rootfs in the future.

Mister_Snuggles · on May 31, 2017

I have no idea what the state of ZFS on Linux is, but I've been using it on FreeBSD for a while now and it's fantastic. Comparing FreeBSD to Linux is a bit of an apples/oranges thing though.

cyphar · on May 30, 2017

Yeah, that happens to me too. I believe the reason why the lag is so bad on openSUSE / SUSE [aside: note the spelling :P] is because we carry patches that make quotas actually apply correctly (but it increases the big-O complexity by quite a lot -- making balances way more expensive).

krylon · on May 30, 2017

I've seen it, too.

I have three machines running openSUSE, and the one I've seen it on is both the least powerful one (Lenovo IdeaPad 100, some Atom chip, 2GB RAM) and the only one with a non-SSD drive.

ansible · on May 30, 2017

We've been running in production with our small-scale setup for (three?) years now. Mostly problem-free. But...

We did have one incident recently with an Ubuntu 14.04 system, which had RAID-1 across 3 drives. Lost one physical drive, and thereby lost the entire btrfs filesystem. Running the btrfs fsck wasn't able to fix it. I likely should have run the latest btrfs-tools to try to fix it, instead of letting the default version that came with 14.04 version try.

Still, we're not planning on switching anytime soon. Been using btrfs send/receive for snapshot backups, which is awesome.

sp332 · on May 30, 2017

I was running 14.04 with its "stock" kernel (can't remember what version, but it was old) and had stability problems after losing a RAID1 drive. Upgrading the kernel made a huge difference in stability, and I was able to recover most of the data.

Zardoz84 · on May 30, 2017

hardware , fake raid, dmraid or btrfs raid?

ansible · on May 31, 2017

It was btrfs RAID-1.

Zardoz84 · on June 1, 2017

Actually is well know that BTRFS RAID-1 have problems if degrades to a single hard disk. Perhaps it's related to this.

eeZi · on May 30, 2017

I'm running btrfs in production with a very heavy workload with millions of files and all sorts of different access patterns. Regular deduplication runs, too. We're probably one of the largest btrfs users.

Had a LOT of unplanned downtime due to various issues with older kernel versions, but 4.10+ has been solid so far. You definitely need operational tooling (monitoring, maintenance like balance) and a good understanding of the internals (what happens when you run our of metadata space etc.).

Happy to answer questions!

On a related note: Never ever use the ext4 to btrfs conversion tool! It's horribly broken and causes issues weeks later.

lloeki · on May 31, 2017

> On a related note: Never ever use the ext4 to btrfs conversion tool! It's horribly broken and causes issues weeks later.

Care to give some details about this and other failures? Part of what makes a FS reputation is not just people telling "it works" but also stories about how the thing crashed and how they recovered from it. IOW, it always works, until it doesn't, and then it still "works" because I can dig myself out of the hole this or that way.

Inspired by the way you can convert a Debian VM to Arch Linux on Digital Ocean, I happen to have been toying with it recently to auto-convert a blank Debian 8.x VM from ext4 to btrfs. Looks like things are fine, but only because the kernel is <4.x and the VM has very little data on it since it's blank.

WARNING: This is a toy. Do not use for production.

https://github.com/lloeki/digitalocean-ext4-to-btrfs

eeZi · on May 31, 2017

It resulted in random, hard to reproduce ENOSPC errors down the line without either data or metadata being anywhere close to full. Neither us nor the btrfs developers that took a look at it were able to figure out what exactly went wrong, but it was something about new blocks not fitting anywhere despite lots of free space.

Someone on #btrfs said that the filesystem layout is a lot different when using the conversion tool and all of the regression testing happens with regular filesystems, not converted ones.

We reinstalled all machines from scratch. Never happened again.

cmurf · on May 30, 2017

It's stable on stable hardware for some time. The multiple device stuff has missing features mainly related to error handling when a device starts to go crazy, and that flat out requires a sysadmin who understands all of that. i.e. Btrfs won't consider a block device unreliable and just ignore it, it'll keep retrying to read or write, while filling up the system log with all of the errors. When there's redundancy, it does fixup these problems automatically, but it can drown in its own noise if a device is producing an overwhelming amount of spurious data. And there's no notification system for this: but note there's no standard notification for this on Linux at all either. LVM and md/mdadm RAIDs do not share the same error handling or notification.

The main issue for whether it's the default filesystem is whether the distro has the resources to support it for their users. Mostly this is in terms of documentation, and understanding what sort of backports to support. Really the only distro doing that work is Suse and maybe Oracle. I don't expect the more conservative distros to support it for some time until they feel they can depend on upstream's backporting alone.

cyphar · on May 30, 2017

> Or at least be stable enough for a Debian or Red Hat to switch to it as a default?

SUSE / openSUSE has had btrfs as the default filesystem for a few years (and we have a bunch of tools built around it adding features like boot-to-snapshot and auto-snapshot of upgrades). Personally I have had issues with it, but I've also messed around with btrfs subvolumes quite a lot (developing container runtime storage drivers) so it might be self-inflicted.

sp332 · on May 30, 2017

If you stick to the green features, you're probably good. https://btrfs.wiki.kernel.org/index.php/Status

Zardoz84 · on May 31, 2017

It depends of would do you with BTRFS. We are using on our small-scale servers where we have many small VM (20~40GiB) that have BTRFS as root with transparent compression. It make more easy expand a VM hard disk, as we only need to add a new virtual hard disk and add it to the BTRFS. Some of our VM's are on HyperV and this means stoping the VM to add a new hard disk or resize the virtual hd. Other are on a new server running Proxmox that allows add virtual hard disk without stoping the VM, that allow us to add extra hard disk space with stoping the VM server. I only need to schedule a weekly rebalance of the BTRFS (Ubuntu server 16.04 don't does this!) and I have a script to check free space and available chunks on the FS to avoid any problem with a full BTRFS partition.

I remember having a issue with BTRFS two years ago when go a unexpected powerdown (the SAI don't help us as was a short-circuit after the SAI!) where a HyperV VM with BTRFS had the FS corrupted, but we managed to recover the data. Also we have a physical server to run Jenkins and GitLab that it's using FakeRaid + BTRFS + btrbk to schedule backups using snapshots and btrfs send, that are stored as compressed files on a network folder (and this folder are backuped to magnetic tapes).

I don't noticed any slowdown when I launched a manual rebalance, but we are operating at low scale so we not have a lot of I/O. Our real bottleneck are the databases that are on Windows servers.

mjevans · on May 30, 2017

The TLDR version:

BTRFS for a 'simple' use case is fine.

Push redundancy to another layer for the time being. EG: MDADM or hardware raid.

disconnected · on May 30, 2017

I dont use it, but in practical terms, I think btrfs is safe to use, from what I've been reading. There are some corner cases where something might bork, and you'll be staring at a recovery job, but hopefully no data will be lost.

The main thing that I think is stopping widespread adoption is that none of the developers seems to want to come forth and say "Here's a stable, rock solid version. Go nuts.". There's always a caveat or a disclaimer. Use at your own risk and all that.

When even the developers are kinda jittery about it, it isn't exactly reassuring.

dom0 · on May 30, 2017

It still looks like the FS exhibits perf degradation (well, uh, worse than Linux already degrades under I/O load anyway) under not-that-untypical workloads. A few years ago it had the same problems with bog-standard workloads such as using it on / and installing or updating packages. Though, like I mentioned, Linux does generally not shine when it comes to I/O scheduling and system responsiveness. Just a couple days ago I made my whole workstation bog down by writing to the /home SSD with 400 MB/s avg (ext4). It's just not very good there, and it feels like the desktop software is getting worse at dealing with it (probably due to more I/O in more spots, like delayed loading of resources or history files that are read/written in the UI thread and stuff like that)...... especially considering that we all were on spinning rust a few years ago, and now everyone uses SSDs with orders of magnitude more IOPS and at least 2-4 times the read/write speed.

api · on May 31, 2017

Synology uses btrfs by default for a NAS product, and I find it hard to believe they'd pick that for a product whose explicit goal is reliable storage if it were full of dragons.

Once something gets a reputation for having issues that reputation tends to stick pretty much forever. I'm thinking maybe older versions of btrfs were problematic and the FUD has never gone away.

X86BSD · on May 31, 2017

What other choice did they have if they deploy on Linux? Btrfs is the best Linux land has and its no zfs.

BrainInAJar · on May 31, 2017

Just use ZFS. It's been battle tested for the last 13 years in production, and was written by careful, thoughtful, intelligent engineers to solve a problem. BTRFS was written sloppy to solve the problem of "oh no, Sun did a thing good"