Advantages of ZFS on a single disk vs BTRFS?

Danathar · September 10, 2023, 12:48pm

I’ve used ZFS on pools before, but I’m wondering if there are any advantages to using ZFS on a single drive compared to BTRFS.

ZFS seems more oriented towards pools, so I’m trying to find a good justification for using it on a standalone drive or SSD.

Any thoughts? Thanks!

kpfleming · September 10, 2023, 12:57pm

I use it on single-disk systems because my entire backup (and restore) system is built around ZFS snapshots and replication.

ZeroSignal9 · September 10, 2023, 2:20pm

I’ve been using ZFS since 2016, including on a couple laptops with a single SSD. It’s been solid for me. I also tried btrfs back in 2015/2016, but it wasn’t reliable for me (things might have changed since then).

You mentioned timeshift. I’m not familiar with it, but it seems to be in addition to snapshots?

With ZFS, you snapshot a dataset. You can send/recv it to another disk for backup, or clone it to use on another system. You can also roll back a snapshot to go back to the previous state of the filesystem.

mercenary_sysadmin · September 10, 2023, 9:35pm

ZFS seems more oriented towards pools, so I’m trying to find a good justification for using it on a standalone drive

Even a single disk ZFS pool is still a pool. There is always a pool with ZFS, that’s the fundamental building block.

any advantages to using ZFS on a single drive compared to BTRFS.

better compression (btrfs compression doesn’t work on extents smaller than 128KiB, which excludes the majority of potentially-compressible data on MANY systems)
better replication (btrfs replication is still rather buggy, and introduces latency spikes many orders of magnitude higher than ZFS replication does)
better snapshots (issues of layout aside, both taking and destroying btrfs snapshots introduces latency spikes orders of magnitude higher than taking or destroying ZFS snapshots does)
stronger checksums (btrfs uses CRC32 only, for kernels <5.5. >=5.5, it still defaults to CRC32C, but does at least support xxhash and blake2b; ZFS defaults to fletcher4, with collision probability 1 in 2^77 vs CRC32C’s 1 in 2^32)
ZFS replication ability to large, multi-drive ZFS servers (remember, btrfs RAID, unlike single-disk btrfs, is NOT very safe and NOT recommended)
frankly better organization. btrfs encourages outright chaos when it comes to subvolume layouts, which gets extremely confusing when you try to figure out a system you set up months or years ago and mostly have just let run as-is. You ALWAYS know where your snapshots, datasets, zvols, etc are with ZFS… on btrfs, they could be anywhere, and you really have to go hunting for them.
easy, safe upgrading to redundant-disk topology later if you want to: you can just zpool attach poolname olddisk newdisk to upgrade from a single disk to a mirror. (btrfs does allow similar reshaping, and in fact more comprehensive reshaping, but the end result is an unsafe btrfs multple-disk array, which kinda removes the point.)

I will note ahead of time, if you ask a btrfs fan about this list of drawbacks, they’ll likely complain mightily about most of them and claim they aren’t true. They will also point out that I’m “the world’s biggest ZFS stan” (no argument, lol) and question biases on my part, pointing out things like my career having benefited from writing sanoid/syncoid (which I originally intended to support both zfs AND btrfs, but backed out later because of btrfs’ chaotic, unpredictable layout… and btrfs tendency to eat data DESPITE snapshots, replication, etc, which I didn’t want to be left holding the bag for in my own tools).

If you have any questions about this list of contrasts, I’ll be happy to answer them. But ultimately, you’re probably going to have to decide what sources you trust, and decide accordingly.

Here are a couple of charts demonstrating the latency issues I referred to. In these tests, we’re comparing a btrfs-raid1 (remember, this is NOT the same as traditional raid1) over 8 12TB Ironwolf drives to a ZFS pool of mirrors using the same drives, in the same system. During the tests, snapshots are regularly taken and destroyed, and the pool/array is replicated regularly to another system WHILE the fio tests are going on. The fio test itself is rate-limited to 8.0MiB/sec read and 23.0MiB/sec write, which is both well within the capacity of the drives, and completely reasonable for sustained drive load for nearly anybody’s system during active periods.

Read latencies (microseconds)

Write latencies (microseconds)

As you can see, during this test (which aims to model the workflow of one of my own production servers, which take and destroy automated snapshots regularly, and replicate to backup hosts regularly as well) btrfs write latency is often two orders of magnitude higher than ZFS… and read latency is in some cases THREE orders (1,000x) higher.

Equally importantly, note that the worst deltas aren’t at the extreme best-case or extreme worst-case ends of the chart… they’re dead in the middle, where you will experience the difference FREQUENTLY.

I do want to stress once more that these tests include automated snapshot creation, destruction, and replication. If your plan for your single-drive system involves neither task, these charts aren’t particularly relevant for you; ZFS still enjoys definite latency curve advantages over btrfs without snapshots and replication, but they aren’t anywhere near this dramatic.

However, I will also note that the differences you see here are not just dry academic stuff. I experienced the latency issues the hard way, first, when attempting to move from ZFS to btrfs at one production site on a trial basis. For the roughly a year that site had a btrfs stack, it was a never-ending thorn in my side due to exactly these latency issues frequently making the system almost unusable for its end users (it was a VM host). After nearly a year in trial prod, the whole thing grenaded itself, and when I restored from backup, I wiped the systems and went back to ZFS. All the issues that had been making that one site take more of my time than the entire rest of my fleet disappeared immediately: the exact same hardware served flawlessly for something like five years after shifting it back to ZFS.

jblondreddit · September 11, 2023, 1:46pm

File system information is one of the things that I don’t like about BTRFS.
At work, I use both ZFS and BTRFS. On BTRFS it is very hard to figure out the size of a snapshot unless you enable quota. But with quota enabled, BTRFS slows down in experience.
zfs diff to see what changed between snapshots is a lifesaver.