ZFS Replication Compression options?

I’ll admit it. I have painted myself into a corner.

I’m currently using a Hetzner “Auction server” with 2 4T disks hanging off of it as my backup destination, and it works great.

But I expect that my source data sets will grow through the years. It’s the whole reason I switched to a “real” NAS because my 4T NAS was constantly filling up.

So my Hetzner server has ~500GB free, and I am on borrowed time.

With my old Synology, their “hyperbackup” tool would compress the source data down dramatically on the destination. the ~ 4T if memory serves on the source compressed down to less than 1T in Backblaze B2.

Are there any magic levers I can pull to achieve the same thing with ZFS? I already set the compression on the backup zpool to zstd.

root@rescue /backuppool # zpool list -v
NAME         SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
backuppool  3.62T  3.07T   569G        -         -     6%    84%  1.00x    ONLINE  -
  mirror-0  3.62T  3.07T   569G        -         -     6%  84.7%      -    ONLINE
    sda     3.64T      -      -        -         -      -      -      -    ONLINE
    sdb     3.64T      -      -        -         -      -      -      -    ONLINE
root@rescue /backuppool #

zfs get compressratio for the dataset(s) in question, please?

root@rescue ~ # zfs get compressratio backuppool
NAME        PROPERTY       VALUE  SOURCE
backuppool  compressratio  1.01x  -

Thanks for the response!

Yep, that’s what I expected. That’s not compressed, at all. Compression only takes effect at file creation, essentially.

Basically the answer here is… not something you’re gonna like. You need to destroy the existing backup, zfs create backuppool/backups ; zfs set compress=zstd backuppool/backups ; THEN you can perform a new full replication from source to target eg syncoid -r root@source:sourcepool backuppool/backups/sourcepool.

This way, the newly created (by inbound replication) datasets will inherit the compression setting on the backuppool\backups dataset, and you’ll get the compression you’re looking for.

If all you wanted to do was compress the data where it lies, you’d have some other options–but they’d break the replication chain. Starting over with a full targeted to the child of a dataset with compression already enabled is the only way to get the data compressed without the next replication wiping out the change.

Alternately, if you’ve got an older common snapshot between source and target, you can create your new dataset with compression enabled, do local replication from the current datasets to children of the new backups dataset, then destroy the original and retarget your normal replication when you’re done.

But the ONLY way to make that work is with replication–not individual copies–and you said you’re already very close to your storage capacity limit, so you don’t have many options.

Another option to avoid breaking the replication chain: zfs detach one drive from your mirror vdev, create a new pool using that drive, and use that as a mule to get things re-jiggered locally on the target. Once you’re done, verify everything is correct, verify it again, then destroy the old pool and zfs attach its drive to the singleton on the new pool, and Bob will be thy nuncle.

1 Like

Nah, deestroying and re-creating the backup ds is just fine.

I was contemplating options like having to ditch afs backups entirely and use an S3 like thing so this is gentle by comparison. Thank you!

(When you try something for the first time you’re foolish if you think you’re gonna get it just perfect without iterating a bit!)

3 Likes

What about using zfs rewrite to add compression to an existing dataset:

zfs set compression=zstd tank/dataset
zfs rewrite -P tank/dataset

With this you could compress source and destination. All snapshots will stay intact. And subsequent send/receive will then compress automatically.

Right. In other words, he would need to have enough space for both copies of his data, the old uncompressed copy and the new compressed copy, simultaneously.

Which he doesn’t. :upside_down_face:

He does not indeed!

I think Jim’s solution is the right answer here. I am not at all precious about my backup data, especially since the Hetzner server doesn’t seem to care about egress/ingress fees.

TBH I’m grateful that this is a thing. I rather thought I was SOL.

Adulting with your NAS is the gift that keeps on giving.