Accidental sudo zfs destroy -r

aidan · March 9, 2025, 12:58am

The details of why I executed sudo zfs destroy -r schwaemm/vm are unimportant but after immediately realising my mistake and ctrl-c’ing a lot I ended up with some of the child datatsets gone.

I figured, no big deal, I have a backup that syncoid pulls every 15 mins, I’ll just syncoid it from the backup and all will be good.
prod = lorien
backup = proxima
Here’s the pull command that runs every 15 mins on backup: /usr/local/sbin/syncoid -r --skip-parent --sendoptions=w --no-sync-snap --no-privilege-elevation lorien:schwaemm tank/backup/schwaemm (I have some encrypted data on schwaemm so I just set ‘w’ for everything)

This started well enough, I ran this command to restore backup to prod:

NEWEST SNAPSHOT: autosnap_2025-03-08_20:00:16_hourly
INFO: Sending oldest full snapshot tank/backup/schwaemm/vm/win11@autosnap_2025-01-01_00:00:11_monthly (~ 139.5 GB) to new target filesystem:
 139GiB 0:49:30 [48.2MiB/s] [====================================================================>] 100%
cannot mount 'schwaemm/vm/win11': Insufficient privileges
INFO: Updating new target filesystem with incremental tank/backup/schwaemm/vm/win11@autosnap_2025-01-01_00:00:11_monthly ... autosnap_2025-03-08_20:00:16_hourly (~ 114.9 GB):
cannot hold: permission denied
cannot send 'tank/backup/schwaemm/vm/win11': permission denied
 624 B 0:00:00 [2.01KiB/s] [>                                                                      ]  0%
cannot receive: failed to read from stream
CRITICAL ERROR: ssh      -S /tmp/syncoid-proxima-1741466254-2463 proxima ' zfs send  -I '"'"'tank/backup/schwaemm/vm/win11'"'"'@'"'"'autosnap_2025-01-01_00:00:11_monthly'"'"' '"'"'tank/backup/schwaemm/vm/win11'"'"'@'"'"'autosnap_2025-03-08_20:00:16_hourly'"'"' | lzop  | mbuffer  -q -s 128k -m 16M' | mbuffer  -q -s 128k -m 16M | lzop -dfc | pv -p -t -e -r -b -s 123409717208 |  zfs receive  -s -F 'schwaemm/vm/win11' failed: 256 at /usr/local/sbin/syncoid line 585.

I’m fine with the above error, I needed to zfs allow user hold schwaemm. However, this is when I noticed that my backup no longer had all the snapshots and the latest data it held was dated mid December.

zfs list -t snapshot tank/backup/schwaemm/vm/win11
NAME                                                                 USED  AVAIL  REFER  MOUNTPOINT
tank/backup/schwaemm/vm/win11@autosnap_2025-01-01_00:00:11_monthly     0B      -   127G  -
tank/backup/schwaemm/vm/win11@autosnap_2025-03-08_21:30:02_monthly     0B      -   127G  -
tank/backup/schwaemm/vm/win11@autosnap_2025-03-08_21:30:02_daily       0B      -   127G  -
tank/backup/schwaemm/vm/win11@autosnap_2025-03-08_21:30:02_hourly      0B      -   127G  -

sudo ls -la /tank/backup/schwaemm/vm/win11/images/410/
total 133023603
drwxr----- 2 root root            5 May 18  2024 .
drwxr-xr-x 3 root root            3 May 18  2024 ..
-rw-r----- 1 root root 274920112128 Dec 14 17:26 vm-410-disk-0.qcow2
-rw-r----- 1 root root       917504 Dec 14 17:02 vm-410-disk-1.qcow2
-rw-r----- 1 root root      4194304 Dec 15 17:48 vm-410-disk-2.raw

I realised at this point, I probably should have stopped sanoid on prod and the syncoid pull service on my backup server but I cannot quite understand why even if one syncoid was pulling backup → production and another was pulling production → backup why there would be an issue. production would always have been older than backup woudln’t it? Unless sanoid created a new snapshot and that then wiped my backup on the next pull.

My questions:

For the datasets that remained after ctrl+c, are they ok? They appear to have all the expected snapshots and the vms are still running.
What exactly happened here? Why was my backup overwritten?
Obviously apart from being more careful with what I execute, is there a safer way to restore datasets?

mercenary_sysadmin · March 9, 2025, 11:04pm

Yes, if you can see the snapshots, everything is fine with them. You can scrub to confirm if you’re feeling extra nervous about that.
As soon as the next snapshot was taken locally, that meant there was new data to replicate to backup. Replication patches from the most recent common snapshot. So, you lost all snapshots more recent than the most recent common snapshot, during the zfs receive process on the target.
Consider replicating less frequently. This gives you a little more time to catch your errors before they replicate out. Next, realize you need to STOP your automated replication immediately if you’ve accidentally destroyed a bunch of snapshots (or rolled back to a very old snapshot) and you need to preserve them on your backup target.

The funny thing is, if the entire ZFS destroy had been allowed to complete–AND had finished prior to the next replication beginning–you wouldn’t have lost data on your backup target, because there would not have been a common snapshot to base replication on.

One last note: you can place a zfs hold on snapshots to prevent them from being destroyed. If you had a hold on one of your newer snapshots on the target, replication would have failed, rather than replication succeeding (at the cost of destroying those snapshots).

You could consider something like automatically placing a hold on every snapshot on the target and then automatically removing that snapshot after it’s twenty four hours old. But while that would have saved you here, it might just as easily mess you up if, for example, you did deliberately roll back a dataset on the source… And then stop getting backups, if you forget to manually remove holds on the target!

aidan · March 11, 2025, 11:18pm

Thanks for the explanation. I’m a little concerned how easy it was to wipe the backup and I’ll look into holds.

Is this what I should have done?

Stop the backup replication and stop Sanoid
Restore from backup using Syncoid
Turn backup replication and Sanoid back on

I’m hoping the above would have undone my destroy and the only effect would have been no snapshots between 1 and 3.

karl · March 12, 2025, 10:05am

Not sure if this is correct, but when I was using checkpoints I had trouble removing datasets. Although doc does not say that checkpoints prevent dataset destruction, but maybe I experienced a bug - the datasets were unmounted. I do not often delete datasets so I may test this tonight.

And yes, holds are awesome – I need to use them more, especially on my backup pool as my snapshots persist but on my NAS they drop off. (yes, I use bookmarks to replicate datasets, thanks to the podcast.)

zpool checkpoint tank

And to discard last checkpoint.

zpool checkpoint -d tank

You can only have one checkpoint at a time, though. Hence why you need to discard before creating a new one.

zpool-checkpoint.8 — OpenZFS documentation