Hi,
I have found my self wondering what the best practice is to create a disaster recovery for a zroot pool.
Is it possible to create a disaster recovery backup for a hardware failure with built-in zfs tools to backup an entire pool or should I consider tools that creates a full disk copy instead to be able to restore a device in case of hardware failure?
As I might suspect, this depends of course like if the entire drive is a pool it might be possible to backup a pool and send it to a remote location from what I’m guessing?
In my particular case right now I have a laptop with Ubuntu’s flavor of zfs with zsys and the whole shebang that won’t boot, and zsys can’t perform a rollback to a previous good state, or I’m not familiar enough with zsys, and frankly I just don’t give a damn at this point since I just want to get rid of it and reinstall the computer with ZBM instead.
Since my current setup isn’t “just” a zpool but also has a ESP partition and multiple pools as per Ubuntu’s default installation and a heap of zsys snapshots and states I think that a full disk clone to to a remote host might be a feasible way forward to replicate the data just in case I forgot something on the rpool. I have successfully booted the system from a thumb drive and rsync’ed my home directory, but the “what if” nags me a bit.
I found a alternative to Clonezilla that seemed to be a better choice a while ago, but I’m not sure if it was Mondo Rescue, Relax and recover or something else. Any suggestions would be much appreciated!
In my own root-on-zfs setup (ZFS-root I specifically set up a clone of the root dataset. That becomes a bootable option in ZBM as a last-ditch recourse to recovering a borked system. It’s pretty simple …
# Optionally create a clone of the new system as a rescue dataset.
# This will show up in zfsbootmenu as a bootable dataset just in
# case the main dataset gets corrupted during an update or something.
# As the system is upgraded, the clone should periodically be replaced
# with a clone of a newer snapshot.
if [ "${RESCUE}" = "y" ]; then
zfs clone ${POOLNAME}/ROOT/${SUITE}@base_install ${POOLNAME}/ROOT/${SUITE}_rescue_base
zfs set canmount=noauto ${POOLNAME}/ROOT/${SUITE}_rescue_base
zfs set mountpoint=/ ${POOLNAME}/ROOT/${SUITE}_rescue_base
fi
In regular use one would re-clone a rescue_date every so often.
Additionally, I set up a pre-snap hook for apt to force a snapshot of the root dataset before any apt install blah is done. Just in case something goes wrong, you can rollback to that snapshot.
# Set apt/dpkg to automagically snap the system datasets on install/remove
cat > /etc/apt/apt.conf.d/30pre-snap <<-EOF
# Snapshot main dataset before installing or removing packages
# We use a DATE variable to ensure all snaps have SAME date
# Use df to find root dataset
Dpkg::Pre-Invoke { "export DATE=\$(/usr/bin/date +%F-%H%M%S) ; ${ZFSLOCATION} snap \$(/usr/bin/df | /usr/bin/grep -E '/\$' | /usr/bin/cut -d' ' -f1)@apt_\${DATE}"; };
EOF
Finally, pretty much every system has more than one drive, even laptops. For laptops with only a single nvme M.2 slot I’ll create a dedicated partition as a dedicated backup pool. I’ve grown fond of zrepl for zfs snapshot/replication … Set that up to replicate the root dataset to the “backup pool”, triggered by time or whatever. That way you have a fully bootable dataset in a non-everyday-use location that ZBM can boot from.
Of course off-system replication is best … zrepl (or zfs-autosnapshot/zfs-backup) can push out to backup systems. My own home dataset on every system is not only snap’d on a 15min cadence and replicated locally on the machine to the backup drive/pool, but also pushed out to several backup boxes.
I haven’t yet had to recover a root dataset. Things are simple enough to just rebuild and replicate back in the home and whatever other datasets are critical. If one had to, I think the process would be along the lines of this :
Boot a rescue USB stick - ZBM probably
Partition new drive - EFI and ZFS partitions
Format and set up EFI partition to load ZBM - maybe copy from USB stick ? Could have a complete copy of /boot/efi on the stick ready to copy
Create zroot pool in ZFS partition
Replicate backup root/home datasets into new zroot pool
When ZBM boots the restored dataset it should mount /boot/efi as normal