Syncoid, multiple hosts and --no-sync-snap

Good afternoon,
I’m still wrestling with snapshots created and managed by syncoid. This was thrashed about a bit in another thread https://discourse.practicalzfs.com/t/keeping-a-minimum-number-of-snapshots/1326 but I thought it was wandering far enough afield to warrant its own thread. Specifically, I’m questioning my use of --no-sync-snap and how it prevents me from easily cleaning up snapshots.

It appears to me that --no-sync-snap will choose the oldest available snapshot whether created by sanoid or syncoid. In my case one of my configurations looks like

A -> C 
B -> C
C -> D (with no-sync-snap)

My scripts run these once daily, back to back and in that sequence. The result is that on D I have a lot of snapshots that match syncoid_C. For this reason, I cannot use the script pasted below to clean up the “foreign” snapshots on D. I’m wondering if it makes more sense for me to eliminate the --no-sync-snap option for the C -> D transfer. That will allow syncoid to manage the snapshots it creates and I can remove the other snapshots w/out causing any problems. (*)

Does this sound right or am I misunderstanding something? If it matters, I have other jobs that transfer filesystems around, including C -> E and C -> B (to different pools, not circular.) I’m inclined to think I should not be using --no-sync-snap with any of these to make cleanup simpler.

The script I plan to use to clean things up (and which is still “in testing” is (and USE AT YOUR OWN PERIL!)

hbarta@olive:~/Programming/shell_scripts/syncoid$ cat purge-foreign-syncoid-snaps.sh
#!/bin/bash

# Purge foreign syncoid snapshots that result from pulling
# snapshots creted on remote hosts by pulling snapshots there.
# AKA What a tangled web we weave!
#
# Only valid snapshots include the string "syncoid_$(hostname)"
# 
Usage="Usage: purge-foreign-syncoid-snapshots.sh pool [pool] ... [pool]"

if [ $# == 0 ]
then
    echo "$Usage"
fi

echo "=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ start  " "$(/bin/date +%Y-%m-%d-%H%M)"

for pool in "$@"
do
    echo "Checking $pool"
    for snap in $(/bin/zfs list -t snap -o name -H -r "$pool"|/bin/grep syncoid|/bin/grep -v "syncoid_$(hostname)")
    do
        echo "destroying $snap"
        /bin/zfs destroy "$snap"
    done
done

echo "=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ finish " "$(/bin/date +%Y-%m-%d-%H%M)"
echo
hbarta@olive:~/Programming/shell_scripts/syncoid$ 

(*) D is my remote backup and a 5 hour drive away. If I mess it up, I’m carrying a multi-TB HDD on a road trip to restore it and in the mean time have no remote backups.

1 Like

Sort of.

Syncoid is just an orchestration wrapper for OpenZFS replication, and therefore works (exactly) like it does. It replicates all available snapshots, unless you go out of your way to make it do something different.

The only purpose of the sync snap is to make certain that there is a snapshot to replicate, and to give (especially newbie-r / more casual) users a guarantee that the replication chain won’t be broken, forcing the abandonment of a long-running incremental replication backup scheme to replace it with a new full.

Doing a full backup instead of incremental might not sound like a very big deal to some folks, but when you’re backing up hundreds of TiB or maybe even a PiB or two… it matters. It matters a lot. :slight_smile:

1 Like