Set Syncoid maximum dataset size

aaron · August 14, 2023, 9:00pm

Hello all,

Thanks for setting this up, Jim.

I am trying to find a way to make a Syncoid run skip dataset syncs that exceed a certain size (e.g. 1GB). If I can do that, I would run Syncoid twice, once with that option on and once with it off, so that all my small syncs are completed first.

I have my personal ZFS server backing up to a Raspberry Pi with a ZFS external HDD in a friend’s house using Syncoid. He is hosting it as a favour, so I am considerate and set the Raspberry Pi in his house to pull backups from me only in the early hours of the morning and at a limited bandwidth. (I do the time limiting by starting it at 1am with a 6 hour timeout on my Syncoid script.)

That means that if I make a lot of changes on my server in a day, it will take more than one day for those changes to copy over to the Raspberry Pi in his house.

It doesn’t actually matter to me much if a big change takes a few days to be backed up to him (yay for Syncoid’s resume functionality!), but there are two things that I am not too happy with:

Ideally I would like to know that everything that hasn’t had massive change (which is normally my most important data) is backed up as normal every day.
I have promtheus-style alerts set up on my Sanoid deployments (using this). If a dataset takes longer than a night to synchronise, all the alerts for any datasets that would have been synchronised after the big one start to fire. My preference would be to only receive alerts about the big ones that could not sync in a night, as I would know at a glance why and could silence them for the period of time I expected it to take to sync.

Looking at the Syncoid code, I suspect I could add a new option to --skip-larger-than or something, compare that to the calculated size for the snapshot send around Line 522 and skip the dataset if the calculated send size for a snapshot exceeded the specified size.

Before I did that, I thought I would see if there was an obvious better approach or if other people would find such a thing useful.

mercenary_sysadmin · August 14, 2023, 10:17pm

Sounds like you should just do independent syncoid runs instead of a single recursive one, tbh.

You could script that up something like this:

for DS in `zfs get -o name -t filesystem,volume name -H`; 
   do /usr/local/bin/syncoid $DS root@otherbox:otherpool/backup/$DS &
done

What that does is fire off parallel syncoid runs for each of the datasets on the source system. If your dataset pool/big/huge is still running a syncoid process when the time for the next one comes around, it’ll just refuse to run (because there’s a conflicting zfs receive process on the target), but any free datasets will get sync processes each time your script fires off.