Coordinating syncoid pull jobs from multiple systems

lipton_tea · July 12, 2023, 12:28am

I’m probably thinking about this the wrong way so hopefully someone help me think about the problem differently.

I’ve got one source server running sanoid and two servers that pull data from the source. Both destination servers have a dataset that they pull in common.

Given the method recommended is cron to run syncoid jobs, how would you go about running the jobs such that they can run either A) simultaneously, B) in such a way they don’t interfere with each other, C) at least that the weekly one will retry in a couple hours if it fails.

One destination server is set to run every 4 hours and the other weekly.

For example I get these cron messages on the regular

cannot receive incremental stream: most recent snapshot of <parentdataset> does not match incremental source
CRITICAL ERROR: ssh -i  path/to/identity -S /tmp/syncoid-user@<ipaddress>-1689112885
    [user@<ipaddress>(mailto:<mailtoinfo> 'zfs send -w -I '"'"'tank/vault 
    /video'"'"'@'"'"'autosnap_2023-07-11_18:00:05_hourly'"'"'  \
    '"'"'<parentdataset>'"'"'@'"'"'autosnap_2023-07-11_22:00:08_hourly'"'"' \
    | lzop | mbuffer -q -s 128k -m 16M 2>/dev/null' | mbuffer -q -s 128k -m 16M 2>/dev/null \
    | lzop -dfc | zfs receive -s -F '<parentdataset>' 2>&1 failed: 256 at /usr/sbin/syncoid line 817. 
Cannot sync now: <childdataset1> is already target of a zfs receive process. 
Cannot sync now: <childdataset2> is already target of a zfs receive process. 
Cannot sync now: <childdataset3> is already target of a zfs receive process.

Not only due to each machine overlapping in time… but the sync job that runs every 4 hours will interfere with itself.

muay_throwaway · July 12, 2023, 6:07am

Are the two destination servers pulling to pools/datasets with the same name? Based on a quick look at the syncoid code, it seems that syncoid just checks if the remote host has a receive process to the destination filesystem of a given name. If the destination filesystems on both pulling servers have the same name, perhaps they are interfering with one another. If you receive in differently named destinations, does it work? (this is just a guess, I should warn)

Otherwise, you could implement a flock file over SMB, hosted on the source server. It should work in Linux kernel v5.5 and up (you didn’t specify an OS, but I assume it’s Linux; probably works on FreeBSD too). This ensures only one syncoid process is running at a time (if they are flocked to the same file). (Ideally, not internet exposed; maybe through WireGuard, etc.)

Topslakr · July 12, 2023, 4:26pm

I wrestle with this too.

The obvious and simple answer is the PUSH the snapshots from the main server to the replicas, setup to run one after the other … but that means if your primary is compromised they would also then be able to wipe your backups. No go.

For me, I have my replicas pull data from the main server at set intervals. One server pulls VM snapshots at the top of the hour, and the other at 30 minutes past.

For bulk data, one server pulls at mid-night and the other at noon.

This has been working great for me. It sounds like you’re trying to do 4 hour sync cycles, so if you offset one server by 2 hours, you should be able to keep the tick tock of replicas happening. You just need to manage the intervals so that your able to get the sync done within the time window available in most cases.

I have also considered a more cumbersome idea, but have yet to need it. I considered having Replica 1 touch a file on Replica two. When Replica 1 finishes it run, it touches this file, which Replica 2 can see. When Replica 2 sees that file, it then runs it’s sync job and deletes the file. You can then have cron jobs running that will coordinate between the two disparate systems and you can manage the permissions to allow Replica 1 and Replica 2 to not have enough access to each other to do any damage. For instance, a limited user can be used to touch that file from R1 to R2 so that, even if that user was compromised, it would have no further access to either the main storage pools or the other Replica.

mercenary_sysadmin · July 17, 2023, 5:45pm

Cannot sync now: is already target of a zfs receive

Just ignore it, basically. If you don’t want it popping up in your system logs, you can redirect STDERR to /dev/null to keep it from showing up in syslog.

Ultimately, what’s happening is your recursive syncoid run syncs everything it can, and ignores what it can’t. And if you’ve already got a syncoid run going to a particular dataset target, you certainly can’t start a new one to the same target–but if part of your recursive run is free, you can (and syncoid will) start syncing that part of it after ignoring the part that’s still busy.

There’s very little overhead involved in syncoid figuring out that the target is busy, so there’s not a lot of reason IMHO to bother with something like flock trying to keep it from running again at all–in fact, again IMHO, that would be a regression, because your flock would prevent a new syncoid instance from syncing or re-syncing the individual parts of the target in your recursive run that are free.

For example, let’s say you’ve got two VMs: data/images/A and data/images/B. And you use syncoid -r to replicate them offsite. But data/images/A generates ten times as much “churn” (new data) as data/images/B, so it usually can’t finish an offsite DR run in less than eight hours.

data/images/B doesn’t generate anywhere near as much “churn” (again, new data) so it finishes much more rapidly: let’s say, in half an hour typically.

Now, if you syncoid -r data/images every four hours, what happens is B gets replicated every time, and A gets replicated whenever it’s not already in the process of replicating.

But if you flock it, B won’t get its every-four-hour replication anymore–because syncoid won’t run again until A finishes.

muay_throwaway · August 21, 2023, 5:27am

It seems that @mercenary_sysadmin addressed this issue in another thread. In brief, as a workaround, including the --quiet flag in the syncoid call prevents the processes from interfering with one another. He suggested filing a GitHub issue, so this seems to be an unintended bug.