Syncoid options

I feel like every time get halfway through writing a post here, I wind up figuring out where I went wrong.

So thanks for being my rubber duck debugger!

Anyway, now that I’ve realized what permissions my syncoid user was missing on the receiving server, I’m curious what options you all use for it?

This is what I’m using currently: syncoid --recursive --no-privilege-elevation --no-sync-snap 172.16.10.173:data/VMs backup/VMs

I’m using --no-sync-snap since I don’t intend on creating many manual snapshots, if ever, so there shouldn’t be any reason for that replication chain to get broken.

Oh, here’s a few questions I’ve been wondering about:

Does Syncoid have a systemd timer the way Sanoid does? Or do you just run it as a cron job under the user you’ve delegated permissions to?

I assume Syncoid doesn’t need destroy privileges when you’re using --no-sync-snap, since pruning snapshots should be handled by Sanoid.

Does Syncoid automatically detect if there was an incomplete replication job and apply the resume token? Or is that something that needs to be done manually? I just tested this and it does, nice!

nice to hear that it works for you, too!

I’m curious what options you all use for it?

I limit the bandwidth to avoid too big performance impacts for users (I/O, network load) during the replication.
With --no-sync-snap it should be understood that only existing snapshots are synchronized, which means, “new” data (not yet in snapshot) will not be synchronized.
After big changes, such as restructurations, I like to create a manual snapshot, this helps me finding the spot as well as it ensures to always have a replication point to start with in case snapshots are lost, for example due to a misconfiguration.

Does Syncoid have a systemd timer the way Sanoid does? Or do you just run it as a cron job under the user you’ve delegated permissions to?

I don’t think it has a such timer or even service, but I think you can easily create one, for example with the files from https://github.com/jimsalterjrs/sanoid/tree/master/packages/debian by copying

  1. sanoid.timer to syncoid.timer
  2. sanoid.service to syncoid.service

to the place shown by systemd-analyze --user unit-paths. I usually pick ~/.config/systemd/user (since nowadays it seems to be advised to hide important files in dot-files for reasons I don’t even want to know).

From the second file, remove Wants, Before, ConditionFileNotEmpty and adjust ExecStart to /usr/local/bin/syncoid [...all your desired options...]. and followed by the normal systemctl --user enable commands.like systemctl --user enable syncoid.timer.
Don’t forget the --user, since by default systemctl started as user tries root (I wonder why they didn’t require to use --user $USER). Ahh, and if this is your first systemd unit, you need to enable it for the user first by sudo loginctl enable-linger <syncoiduser>. Then you just need to tell the all-supervising-daemon that you changed its config by systemctl daemon-reload --user and you can start it systemctl --user start syncoid.timer. To see what happend you are supposed to run journalctl --user -e. There are many long texts to read about the great methods of troubleshooting you will need erm I mean you can use.

It is so simple with systemd!

Alternatively you could add a single line to crontab (as the syncoid user, just run crontab -e and add

*/15 * * * * TZ=UTC /usr/local/bin/syncoid --quiet [...your options...]

You even get a free mail if there are eny errors.

I assume Syncoid doesn’t need destroy privileges

Yes, I also think it does not need destroy privs then; but just run it on command line to see if there are eny errors

an incomplete replication job

Yes, this works really well! As long as the anchor snapshots don’t vanish :slight_smile:
I think you know that you must take special actions if you have multiple jobs on same dataset.

1 Like

I’m only using it at home for now, looking forward to being able to use it in production.

Curious why the bandwidth limit, is there so much churn on the sending side that it’s taking more than a minute or two? Unless it’s going over a WAN link I don’t see much point in capping the speed.

Of course “usually” (whatever this means in a context) it will be very fast and efficient; but you never know if users accidentally copy a big tree only to delete it later, or whatever. To avoid that in such cases ZFS replication could saturate a link and maybe even cause a network monitor to mail out warnings, I’m used to set a limit.

I don’t use --source-bwlimit these days anywhere near as frequently as I used to–when I first wrote syncoid, I frequently had only 5Mbps to work with on upload pipes!–but I do still use it.

If you’ve got an upload throughput cap that’s much smaller than the expected throughput of zfs send, it’s a good Idea to use bwlimits, even if you want maximum throughput for your replication process–because even a bwlimit slightly higher than your WAN throughput cap will potentially decrease the latency impact on the WAN, by not overstacking packets in the router’s buffer instead of just in the host’s buffer.

It’s also a good way to minimize the storage impact of replication on a busy system. If you’ve got several hundred GiB of fresh data to replicate and would prefer a somewhat longer replication time and the lower point in time storage load that goes with it, you can use source-bwlimit to make that happen.

Hm, something I should definitely keep in mind. I like having a dedicated link for the storage arrays to talk to each other but that’s not always in the budget.

For home use of syncoid, I need the limits to prevent knocking myself offline :slight_smile:

Previously, replicating ZFS at work, we had a dedicated 100Mbps MPLS link, and in that case I still put a limit at 95% of that speed just to stop Nagios from complaining when the pipe was saturated and it’s checks were delayed.