USB backups with rotated drives

tom_aus · May 30, 2024, 12:37pm

Hi all,

I have a mate of mine who would like to move his system over to using ZFS, however, he is attached to backing up via usb drives. Currently he has two drives that he rotates every day on his server, as well as a monthly drive he only plugs in once a month. To make matters more interesting the monthly drive sometimes doesn’t get plugged in for let’s say 3 months if he forgets.

How would you achieve something like this using sanoid without the possibility of having to resend all the data because there aren’t any common snapshots because the monthly drive wasn’t plugged in soon enough?

Any help would be greatly appreciated,

Cheers

HankB · May 30, 2024, 1:30pm

Is the system that is being backed up using ZFS? (Or planned to use ZFS?)

A backup that requires manual intervention does not work for me. I need to have something that is automated and the only manual step is to check to see that it is working.

Are the USB drives all the typical ones bought in USB enclosures? My concern with those is that they may not be suitable to keep powered up 24x7. With that reservation, here’s what I would do.

Format the two “daily” drives as a ZFS mirror and use that as a local backup. My preference would be to leave them connected 24x7 and run periodic syncoid backups to the USB pool. If you let syncoid manage its own snapshots, there should not be a problem with snapshots.

I would format the “monthly” drive using ZFS and backup the daily USB pool to it. I don’t have direct experience with the configuration of pool1 → pool2 → pool3 configurations so I’m not sure if there would be problems with common snapshots between all pools. You probably want to use the following syncoid option for one of the pathways to disambiguate the snapshot names.

      --identifier=EXTRA    Extra identifier which is included in the snapshot name. Can be used for replicating to multiple targets.

I would attach all three drives and perform all transfers daily, but that’s a personal choice.

Stretching a bit, I would consider putting the “monthly” drive on one of your systems and backing up over the Internet to provide a remote copy of the data. That has security implications requiring an SSH connection between the hosts. I address that with Tailscale which uses Wireguard under the covers.

You could employ other strategies that leverage ZFS but based on the information provided, this is what I would probably do.

mercenary_sysadmin · May 30, 2024, 3:51pm

That’s what sync snapshots–still the default in syncoid–are for. If you don’t go out of your way to specify --no-sync-snapshot, then syncoid will create (and destroy) its own sync snapshots in order to make certain that you don’t lose continuity.

By default, syncoid identifies its own snapshots by the hostname of the system running it. That won’t really work here, since you have the same host and multiple pools, but still need to ensure continuity. That’s what the --identifier argument HankB referred to above is for.

So, if you create a pool named BACKUPA on USB drive A, and a pool named BACKUPB on USB drive B, you could then back up your main pool PROD like this:

syncoid -r --identifier='BACKUPA' PROD BACKUPA/PROD

and

syncoid -r --identifier='BACKUPB' PROD BACKUPB/PROD

and syncoid itself will maintain snapshots that ensure neither BACKUPA nor BACKUPB will lose their last common snapshot with PROD.

tom_aus · June 2, 2024, 2:53am

Awesome, that’s exactly what I need thank you :-).

I was just wondering if the USB drives need to be scrubbed regularly or more accurately what is the worst that can happen if you don’t?

How often should they be scrubbed, there probably going to be 4TB drives?

Cheers

mercenary_sysadmin · June 2, 2024, 3:07pm

Ideally, monthly. What you’re trying to do is catch any corruption that’s happening on disk while you’ve still got odds of recovery. The longer you let corruption go on undetected, the worse your odds of having sufficient redundancy to recover from it once you do.

tom_aus · June 4, 2024, 7:18am

Fantastic, I have started using syncoid with identifiers, however, drive A is accumulating drive B sync snapshots and vice versa, is there any way to avoid this?

mercenary_sysadmin · June 4, 2024, 2:49pm

You have to manually destroy the unwanted snapshots, unfortunately. If you’re certain you’ll never put spaces in your pool/dataset/zvol names, that can be as simple as:

root@box:~# for snap in `zfs list -rt snap USBpoolA | grep syncoid | grep -v _USBpoolA | awk '{print $1}'` ; do echo zfs destroy $snap ; done

This assumes that you used “USBpoolA” as the identifier on the pool named “USBpoolA”, obviously. If the output looks good, up arrow and take out the echo–or, better yet, just add an extra line that actually does the work, so you can see what’s going on as it happens:

root@box:~# for snap in `zfs list -rt snap USBpoolA | grep syncoid | grep -v _USBpoolA | awk '{print $1}'` ; do echo zfs destroy $snap ; zfs destroy $snap ; done

tom_aus · June 5, 2024, 12:07am

Fantastic, thanks very much for that .

Now that I think I have the backups themselves sorted, I thought I would just add syncoid --configdir "directory of sanoid conf that includes the given backup drive" --monitor-snapshots. This fairly regularly (primarily when the daily drive is first plugged in) gives me CRIT no daily snapshots at all but they are visible with sudo zfs list -t snapshot and does sometimes give the correct result. It is also now giving me:
print() on closed filehandle FH at /usr/sbin/sanoid line 1519
print() on closed filehandle FH at /usr/sbin/sanoid line 1520
Could not write to $cache!\n at /usr/sbin/sanoid line 812

After leaving the pool imported for 10 ish minutes than the --monitors-snapshots command works as expected.

Thank you very much for the help provided, it has been invaluable.