Common Snapshots

Question about common snapshots when doing incremental transfer. Does the requirement for a common snapshot mean in reality that you always need to keep the first snapshot in the source location? If this needs to be kept indefinitely to avoid having to destroy the target datasets and do a new full transfer will it not possibly grow quite large with time?

I’m replicating snapshots to a backup-server so that I’m not filling up my main server with a lot of snapshot data, but if I still need to to keep the first snapshot on the main server I guess it does not really matter.

Suppose you do:

zfs snapshot main/b@1
zfs send main/b@1 | zfs recv bkup/b
zfs snapshot main/b@2
zfs send -i main/b@1 main/b@2 | zfs recv bkup/b

Now b@1 and b@2 exist on both pools.

Now we are at a point where your question becomes relevant.

zfs snapshot main/b@3

To replicate the incremental changes between b@2 and b@3

zfs send -i main/b@2 main/b@3 | zfs recv bkup/b

When this is done, the b@1 on either pool is not relevant in either pool: it would be fine for it to have been deleted earlier in main and/or bkup.

Summing up @mgerdts answer, basically zfs needs a reference point in order to do an incremental sync.

If it doesn’t have that it becomes impossible for zfs to know which blocks have already been transferred and which haven’t.

As long as they have a common snapshot it will work fine. You don’t need to always keep the first snapshot in the chain.

One final note on common snapshots: the most important common snapshot isn’t the oldest common snapshot, it’s the newest common snapshot.

Let’s say you’ve got three snapshots on your source: @0, @99, and @100. And your target has @1 through @98. There are no common snapshots, so you can’t incrementally replicate between the two.

But what if we change it up a bit, and say that we’ve got the same @0, @99, and @100 on the source (we destroyed all the rest in a fit of housecleaning), but we’ve got @0-@98 on the target. We can incrementally replicate, since the two have @0 in common… but in order to do so, the target has to destroy all the snapshots and all the data newer than @0 before it can begin!

On the other hand, if we’ve got @0, @99, and @100 on the source and @0-@99 on the target, our most recent common snapshot is @99, and the target doesn’t need to start from @0 (the oldest common snapshot)—it can start directly from @99 without having to destroy any local data first, so it completes enormously faster.