Curious as to the utility of ZFS bookmarks

When I heard Alan mention ZFS bookmarks on the latest 2.5 Admins I went looking on the OpenZFS documentation as this was not a feature I had previously been aware of, and I must admit I don’t know if my understanding is correct.

I also found this post which I think clarified the use of these a bit more Chris's Wiki :: blog/solaris/ZFSBookmarksWhatFor

So, is a bookmark essentially a marker that tells zfs send what transaction group to send the replication stream with? Instead of needing to keep the old snapshot around to do a comparison?

I don’t currently have any interest in using this feature - I’m not anywhere close to being tight on space on any of my pools, but I do want to understand its possible uses.

So essentially, this is a feature that might come in handy if you’re backing up a small-but-fast SSD based server to a slow-but-massive rust-based server. You can snapshot the dataset on the SSD pool, send it to the rust pool, bookmark the snapshot on the SSD source, then destroy the actual snapshot on the source.

Now, you’ve reclaimed any divergent space that the snapshot itself was occupying–say you’ve deleted 100GiB of data since the snapshot was taken; you bookmark the snapshot, you destroy the snapshot, now you’ve got 100GiB free. The bookmark really is just an immutable record of the time that the destroyed snapshot was taken; this can in turn be used to base an incremental replication on.

To understand the how and why on that, realize that incremental replication relies on the birthtime of blocks. When you do a zfs send -i (or -I), zfs send finds all the blocks that were born in between the two snapshots you specify, and that’s what it sends to the other side. As long as the other side has all the blocks leading up to the birth time of the oldest snapshot being sent, everything’s fine!

So when we bookmarked that snapshot on the source, then replicated the source to the target, we could bookmark the snapshot on the source and destroy it. Our next incremental replication uses the bookmark instead of the actual snapshot, and that is enough to let zfs send -i know what birth time to begin sending blocks from.

I have to look this feature up every time somebody asks me about it and refresh myself on it, because the thing is, it’s not very useful to anybody who’s already got a sane setup with automated snapshots and replication.

Essentially, this is a very helpful feature if you very rarely replicate a small system to a much larger one. But if you’re already doing automated replication on a routine basis, bookmarks won’t help much–you can already destroy every snapshot but the most recently replicated snapshot, with no need for bookmarks. So, say you’re replicating daily–well, don’t you want to keep a snapshot around locally for at least one day anyway? If the answer is “yes”, bookmarks won’t do you any good.

6 Likes

Hm so it’s not the transaction group but rather the block metadata.

Yeah, I can’t see a lot of use for this.

With how fast replication is and how little space most snapshots tend to take there’s not much point to it.

Thanks for taking the time to explain!