Sounds like you need more archive depth.
Replication requires a common snapshot. So, if you destroy all of your common snapshots–such as by rolling back to a snapshot that’s older than your oldest common snapshot–you’ll break the chain, and the only way out is to destroy the target (or rename it) and begin again with a full replication.
To avoid this, maintain a greater archive depth on both sides. If you keep, for example, 30 dailies on both sides–repeat, both sides–you’d have to roll back to some snapshot even older than that before you broke the chain.
So for example, you encounter a problem, and you roll back a week on the source. Your next replication gets based on the snapshot from eight days ago, which is now the most recent common snapshot–so the target itself effectively rolls back to that eight day old snapshot automatically, the next time you replicate in.
If you don’t want that to happen either, you need to make some special arrangements after rolling back on the source, before its next replication to the target.
Alternately, avoid rolling back at all, and use a tool like rsync --inplace against the snapshot you’d like to “roll back” to. It will be slower than a rollback, but it will simply patch the existing files in the existing filesystem with each block that’s different in the snapshot, leaving you with a completely unbroken chain and no missing data on EITHER side.
zfs rollback -r pool/ds@old
– instantaneous but destructive; all snapshots newer than @old
irrevocably destroyed; replication will destroy those snapshots on target as well (assuming a common snapshot still exists)
zfs snapshot pool/ds@before; zfs clone pool/ds@old pool/clonetmp ; rsync -hav --inplace --delete --progress /pool/clonetmp/ /pool/ds/ ; zfs destroy pool/clonetmp
– slower, patches files in place with changed blocks from the referenced snapshot. Does not destroy snapshots on either source OR target.
So for example say you got hit with ransomware: a rollback instantly repairs the damage but permanently destroys the altered data, making it impossible to investigate HOW you got ransomwared unless you shell into your target to salvage that data before the next incoming replication wipes it out.
But an rsync --inplace with the steps shown above, although potentially much much slower, repairs your data while leaving copies of the ransomwared data intact–both in any automatically taken snapshots while the ransomware was active, AND in the manually created snapshot you took @before… And upon your next replication, @before will be backed up to the target as well as any new snapshots you take afterward.