Hi all,
I’m in the process of trying to decommission my Synology NAS and finalise my migration over to TrueNas. The only data left on my Synology is Active Backups for Business from the past couple of years leading up to when I set TrueNas up and, irrationally, I’m keen to hold onto them if I can.
I think I’ve figured out a few ways I can do this but I’m not entirely sure how ZFS will handle the data from a de-duplication and wear-and-tear perspective.
There’s two main folders for Synology’s Active Backup data:
/volume1/ActiveBackupforBusiness/@ActiveBackup/(du -shreturns 39T)/volume1/ActiveBackupforBusiness/ActiveBackupData/(du -shreturns 1.3P)
The first is the ‘deduplicated’ backup folder. It appears that instead of using btrfs filesystem dedupe features it uses img.delta files and only backs up what has changed. How to re-assemble these img.delta files without the Active Backup software is unclear, but could be another option.
The second is an expanded virtual folder containing the assembled disk img files for each backup date. There’s a device_spec file providing information about each image and these disk images will mount as expected using a command like mount -o ro,loop,offset=135266304 2.img /mnt/e
I definitely have space for the deduplicated backups, and assuming I’m able to figure out how to reconstruct the image files that would be the preferable way to do this - as it presumably wouldn’t require ZFS dedupe at all.
However - if I can’t figure that out, how would ZFS handle the copying of data from the expanded backup folder? I’ve already prepared a dedicated pool: 330GB RAM, dedup enabled and a SSD dedup vdev.
1 ) How well would ZFS dedupe handle img files compared to regular files?
If I were to copy the img files across to TrueNas, would ZFS be able to dedupe them effectively or would I be better off mounting the img files and using rsync to sync the contents across as files?
It sounds like being one big monolithic file can cause issues for detecting duplicated blocks… which in my case is fine - I don’t need bootable or re-imagable disk images - I’m mainly concerned about archival of the tree and file attributes.
2 ) Assuming ZFS dedupe did work here, what would the hardware impact be and would nopwrite reduce hard disk usage?
My presumption is that copying the img files would be the worst option from a hardware perspective because even if ZFS manages to deduplicate everything as well as the original Synology backup folder - I will have written 1.3 petabytes of data to my disks in the process. Not only would this take forever, that’s like 3-years worth of lifespan for the EXOS disks I use.
However, if I mount the images first then rsync the files - I can at least utilise snapshots and only sync changed files rather than the whole img file.
I’ve stumbled across ‘nop-write’ in the ZFS docs which seems to suggest that you can avoid writing data if it already exists and it specifies full-backups as one of it’s use cases. Is this applicable to copying disk images here? Though - I suspect that even if nop-write saves my hard drives from unnecessary writes I’m still probably looking at a monumentally large copy time since the blocks would have to be read into RAM, check-summed, compared to the dedupe table, repeat… for 1.3 petabytes of data.
In the process of writing this out I might have answered my own questions… but any thoughts are welcome and hopefully it helps someone out who is googling for answers to a similar conundrum.