Shipping my NAS overseas, what's the best way to maintain access to my data in the meantime?

Hi PracticalZFS,

Glad to see this community continuing post-Reddit!

The TL;DR is that I’m moving country soon, which will involve shipping my NAS, and I will be without it for a few months, but I want to maintain access to my data, probably on a single disk attached via USB.

The NAS is a simple affair - a zpool with a single vdev which is a 2x18TB mirror. It’s backed up to backblaze, but that’s not practical for daily access. The main client device is an M1 Macbook Pro, and the NAS is running Ubuntu 22.04. As the USB drive would be highly portable, encryption would be very nice to have, but is not a hard requirement. I won’t need network access for the duration of the move.

Bearing in mind that I’m experienced with Linux and IT generally but a newbie to ZFS, I can see a few options. The common thread amongst all of then is buying a single large USB drive and working from that:

  1. The traditional most-compatible approach would be to format the USB drive as exFAT and use rsync. This would be simple, albeit potentially error-prone, although I’m fairly comfortable that all data before the move would be recoverable (despite the possibility of rsync’ing bad data over it), thanks to snapshots. But it would be preferable to have the integrity-checking built in, and only having to copy a delta back ala zfs send/receive.
  2. A ZFS-centric approach would be to send a snapshot to the USB drive and work from a clone, then do the same in reverse when I unpack the NAS - promoting the received snapshot. However working with ZFS on Mac looks iffy. openzfsonosx seems to be a thing, but I have no idea how reliable it is - and the documentation appears quite outdated, which doesn’t inspire confidence in it as a project.
  3. Use ZFS as above, but instead of openzfsonosx set up a raspberry pi, duct-tape it to the USB drive, and create a “mini-NAS”. I’ve seen people do neat things with the PCIE interface of the Pi CM4 and sata breakout boards, but the price for all the pieces would be getting up there, and availability isn’t great (I’m in the UK).

So, how would you go about this? Honestly, I’m currently leaning towards the rsync approach. The pool is using 11.0TB, has 5.5TB free, and of that I really only care about my own photos and videos which is under 4TB.

But is there something I’ve missed? Would a Pi4 with a USB disk be unbearably slow or could it get close to maxing out a single disk? Perhaps openzfsonosx is awesome and I should definitely use it? It feels like the odds are stacked against it, but if you’ve had any experience with any of these solutions, or have a better one, I’d love to hear it!

I gave up on Mac ZFS because I got some flakey behaviour.

I’d just RSync TBH. Couldn’t you use encrypted AFS instead of ExFAT?

1 Like

P.S. If you really want ZFS, you could try passing the disk thru to an Ubuntu VM and then sharing the root folder back to the host…

2 Likes
  1. zpool scrub mypool

WAIT FOR IT TO FINISH. now zpool status to make sure no errors. Don’t skip this step!

  1. zpool split mypool myportablepool mydisk1

Now you’ve got two single-disk pools: mypool and myportablepool. Both have all of your data.

  1. zpool export myportablepool

Now you’re ready to pull myportablepool out of the system. Next step: grab a single-disk USB enclosure, and stuff 'er on in there.

  1. on your laptop, attach the USB disk, and zpool import myportablepool

Presto.

  1. once you arrive, zpool export myportablepool, shuck the drive back out of the enclosure, and put it back in your NAS. When the NAS boots, zpool import myportablepool. Now you’ve got mypool (your original data) and myportablepool (your original data, plus anything you did while in transit).

  2. zpool scrub myportablepool

Once again, wait for it to finish, then zpool status myportablepool. No errors? Great.

  1. zpool destroy mypool

You no longer need the original version of your pool; that’s all on myportablepool.

  1. zpool export myportablepool ; zpool import myportablepool mypool

Now it’s named “mypool” again, for your sanity.

  1. zpool attach mypool mydisk2 mydisk1

Now your pool is once again composed of an intact two-disk mirror vdev, and a gentleman by the name of Robert is, in fact, thine uncle.

7 Likes

For what it’s worth, I’ve had very good luck with ZFS on a Mac. I used it for years for local photo storage, which could be easily backed up to a NAS with send/recv. One caveat: I changed my workflow since the pandemic, so I don’t know how ZFS fairs in the Apple silicon world.

1 Like

Thank you Jim, this is brilliant. I’m a little squeamish about putting a 7200rpm Exos into a portable caddy, but I’m sure it would be fine as long as I’m careful and don’t knock it about.

I’d just RSync TBH. Couldn’t you use encrypted AFS instead of ExFAT?

I could, but then I couldn’t plug the drive into the NAS and rsync directly, it would all have to be over the network. Although now that I think about it, a gigabit link isn’t that far off the speed of a single disk anyway.

I would assume you are shipping the NAS with the bulk slowboating overseas or something, which is the reason its taking months?

Personally, I would never do that. I would give the NAS enclosure to a friend or family member, without disks and ask them to send it by air to my new address. I would take the disks myself, bubblewrap either in my suitcase or hand luggage.

PS. Is the data encrypted? I would never leave my home with unencrypted disks. Ever.

1 Like

I would assume you are shipping the NAS with the bulk slowboating overseas or something, which is the reason its taking months?

Yep.

You make a good point. As of now it’s not encrypted. Fortunately I still have a ton of free space, and encryption can be done with a send | receive, so I’m going to get right on that.

The answers in thread have been extremely helpful - I think I’ve been convinced to split the array and use a caddy (after encrypting the data of course). Now I just need to figure out the best way of accessing it from the mac.

I thought something that uses Apple’s native virtualisation would be the way to go, but nothing I’ve found supports USB pass-through. UTM looks promising, but USB only works if you use QEMU. That should be good enough, but is a layer of virtualisation going to be any better than a fuse port?

On shipping, even though it’s backed up I’d rather not have them both with me. The boot drive is an m.2 ssd, so I think I’ll encrypt the zfs volumes, and extract the boot drive as well after scrubbing the volume key and any credentials from it. Then one hdd and the boot ssd travel with me, while the chassis and the other hdd (sans decryption key) go by sea.

nothing I’ve found supports USB passthrough

You don’t necessarily need USB passthrough. I’m not directly familiar with Mac-specific hypervisor options, but eg on Linux or in Hyper-V you wouldn’t necessarily pass the entire USB device through, you’d just pass the storage device through. In other words, it’s “just a disk” passed through like any other disk (and it’s just on you to make sure that USB drive is plugged in before you start the VM and while the VM is running).

If that’s not entirely clear: again using Linux as an example, you can pass through actual physical USB ports (and therefore whatever device is plugged into that port). But if your USB drive is already plugged in, it’s already going to have a device name as a disk, eg /dev/sdf or what have you, and you can just pass /dev/sdf in, rather than doing USB passthrough.

1 Like

you wouldn’t necessarily pass the entire USB device through, you’d just pass the storage device through.

Yeah that’s true, although I haven’t found any with the option to pass a raw block device either. It seems that functionality is in beta, whereas USB passthrough is not: VZDiskBlockDeviceStorageDeviceAttachment | Apple Developer Documentation

It’s probably possible with QEMU though, I just haven’t found the right implementation - I’d like to avoid crafting a config by hand if possible.

QEMU can absolutely pass through raw block devices, though I have no idea what the state of QEMU/KVM is on a Mac.

Thank you for all the replies. I thought I would update, as the NAS was shipped today so the main part is done. Honestly, I cannot tell you how glad I am I migrated from BTRFS.

Firstly, I encrypted all the data. This was done as follows (for each volume):

zfs snapshot pool1/photos@20231008-encrypt
zfs send pool1/photos@20231008-encrypt |
	zfs receive \
		-o encryption=on \
		-o keyformat=raw \
		-o keylocation=file:///etc/pool1.keyfile \
		pool1/photos-encrypted

I made a copy of the encryption key on my laptop (which is of course itself encrypted). For some volumes (not photos), I took the opportunity to enable compression with -o compress=zstd (you can set the option at any time, but it only applies to future writes).

In my paranoia I then checked the copy with rsync:

rsync -aic /pool1/photos/ /pool1/photos-encrypted/ --dry-run

-a you’re probably familiar with. -c means use checksums rather than rely on date/time/size, which is obviously much slower but also the point here. -i prints a summary of changes - no output means no files would be copied, I.E. the checksums match for all files.

Then I moved the new encrypted volume into place over the old one:

zfs destroy pool1/photos
zfs rename pool1/photos-encrypted pool1/photos

All up the encryption/compression and verification took me about a week of evenings, just because each step took multiple hours depending on the size of the volume.

For data access while the NAS is in transit, I have not found an elegant solution for accessing the data on my mac. I know it could be done, but it’s going to be clunky, so I’ve also rsync’d the essential data to a 2TB SSD in a USB enclosure (encrypted with APFS), and will work from that. This is the first time in a while that I wish I was still on Linux, but at all other times I’m very happy on macos.

But despite not having a plan for access, I did split the array, simply for the fact that I want to have a copy with me in case it gets lost, damaged, or even delayed in transit. Maybe I’ll try a Raspberry Pi 5 mini-NAS in the intervening months.

After the scrub had finished I disabled anything that might change a volume:

systemctl disable smbd
systemctl disable duplicati
systemctl disable syncthing

Then split the pool:

zpool split pool1 pool1-portable

I didn’t specify the disk as I didn’t really care which one became the portable one, they were specified by wwn, which is printed on the disk, so it was easy to identify.

ZFS didn’t automatically mount the new pool (it wasn’t visible with zpool status), but zpool import with no other args lists the pools available for import.

I then imported it to inspect (zpool import pool1-portable), and sure enough, I had two pools. It automatically started a scrub, but had to cancel it, as I didn’t have another 12 hours to let it run (I think this is OK…).

Then I simply did zpool export pool1-portable, deleted the keyfile, and ran fstrim before powering down. One thing that almost tripped me up was the use of etckeeper - that had to be removed as it had preserved the keyfile in its daily commit.

My original plan was to pull the boot SSD, but it’s under the motherboard and that seemed like a faff, so I settled for removing the key. It’s not foolproof, but I’m not likely to be the target of a nation state, so if someone is going to go to the effort of extracting it from a trimmed SSD then more power to them. There are other credentials such as an SSH key for pulling backups from a web server and of course an api key for accessing backup buckets, but those are easily rotated - I’ll regenerate them at the other end.

On that note, the plan for arrival is basically what Jim described. After copying back the key file:

zpool import  # should show pool1-roamer
zpool import pool1-roamer
zpool scrub pool1
# if successful:
zpool destroy pool1-roamer
zpool attach pool1 wwn-0x5000c500db7d8492 wwn-0x5000c500db804c23

This of course assumes that I won’t change the data on the roaming drive. If I do, the scrub, destroy and attach would all be on the opposite pool (followed by importing as the new name).

Thanks for the help everyone, I may update at the other end when it’s all running again if anyone is interested.

2 Likes