Long time lurker here. I thought I chime in because I just did that transition for my main server in the home lab.
Background: the server (running Debian) has been serving zfs datasets for 10 years now (mainboards and pool drives replaced at some time). No problems ever. A drive in one of the pools (raidz2 at the time) died, was replaced, no sweat. OS had always been on ext4 (Samsung SATA ssd).
Then, a couple of weeks ago, in a separate machine, a Samsung nvme ssd died out of the blue. It was one drive of two in a mirror. That scared me enough to replace the Samsung nvmes with Intel drives. In the meantime, smartctl showed 90% health remaining on the boot drive of the main server. I couldn’t tell how many hours remaining life (or MBs written) this would translate to but it provided the necessary kick to change the boot drive(s).
For learning, I used a spare machine to install a minimum debian system on a mirror. The instructions on zfsbootmenu are good, it really helped me to go through them a couple of times (first with a single drive zfs boot), then with a mirror. It also helped me to read up on efibootmgr. This here helped:
I suppose you can do some of the learning with VMs, but for me there was quite a bit of hassle getting some of my boards (Supermicro X10) to boot uefi, so practicing on real hardware proved helpful (though painful).
For the actual migration, on the main server, I saved the list of installed packages:
sudo dpkg --get-selections > pkg_list
I also copied all config files to the data zfs pool to have them available:
on the new machine, I prepared for the package installation like this:
avail=`mktemp`
sudo apt-cache dumpavail > "$avail"
sudo dpkg --merge-avail "$avail"
rm -f "$avail"
Then updated the dpkg selections:
sudo dpkg --set-selections < pkg_list
and ask to install the selected packages:
sudo apt-get dselect-upgrade
Note: you’ll likely have to manually curate the list. In my case there were some old linux headers and other packages present in the previous install that weren’t available or needed anymore.
After this, you have a system with all packages from the previous system installed.
I created a usb stick for zfs boot menu. Powered off the server, removed the previous boot drive (stuck it into a how-swap usb enclosure to mount on /tmp if needed), installed the new mirror ssds (setup on the test machine), booted the machine using the zbm usb stick, from there into the os on the new mirror; then ran efibootmgr to install efi boot on the new mirror, ran efibootmgr again to set correct boot order, rebooted and was pleasantly surprised that it actually booted.
From there, all I had to do was copy the config files for the various services. All in all it took me less than an hour to get the server back up running including NFS, Samba and a few VMs.
In summary, it’s a lot of work, definitely takes some practice to wrap your head around the boot manager.
For the two mirror drives I use, I actually installed a boot partition with zfs boot menu on each of the two mirror drives for added safety. The previous experience of a failed drive proved enough incentive to have a mirror boot dataset, for which I can hopefully replace a single drive before catastrophic failure. Right now I’m doing test runs with sanoid and syncoid the backup the snapshots of the main server to a backup server.
Sorry for the super long post, hoping this proves helpful