Currently running openzfs 2.3 with arch linux and LTS kernel. Native encryption is being used.
Starting this morning, I’m having a ‘null’ pool issue although, zfsbootmenu is able to see the pool, report it showing as healthy but when I try to boot it states that the pool needs to be recreated.
Imported pool at /mnt, loaded keys with zfs load-key -aL to force prompt instead of file keys for zfs encryption.
Regenerated initrd by force reinstalling linux-lts.
After that, during boot it crashed stating that this pool was previously imported and used by “archiso”
Exporte the pool from zfsbootmenu recovery shell, and used the FORCE flag while re-importing the pool.
System booted.
Really weird but, it worked. It must have been something else since kernel had not been updated in a while and I’ve even tried to boot on a snapshot from 6 days ago, and the last maintenance on this pool was the automatic weekly scrub, which still indicates 0B repaired…
I’m glad you got it sorted. I had a similar problem with an Ubuntu installation that just suddenly refused to boot, despite the pool being fine, and I never could get it running again.
Personally, I stopped using zfsbootmenu after that–it’s very nice to have a proper ZFS boot environment, but it’s easier for me to manage boot-and-root off of mdraid1, and leave ZFS for the after-the-system-is-booted data!
I suspect there is something with the cachefile going here like a bug but, I have no idea on how to reproduce again and maybe report to openzfs.
scrub was running, the laptop was under some load (nothing exhaustive, just gaming while the scrub was running so, some extra CPU and GPU load), weekly scrub process finished flawlessly and next day when I tried to boot, got contemplated with this null pool.
I’m studying the possibility of not relying on the cachefile anymore for this specific laptop setup by setting cachefile=none, disabling zfs-import-cache.service and enabling zfs-import-scan.service to force import of any pools on boot.
Welp…just ended my zfs-boot-menu rabbit hole. If Jim Salter stops using zfs-something, it’s not for me.
FWIW, sitting on unRAID 6.11 w/ ZFS plugin and reconsidering moving to 7.2. I really use unRaid for docker, and not even their CA…good ramp into Linux though, so thinking Proxmox or just Ubuntu Server with docker.
I am definitely not trying to cancel ZBM here. Just because it isn’t currently the best choice for me personally does not mean it’s a bad choice for everybody.
ZBM is easily the best option I’m aware of for a proper ZFS boot environment under Linux, and I want to see the project prosper. It’s currently struggling in essentially the same way ZFS itself has traditionally suffered in Linux–it’s not directly supported by any distro I’m aware of, so it’s awkward to get it going by comparison with things directly supported in mainstream installers.
I personally have pretty specialized needs, and I depend very little on my root filesystem or host operating systems, because I segregate my important workloads into virtual machines, and am careful not to “nest” heavily into customizations in my desktop environments–i use as default an environment as possible, apps straight out of repos, and so forth.
If I hadn’t already carefully pruned my own dependencies on bare metal environments down to essentially nil, I’d be willing to put up with an awful lot more to get a proper ZFS boot environment–and to dig in and learn and document techniques to directly rescue a failed environment.
If you haven’t pruned your dependence on the bare metal operating system similarly, I definitely wouldn’t recommend writing off ZBM!
Definitively, on my laptop setup with self deployed CAs and Secure Boot, zfsbootmenu is way simpler than keeping grub or generating UKIs and less hooks are needed when upgrading the kernel like including new boot entries every update.
Also, with UKIs outside the encrypted ZFS pool, they might not be on-par with the dataset selected as bootfs if I need to revert a snapshot (kernel and modules version mismatch) so, keeping it simple is my best option here. Having the initrds and kernels inside the encrypted pool is also a nice bonus.
Grub support for encrypted pools is still a hit or miss, making ZBM the “necessary evil” here.
Rather than use a patched version of Grub, I find it better to just use standard Grub that loads a initramfs that has the ZFS utilities built-in. Any time the kernel and/or ZFS are updated, a new initramfs is built to match. No worrying if the bootloader is up-to-date with the latest version of ZFS. I suppose any bootloader than can load the initfamfs can boot from any dataset.
Thank you so much for the detailed response. I mostly am interested in learning (hence using zfs plugin on unRAID before it was “officially” supported). That and sanoid have saved my bacon so many times, thank you!
The reason zbm interested me is because I leveraged ZFS boot environments on pfsense often. Makes experimenting, breaking, and fixing fun and speeds learning.
Your distribution should have a package called zfs-initramfs or something similar. After install, you just have to update your initramfs, if it’s not done automatically.
Then add these options to your kernel options in grub: boot=zfs root=zfs=<your root dataset>
I have been using ZBM for a short while but just ran into some difficulty with KASLR. Would you say using BTRFS on root with ZFS for data/storing docker containers/vm’s be a sensible way of getting snapshots on root whilst keeping all the important stuff safe or is BTRFS just too much of a mess? Is there a reason to use EXT4 over BTRFS for the OS partition?
I’m not trusting my root partition to btrfs at all. I don’t want to freely bash btrfs because it has its place on the opensource OS space but, when I decided to go full ZFS, one of the reasons that made me do this was that out of nowhere when booting my laptop, btrfs started to print out errno=-5 IO failure errors in an unrecoverable fashion. Volumes and sub-volumes would not even mount as read only, and after extensive(2 days) research I decided to let go and format that drive.
btrfs “dont need fsck” until you face some serious issue like this. What is the KASLR issue your are facing, that is making you to reconsider using zfs as your root filesystem?
Good to know before I put it into production, yikes that sounds rough. I installed ubuntu within a Hyper V VM using the ZBM instructions Ubuntu (UEFI) — ZFSBootMenu 3.0.1 documentation to the letter (I did the install over a little while rather than in one sitting) and I got an “Invalid physical address chosen” and “Physical KASLR disabled: no suitable memory region” error.
I haven’t been able to fix it yet, I have posted it here:
I don’t have the issue when installing with EXT4 or BTRFS using the standard ubuntu server installer. The server isn’t in production yet so I still have some time to do major changes.