Thanks to @Halfwalker sharing his excellent Ubuntu build script, I was finally able to figure out a local build of zfsbootmenu so that I could have the latest debian-packaged zfs and remote ssh access during boot. I thought I would share my notes in case it helps anyone else working on this type of thing.
I did all of this in an Incus VM on my workstation, and would recommend NOT doing this directly on a production machine or personal machine without using a VM - I made a physical debian test system unbootable while working on this by installing the wrong dracut package, and there are a few other footguns in here too.
Also, the resulting images are not signed for Secure Boot, and will need Secure Boot disabled in bios on the machine that you use them on, and I ran into a problem where the build VM couldn’t modprobe the zfs-dkms module because of Secure Boot, and had to disable Secure Boot for the build VM before I could run the following build:
Firstly, get the latest ZFS (2.2.6) from bookworm-backports, and build a ZBM EFI image with it, just to check we can get that far:
sudo su
add bookworm-backports to /etc/apt/sources.list for main an contrib
apt update
apt install -y linux-headers-$(uname -r)
apt install -t bookworm-backports -y curl zfs-dkms zfsutils-linux
rm -rf /tmp/zfsbootmenu && mkdir -p /tmp/zfsbootmenu
curl -L https://get.zfsbootmenu.org/source | tar xz --strip=1 --directory /tmp/zfsbootmenu
cd /tmp/zfsbootmenu
make install
apt-get -qq --yes --no-install-recommends install libyaml-pp-perl libsort-versions-perl libboolean-perl mbuffer
apt install -y dracut-core kexec-tools fzf systemd-boot
Add the following to /etc/zfsbootmenu/config.yaml
Components:
Enabled: false
ImageDir: /root
Versions: 3
EFI:
Enabled: true
ImageDir: /root
Versions: false
Global:
BootMountPoint: /boot/efi
DracutConfDir: /etc/zfsbootmenu/dracut.conf.d
InitCPIOConfig: /etc/zfsbootmenu/mkinitcpio.conf
ManageImages: true
PostHooksDir: /etc/zfsbootmenu/generate-zbm.post.d
PreHooksDir: /etc/zfsbootmenu/generate-zbm.pre.d
Kernel:
CommandLine: ro quiet loglevel=7
then
generate-zbm --enable
generate-zbm --no-initcpio --debug
(Edit: I just noticed that the code block above scrolls - you might need to scroll down to see everything)
The resulting image, /root/vmlinuz.EFI , will have the latest backports ZFS version, but no SSH capability.
Then, add in dropbear ssh support and build the image again. This uses a workaround from here to fix a recent dependency problem with dracut-network
apt install dropbear-bin dracut-network openssh-server
rm -rf /tmp/dracut-crypt-ssh && mkdir -p /tmp/dracut-crypt-ssh
cd /tmp/dracut-crypt-ssh && curl -L https://github.com/dracut-crypt-ssh/dracut-crypt-ssh/tarball/master | tar xz --strip=1
sed -i '/inst \"\$moddir/s/^\(.*\)$/#&/' /tmp/dracut-crypt-ssh/modules/60crypt-ssh/module-setup.sh
cp -r /tmp/dracut-crypt-ssh/modules/60crypt-ssh /usr/lib/dracut/modules.d
echo 'install_items+=" /etc/cmdline.d/dracut-network.conf "' > /etc/zfsbootmenu/dracut.conf.d/dropbear.conf
ssh-keygen -t ed25519 -f /root/.ssh/remote-zbm
echo 'add_dracutmodules+=" network-legacy "' >> /etc/zfsbootmenu/dracut.conf.d/dropbear.conf
echo 'dropbear_acl=/root/.ssh/remote-zbm.pub' >> /etc/zfsbootmenu/dracut.conf.d/dropbear.conf
mkdir /etc/cmdline.d
echo 'ip=dhcp rd.neednet=1' > /etc/cmdline.d/dracut-network.conf
generate-zbm --no-initcpio --debug
It generates the same EFI file as last time. This image works as a normal console based ZBM, but also starts an SSH server on a DHCP address (which you will either need to create a static lease for or figure out from your router what address it has), and can be accessed as root on port 222 using the SSH key that was generated:
ssh -i /root/.ssh/remote-zbm -p 222 10.4.6.108
You then get a shell prompt
zfsbootmenu ~ >
and you have to type “zfsbootmenu” and press enter to get the actual ZBM
zfsbootmenu ~ > zfsbootmenu
From there it’s a regular ZBM, except that when you select an envrionment and boot it, the ssh sessioin drops out and hangs when kexec hands of to the new kernel, which means if the new kernel has any problems opening the pool (e.g. because it’s an old snapshot that has an old ZFS version that doesn’t support the subsequently-upgraded pool), then you won’t see the failure messages that you would normally see in the console, and the new kernel cannot log them because it can’t load the pool, and I haven’t been able to find a way to see them from ZBM logs? I’d love to know how to fix this if anyone has some ideas.