ZFS Auto Replaced Failed Disk with Zeroes for READ WRITE CKSUM

Hello Everyone!

I have a pool that automatically replaced a failed disk. However, when I look back and try to identify why the disk was replaced, there is no information in any of the logs, but a resilvered email notification was sent and the pool resilvered successfully. Additionally the READ, WRITE, CHKSUM values have all zeroes and I can still access the disk using smart - meaning the disk is not completely dead.

zfs version:
zfs-0.8.3-1ubuntu12.17
zfs-kmod-2.1.5-1ubuntu6~22.04.4

Any ideas why this may be happening or where I can get more information regarding the failure?

Thanks!

It’s very possible completely unrelated but I see that you are running mismatched zfs userspace tooling and kernel version (probably the 20.04 LTS running HWE kernel from 22.04?).

This can cause problems and you should consider upgrading the userspace tools (zfsutils-linux), not run the HWE-kernel or upgrade to the latest LTS (again, skipping the HWE-kernel if you can).

1 Like

Definitely running the HWE kernel on this system. Currently investigating the upgrade possibility. Thanks