I’m running a raidz1-0 (RAID5) setup with 4 data 2TB SSDs on CentOS.
During midnight, somehow 2 of my data disks experience some I/O error (from
When I investigated in the morning, the zpool status shows the following :
state: SUSPENDED status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run 'zpool clear'. see: http://zfsonlinux.org/msg/ZFS-8000-HC scan: resilvered 1.36T in 0 days 04:23:23 with 0 errors on Thu Apr 20 21:40:48 2023 config: NAME STATE READ WRITE CKSUM zfs51 UNAVAIL 0 0 0 insufficient replicas raidz1-0 UNAVAIL 36 0 0 insufficient replicas sdc FAULTED 57 0 0 too many errors sdd ONLINE 0 0 0 sde UNAVAIL 0 0 0 sdf ONLINE 0 0 0 errors: List of errors unavailable: pool I/O is currently suspended
I tried doing
zpool clear, I keep getting the error message
cannot clear errors for zfs51: I/O error
Subsequently, I tried rebooting first to see if it resolves - however there was issue shut-downing.
As a result, I had to do a hard reset. When the system boot back up, the pool was not imported.
zpool import zfs51 now returns me :
Destroy and re-create the pool from a backup source.
-F, I get the same error. Strangely, when I do
zpool import -F, it shows the pool and all the disks online :
pool: zfs51 id: 12204763083768531851 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: zfs51 ONLINE raidz1-0 ONLINE sdc ONLINE sdd ONLINE sde ONLINE sdf ONLINE
Yet however, when importing by the pool name, the same error shows.
Even tried using
-fF, doesn’t work.
After scrawling through Google and reading up on different various ZFS issues, i stumbled upon the
-X flag command (that solves users facing similar issue).
I went ahead to run
zpool import -fFX zfs51 and the command seems to be taking long.However, I noticed the 4 data disks having high read activity, which I assume its due to ZFS reading the entire data pool. But after 7 hours, all the read activity on the disks stopped.
I also noticed a ZFS kernel panic message :
kernel:PANIC: zfs: allocating allocated segment(offset=6859281825792 size=49152) of (offset=6859281825792 size=49152)
Currently, the command
zpool import -fFX zfs51 seems to be still running (terminal did not return back the input to me). However, there doesnt seem to be any activity in the disks. Also running
zpool status in another terminal seems to hanged as well.
I’m not sure what do at the moment - should I continue waiting (it has been almost 14 hours since I started the import command), or should I do another hard reset/reboot?
Also, I read that potentially I can actually import the pool as readonly (
zpool import -o readonly=on -f POOLNAME) and salvage the data - anyone can any advise on that?
I’m guessing both of my data disks potentially got spoilt (somehow at the same timing) - how likely is this the case, or could it be due to ZFS issue?