I have a raidz1 pool with 3 disks. I powered down this server to do some physical maintenance unrelated to disks and upon power up the pool did not exist. “zpool import” shows:
pool: platters
id: 9995614438679844999
state: FAULTED
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
see: http://zfsonlinux.org/msg/ZFS-8000-5E
config:
platters FAULTED corrupted data
raidz1-0 FAULTED corrupted data
ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG UNAVAIL
ata-WDC_WD80EDAZ-11TA3A0_VGGEMKNG ONLINE
sdd ONLINE
This disk appears to be physically fully dead (cannot enumerate on bus). Attemping to do “zpool import -f platters” gives me:
internal error: Invalid exchange
Aborted
Checking this against the dbgmsg log:
1689207278 spa.c:6242:spa_tryimport(): spa_tryimport: importing platters
1689207278 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING
1689207279 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1': open error=2 timeout=1000000881/1000000000
1689207279 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/sdd1': best uberblock found for spa $import. txg 4332344
1689207279 spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=4332344
1689207280 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1': open error=2 timeout=1000001090/1000000000
1689207280 vdev.c:155:vdev_dbgmsg(): raidz-0 vdev (guid 17603172039307171397): unable to read the metaslab array [error=52]
1689207280 vdev.c:155:vdev_dbgmsg(): raidz-0 vdev (guid 17603172039307171397): vdev_load: metaslab_init failed [error=52]
1689207280 spa_misc.c:403:spa_load_failed(): spa_load($import, config trusted): FAILED: vdev_load failed [error=52]
1689207280 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING
1689207284 spa.c:6242:spa_tryimport(): spa_tryimport: importing platters
1689207284 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING
1689207285 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1': open error=2 timeout=1000000144/1000000000
1689207285 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/sdd1': best uberblock found for spa $import. txg 4332344
1689207285 spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=4332344
1689207286 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1': open error=2 timeout=1000631147/1000000000
1689207286 vdev.c:155:vdev_dbgmsg(): raidz-0 vdev (guid 17603172039307171397): unable to read the metaslab array [error=52]
1689207286 vdev.c:155:vdev_dbgmsg(): raidz-0 vdev (guid 17603172039307171397): vdev_load: metaslab_init failed [error=52]
1689207286 spa_misc.c:403:spa_load_failed(): spa_load($import, config trusted): FAILED: vdev_load failed [error=52]
1689207286 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING
1689207286 spa.c:6098:spa_import(): spa_import: importing platters
1689207286 spa_misc.c:418:spa_load_note(): spa_load(platters, config trusted): LOADING
1689207287 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1': open error=2 timeout=1000000201/1000000000
1689207287 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/sdd1': best uberblock found for spa platters. txg 4332344
1689207287 spa_misc.c:418:spa_load_note(): spa_load(platters, config untrusted): using uberblock with txg=4332344
1689207288 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1': open error=2 timeout=1000000921/1000000000
1689207288 vdev.c:155:vdev_dbgmsg(): raidz-0 vdev (guid 17603172039307171397): unable to read the metaslab array [error=52]
1689207288 vdev.c:155:vdev_dbgmsg(): raidz-0 vdev (guid 17603172039307171397): vdev_load: metaslab_init failed [error=52]
1689207288 spa_misc.c:403:spa_load_failed(): spa_load(platters, config trusted): FAILED: vdev_load failed [error=52]
1689207288 spa_misc.c:418:spa_load_note(): spa_load(platters, config trusted): UNLOADING
1689207288 spa_misc.c:418:spa_load_note(): spa_load(platters, config trusted): spa_load_retry: rewind, max txg: 4332343
1689207288 spa_misc.c:418:spa_load_note(): spa_load(platters, config trusted): LOADING
1689207289 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1': open error=2 timeout=1000000796/1000000000
1689207289 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/sdd1': best uberblock found for spa platters. txg 4332331
1689207289 spa_misc.c:418:spa_load_note(): spa_load(platters, config untrusted): using uberblock with txg=4332331
1689207290 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1': open error=2 timeout=1000000408/1000000000
1689207290 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGGEMKNG-part1': vdev_load: vdev_dtl_load failed [error=52]
1689207290 spa_misc.c:403:spa_load_failed(): spa_load(platters, config trusted): FAILED: vdev_load failed [error=52]
1689207290 spa_misc.c:418:spa_load_note(): spa_load(platters, config trusted): UNLOADING
Seems to suggest that it’s reading the config/label on one of the good drives, it learns that this is a 3 disk array and that <dead drive> is a member. Then it tries to read <dead drive> and hits a timeout.
Any attempts at other zpool commands that reference the pool all fail with “no such pool”. e.g.:
zpool offline platters ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG
cannot open 'platters': no such pool
All available instructions for replacing a disk seems to assume that the pool is online and working. I cannot import this pool, so I cannot “offline” the disk and replace it. So, how do I tell the remaining two disks to give up on the third?
I do have a physical replacement disk available, so I am open to either:
-Replace the failed drive, resilver the pool
-Bring up the 2 drives by themselves as read-only (this should be possible in raidz, no?), transfer data to a backup and then build a new pool.
zdb’s output:
WARNING: ignoring tunable zfs_arc_max (using 4038935552 instead)
platters:
version: 5000
name: 'platters'
state: 0
txg: 4331450
pool_guid: 9995614438679844999
errata: 0
hostid: 2088271309
hostname: 'htpc'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 9995614438679844999
create_txg: 4
children[0]:
type: 'raidz'
id: 0
guid: 17603172039307171397
nparity: 1
metaslab_array: 64
metaslab_shift: 34
ashift: 12
asize: 24004646141952
is_log: 0
create_txg: 4
com.delphix:vdev_zap_top: 129
children[0]:
type: 'disk'
id: 0
guid: 13660960398252877464
path: '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1'
devid: 'ata-WDC_WD80EDAZ-11TA3A0_VGG3KDVG-part1'
phys_path: 'pci-0000:00:1f.2-ata-2'
whole_disk: 1
DTL: 1402
create_txg: 4
com.delphix:vdev_zap_leaf: 130
children[1]:
type: 'disk'
id: 1
guid: 9598384422309144048
path: '/dev/disk/by-id/ata-WDC_WD80EDAZ-11TA3A0_VGGEMKNG-part1'
devid: 'ata-WDC_WD80EDAZ-11TA3A0_VGGEMKNG-part1'
phys_path: 'pci-0000:00:1f.2-ata-3'
whole_disk: 1
DTL: 1400
create_txg: 4
com.delphix:vdev_zap_leaf: 131
faulted: 1
children[2]:
type: 'disk'
id: 2
guid: 15626403881568574750
path: '/dev/sdd1'
devid: 'ata-WDC_WD80EDAZ-11TA3A0_VGG2SPZG-part1'
phys_path: 'pci-0000:00:1f.2-ata-4'
whole_disk: 1
DTL: 1399
create_txg: 4
com.delphix:vdev_zap_leaf: 132
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data