So I got two new SSD’s to place in front of my spinning rust as a special vdev for metadata and small blocks.
Then I notice this one drive in zpool status:
NAME STATE READ WRITE CKSUM
ata-HGST_HUH728080ALN600_ABCDWXYZ ONLINE 26 0 12
and not long after:
NAME STATE READ WRITE CKSUM
ata-HGST_HUH728080ALN600_ABCDWXYZ ONLINE 117 1 71
I asked around over at STH and dug around myself some more.
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 14067 1468064631
# 2 Short offline Completed: read failure 90% 14067 1468064631
So I create a partition starting just before sector 1468064631 and I let this run for an hour or so:
# badblocks -b 4096 -c 1024 -s /dev/disk/by-id/ata-HGST_HUH728080ALN600_VLJVKBYY-part1
Checking for bad blocks (read-only test): 9 0.00% done, 0:04 elapsed. (0/0/0 errors)
100.00% done, 0:07 elapsed. (1/0/0 errors)
110.00% done, 0:09 elapsed. (2/0/0 errors)
120.00% done, 0:11 elapsed. (3/0/0 errors)
130.00% done, 0:13 elapsed. (4/0/0 errors)
Only errors so I cancelled it.
Also this:
7 Seek_Error_Rate 0x000b 066 066 067 Pre-fail Always FAILING_NOW 1966194
So yeah, its dead. Stuff happens and its oke, except I didn’t have an 8TB spare …
I have some 4TB disks tucked away somewhere, which together are just about big enough to hold all the data in a raidz1.
Threw those in, syncoid -r ...
That was last night …
Playing around with the new SSD’s and somewhat disappointed with the performance I am seeing and wondering what that might cause … I get an alarm from one of the 4TB’s:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 083 083 016 Pre-fail Always - 5636444
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 3
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 3
Along with a bunch of ATA errors in the SMART error log.
So they are old (> 45000 hours) but really, did it have to happen today?
I am just done syncing one pool, and now I have to do it all over again.
I jumped the gun and ordered a box of SSD’s to get rid of this rust once and for all.