Recover from READ and CKSUM errors without RAID

G’day,

I have a very low budget machine with a single SSD and a single HDD, the SSD runs a nextcloud VM that is then sent using syncoid and sanoid over to the HDD as a backup. Unfortunately the SSD is now returning 5 read and 5 cksum errors. The data that is stored on the machine is not mission critical at all, more annoying to replace than irreplaceable.

I was just wondering on what you guys thought the next step should be? Ignore the errors :face_with_peeking_eye:. Override live data with a backup (how would I know which snapshot is fine on the HDD?)?

Thanks so much,
Cheers

I’d be inclined to test out the new (2.2) corrective receive option of zfs receive, i.e. -c. I’ve never yet had an occasion to try it so am not sure on the procedure. Create a checkpoint on your SSD pool first just in case and only discard it once you’re sure. In theory, i’d guess you could just send the latest snapshot from the hard drive but that could be big so I’d check timestamps on the files zfs reports as corrupted and try minimal snapshot range(s) to include the last writes to affected files

Otherwise, just restore the files manually with cp from the HDD.

I don’t know for certain but given that ZFS identifies that data is corrupt when reading, I assume, zfs send will never send corrupt data. So if my assumption is correct. the data on your HDD might be out-of-date but not corrupt.

Thanks for getting back to me, I’ll have a look at that. I have come down with influenza so it might take a bit of time before I get back.