Creating a mirror on a single disk pool with READ/WRITE errors

SirGeorge · October 10, 2025, 3:19am

Question = can adding a disk and creating a mirror for an existing single disk pool which has experienced READ, WRITE, and CKSUM errors lead to corrupt data in the mirror?

Setup is:

tank pool at home
rent pool in a remote data center
Datasets are using native zfs encryption
tank has been running for a couple years as a mirror and has never thrown any READ, WRITE, or CKSUM errors. Scrubs run on it monthly without issue or error.

# Tank
root@home:~# zpool status tank
  pool: tank
 state: ONLINE
  scan: scrub repaired 0B in 23:27:02 with 0 errors on Tue Sep 16 12:24:05 2025
config:

	NAME                        STATE     READ WRITE CKSUM
	tank                        ONLINE       0     0     0
	  mirror-0                  ONLINE       0     0     0
	    sda                     ONLINE       0     0     0
	    sdb                     ONLINE       0     0     0

errors: No known data errors

# Rent
user@remote:~$ zpool status -v rent
  pool: rent
 state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 1 days 09:13:11 with 0 errors on Mon Sep 15 09:37:15 2025
config:

	NAME                          STATE     READ WRITE CKSUM
	rent                          ONLINE       0     0     0
	  sda                         ONLINE       0     0     2

errors: No known data errors

Data is replicated via syncoid:

syncoid --sendoptions="w" --no-privilege-elevation --no-sync-snap -r tank/ds1 user@remote:rent/ds1

rent was not reporting any corrupt data files via zpool status -v.

To create resiliency, I added a second disk to rent to create a mirror vdev, and resilvering began during which I observed:

user@remote:~$ zpool status
  pool: rent
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Oct  2 18:11:18 2025
	126G / 8.79T scanned at 572M/s, 31.2G / 8.79T issued at 141M/s
	31.2G resilvered, 0.35% done, 18:03:10 to go
config:

	NAME                            STATE     READ WRITE CKSUM
	rent                            ONLINE       0     0     0
	  mirror-0                      ONLINE       1     0     0
	    sda                         ONLINE       1     0     0
	    sdb                         ONLINE       0     0     1  (resilvering)

errors: No known data errors

A READ error leads to a CKSUM error on the other disk being resilvered?

At the completion of the resilver:

user@remote:~$ zpool status
  pool: rent
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 8.81T in 15:41:41 with 0 errors on Fri Oct  3 09:52:59 2025
config:

	NAME                            STATE     READ WRITE CKSUM
	rent                            ONLINE       0     0     0
	  mirror-0                      ONLINE      90     0     0
	    sda                         ONLINE      98     0     0
	    sdb                         ONLINE       0     0    90

90 READ errors on the source disk, 90 CKSUM errors on the disk added to the mirror and resilvered.

Then I triggered a scrub of rent, after which:

user@remote:~$ zpool status
  pool: rent
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
	Sufficient replicas exist for the pool to continue functioning in a
	degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
	repaired.
  scan: scrub repaired 43.0M in 13:18:07 with 0 errors on Fri Oct  3 23:54:32 2025
config:

	NAME                            STATE     READ WRITE CKSUM
	rent                            DEGRADED     0     0     0
	  mirror-0                      DEGRADED    90     0     0
	    sda                         FAULTED    230     2     0  too many errors
	    sdb                         ONLINE       0     0    90

errors: No known data errors

Now the scrub is making repairs? zpool status -v does not report any corrupted or damaged files, though.

Questions:

Did sda on rent have a risk of any actual corruption, or did those READ errors simply indicate “disk not healthy” but then a retry of the READ worked.
If there was a risk of corruption, am I correct in assuming that adding sdb to rent would copy any corruption to that new disk in the mirror? Is the pool actually protecting me from a failure of sda?
During the resilver, the synchronization between READ errors on the source disk and CKSUM errors on the newly added disk - is that related? If there was a READ error, the relevant CKSUM would be wrong, leading to those CKSUM errors?

Appreciate any help.

I believe a bad SATA cable and controller have been ruled out. Next step is a smart long test on sda. If the result indicates a failing drive, I’m wondering if the right course of action is to replace sda, nuke the pool, create a new mirror pool, and try sending the full datasets again from tank.