checksum error on <0x52f2>

doddi · October 2, 2025, 11:35pm

I recently suffered a failed disk resulting in some files getting corrupted. I replaced the drive and recovered the corrupted files, however one file keeps complaining about checksum errors:

$ sudo zpool status -v | grep -v autosnap

  pool: mypool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 04:52:52 with 20 errors on Sun Sep 28 07:52:52 2025
config:

        NAME        STATE     READ WRITE CKSUM
        mypool      ONLINE       0     0     0
          sda       ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        mypool/mydataset:<0x52f2>

The other 20 errors are that same file, or whatever it is, in old snapshots.

Has anyone seen this before and knows how to clear that error? zpool clear mypool does nothing.

Cheers,
Doddi

karl · October 3, 2025, 3:17pm

doddi:

  pool: mypool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 04:52:52 with 20 errors on Sun Sep 28 07:52:52 2025
config:

        NAME        STATE     READ WRITE CKSUM
        mypool      ONLINE       0     0     0
          sda       ONLINE       0     0     0

Short answer: restore from backup, she’s dead!

You have 20 unrecoverable errors with no redundancy. See Message ID: ZFS-8000-8A — OpenZFS documentation for more.

ZFS has done its job — it detected silent corruption. But because you’re using a single disk (no redundancy), it cannot fix the issue. The only fix now is restoring the corrupted files from backup.

Disclaimer: I have no experience reparing this. I have a few pools on one device but when they fail or go wrong, I restore from backup. So I you do not have backup wait to see if anyone here can help.

doddi · October 3, 2025, 3:41pm

I’ve replaced the disk and restored from backup, the 20 unrecoverable errors are

mypool/mydataset:<0x52f2>

and 19 snapshots of that corrupt thing, the question is if anyone knows what that is, it’s not a filename in my dataset, and trying to do anything with it just results in the terminal interpreting it as the newline character. I can’t restore it from backup because it isn’t a file (I think).

I guess I could try creating a new pool and move the datasets over.

karl · October 3, 2025, 5:52pm

It’s a hexadecimal object ID representing the internal object number within the dataset that has an error.

ZFS stores data in a set of internal objects files, directories, metadata, extended attributes, etc., each with a unique ID. The object <0x52f2> is the one that ZFS couldn’t read or verify due to checksum errors, likely because of underlying disk issues or corruption in the storage.

Try to locate the object:

sudo zdb -dddd mypool/mydataset 0x52f2

This will attempt to show information about that object — sometimes it will give you a file path or at least tell you if it’s a directory, file, xattr, etc.

doddi · October 3, 2025, 7:41pm

Thanks a lot for the help!

This is the output:

$ sudo zdb -ddddddddddddddddddd mypool/mydata 0x52f2
Dataset mypool/mydata [ZPL], ID 165, cr_txg 98, 589G, 49575 objects, rootbp DVA[0]=<0:882bebd000:1000> DVA[1]=<0:ea5a0e8000:1000> [L0 DMU objset] fletcher4 uncompressed unencrypted LE contiguous unique double size=1000L/1000P birth=763953L/763953P fill=49575 cksum=00000014c0c9e148:00003aeab66b1825:0058a0bd99a6594f:5d69b0335fa48963

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
     21234    2   128K   128K  18.9M     512  18.9M  100.00  ZFS plain file (K=inherit) (Z=inherit=lz4)
                                               176   bonus  System attributes
        dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED 
        dnode maxblkid: 150
        path    on delete queue
        uid     1000
        gid     1001
        atime   Tue Mar 25 19:54:31 2025
        mtime   Mon Apr 24 21:42:54 2023
        ctime   Fri Feb  7 21:44:01 2025
        crtime  Fri Feb  7 21:44:01 2025
        gen     9957
        mode    100770
        size    19748557
        parent  21379
        links   0
        pflags  840800000004
        projid  0

This datasets holds exported JPG’s, so by the type and size I assume it’s some random photo. The path is listed as “on delete queue”, which I haven’t been able to find information on.

doddi · October 3, 2025, 9:23pm

But then I did find a lot of information on zfs delete queues. Query the ZFS master node

$ zdb -ddddd efni/doddi/myndir 1
Dataset efni/doddi/myndir [ZPL], ID 165, cr_txg 98, 589G, 49575 objects, rootbp DVA[0]=<0:88005fb000:1000> DVA[1]=<0:8a00177000:1000> [L0 DMU objset] fletcher4 uncompressed unencrypted LE contiguous unique double size=1000L/1000P birth=796649L/796649P fill=49575 cksum=00000013f15ea0b8:000037d005e27efd:0052ab0c0cffdfb6:55c4e915331a4e4a

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
         1    1   128K    512     8K     512    512  100.00  ZFS master node
        dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED 
        dnode maxblkid: 0
        microzap: 512 bytes, 7 entries

                utf8only = 0 
                casesensitivity = 0 
                VERSION = 5 
                ROOT = 34 
                DELETE_QUEUE = 33 
                normalization = 0 
                SA_ATTRS = 32

find that the delete queue is item number 33, query that

$ sudo zdb -ddddd efni/doddi/myndir 33
Dataset efni/doddi/myndir [ZPL], ID 165, cr_txg 98, 589G, 49575 objects, rootbp DVA[0]=<0:88005fb000:1000> DVA[1]=<0:8a00177000:1000> [L0 DMU objset] fletcher4 uncompressed unencrypted LE contiguous unique double size=1000L/1000P birth=796649L/796649P fill=49575 cksum=00000013f15ea0b8:000037d005e27efd:0052ab0c0cffdfb6:55c4e915331a4e4a

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
        33    1   128K    512      0     512    512  100.00  ZFS delete queue
        dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED 
        dnode maxblkid: 0
        microzap: 512 bytes, 1 entries

                52f2 = 21234

and find my file in there as the only entry. The internet says that files getting stuck in the delete queue can happen if the file is open, and some people talk about having to mount-cycle the pool to have it flush the queue.

I’ve unmounted the pool, exported it, imported and mounted again with no luck, then re-booted, although this has been a problem since before last boot, and that didn’t work either.

karl · October 4, 2025, 8:33am

doddi:

$ sudo zdb -ddddd efni/doddi/myndir 33
Dataset efni/doddi/myndir [ZPL], ID 165, cr_txg 98, 589G, 49575 objects, rootbp DVA[0]=<0:88005fb000:1000> DVA[1]=<0:8a00177000:1000> [L0 DMU objset] fletcher4 uncompressed unencrypted LE contiguous unique double size=1000L/1000P birth=796649L/796649P fill=49575 cksum=00000013f15ea0b8:000037d005e27efd:0052ab0c0cffdfb6:55c4e915331a4e4a

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
        33    1   128K    512      0     512    512  100.00  ZFS delete queue
        dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED 
        dnode maxblkid: 0
        microzap: 512 bytes, 1 entries

                52f2 = 21234

This tool is for advanced debugging and is where my skills end. File seems to be metadata structure used by ZFS to track file deletions in progress (or pending deletion in snapshots). It’s not user data, and its presence is entirely normal in an active filesystem.

How did you restore the dataset, was it through zfs send/recv or another tool like rsync/cp command? Do you know this new device to be good?

doddi · October 4, 2025, 9:19am

My understanding is that when you delete a file, it goes to the ZFS delete queue, which has object ID = 33 in my dataset, and stuck in that queue is my file with object ID = 52f2 in hexadecimal or just 21234.
In any case, since there’s seemingly no way to flush the queue, I’ll try making a new dataset, which I think would have a new delete queue, and then not rsync the data over or something, so that the old delete queue doesn’t get copied over.

How did you restore the dataset, was it through zfs send/recv or another tool like rsync/cp command? Do you know this new device to be good?

zfs-send-ing the last known non-corrupted snapshot would have involved loss of some new data, and since it was only a handful of files that needed restoring, I used cp or rsync to restore individual files. I replaced the dying disk with a new Samsung EVO 1TB and no new data errors have appeared since then.

doddi · October 9, 2025, 3:22pm

I tried rsync-ing from my backups to the new dataset and got input/output errors when doing so. I’m going to consider the pool toast, set up a new mirrored pool and restore from backup.

Thanks for your help @karl !