Offline and replace each drive with its by-id path so the pool reports scsi-ids instead of devs

After rebooting an Ubuntu server a zpool (for data only; OS is EXT4), it showed that two drives (SAS) were FAULTED:

# zpool status
  pool: zroot
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: scrub in progress since Thu Aug  7 21:02:11 2025
        564G scanned at 1.04G/s, 149G issued at 282M/s, 763G total
        0B repaired, 19.57% done, 00:37:11 to go
config:

        NAME                        STATE     READ WRITE CKSUM
        zroot                       DEGRADED     0     0     0
          raidz3-0                  DEGRADED     0     0     0
            sda                     ONLINE       0     0     0
            sdb                     ONLINE       0     0     0
            scsi-35000039878613d45  ONLINE       0     0     0
            sdd                     ONLINE       0     0     0
            sde                     ONLINE       0     0     0
            4044550924801993745     FAULTED      0     0     0  was /dev/sdf1
            6246812777007092846     FAULTED      0     0     0  was /dev/sdg1
            sdh                     ONLINE       0     0     0

My best estimate was that the device names for the 6th and 7th drives switched on reboot, causing ZFS to get mixed up. I tried every which way to use zpool commands (replace, online, etc.) to integrate them back into the pool.

On a lark, I removed them and reseated them. The pool immediately began resilvering. Not sure what happened.

But if you notice the 3rd drive has its scsi-id as the identifier, while everything else is by device name. This is from a previous drive replacement.

To avoid the possibility of device names switching around, to the peril of my zpool, I would like to one-by-one, remove each drive from the pool and replace it by scsi-id instead of device name.

Would someone kindly give me the proper commands/syntax to do that?

You can export the pool, and re-import the pool using the scsi device IDs or by another identifier (something that won’t change between reboots)

Example:
zpool import -d /dev/disk/by-id/scsi-id1(-part1) -d /dev/disk/by-id/scsi-id2(-part1) -d /dev/disk/by-id/scsi-id3(-part1) zroot

_

I like to use by-partlabel for my devices. I give the partition a name using the model number and part of the serial number.


        NAME                      STATE     READ WRITE CKSUM
        datapool                  ONLINE       0     0     0
          mirror-0                ONLINE       0     0     0
            HUH721010ALE600_WXVC  ONLINE       0     0     0
            HUH721010ALE600_JUWD  ONLINE       0     0     0
          mirror-1                ONLINE       0     0     0
            HUH721010ALE600_9M2G  ONLINE       0     0     0
            HUH721010ALE600_29LD  ONLINE       0     0     0
          mirror-2                ONLINE       0     0     0
            HUH721010ALE600_V38G  ONLINE       0     0     0
            HUH721010ALE600_LMVD  ONLINE       0     0     0
2 Likes

Can this be done in-place? Or does it require a reboot?

If the former, does it take the pool offline for any significant amount of time?

If the OS isn’t a part of the pool, no need to reboot.

As long as all of you device paths are correct and show up on the system, it shouldn’t take any longer than usual to re-import the pool with the new device paths.

If your device paths have a “-part1” listed, use those. (The devices will show up in zpool status without the “-part1” appended.)

So I like the idea of using serial numbers like you did. This way we can easily corelate drives in iDRAC with the drives in the pool.

Just want to be 110% on the syntax. The pool is named ‘zpool’. So would it be:

zpool import zroot -d /dev/disk/by-id/scsi-STOSHIBA_AL14SEB120NY_2880A02GFL0E-part1 -d /dev/disk/by-id/scsi-STOSHIBA_AL14SEB120NY_2880A29TFL0E-part1 -d /dev/disk/by-id/scsi-STOSHIBA_AL14SEB120NY_28L0A0AMFL0E-part1 -d /dev/disk/by-id/scsi-STOSHIBA_AL14SEB120NY_2880A36ZFL0E-part1 -d /dev/disk/by-id/scsi-STOSHIBA_AL14SEB120NY_2880A2RPFL0E-part1 -d /dev/disk/by-id/scsi-STOSHIBA_AL14SEB120NY_2880A2T3FL0E-part1 -d /dev/disk/by-id/scsi-STOSHIBA_AL14SEB120NY_2880A2T4FL0E-part1 -d /dev/disk/by-id/scsi-STOSHIBA_AL14SEB120NY_2880A2T6FL0E-part1```

These are in the order that the drives are in the physical slots on the server.

That should work, but the pool name is last:
zpool import -d <devices> <pool>

Good tip. I may do this going forward. Also can add other tags, like date of in service, etc.

Did this work without any other work around?