I am in a bit of a predicament. I have inherited a RHEL 8 System running on a Supermicro Chassis connected to 2 Jbods using an Avago controller. ZFS is running on this system and we have some failing drives but nobody determined what drive went with what disk in the zpool. The zpool has been imported using /dev/disk/by-id so it is displaying the wwn number of the disks (none of which match the physical wwn number on the disks lable.). My friend and I have been racking our brains for the past 2 days trying to figure out how to identify what disks are what.
So here is my question. Does anyone know a good and accurate way (it does not have to be easy) to let us know what disk in our zpool is in what physical slots in our jbod. I so far have limited information on the system but I can try to give more info if needed.
There are a couple of programms to let the led of a disk light, eg ledctl (/ledmon), which should help to identify which disk sn is in which slot. When documented you know even a died disk is in which slot ven if cannot light anymore.
Do you have any links under /sys/class/enclosure? The JBODs my workplace buys end up with /sys/class/enclosure/*/SLOT*/device/wwid paths that correspond to the /dev/disk/by-id IDs.
We will have to try this and see if we can blink the drives. We ended up for now just using DD to get them to blink but not all of them would blink. So we will have to figure out another method.
Yes we did find this but the issue is the numbers for each enclosure start at 00 and end at 20 but each enclosure in that directory restarts at 00 and not to mention our jbod only has 2 enclosures and both have 90 disks 00 through 89.
Unless enclosure means backplane and it’s deciding them by backplane?
Could be. On ours there are generally two entries in /sys/class/enclosure per chassis since there are two IO modules per chassis and the host is cabled to both of them, so it’s not an “enclosure” in the physical sense. The IO modules are redundant, and you see the same set of sixty slots from both, but I want to say I’ve seen one where the chassis provides different SES for different groups of disks. I think on that one there were different labels in /sys/class/enclosure/*/device/model or lsscsi | grep enc. Anything illuminating in there?
Even if the disk label is missing the WWN, I am guessing it does have a serial number. You may be able to correlate WWN to serial number and then to physical slot location using the following for each disk:
sudo smartctl -i /dev/sdX
Obviously, it will take an outage while you shut down the system to pull each disk to record the serial number and physical slot. Photos of each disk showing the serial number beside the open physical slot may be useful, too.