Seagate Sea Chest Utilities and Stupidity

aciddensity · March 2, 2024, 12:27am

Greetings fellow storage enthusiasts!
Let me tell you a story of woe and humility.

Recently after hearing about a lovely site call serverpartdeals from the 2.5 admins podcast, I decided it was past time for me to replace some of my aging drives while also kicking Western Digital to the curb (for personal reasons). So I picked up a set of very nice factory recertified Seagate Exos X14 12TB drives. A handful of days and a border crossing later and the drives show up at my door. The packaging was fantastic and I was excited to try them out. While waiting for the drives to come up to temperature (it’s rather cold here currently) I was reading the datasheet on the drives and I saw they were capable of 4k native sectors. This made me wonder “why not convert them from the start?”.

Anyone familiar with these things may already be getting a sense of dread for where I’m going.
I tossed the drives into my server, a Dell R710 with raid controller in HBA mode, and started getting familiar with the Sea Chest tools from Seagate.
This is where “stupidity” part of the topic comes in.
I repeatedly ignored obvious and clear warnings against doing that I was doing and proceeded to give the command to change the sector size. The tool makes it very clear that if this process is interrupted the drive is basically a paper weight unless I send them in for repair. I hit enter…
Process failed.
“Weird, must be a fluke” my ignorant monkey brain says. “Let’s try it on the other drive” …
Process failed.

Friends, I’m not proud to tell you that within hours of receiving these two drives I hosed both of them.
So now my options are to reach out to Seagate support and hope for a miracle, or cut my losses and consider it a lesson learned, or semi-blindly attempt a very sketchy serial console repair.

So please take my story as a word of caution in case you are considering doing something similar, and DO NOT try to do this on an HBA or RAID controller with multiple disks attached since it will almost certainly fail and leave you sad.

P.S. If any of you know how to fix this I would be very interested in hearing how.

mercenary_sysadmin · March 2, 2024, 2:29pm

Your story hurts my heart, but I think you already summed up your realistic choices. I’d probably at least try reaching out to Seagate support… But personally, I’d definitely go with the “I screwed up, can you help?” angle instead of the “you owe me help” angle. Good luck!

ZeroSignal9 · March 3, 2024, 6:27pm

Seems most storage devices these days still default to 512 sectors emulated instead of native 4k for compatibility with legacy operating systems.

Did the process fail instantly, or did it seem to make some progress before failing?

aciddensity · March 4, 2024, 2:22am

It was near instant failure.
It is on a Debian 12 install and I had the journal streaming in another window to watch for errors, I was able to see right away that the bus had reset.
The drives no longer register on the HBA. I’ve read that can be due to how the HBA goes about doing the detection. Something about checking for capacity and write-ability. In my desktop PC they do register but writes fail and SMART data is all but totally missing. I am hoping Seagate support has some magic I can try to recover them.

aciddensity · March 4, 2024, 2:24am

Thank you for the sympathies.
I will certainly reach out to them and for sure will own up to my mistake.

HankB · March 12, 2024, 10:28pm

I have had issues with SSDs on some HBAs in that they may not support trim. I wonder if you connected the drives to more conventional ports (like motherboard SATA ports) to see if the process may work. You might even increase the chance of success if you could do this on Windows.

If these are SAS drives, then probably not going to work.

aciddensity · March 12, 2024, 10:52pm

Thanks HankB, I appreciate your suggestion.

I did try them both on a standard SATA port with no luck. The HBA was actually the most likely cause of my whole problem it seems. Pretty much everything I was able to find in my research said that an HBA with multiple disks attached is almost guaranteed to cause a bus reset when issuing the sector resize command from the Sea Chest Utils. I just wish I had seen that warning before starting.
Neither of the drives were SAS, which actually limited some of the tools I could use to try and fix them.

I do feel like you are trolling me a little with the Windows comment, though. I know some tools target Windows only but for the functions I was working with I needed something as low level as I could get. I feel using Windows would have made it more challenging to ensure nothing else was trying to access the disks. Per the Seagate Sea Chest Utilities recommendations I could have booted into DOS but I didn’t trust my very aged knowledge of DOS so I didn’t attempt it.

I did reach out to Seagate though, and in a show of fantastic customer service they agreed to accept the drives for a potential RMA. It required much patience with their online “live chat” interface but it was worth it. Just waiting to hear back about the final verdict. Wish me luck!

HankB · March 21, 2024, 12:40am

No troll intended. I am a little frustrated that tools available for Windows are not always ported to Linux.

feinedsquirrel · December 16, 2024, 11:12pm

I know this is 9 months old, but here is my experience.

Also purchased my first set of Seagate drives from ServerPartDeals (one x24, one x20, two ironwolf pro nt models) and was excited to try out 4kn. As related here, the warnings about multiple disks, production (fully installed) OS, etc are real. Ironically, with 4 drives plugged into my motherboard (2 on direct sata connections to CPU, 2 on some realtek controller or something) my first drive (the x20) successfully converted.
“Awesome!” I thought, “warnings are for the legal team to cover their rears.”
Next drive (the x24) started successfully, going… going… seemed to hang… waited an hour per stdout info… “Crap. Warnings were real.”

I had read in the man pages, other locations, and early printout from the setSectorSize option that if it seems to fail, you can try setting sector size again, and it may be able to fix itself.

aside: I powered off my system, unplugged the other 3 drives, and upon reboot, it stalled in early boot with repeated errors accessing the ATA device. Eventually boot continued but the drive would not show up at all with lsblk. I hot swapped it from the realtek(? again, don’t remember exactly which controller my mobo uses, I could look it up pretty quickly, but don’t think it is relevant) controller to a direct CPU Sata port, and it appeared with lsblk. “phew! at least it isn’t completely destroyed, still have some sort of communication with the drive.”
I was unable to write to the disk, despite it showing up in lsblk. Not even dd if=/dev/zero of=/dev/sda worked. SMART also was not happy, read errors increasing at 60 errors at a time on each reboot attempt.

So at this point, instead of running from my production system (up to date Arch Linux on the lts kernel), I decided it may be necessary to put in the effort to get Seagate’s linux boot image on a usb drive and boot to that. It uses tiny core linux, boots with refind (imo a pleasant surprise), and has all the SeaChest utils installed. I couldn’t find (I didn’t try that hard to look) a seagate util to install this image from linux. They have a windows .exe that will flash the selected usb drive, and I just used that out of ease and quickness to get on with the project.

Booted into the usb, and since the format to 4k failed, I decided to re-run the setSectorSize back to 512 to hopefully “reset” it of sorts. It started the process, then not-quite-immediately the stdout printout stated that it recognized there was something wrong with the drive, possibly a previous failed attempt to re-format, and stated it would attempt a recovery. (I don’t remember the exact phrasing.) To re-iterate: I had only one drive connected to the system, and I was using a sata port that was directly connected to the CPU (not using a chipset controller). After several minutes, the command completed and stdout printout reported a successful recovery. “YEE HAW!”

At this point, I think I did a sanitize overwrite, but don’t remember. There are drive background operations going on in the drive after a format, as I understand, the spindle has to write each sector, but I don’t understand it entirely. vonericsen on seagate’s github has a bunch of info on it. I don’t remember exactly but I know that after this, in the seagate tinycore environment, I re-ran the format to 4k, and it was successful at this point. There were reports that x24 drives were failing format on the current version of SeaChest, and I was concerned about this, but those concerns were put to rest on this result of successful recovery and subsequent successful format to 4kn.

I rebooted the tinycore environment between each drive, and with each in turn connected to the CPU SATA port, the last two drives successfully formatted to 4kn.

After a reboot into my production, with the drives still in the middle of the background operations, I used the arch wiki badblocks#alternatives section recommendation to cryptsetup & shred zeros across all 4 drives simultaneously, to force the background ops to complete. All four passed the cmp -b /dev/zero command.

Note: The x24 shows failed shortDST. SMART reports 170-something un-correctable read errors, but zero re-allocated sectors. I believe the uncorrectable read errors were due to my reboots between the failed format and the successful repair. The read errors haven’t grown since the repair. I tried a SeaChest --dstAndClean, but the read errors are still recorded in the log. I am ignoring them, considering the drive in perfectly fine condition, and will just monitor for an increase from the current value, in addition to the other traditional indications of drive failure (reallocated sectors, etc).

This was long, hope it helps somebody.

mercenary_sysadmin · December 16, 2024, 11:40pm

I hope this helps somebody: on re-reading what you were trying to do and with what tools, I think–repeat, I think–there was an easier and fully open source way to do what you were trying to do: the sg_format utility, which is what one must use when converting (eg) 520-byte sectors to 512 byte (or 4,096 byte!) sectors on enterprise-targeted drives.

root@ubuntu:~# sg_format -v --format --size=512 /dev/sdwhatever

More information: 520 byte sectors and Ubuntu – JRS Systems: the blog

Sorry I didn’t think of this earlier, folks!

feinedsquirrel · December 17, 2024, 12:38pm

Ah ha! I was reading sg_* man pages today as I wanted to ensure I was working on the correct device today, and saw that there was an sg_format. I didn’t read its man page as I no longer needed it this time around, but I wondered if it would/should have worked. Good to know that my musing was likely correct, and next time I need, I’ll dive into its options.

mercenary_sysadmin · December 17, 2024, 6:26pm

The behavior when you sg_format a drive is essentially the exact behavior described up thread for the Seagate Sea Chest Utility (which I’ve never used)–it takes a very long time to complete, the drive gets seriously warm during the process, and sometimes it errors out shortly into the first run, but usually will then complete successfully on a second.

So yeah, I don’t know for certain, but I’d be honestly pretty surprised if they aren’t doing the exact same thing under the hood! And sg_format is not vendor specific; I’ve only needed to use it a few times but it’s never been on the same brand of SSD twice.