USB: Why we can't have nice things

I’ve advocated for not using USB connected storage, preferring “proper” storage interfaces such as SATA and NVME. (I have distant memories of SCSI, MFM, RLL … but those mostly predate ZFS.) Nevertheless, I do have what I consider an experimental file server running on USB connected HDDs. It’s running Debian (not RpiOS) starting with Bullseye as of 2023-03-20 and upgrading to Bookworm (Testing) about a month later. The Hardware is a Pi 4B. Initially it was a 4GB RAM model but I swapped that for a 4GB Model just because I had one. The USB adapter is a " WAVLINK USB 3.0 to SATA Dual Bay External Hard Drive Docking Station" with SD card reader, USB ports and USB charging ports. I like it because the charging port provides sufficient power for the Pi 4B and the other charging port powers a couple 50mm fans that cool the drives. This drive bay identifies as

Bus 002 Device 004: ID 152d:0583 JMicron Technology Corp. / JMicron USA Technology Corp. JMS583Gen 2 to PCIe Gen3x2 Bridge

And I populated it initially with 2x 6TB 7200rpm HDDs configured as ZFS mirrors. Overall, it’s been pretty solid. I back up a fair amount of my local stuff to it including several other Pis that use ZFS. After a year of operation I migrated to 2x 8TB 7200rpm HDDs to get more space. At present it’s at 71% of capacity. It’s been pretty solid except …

One day I attached an SSD to one of the USB3 ports to transfer some bulk data. Operation became unstable and I need to stop that Right Now. I do not recall if I tried using one of the USB3 ports on the Pi 4B. I just transferred the files over the network instead. Likewise when migrating to the larger HDDs I could only do that with two HDDs connected at a time.

While I was away from my home lab for a few weeks, I found the following status:

root@piserver:/home/hbarta# zpool status
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 4M in 17:24:42 with 0 errors on Sun May 11 17:48:44 2025
config:

        NAME                        STATE     READ WRITE CKSUM
        tank                        ONLINE       0     0     0
          mirror-0                  ONLINE       1     0     0
            wwn-0x5000cca0bee6a900  ONLINE      10     0    35
            wwn-0x5000039a78c87b89  ONLINE       4     0    24

errors: No known data errors
root@piserver:/home/hbarta# 

On my return, I checked dmesg output and found problems starting with

[921468.327430] sd 1:0:0:1: [sdb] tag#12 uas_eh_abort_handler 0 uas-tag 7 inflight: CMD IN 
[921468.335675] sd 1:0:0:1: [sdb] tag#12 CDB: Read(16) 88 00 00 00 00 01 7c 26 f2 28 00 00 04 00 00 00
...

All of the errors including when things settled down can be viewed at https://pastebin.com/VHQxyNZX. I performed an update/upgrade and rebooted before proceeding. Without further actions the status became

hbarta@piserver:~ $ zpool status
  pool: tank
 state: ONLINE
  scan: scrub repaired 4M in 17:24:42 with 0 errors on Sun May 11 17:48:44 2025
config:

        NAME                        STATE     READ WRITE CKSUM
        tank                        ONLINE       0     0     0
          mirror-0                  ONLINE       0     0     0
            wwn-0x5000cca0bee6a900  ONLINE       0     0     0
            wwn-0x5000039a78c87b89  ONLINE       0     0     0

errors: No known data errors
hbarta@piserver:~ $ 

The time for the scrub seemed a bit long but I haven’t been tracking it. I’m running a scrub now and it looks like it is going to take about as long.

I’ve seen a similar issue once in the past. It’s possible that a drive is failing, though there is nothing in SMART stats to indicate that. There were errors logged, but I suspect they resulted from possible power issues or perhaps USB/firmware/driver problems. I’m blaming USB. I don’t recall any issues like this with all of the SATA connected drives I’ve used. The occasional errors logged related to power issues and did not result in any apparent operational problems (once clean power was provided.)

The upshot is, one can use USB, but there are likely to be occasional hiccups that are not otherwise encountered.

My $0.02US

1 Like

I’ve been using USB storage for nearly a decade. I haven’t had any real issues.

A couple things to keep in mind for anyone who wants to do this:

Make sure any storage devices have the power they need. Not an issue for 3.5" HDDs in an enclosure since they come with their own power supply. If you need a hub for bus-powered devices, make sure the hub’s power supply is enough for the number of devices you’re using.

Portable 2.5" HDDs typically have power saving features. You’ll want to disable these if using the HDDs in a NAS-like setup. The drives unload the heads often and spin down after a small period of inactively. The constant loading/unloading and spinning up will add unnecessary wear to the drive, and also hurt performance.

1 Like

CKSUM errors like that worry me. What would have happened if you didn’t have a mirror?

FWIW, I’ve seen similar with SAS, though it made me suspect faulty hardware.

1 Like

Me too! This is turning into a shaggy dog story for me. The host has crashed during the scrub a couple times. I found it down this morning apparently not surviving the “second Sunday scrub.” I had an SSH session open monitoring the pool status (watch -n 60 zpool status) which showed gobs of errors. Unfortunately the screen would not scroll back to that and I did not copy it, but it looked like the scrub was over 90% complete before the connection dropped. I power cycled the setup (drives and Pi 4B) twice and it did not come up. After attaching HDMI and USB cables, it came up of course. :roll_eyes:

There’s no obvious problem indicated in the SMART data for either drive and the zpool status came up clean, but only 44% complete.

hbarta@piserver:~ $ zpool status
  pool: tank
 state: ONLINE
  scan: scrub in progress since Sun Jun  8 00:24:01 2025
        2.42T scanned at 164K/s, 2.28T issued at 0B/s, 5.18T total
        0B repaired, 44.07% done, no estimated completion time
config:

        NAME                        STATE     READ WRITE CKSUM
        tank                        ONLINE       0     0     0
          mirror-0                  ONLINE       0     0     0
            wwn-0x5000cca0bee6a900  ONLINE       0     0     0
            wwn-0x5000039a78c87b89  ONLINE       0     0     0

errors: No known data errors
hbarta@piserver:~ $ 

I interrupted the scrub, shut it down and moved the drives to my desktop which conveniently has two 3.5" hot swap bays and imported the pool there. I kicked off another scrub to see if there is a drive issue.

I’m wondering if the drive bay, Pi 4B or Debian installation is having problems. I can’t rule out a power problem either. Perhaps the wall wort is failing.

This does not make me happy. :worried:

:person_facepalming:

DO NOT IMPORT A POOL ON AN X86_64 HOST CREATED ON AN ARM64 HOST AND WHICH MOUNTS /var WITHOUT THE -N (DO NOT MOUNT) OPTION

The host begins to behave strangely and eventually falls over. Sadly… I recall doing this before with similar results. Popping the drives out and power cycling the host seems to get everything back to normal and if I discover issues, I’ll revert to an older snapshot. Benefit of root on ZFS. (Yes, I’m all in.)

Back to the issue. Before the host completely fell over, I was able to kick off a scrub and that completed w/out errors. I’m repeating that due to the questionable operation during the scrub. The previous scrub took over 11 hours to complete. In the mean time I’m wondering what I can do to check out the Pi 4B and dock. Perhaps a fresh Debian install with some spare (retired) HDDs to exercise it. I can start with one HDD which should put less pressure on the power supply.

NB: I could find no emoji for ARRGGHH!

2025-06-10 morning update

A second scrub on my desktop has completed without issue for the drives that hold the server pool. It seems the drives are not experiencing problems.

In the mean time I’m exercising the Pi 4B and USB dock using a spare 8TB HDD (single HDD for now) and a fresh Debian install. I must remind that the original install was running from an SD card and those have a poor reputation for reliability.

  • On first boot the drive was listed as /dev/sda but I could not fetch SMART stats using smartctl -d sat -a /dev/sda nor could I read the partition table.
  • Following reboot, the only USB devices listed were the two Linux (AFAIK virtual hubs.) I believe this indicates a malfunction of the Pi 4B USB chip as at a minimum there should be internal hubs identified.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  • Following a power cycle (pull wall power, bit just a reboot) the USB devices were populated as expected and the HDD was identified and SMART stats queried as well as the partition table. It held a ZFS pool which I imported and destroyed.

For further testing I’m employing some scripts I developed for some other testing (https://github.com/HankB/provoke_ZFS_corruption/tree/main/scripts) to populate and exercise the pool. At present the pool is at 16% of capacity and building.

The previous issue leads me to suspect an issue with the Pi 4B USB chip - unless the dock is somehow interfering with the Pi 4B operation.

(The shaggy dog stood up, circled the rug a couple times and has laid back down.)

1 Like