So I was trying to download a torrent (while seeding like 5 others) when I noticed my rates just kept gradually falling to 0B upload/download until spiking back up to 1-2MB before falling again. I check my Proxmox SMART test of my drives and then it shows one disk was degraded. When I try to view the overall “disks” tab in Proxmox it just times out and shows an error [communication failure (0)]

So I try to do a zpool scrub tank_name, which started Monday May 4 22:02:21 2026…

While scrubbing the checksum errors on the online repairing disk (wwn-0x5000c5004d033fc1) just keep climbing… I made the degraded disk go offline. Here’s the current status of zpool status tank_name:

root@nova:~# zpool status Orico2tera4
  pool: Orico2tera4
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub in progress since Mon May  4 22:02:21 2026
        3.53G / 378G scanned at 36.9K/s, 3.47G / 378G issued at 36.3K/s
        9.61M repaired, 0.92% done, no estimated completion time
config:

        NAME                                              STATE     READ WRITE CKSUM
        Orico2tera4                                       DEGRADED     0     0     0
          mirror-0                                        ONLINE       0     0     0
            ata-ST2000NM0011_Z1P2D6SC                     ONLINE       0    13     1
            usb-External_USB3.0_DISK01_20170331000C3-0:1  ONLINE       0     0     3  (repairing)
          mirror-1                                        DEGRADED     0     1     0
            wwn-0x5000c500357c0b91                        OFFLINE      0     0    21
            wwn-0x5000c5004d033fc1                        ONLINE       0     1 2.00K  (repairing)

errors: 49 data errors, use '-v' for a list

I haven’t used these disks for super long, it’s only been about 5 months of my homelab actually being used, and I wasn’t doing constant torrenting until February. The disks are refurbished, 2TB each, and they’re stored in a USB connected drive bay. my usage is pretty low, just 432.80 GB of 4TB (11.13%)

I’ve looked at my snapshots with zfs list -t snapshot, not sure when I should try to restore from a snap, but I’ve never done it before. I’ll make sure to take backups more seriously from now on, don’t be me…

Update:

Turned off the machine and bay, realized it had shit ventilation and that the drives were pretty hot, let it cool and gave everything a quick dust down. Nothing seemed to be bad or visibly fucked up?

After letting it chill out for about 2-3 hours I put the drive bay in a better vented spot and did a scrub, then resilvered the drive, then did another scrub. About to do some SMART tests.

Here’s zpool status -v:

zpool status -v Orico2tera4
  pool: Orico2tera4
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 00:56:51 with 0 errors on Wed May  6 23:37:43 2026
config:

        NAME                                              STATE     READ WRITE CKSUM
        Orico2tera4                                       ONLINE       0     0     0
          mirror-0                                        ONLINE       0     0     0
            ata-ST2000NM0011_Z1P2D6SC                     ONLINE       0     0   199
            usb-External_USB3.0_DISK01_20170331000C3-0:1  ONLINE       0     0   125
          mirror-1                                        ONLINE       0     0     0
            wwn-0x5000c500357c0b91                        ONLINE       0     0   100
            wwn-0x5000c5004d033fc1                        ONLINE       0     0   462

errors: No known data errors

And then it again after a clear:

zpool status -v Orico2tera4 
  pool: Orico2tera4
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:57:18 with 0 errors on Thu May  7 01:28:30 2026
config:

        NAME                                              STATE     READ WRITE CKSUM
        Orico2tera4                                       ONLINE       0     0     0
          mirror-0                                        ONLINE       0     0     0
            ata-ST2000NM0011_Z1P2D6SC                     ONLINE       0     0     0
            usb-External_USB3.0_DISK01_20170331000C3-0:1  ONLINE       0     0     0
          mirror-1                                        ONLINE       0     0     0
            wwn-0x5000c500357c0b91                        ONLINE       0     0     0
            wwn-0x5000c5004d033fc1                        ONLINE       0     0     0

errors: No known data errors
root@nova:~# 

What have we learned?

  • Do biweekly scrubs
  • Put your drives in a not shit location
  • Do trims like, once a month maybe
  • Make way more frequent snapshots
  • Backup your shit!!! NOW!!! To literally anywhere else but just do it!!!
  • desentizised@lemmy.zip
    link
    fedilink
    English
    arrow-up
    3
    ·
    21 hours ago

    I’ve tried recycling a USB-based 5-bay enclosure (which I previously used in hardware raid mode) for my unraid-based backup and even in that lower criticality use-case it was an absolute showstopper. Its kind of a shame that USB seemingly can’t be used this way. It would make redundant data storage so much more affordable.