Hacker News new | past | comments | ask | show | jobs | submit login

Whether uncorrectable read error (bad sector) reported by drive firmware, or dm-integrity detecting corruption, the affected LBA's are propagated up to md. And then md can determine the location of a copy (mirror or reconstruct from parity). It then overwrites the bad location.

This mechanism is often thwarted with consumer drives, when their SCT ERC timeout can't be set or is longer than the kernel's SCSI command timer default of 30 seconds. Once a command hasn't returned a result of some kind within 30s, the SCSI driver does a link reset. On SATA this has a pernicious effect of clearing the entire command queue, not just the one that was hung up in "deep recovery". LBA's aren't returned so it's indeterminate where or what the problem was caused by, no fix up happens. This results in bad sector accumulation. This misconfiguration is common, and routinely costs people their data.

https://raid.wiki.kernel.org/index.php/Timeout_Mismatch

This may be easier and more reliable to do with a udev rule; but the concept is the same. Also, while this is linux-raid@ wiki, it doesn't only apply to mdadm raid, but LVM, and Btrfs as well. I don't know if it applies to ZoL because I don't know about all the layers ZFS implements itself separate from Linux. But if it depends at all on the SCSI driver for error handling, it would be at risk of this misconfiguration as well.




It seems that a HGST/WD Ultastar, which is their data center drive line, has ERC disabled by default. I've now set it to the suggested 7 seconds.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: