HN went down for nearly all of Monday the 6th. I suspected failing hardware.
I configured a new machine that is nearly identical to the old one, but using ZFS instead of UFS. This machine can tolerate the loss of up to two disks. I switched over to it early morning on the 16th, around 1AM PST.
Performance wasn't great. Timeouts were pretty frequent. I looked into it quickly, couldn't see anything obvious, and decided to sleep on it. I switched back to the old server, expecting to call it a night.
Then the old server went down. Again. The filesystem was corrupted. Again. So I switched back to the new server. During this switch some data was lost, but hopefully no more than an hour.
And here we are. I'm sorry that performance is poor, but we're up. I'll work to speed things up as soon as I can, and I'll provide a better write-up once things are over. I'm also really sorry for the data loss, both on the 6th and today.
Raidz2 is not fast. In fact, it is slow. Also, it is less reliable than a two way mirror in most configurations, because recovering from a disk loss requires reading the entirety of every other disk, whereas recovering from loss in a mirror requires reading the entirety of one disk. The multiplication of the probabilities don't work out particularly well as you scale up in disk count (even taking into account that raidz2 tolerates a disk failure mid-recovery). And mirroring is much faster, since it can distribute seeks across multiple disks, something raidz2 cannot do. Raidz2 essentially synchronizes the spindles on all disks.
Raidz2 is more or less suitable for archival-style storage where you can't afford the space loss from mirroring. For example, I have an 11 disk raidz2 array in my home NAS, spread across two separate PCIe x8 8-port 6Gbps SAS/SATA cards, and don't usually see read or write speeds for files[1] exceeding 200MB/sec. The drives individually are capable of over 100MB/sec - in a non-raidz2 setup, I'd be potentially seeing over 1GB/sec on reads of large contiguous files.
Personally I'm going to move to multiple 4-disk raid10 vdevs. I can afford the space loss, and the performance characteristics are much better.
[1] Scrub speeds are higher, but not really relevant to FS performance.