Happy to answer any questions or discuss any comments here.
Once again, thank you to Allan Jude at Klara Systems for the advice and guidance with the new ZFS "special" vdev for metadata caching that is discussed this quarter ...
No questions, just wanted to say thank you for such a great service.
Trusting a service provider is really hard in most cases, but you make it easy to trust rsync with posts like this and backing it up with reliable services.
Your "industries" page has a broken "about" link in the footer—and maybe other issues, as it's very different from most of the rest of your site, and there a some other little not-broken-but-not-ideal things about it that I see with a quick once-over, like the button-look "sign up now" at the bottom only having the text clickable.
> We believe that the risk of "logical failure" of an SSD is higher than the risk of physical failure. This means that some pattern of usage or strange edge-case causes the SSD to die instead of a physical failure. If we are correct, and if we mirror an SSD, then it is possible the two (or three, or four) SSDs will experience identical lifetime usage patterns. To put it simply, it is possible they could all just fail at exactly the same time. The way we mitigate that risk is by building mirrors of SSDs out of similarly spec'd and sized but not identical parts.
This makes sense to me (and is a good example of looking at more abstract failure domains in addition to the basic ones we all know and love) -- I'm curious if there's data to support this. rsync.net is in a good position to possibly collect that data.
I think the "Mix SSD vendors/batch numbers" is a hold over from the very early days of SSDs where a handful of people get seriously slimed by having 90%~ of their same-brand-same-batch drives in a single machine fail at once due to some SSD batch failure.
As a side effect people generally now stagger SSDs a little to avoid something similar happening (ofc if you have multi machine replication this is less of a issue, but still a total machine loss can hurt due to capacity loss or parts shortages in edge locations, etc)
I've personally not seen a synchronised SSD array happen failure for a long time, but it's hard to know how much of that is because people now plan to avoid them.
"I think the "Mix SSD vendors/batch numbers" is a hold over from the very early days of SSDs where a handful of people get seriously slimed by having 90%~ of their same-brand-same-batch drives in a single machine fail at once due to some SSD batch failure."
I want to clarify - there's the issue of a bad batch wherein their longevity is greatly reduced and they fail in a cluster, etc., etc.
But that is not what we are guarding against ...
Instead, the risk we're thinking about is that there is an actual bug in the firmware that causes a particular workload to brick the drive or destroy it or whatever.
The critical point is that if the drives are mirrored then they experience an identical workload over their lifespan and they could fail literally simultaneously.
So by all means - do indeed guard against bad batches or manufacturing defects by mixing drives. Just understand we're talking about something slightly different here ...
I can confirm that e.g. there has definitely been e.g. a batch of enterprise SSDs from Intel a couple years ago which failed en masse after a certain amount of powered-on time.
From my limited experience I did have a pair of Intel SSDs in RAID 1 fail within 2 days of each other in the same way. Thankfully the first was replace and recovered before the second failed.
I have a ZFS account - how do those work under the hood? Is it a VM backed by a ZFS volume? How does the overhead compare to a normal account? I suppose using a VM eliminates some advantages of the special device.
It sounds like you're still comfortable using HDD for the bulk of your storage, and add SSDs for fast caches.
Do you think you'll ever get to a point where you run zpools entirely of SSDs? If so, what criteria is important for you? (Raw price per gigabyte? Power usage? MTBF? Something else?)
In the current landscape I do not foresee rsync.net using all-flash zpools.
All of our access is over WAN so disk IO is not that important - it is raw price per GB and even with the "nice" SAS drives we buy ... ~$400 for 16TB is a huge difference vs. ~$700 for 1.8 TB (which is, roughly, the price of the Intel part mentioned in the list of cache drives).
Yes, that's correct. ALL of the metadata from that point forward gets placed on the cache.
That is, until you fill it up - which you could.
At that point new metadata goes onto the spinning disk vdevs, as it did prior to the cache. As files age in and out, space gets freed on the cache and some new metadata makes it back on there.
So if you fill it up, it behaves more like a cache.
ALSO, you can add multiple metadata caches to a pool ... so if you fill one up, you can add another ...
- They added fast metadata storage devices to their zfs pools. Since failure of these metadata device groups mean the whole storage pool is lost, they've set them up to have a similar level of redundancy to the rest of their system (i.e. it's fine as long as 4 devices don't all fail in a very short span of time).
- Chia is a cryptocurrency that made drives ~double in price for a while, but that's mostly stopped, though they're still a little more expensive than they were before. Also, reading between the lines, Chia looks really, really scammy/pyramid-schemey, even by cryptocurrency standards.
- A 3rd party SQLite streaming replication project added SCP as a transport mechanism, so now it works with rsync.net
- Rsync.net supports lots of their-side checksumming tools and they're really like it if you'd stop using computationally expensive ones if all you're doing is verifying data integrity.
Once again, thank you to Allan Jude at Klara Systems for the advice and guidance with the new ZFS "special" vdev for metadata caching that is discussed this quarter ...