Hacker News new | past | comments | ask | show | jobs | submit login

This is actually just a non-scientific rule of thumb that I personally developed the first time I ever used an SSD as a boot mirror.

I have heard smart people confirm that this is a smart and reasonable practice but have never seen any data or supporting figures, etc.

It's basically cost-free and if you don't like other vendors, you can always pair up (current Intel drive) with (one generation ago Intel drive).




I think the "Mix SSD vendors/batch numbers" is a hold over from the very early days of SSDs where a handful of people get seriously slimed by having 90%~ of their same-brand-same-batch drives in a single machine fail at once due to some SSD batch failure.

As a side effect people generally now stagger SSDs a little to avoid something similar happening (ofc if you have multi machine replication this is less of a issue, but still a total machine loss can hurt due to capacity loss or parts shortages in edge locations, etc)

I've personally not seen a synchronised SSD array happen failure for a long time, but it's hard to know how much of that is because people now plan to avoid them.

[edit]

with the exception of: https://www.engadget.com/2020-03-25-hpe-ssd-bricked-firmware...


"I think the "Mix SSD vendors/batch numbers" is a hold over from the very early days of SSDs where a handful of people get seriously slimed by having 90%~ of their same-brand-same-batch drives in a single machine fail at once due to some SSD batch failure."

I want to clarify - there's the issue of a bad batch wherein their longevity is greatly reduced and they fail in a cluster, etc., etc.

But that is not what we are guarding against ...

Instead, the risk we're thinking about is that there is an actual bug in the firmware that causes a particular workload to brick the drive or destroy it or whatever.

The critical point is that if the drives are mirrored then they experience an identical workload over their lifespan and they could fail literally simultaneously.

So by all means - do indeed guard against bad batches or manufacturing defects by mixing drives. Just understand we're talking about something slightly different here ...


yup, the Intel failure I mentioned below was a firmware issue, not any actual failure of the flash modules


I remember the same mid vendors and drive batches suggestion for HDD RAID Arrays. Ahhh.


I can confirm that e.g. there has definitely been e.g. a batch of enterprise SSDs from Intel a couple years ago which failed en masse after a certain amount of powered-on time.


From my limited experience I did have a pair of Intel SSDs in RAID 1 fail within 2 days of each other in the same way. Thankfully the first was replace and recovered before the second failed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: