The SSD Endurance Experiment: Only Two Remain After 1.5PB

tytso · on Sept 24, 2014

The definition of wear out is more than just the SSD declaring the cell bad, or the SSD failing suddenly. A cell is technically declared worn out when the chance that the cell suffer charge leakage after N months at temperature T exceeds probability P. (Where exactly what these parameters are are a secret that the SSD vendors don't disclose. There are some standards, but the SSD vendors don't necessarily use those standards when the make promises about their product's wear endurance.

So even though an SSD might last for 1.5 PB's worth of writes, there is no guarantee that if you were to then put the SSD on a shelf and wait nine months, that they data will still be good. This is probably one some vendors will declare themselves to be dead after so many gigabytes worth of writes, even if the flash cells haven't "failed" yet. Otherwise users might depend on the SSD's contents being retained, when in fact they might suffer data loss.

But of course, this doesn't really matter much, because you treat all data stored on SSD's as a cache, and do regular backups, RIGHT? :-)

brazzy · on Sept 24, 2014

> treat all data stored on SSD's as a cache, and do regular backups, RIGHT?

You meant: all data stored anywhere, RIGHT?

The best way I've heard it put: data that is not backed up is implicitly classified as "nice to have".

ilaksh · on Sept 25, 2014

"all data stored on SSDs as a cache" is absolutely wrong. The SSD reliability is not less when compared to HDDs.

The SSD is just as useful as permanent storage as an HDD.

Each SSD drive must be evaluated individually. Some batches or models are very unreliable. This can still be an issue with HDDs however.

All storage without backups is subject to loss, therefore all storage systems require backups of important data. RAID and other redundant systems can reduce the likelihood of data loss in SSDs just as well as HDDs.

If you statement was that all SSD data must be backed up regularly, just as HDD data is, then it would have been correct. As you stated it, incorrect and misleading.

bloat · on Sept 24, 2014

This does not invalidate your point of course, but note that the article describes one of the tests some of these drives have passed: write a large file and power the drive off for a week and then check the file.

allegory · on Sept 24, 2014

Try powering off for 5 years and keeping at 75oC and see if the data is still there.

gambiting · on Sept 24, 2014

Wouldn't most storage types, including regular hdds and optical media fail this test though? I am not talking about complete data loss, but some data would be corrupted after such time.

userbinator · on Sept 24, 2014

Magnetic media (hard drives, tape) will essentially retain data indefinitely unless exposed to magnetic fields that are strong enough or until the Curie point is reached, both of which are unlikely scenarios for long-term storage. Even in cases of fire that destroys the external components of an HDD, if the platters didn't get hot enough the data is still there:

http://istcolloq.gsfc.nasa.gov/fall2008/presentations/peders...

Flash-based memory is different - unlike magnetic media which can be thought to be bistable, flash is inherently unstable (monostable); the erased state of a cell is lower energy so the electrons stored in a programmed one are "under pressure", and due to tunneling effects, slowly leak out over time.

The consequence of this is that magnetic media will continue to store information long after it's obsolete; I'm almost willing to bet that the data on a modern HDD will still be there on the platters in 100+ years, even if the rest of the drive becomes inoperable. Ditto for optical media such as pressed CDs - in that case the bits are manifested physically, and unless the medium is degraded to the point where the bits are no longer distinguishable, the data stays (theoretically, even a CD whose reflective layer has degraded is still readable via SEM or other physical means, since the data is physically pressed into it.) On the other hand, flash will slowly and irreversibly erase itself over long periods of time, as each cell returns to its non-programmed stable state.

Retric · on Sept 24, 2014

This is not exactly true, the magnetic fields on HDD migrate around on the disk over time and eventually become unreadable by the disk. In theory the data remains recoverable for significantly longer than that but it's not 'stable'. While historically not much of a problem it's a larger issue as HDD keep increasing in capacity.

HDD actually internally refresh data to avoid this issue so their much better as 'hot' storage. Tape is designed to avoid most of these issues and is much better for long term storage.

"Magnetic media – such as floppy disks and magnetic tapes – may experience data decay as bits lose their magnetic orientation. Periodic refreshing by rewriting the data can alleviate this problem. " http://en.wikipedia.org/wiki/Data_degradation

allegory · on Sept 24, 2014

Decent quality hard disks and DLTs will probably be fine. I've had DLTs survive 100oC for over an hour inside a fire safe in a fire and some old Seagate Cheetah 10K U160 SCSI disks in an external SCSI case with a dead fan running at 80oC for a month quite happily. The latter was actually running and operational which is remarkable.

Some optical media will fail. It doesn't even have to be kept at 75oC from experience.

Flash will almost certainly fail (leakage increases with temperature). I've had dead CF cards, USB sticks, SD cards, the lot and corruption after only a couple of years stored in ideal conditions. Then there's the old Sun PROM crapfest to consider as well...

These are all anecdotes, but there is data out there to support this as well.

VLM · on Sept 24, 2014

"I've had dead CF cards, USB sticks, SD cards, the lot and corruption after only a couple of years stored in ideal conditions. Then there's the old Sun PROM crapfest to consider as well."

The story goes back much further with solid state if you talk to "retrocomputing enthusiasts" unfortunately eproms reading all 1 awhile is depressingly common. You only need a single bit error of course for software failure.

Another interesting point to bring up is it is Very well known in the retro community that the details of eprom programming technique and certification (or lack thereof) of eprom programers has a dramatic effect on burn lifetime, like multiple orders of magnitude difference in burn lifetime. Anyone can make an eprom burner that verifies an hour later. Much harder to make an eprom burner that verifies 10 years later.

This affects more than stereotypical retrocomputing due to embedded devices. Plenty of PBXes and machine tool controllers and scientific instruments and classic video arcade machines get scrapped because the eproms lost their minds.

allegory · on Sept 24, 2014

Very good points there. I worked for an aerospace and defence company for a bit as an electrical engineer. Our software was always read into RAM, checksummed, the RAM was write protected via a register and only then the code was executed. The bootloader was a mask ROM. That was all just to work around the possibility of bit flips in EPROMs.

camillomiller · on Sept 24, 2014

Of course, on magnetic tapes!

allegory · on Sept 24, 2014

I still use a Quantum VS160 DLT to back up my SSD...

The thing plugs straight into my laptop via no brand cheapo SCSI/USB bridge much to the laughing from my colleagues (apart from the ops team who understand why I do it).

Being an ex EE, I understand exactly how "transient" the state of an SSD is.

Electrons; gotta catch 'em all.

beagle3 · on Sept 24, 2014

Is it because you are not a CE that you don't understand how transient DLT tapes are?

Some people have experienced 10% failure/year and discovered it has to do with storage orientation.

kjs3 · on Sept 24, 2014

What does Civil Engineering have to do with tape reliability?

I back up to tape as well (write+verify). It's a risk mitigation strategy (as well as a cost savings). I don't expect to read the tape 10 years from now; I expect to read it an hour or so after a drive fails. The 3 or 4 times I've had to rely on this strategy, I've never lost data.

beagle3 · on Sept 26, 2014

I meant chemical engineering. Not versed in all the acronyms, I apologize.

Despite the commonly believed myths, tapes are not very durable unless you store them in the right temperature and humidity and the right orientation. And .. what do you know, the same holds true for magnetic drives (with slightly different, though no harder to achieve, environmental conditions).

Tapes are also generally more expensive before the 200TB mark or so these days (you did factor the cost of a drive into your cost saving, did you?) and actually around 400TB if you factor in two drives (which you should, because the tape drive also fails).

Trusting an SSD is stupid. But the economics of backup tapes are very different from what the vast majority of people believe. For 99.9% of the situations, backing up to a magnetic drive is a better solution.

userbinator · on Sept 24, 2014

The main issue I have with this form of testing is that it's basically measuring the ultimate endurance characteristics of the flash - running program/erase cycles until some piece of the flash becomes completely unusable. The majority of the time the first failure will occur in a user data block, but there's a nonzero chance that it's in a block mapping table or the firmware itself, and that will definitely cause catastrophic failure. The article seems to be implying that it's OK to write more data than the manufacturer specifies, but this is not something anyone should ever be doing in a real-world scenario, because retention is inversely proportional to endurance and also (exponentially!) to temperature. A drive that retains data for a week at 20C may not be able to at 30C or even 25C.

The 840 Pro's reallocated sector count appears to have started rising at 600TB, which is roughly 2400 P/E cycles, on average, of the whole flash - this is not surprising and agrees with the typical endurance figure of 2K-3K for 2x nm MLC.

I've never agreed fully with the reasoning behind MLC - yes, it's technically twice the capacity for the same die area/price as SLC (or alternatively, half the area/price for the same capacity), but it's also nearly two orders of magnitude less endurance/retention and requires far more controller complexity for error correction and bad-block management. In a storage device, I think reliability is more important than capacity - even with backups, no one wants to lose any data. The tradeoff doesn't make so much sense to me - theoretically, you could buy an MLC SSD that wears out after a few years (thus needing replace it and copy the existing data over to the new one, along with all the risks that causes, etc.), or for only twice as much, an SLC one that probably won't ever need replacing.

A 256GB SLC SSD with 100K P/E cycle flash is conceivably good for 25PB and 5-10 years, or <1PB and over a century... i.e. you could probably use one for archival if stored in a good environment. Part of me thinks the manufacturers just don't want to make such long-lasting products, hence the strong association of SLC to "enterprise" products. (And the much higher pricing of SLC SSDs, more than the raw price of NAND would suggest.)

vidarh · on Sept 24, 2014

The rationale is that most people will never approach those kind of number of P/E cycles, and so people would rathe pay for more space, or pay less. Even in many cases in enterprise settings.

We have some cheapish SSD's in use for some of our high traffic database servers. We lost some drives that failed catastrophically, and the company we bought it from "suggested" we might have worn them out and maybe we didn't have a reason to RMA them, and perhaps we just ought to buy more expensive enterprise models next time.

So we checked the SMART data, and after a year of what to us is heavy 24/7 use with a large percentage of writes, we'd gone through less than 10% of the P/E cycles.

(We did our RMA, and it was very clear that this was a problem with the model/batch - all the failed drives were OCZ Vertex drives from when their failure rate shot through the roof before the bankruptcy)

All our other SSD's are chugging along nicely; the oldest have suffered through 3-4 years of heavy database traffic. I am just waiting for the oldest ones to start failing.

At that rate it doesn't matter if they won't survive as long as SLC anyway: We'll end up replacing them with faster, higher capacity newer models soon anyway - we usually do on a 3-5 year cycle depending on hardware and needs -, because it's more cost-effective for us to upgrade regularly to increase our hosting density as it helps us avoid taking more rack space, and colocation space/power/cooling costs us more than the amortised cost of the hardware.

The consumer market is similar: Most people don't ever buy replacements for failed drives - they buy a newer computer.

userbinator · on Sept 24, 2014

The rationale is that most people will never approach those kind of number of P/E cycles

The flip side of that is most people could now have drives that don't cost all that much more, but last much longer. Most SLC tends to be rated for 100K cycles and 10 years of retention; assuming a roughly inverse correlation, at 10K or 1K cycles the retention goes up considerably to a century or more.

The consumer market is similar: Most people don't ever buy replacements for failed drives - they buy a newer computer.

That is true, but the long-term implications are more subtle; the fact is that most people don't backup, and quite a few of them keep the old drives (that were still working when they were replaced) around as "backup", with the implicit assumption that the data on them will likely still be there if they ever decide e.g. that they wanted to find an older version of some file they had. With flash memory, this assumption no longer holds.

On a longer timescale, we've been able to "recover data" from stone tablets, ancient scrolls and books, this being a very valuable source of historical information; and most if not all of that data was probably never considered to be worth archiving or preserving at the time. More recently, rare software has been recovered from old disks ( http://www.chrisfenton.com/cray-1-digital-archeology/ ). Only the default, robust nature of the media made this possible.

Despite modern technology increasing amount of storage available, and the potential to have it persist for a very long time, it seems we've shifted from "data will persist unless explicitly destroyed" to "data will NOT persist unless explicitly preserved", which agrees well with the notion that we may be living in one of the most forgettable periods in history. It's a little sad, I think.

darkmighty · on Sept 28, 2014

The fact is, even it wouldn't matter as-is if the data took 10000 years to degrade from the platter itself. Most consumer hard drives those days are made for laptops, which are probably used for less than 5 years on average. Even if you consider external HDs and desktop HDs, a long of a lifetime isn't much use: the control electronics themselves fail fast, and the mechanical reliability even faster.

It's an optimization rule of thumb: in an optimal trade-off for (e.g.) maximum reliability for cost, the reliability of each element will tend to be close (actually the derivative of the reliability vs cost will be equal, but this tends to imply the former) -- i.e. you improve the least reliable and sacrifice the most reliable, even if the most has very reliability.

sitkack · on Sept 24, 2014

Sounds like you want some multi-terabyte capacity Dec DiskPacks [0] completely separating the controller and the heads from the recording medium.

http://en.wikipedia.org/wiki/Disk_pack

dijit · on Sept 24, 2014

OCZ Vertex are the cheapest least reliable SSD's in recent memory, I can't imagine how you could ever justify those for database servers.

FWIW I'm not advocating expensive drives, but ones that are known to fail reliably are far better than the cheapest consumer SSD, I put Intel 513's in RAID10 on the databases at my last company with instructions to replace drives at 60% of their expected life.

databases are important, for many people it's the heart and soul of a business, recovering them can be very costly and especially time consuming.

for average Joe, have a computer for 4-5 years and then throw the machine away, you can't expect the same out of prod servers. Please, please please in future when purchasing things for servers check the failure rate, if there is no real world data then DO NOT BUY those things.. especially avoid consumer markets, they're cheap and cheerful for a reason.

vidarh · on Sept 24, 2014

> OCZ Vertex are the cheapest least reliable SSD's in recent memory,

Vertex 3's are perfectly fine. We've not had any of our Vertex 3's fail, in fact, which means they're one of our better performing models. Vertex 2's are known to have high failure rates, and so is 4 (and we tried some 4's and won't again).

> I can't imagine how you could ever justify those for database servers.

You answered your own question. Because they were cheap, and failure of individual drives does not matter.

For any given drive, we assume it will die. For any given RAID array, we assume the entire RAID array will die. For any given server, we assume the server will regularly crash or die. For any given data centre, we assume the data centre will eventually lose power or burn to the ground.

When you start out with those assumptions, that reflect real world risks any business should plan for, you then design your reliability around that:

Everything is in RAID's and can afford to lose at least one and often two drives. Everything is replicated, so if the RAID or the server it is attached to dies, another server can take over. Everything is replicated to a secondary location, so if the data centres loses power (has happened to us - a suspected fire forced the data centre operator to shut everything down before the fire brigade could enter), we can make the decision how long to wait before we reroute (we don't do that automatically at the moment, though we could - we've moved traffic transparently between the data centers in some instances).

To me, if you worry about data loss from failing drives, then your system is designed wrong.

If a system is designed for resilience, drive reliability becomes purely economic question: How much it costs us to expend the effort on RMA'ing drives and send someone down to replace them vs. price difference for drives. In that calculation, the Vertex 3's do just fine. We're not buying more OCZ at the moment, but we'll see what happens under new ownership, who knows.

> databases are important, for many people it's the heart and soul of a business, recovering them can be very costly and especially time consuming.

If database recovery is costly, then in most cases someone is not doing their job. Few businesses have data that is big enough to justify not having both database level replication, regular snapshots, and nightly backups. For every database, we have about 4-5 copies newer than 24 hours old, at a minimum (master, slave, <1 hour old replica of the whole container the master runs in, <1 hour old replica of the whole slave container, and the newest backup image), as well as older snapshots. For some databases we have more copies than that. It costs us peanuts compared to what it costs us to serve up the live versions of the sites those databases are for.

> for average Joe, have a computer for 4-5 years and then throw the machine away, you can't expect the same out of prod servers.

Yes, you can. For us, if we keep our production servers more than about 3 years, we lose money, since as long as we're growing, we have the option of taking more rack space vs. rotating out our oldest servers and replacing them with servers that have many times higher capacity in the same space.

Our current oldest generation servers can handle about <20% of the capacity per 1U of rack space than our newest generation. With the cost of taking an extra rack what it is, it's an absolute no-brainers for us to throw out those servers and replace them with new servers on about a 3 years cycle. Sometimes, if our growth is slower, we'll leave it a bit longer, until we need the space, but 5 years is pretty much the upper limit.

For businesses with an entirely static, and small, workload, sure, you may prefer to keep the servers for longer, and those have the option of buying more expensive drives, or deal with more failures over the lifetime of their server.

> Please, please please in future when purchasing things for servers check the failure rate, if there is no real world data then DO NOT BUY those things.. especially avoid consumer markets, they're cheap and cheerful for a reason.

Don't assume we don't check. And no, I most definitively will not avoid consumer markets. On the contrary. Enterprise components are sometimes worth it. But often they are priced and designed for people who are terrified of component failures or don't want the "hassle". When you have a system where component failure is an assumed "everyday" event, the consumer versions often have a far lower total cost of ownership.

Retric · on Sept 23, 2014

I normally find it annoying when they run endurance tests like this using only one drive of each brand and treat the results as particularly meaningful. However, in this case I think the failures may suggest things about the drives underlying architecture not just who picked the best sample from the bin.

rodgerd · on Sept 23, 2014

> I normally find it annoying when they run endurance tests like this using only one drive of each brand and treat the results as particularly meaningful.

The main takeaway for me from this is less around the reliability of individual drives, but that SSDs as a whole have moved into a space where I don't really need to worry about them being signifiantly less reliable than hard drives.

redangstrom · on Sept 23, 2014

While the conclusion is true and has been for some time, the supporting data is insufficient.

For example, the author doesn't even bother to gloss over key reliability parameters such as data retention, which is primarily impacted by cycling.

(By data retention, I mean the ability of drive to retain written data over a period of time, such as 6 months, since written. This is usually accelerated in testing by a bake cycle.)

That said, it's a fun read and I'm glad there's more exposure on this topic.

jychang · on Sept 24, 2014

Well, they could still improve.

My Macbook's SSD died in a blaze of reallocated sectors and write failures just last week, and that thing was just around 2 years old.

For comparison, I've only had 2 spinning hard drives fail on me (I've owned around 25), and those were >10 years old and mostly decommissioned.

It could be a statistical fluke, and the sheer speed of SSDs means that even now, a higher failure rate is acceptable, but SSDs in general don't have the longevity of older magnetic hard drives.

jahewson · on Sept 24, 2014

My Macbook's SSD lasted just six months before it failed, right before printing a boarding pass for a transatlantic flight. Afterwards it turned out Time Machine had been making corrupted backups, which wasn't fun, fortunately no user data loss, only Applications. I've learnt my lesions about SSDs: they don't give you any hints that they're about to fail, and one they do, it's game over.

fulafel · on Sept 24, 2014

> SSDs as a whole have moved into a space where I don't really need to worry about them being signifiantly less reliable than hard drives

The majority of SSD reliability problems never had anything do with flash endurance. You can tell this from the failure mode too: all the data disappears. Instead, they are caused by firmware and design bugs that cause occasional corruption of the FTL internals.

Regular benchmark-style IO patterns aren't a good way to tickle these bugs so the endurance test doesn't really say anything about this main problem.

VLM · on Sept 24, 2014

Could someone with experience on the technical side of the industry (not marketing or astroturf) tell us if "one drive of each brand" applied to a dozen pieces of hardware is a dozen different brand stickers attached to the eventual output of the same NAND chip foundry, or is it realistic that its really a dozen foundaries? Fundamentally rewrite testing does not test the physical parameters of a brand name sticker, it tests the physical parameters of one kind of IC cell. I would not be entirely surprised to hear there are very few foundries making this exact type of IC but I can't estimate how much larger or smaller than number is relative to the much more well known number of brand name stickers.

As a message to the marketing people I excluded in the above paragraph, I would be willing to pay a modest amount extra money to have the honest foundry name on the marketing material for a drive, and lets be honest I'm only going to buy multiple quantities of your product if I can order from separate foundries so please don't lie and also please give each foundry a different UPC code or something. Somehow I get the feeling this will end up like power supplies where "everyone knows" there is one model of supply and stacks of different wattage stickers applied to the same device to maximize revenue. The market is too corrupt and unequally informed for this to happen. Still, a guy can dream.

Mithaldu · on Sept 23, 2014

Given that return rates for these drives can go up to 2.5% even under normal usage conditions (though granted, that was OCZ), it's really quite meaningless to do this kind of test without having a whole batch of drives for each type.

simoncion · on Sept 24, 2014

Two words: IBM Deathstar.

pyre · on Sept 24, 2014

What does that have to do with commentary on needing to have larger cross-batch samples?

vidarh · on Sept 24, 2014

When bringing up OCZ' notorious high return rate, and using that to indicate a high failure rate for SSD's, bringing up the IBM DeathStar is a very relevant counterpoint.

Larger cross-batch samples is absolutely a good idea, but the dig at SSD's applies just as much to regular harddrives.

[For those who don't know, DeathStar was a nickname given to the IBM DeskStar line of harddrives after one of the models had a problem that gave them high odds of catastrophic head crashes - so bad in fact that they'd strip most of the magnetic material off the glass platters when they crashed[1] -; out of 10 IBM DeskStar drives we had at the time, all 10 failed within a week or two of each other, under a year after we bought them. The problem was is often considered to be one of the main reasons why IBM unloaded their hard drive business to Hitachi]

[1] http://en.wikipedia.org/wiki/HGST_Deskstar#mediaviewer/File:...

michaelt · on Sept 24, 2014

  The problem was is often considered to be one of the main 
  reasons why IBM unloaded their hard drive business to 
  Hitachi

Who, in a compete reversal of fortunes, now produce some of the most reliable drives [1].

[1] https://www.backblaze.com/blog/hard-drive-reliability-update...

pyre · on Sept 26, 2014

You're missing the fact that (IIRC) prior to that DeskStar were considered some of the best drives on the market.

jcampbell1 · on Sept 23, 2014

These SSDs are failing at roughly 3000 write-out cycles. Traditional hard drives can take 6 hours to write out, so doing a similar test would take ~2 years.

Spinning disks are so freaking slow that you could never test the reliability apples-to-apples. Any workload that wears out an SSD could never be run on a traditional HD.

CPAhem · on Sept 23, 2014

A very good point, but it would still be good to see a comparison, even if it does take 2 years to run on a traditional HD.

joshvm · on Sept 24, 2014

This is encouraging although even tests on early drives showed that an 'average' drive should last far longer than people need them for - purely based on the number of allowed writes. 750TB? That's more data than my department, an imaging research group, have on our cluster...

There are some more tests which are very hard to do because you need time and a large sample size, for instance what's the data retention time for a typical SSD?

As far as I can tell, nobody really knows because you'd need to leave the drive off for probably more than a year - and as soon as you turn the drive on, presumably you refresh the charge that's leaked out? Most of the time this isn't a problem because almost everyone turns on their PC or laptop weekly/monthly if not daily.

sounds · on Sept 24, 2014

> as soon as you turn the drive on, presumably you refresh the charge

No, bits lose their charge slowly but it doesn't matter whether the SSD is powered on or off. http://www.tomshardware.com/forum/256180-32-read-longevity

As a side note, the really interesting thing about this endurance test is that almost all the drives fail badly at the end of their life. Even though almost all the manufacturers specifically claim they "fail safe" into a read-only mode.

frozenport · on Sept 24, 2014

It's frequently assumed that breakdowns are geometric random variables and we can extrapolate the rate by testing many drives. For example, test 100 drives.and find when the first one breaks.

The numbers in this test should be interpreted as an average with some drives breaking earlier. Now consider the failure rates for drives where the file is spread across multiple disks.

dzhiurgis · on Sept 24, 2014

If you don't have much RAM I suppose you'd hit these limits much faster as OS would swap a lot of data onto drive.

reitzensteinm · on Sept 24, 2014

Two of the drives survived a year of continuous use, so even regular swapping is trivial.

Say you're using your computer 12 hours a day, half of that doing tasks where memory usage greatly exceeds your physical memory, and you're using 1/4 of the bandwidth of the drive to swap in/out (of which 1/2 are writes).

Under this quite heavy scenario, the writes will catch up to this endurance test after 32 years.

The speed, price & capacity of these drives are (at least for now) improving so quickly that by the time you wear one out, even with torturous use, replacing it will be trivial.

Cost per gigabyte has halved over the period of this test. So even when taking the drive that just died and using it for the most pathological use case imaginable, you'd be buying a $200 drive today, replacing it with a $100 drive in one year, $50 in two years, etc. Not a big deal.

fixedd · on Sept 24, 2014

I just wish someone could tell me one I could buy that would last longer than 1.5 years in my laptop. I'm on #3 now :(

simoncion · on Sept 24, 2014

The 100 "GB" OCZ Vertex LE in my laptop has 27,834 hours of power-on time. This is the only drive in the system, and it houses several always-on encrypted swap partitions. I'd be surprised if you could purchase a new one at this time, though. :P

I can get back to you in a little more than a year about the 750 "GB" Samsung 840 EVO in the lady's laptop, and 6->8 months on the Crucial M4 SSD in my gaming PC. :)

fixedd · on Sept 26, 2014

My last to fail was an OCZ Vector :(

toomuchtodo · on Sept 24, 2014

Any of the Intel SSDs should last you the life of the laptop.

roghummal · on Sept 24, 2014

I prefer $brand, they hold up better under extensive use.

alexbecker · on Sept 24, 2014

If you're swapping regularly then you have more immediate computer needs than a fast hard drive.

vidarh · on Sept 24, 2014

I actually wish I could turn swap off safely on my work computer, because I actually have enough RAM for normal situations. Swap may eventually be an old curiosity for most people, and a "niche thing" for others... When I start swapping, it's a sign some application is leaking memory and I'd rather it didn't get any and crashed instead of freezing the machine up for ages while trashing the drive.

But as it turns out OS X behaves really badly if you try to take away its swap (last time I tried, the kernel memory usage slowly increased until it consumed almost all memory; I re-enabled flash and rebooted and it got back to normal. It's a bit annoying that we've become so dependent on a kludge to work around too little memory. (Brings me back to my Amiga days and the long, heated discussions about whether or not swap was a good idea in the first place)

dmm · on Sept 24, 2014

One trick people in a similar situation on linux use is to make a small ram disk and then assign that as the swap. A little absurd but prevents the bad behavior.

darkr · on Sept 24, 2014

It is indeed absurd, when swapping can be avoided entirely by setting kernel swappiness to 0:

sysctl -w vm.swappiness=0

to make permanent, set vm.swappiness=0 in /etc/sysctl.conf

Though for a desktop machine, disabling swap probably won't help improve performance

dmm · on Sept 24, 2014

That's true, but this was a special circumstance where disabling swap was creating very bad system behavior.

darkr · on Sept 24, 2014

pre 3.5 it doesn't actually disable swap, it will only use it when it's about to OOM.

post 3.5 it does actually hard-disable, but setting swappiness to 1 gives the same behaviour as swappiness=0 in < 3.5

dmm · on Sept 24, 2014

That's good to know, thanks.

callesgg · on Sept 24, 2014

So not entering read only mode after life end seams like a very dangerus bug that is not realy accepteble.

Sidenote: Articles like this always scares the shit out of me I have a kingston ssd that has been in my main server for almost 3 years now.

The smart data seams to say that it is 100% fine but as it has been on for 24* 365*3 hours that seams unlikely.

vidarh · on Sept 24, 2014

If the chance of a drive failure scared the shit out of you, it is an indication that you don't trust your recovery processes... Drive failure shouldn't be more than a minor annoyance.

rocky1138 · on Sept 24, 2014

Yes, but your data is on more than one drive, yes? Presumably you have some sort of RAID setup whereas if the one drive were to truly fail, another would pick up in its place.

rsync · on Sept 24, 2014

What worries me about raid mirrors on SSDs is that a lot of SSD failures are not due to a part failure, but rather, a pattern failure ... meaning, if you subject this SSD to thus and such series of writes, then it fails.

So the worry is, if you mirror an SSD then you could (theoretically) inflict the exact same pattern on them over their lifetime and they would fail simultaneously.

That is why all of our SSD boot drives, which are indeed mirrors, are built from two different SSDs ... either two different generations of Intel SSDs (3xx and 5xx for instance) or one Intel and one equivalent Samsung. This way, their behavior cannot become correlated...

vidarh · on Sept 24, 2014

That should concern you for regular drives too.

The infamous IBM DeathStar problem was partly resolved with a firmware upgrade that added wear levelling, because the crashes were found to be due to material flaking off the platters when the head remained in the same location, leading to dust in the drive that had a high risk of trigger head crashes so bad it'd strip almost all the material off the glass platters.

There's a fun picture on Wikipedia:

http://en.wikipedia.org/wiki/HGST_Deskstar#mediaviewer/File:...

We had an array of 10 of them in 2001, and when the first one failed we weren't aware of the problem and didn't think much of it. Then the second failed a week later. The third a week after that. And so on, almost like clock-work until all of them were dead. At the time that array made up enough of our capacity that we couldn't afford to just take it out of rotation until we'd replaced the drives.

It's the event that taught me the hard way to always at a minimum mix batches, and preferably models and/or brands (just mixing drives from different batches would've been insufficient with the DeathStar, as far as I remember). As well as to favour multiple smaller arrays... We never lost any data, and the array remained available for the entire time period it took to cycle through all the drives, though.

VLM · on Sept 24, 2014

I've done the same strategy with spinning rust. Only a noob sets up a RAID mirror with two drives same model same mfgr and adjacent serial numbers. I say noob because you only experience this failure mode once before you'll professionally refuse to deploy that way.

I have had historical issues with corporate procurement of dell servers with this problem. There is a labor intensive strategy where you throw all the servers into one room and play switcheroo. They'll still be the same model and maybe even the same batch but at least they'll probably not be adjacent serial numbers anymore.

A coworker of mine had the strategy of buying RAID servers and then pulling and replacing one drive from all the servers after initial burnin and keeping the yanked drive as as a cold onsite spare.

I would gladly pay a little more money to dell or a competitor to put together turnkey servers with carefully intentionally mismatched raid.

callesgg · on Sept 24, 2014

Partialy, I do backups of everything that is mine, documents and stuff like that. But restoring the entire system with all tweaks and packets would probably take weeks.

bigiain · on Sept 24, 2014

Why?

Give the startlingly low cost of storage these days, I can't imagine a scenario where I'd risk "weeks" of productivity just to save on the raw storage cost of doing a complete (and i my case, bootable) "entire system" backup/archive.

callesgg · on Sept 24, 2014

Is not about money it is about ROI setting up a backup system would be some work and doing backups verifying that they work everything takes time (time to remake system - the time it would take to do backups)/(the risk of failure) I think it is to low

bigiain · on Sept 24, 2014

Fair enough - for me rsync(+bash+cron) and a $80 external usb drive (or just clicking the buttons on TimeMachine on Mac OS X boxen) is a small enough "I" to easily justify the "R" - even for data I could retrieve easily enough if needed. For data I actually care about, I put a lot more effort/money into ensuring I don't lose it.

sounds · on Sept 24, 2014

Unlike magnetic HDDs it does not matter what the "powered on" time is for an SSD. The most significant factor is the number of blocks erased (and presumably written to). High temps can also affect SSD life.

dragontamer · on Sept 24, 2014

http://www.overclock.net/t/1507897/samsung-840-evo-read-spee...

Incorrect. It has become apparent that the Samsung 840 EVO (Triple-level Cells... 3-bits per cell) have a "powered on" issue.

There seems to be significant slowdown when reading data that hasn't been touched for months on those drives.

plaguuuuuu · on Sept 24, 2014

I had an SSD totally brick on me. Not fail to write but just brick. No idea how to recover it

wazoox · on Sept 24, 2014

You can't, so far that's how SSDs die. Kroll Ontrack and similar services may get some of the data back, but that's all.

devindotcom · on Sept 23, 2014

I kind of anthropomorphize the devices in tests like this, so it's a bit sad to see the poor things made to run until their legs fall off. But it's nice to know they run farther than expected.

Scuds · on Sept 23, 2014

running a defragment operation on a few of these makes them emit some electronic noise if you listen closely. :D

darkstar999 · on Sept 24, 2014

For real though, don't defrag an SSD. It's pointless and wears it out for no reason.

dghughes · on Sept 24, 2014

The old Windows defrag with the small blocks was like visual bubblewrap.

We need more system utilities that are fun to watch.

reitzensteinm · on Sept 24, 2014

I agree that it's not a good idea to defrag an SSD. But I'd like to nitpick your assertion that there is no reason. Even on excellent drives, 4k random reads/writes are not as fast as sequential.

For example, the SSD in my machine, a 120gb Intel 330 series, can do ~88 MB/s of random 4k reads, but 500 MB/s of sequential, a multiple of ~6x.

Now, a Seagate 5 TB drive will do 146 MB/s of sequential reads and ~470 kB/s of random access, a difference of ~310x, so SSDs are punished for poor access patterns significantly less.

But it's probably possible to manufacture a situation where defrag would indeed give significant benefits. Whether it's ever seen in the real world is another story.

[1] http://ark.intel.com/products/67287/Intel-SSD-330-Series-120...

[2] http://www.seagate.com/www-content/product-content/desktop-h...

noselasd · on Sept 24, 2014

Random reads are still significantly slower than sequential reads on most SSDs.

colordrops · on Sept 23, 2014

The fact that they work better than spec indicates to me that we are still at the forefront of this technology with good engineering effort behind it. Once it matures and companies try to squeeze out every dollar, expect them to fail a lot more and the MTBF to be less than advertised, similar to printers etc

scrollaway · on Sept 23, 2014

It'd be nice to get some endurance testing on the sandisk internal solid state drives (eg. http://www.amazon.com/Sandisk-SDSA5JK-064G-Module-Laptop-Net...)

I've had one fail three days ago on a laptop that was barely a year old. I'm not even sure what actually failed - debugging a broken ssd is a pain, and when I mount it it just freezes for ages, making the matter worse.

Sandisk being the only real seller of those things there isn't exactly a lot of competition but I'd like to actually see how they hold up vs. regular SSDs. There is a bit of a false expectation when you buy a laptop with a ssd, expecting the endurance of a regular ssd and getting something potentially awfully bad.

kabdib · on Sept 23, 2014

Have seen quite a few Sandisk SSDs die over the past 3-4 years. Ditto, OCZ.

Yet to see an Intel or Samsung bite the dust. I'm sure it's only a matter of time, but I won't buy any other brands.

simoncion · on Sept 24, 2014

I have an OCZ Vertex LE that's still going strong after a little over 27,800 power on hours and ~18.6 TB of data written.

I would be shocked if "Judge reliability on drive models, rather than manufacturers." was any less true in the SSD era than it was in the HDD era.

vidarh · on Sept 24, 2014

You're right, sort of. Most of the time that's certainly the case. But OCZ QA appears to have fallen off a cliff at some point, to the extent where it was presumably a large factor in their bankruptcy, and it was a systemic problem with most OCZ models.

A huge list of models have failure rates way above average, with a number of them exceeding 5%, and some claiming that the failure rate for some of the Octane and Petrol models exceeded 30%.

The best ones have been in line with other manufacturers, though. Unfortunately the problems were so widespread that your odds of picking a "safe" OCZ drive for a while were ridiculously bad, unless you were prepared to wait for a year or two to get hard numbers on a model before buying.

The Vertex line has been a crapshot, for example. Several of the Vertex 2 models had unreasonably high failure rates (5%-10% range). Your Vertex LE is as far as I know based on a different design/controller than the regular Vertex 2 and might have "escaped" the Vertex 2 problems. The Vertex 3 appears to have been much better (and none of the Vertex 3's we have have failed). Vertex 4 has been disastrously bad for us - every single one failed hard after less than a year, to the point where even the SMART data is partially corrupted on most of the drives (claiming it's been powered on for 75 million years for example..), and marked the end of buying OCZ drives for our part.

kalleboo · on Sept 23, 2014

I'm happy to see the high write longevity these drives are achieving, but it frightens me a lot that it seems like the fail-safes where they're designed to go into read-only mode instead of just dropping off the controller and losing your data are failing on all of them, even the Intel!

cesarb · on Sept 23, 2014

In my experience with traditional HDDs, they often drop off the controller and lose your data when they fail. Which means that, at least, SSDs are not worse than HDDs in that regard. And they have the potential for a softer failure mode (going read-only).

qwerta · on Sept 24, 2014

There was test at czech site diit.cz. SSD survived several overwrites. But after it was left disconnected for several months without power, it lost all data. Apparently SSD needs to repower its cells periodically.

masklinn · on Sept 24, 2014

Note that the techreport includes a component of leaving SSDs unpowered to verify exactly that kind of issues. Although it's not a very long one.

higherpurpose · on Sept 23, 2014

I wish they tested a Crucial drive, too. Crucial drives tend to have great dollar/GB ratio.

Aoyagi · on Sept 23, 2014

I wish the number of samples was much, much higher...

CPAhem · on Sept 23, 2014

Yes and that SSDs had also been compared to normal spinning hard drives, too.

JoeAltmaier · on Sept 23, 2014

Here's somebody else's data: http://www.overclock.net/t/1284055/ssd-return-rates

gonzo · on Sept 24, 2014

ck2 · on Sept 24, 2014

The intel failed to reach the petabyte mark.

This is interesting because many datacenters use the intel.

daurnimator · on Sept 24, 2014

Though it is noted they intentionally fail:

> The Intel 335 Series is designed to check out voluntarily after a predetermined number of writes. That drive dutifully bricked itself after 750TB

dragontamer · on Sept 24, 2014

This is also the consumer drive instead of the enterprise drive.

rasz_pl · on Sept 24, 2014

hahaha Did you really believe that claim? Intel drive documentation states clearly drive should switch to READ ONLY mode, instead it DIED.

TheLoneWolfling · on Sept 24, 2014

I am concerned about the lack of read-only at EOL.

I, for one, would much prefer slightly less longevity and better reliability than vice versa.

I mean, I do backups, but backups only do so much.

chroma · on Sept 24, 2014

Backups only do so much? Could you elaborate on what your backups don't do?

My backups allow me to recover from theft, hardware failure, accidental deletion, and more. If my computer were to burst into flames right now, I would only lose 30 minutes of work.

Even if SSDs reliably went read-only at the end of their service life, I would still keep my existing backup strategy. There are so many ways to lose data without disk failure.

TheLoneWolfling · on Sept 25, 2014

I have a laptop, and as such tend to use it on-the-go.

As such, backups won't capture anything after the last time I had access to my backup volume.

Also, again, I don't always have access to my backup volume.

I'm not saying "I wish that they would go read-only so I didn't have to back up" - I am perfectly well aware of the need to back things up.

shdon · on Sept 24, 2014

Backups don't keep your system somewhat running in the event of failure. And unless you run a full image backup, it takes much more time and effort to restore from a data-only backup than to simply do a direct copy off of a write-locked but perfectly readable drive.

NoMoreNicksLeft · on Sept 24, 2014

If an SSD is written to once, and kept powered on (and at a temperature in the low 70s), while getting regular reads, how long can I expect this drive to last?

If that drive is powered on, but kept at low temperatures (say near or below freezing), does this help it survive longer?

What failure modes would occur in such an environment? Would it just be power surges frying the thing, static electricity?

listic · on Sept 24, 2014

They should have included Samsung 850 Pro, which sells since August. http://smile.amazon.com/s/ref=nb_sb_ss_c_0_7?url=search-alia...

OniBait · on Sept 24, 2014

That would've been a bit hard to do seeing as how the endurance test has been running for over a year. Even with SSDs, 1.5 petabytes of reads/writes takes a while to write.

gambiting · on Sept 24, 2014

I've always been curious - I understand that memory cells have their own durability, but how about the controllers that are used to transfer that data? Do they wear out too? In fact, can a CPU fail after having exobytes of data sent through it?

arenaninja · on Sept 23, 2014

Does anyone know if there's a utility that monitors SSD health? I've a 512GB SSD, which I'll probably keep for a while, but I don't like being in the dark about how far along its lifetime I am

emddudley · on Sept 24, 2014

Samsung drives come with Samsung Magician, which is a pretty nice utility.

AndyNemmity · on Sept 24, 2014

And the only time I've lost a drive was due to a Magician firmware upgrade that was wrong. I now avoid it out of general fear.

jewel · on Sept 23, 2014

smartmontools work good on the command line in linux. There's also a GUI called palimpsest which makes the SMART numbers easy to read.

On other operating systems any tool that can read the SMART data is what you should look for.

Eiriksmal · on Sept 25, 2014

smartmontools can be utter crap when it comes to interpreting SSD data. Some manufacturers seem to deliberately program the SSDs to report SMART data incorrectly, or report correct data under the wrong code[0], just to encourage you to use their proprietary software.

I have an 840 Pro in one server that works okay with smartctl, but the Corsair Neutron GTX in a different server has readings all over the board that are utterly useless.

[0] http://forum.corsair.com/v3/showthread.php?t=89316

alecco · on Sept 24, 2014

I can't find in the article if those were sequential 1.5PB writes or random small writes (i.e. < 4KB). If the later case, this article should be flagged.

arihant · on Sept 24, 2014

They link to a post with details of the test in this post, which mentions it is sequential - http://techreport.com/review/24841/introducing-the-ssd-endur...

alecco · on Sept 25, 2014

Thanks!

abvdasker · on Sept 24, 2014

Really excellent writing on these pieces. I lol'd at "dutifully bricked itself". If only all tech writing were as colorfully engaging.