Hacker News new | past | comments | ask | show | jobs | submit login
Pure: No more hard drives will be sold after 2028 (blocksandfiles.com)
34 points by jerryjerryjerry on May 11, 2023 | hide | past | favorite | 48 comments



Could someone help me understand something I noticed recently when shopping around for SSDs - the pretty large difference in power usage between the more consumer-targeted models vs datacenter models. How much of it is attributed to larger memory sizes vs other factors?

Examples:

- Intel 660p 2TB: 100mW active. Older and slower, but only 100mW!

- Samsung 990 Pro 2TB: 5-6 watts

- Micron 9400 >=8TB: 15-25 watts !

Intel: https://www.intel.com/content/www/us/en/products/docs/memory...

Samsung: https://download.semiconductor.samsung.com/resources/data-sh...

Micron link: https://media-www.micron.com/-/media/client/global/documents...


The 100mW active power rating for the 660p is almost certainly measured while running MobileMark, a general system performance benchmark that is very lightweight and leaves the storage completely idle for the bulk of the test duration. So that power rating is really more of an indicator about idle power in a machine sitting at the desktop doing nothing (but not in a suspend/sleep state at the system level). The 990 Pro power rating and the enterprise SSD power rating are both for a drive that's actually busy. The Micron 9400 has a controller with twice as many channels and none of the performance-sacrificing optimizations that consumer SSD firmware comes with.


1.6/0.6M read/write iops the Micron. There's probably still some difference, but iops/watt is probably a more reasonable way to look at these.


It is important to note that Pure Storage is a vendor of all-flash storage solutions, so there may be some exaggeration in their claims.


For one they totally neglect that NAND loses data over time. Granted, not fast. But a 10 year old flash drive is more likely than not to be unreadable. This problem has not been resolved.

Hard drives, while they do have a finite lifespan, can probably survive 100 years or more. Furthermore, the components likely to fail can be replaced.

Surprisingly CD-ROM (and DVD-ROM and ...) are going to last for centuries or millennia. Even writable CD-ROM will do. They also have the advantage that if necessary you can read them through a microscope. So failing components don't matter that much. With enough patience you could read them with a microscope, pen and paper.


> NAND loses data over time

All kinds of data medium loses their information over time. Even the novel you wrote in highschool 10 years ago (as an example) with pen & paper would lose ink overtime due to evaporation. There is a reason books are lost in time.

There is one and probably only one exception to this: flowing data. Only data that is constantly refreshed would stay on forever.


Ok, sure. But NAND loses data FAST compared to hard drives.


The more limited life span, lack of warning about impending failure, and greater cost (at the drive sizes I want) are why I haven't moved to SSDs yet. The speed advantage over platters is nice and all, but isn't really a factor for me.


hard disks generally don't do well when left to rut. and if you need to repair it, fitting replacement parts my not be easy to come by anymore. writable cd-roms are also not very reliable for long term storage. quality of the material surely makes a big difference, but lifespan is about 10 to 20 years. that being said, I still have floppies from end of the 80's that still work


Yes, I would take it with a very large can of salt.


One thing that HDDs seem to have and SSDs seem to lack is that they fail _reliably_. Some SMART values get higher, warning, then error. Sometimes directly in error. Tick tick tick tick. But you often can still recover some data.

SSDs on the other hand: from one day to the next they are dead, don't do NVME or SATA handshakes anymore, and you're done.

I wonder if SSD firmware still needs to mature to the point of still allowing some connectivity and data recovery.


"I wonder if SSD firmware still needs to mature to the point..."

This is a major issue, I've had almost brand new SSDs fail catastrophically without warning. In one instance I'd gone to great trouble to create an organized data disk for archiving and it failed just as I was about to back it up, and nothing I did could get it to even recognize the SATA interface let alone give me access to the data.

This is most annoying, first, why does the interface have to also die when something goes wrong with the memory? Second, can't the firmware at least tell us what's wrong with the memory? (Given the interface is essentially the smallest part of the drive's real estate it's unlikely the bit that fails, similarly the firmware would be second smallest—even then, why isn't redundancy built into drives for these two vital components (two of each)?)

So what's going on? To maintain the integrity of our data we not only need these issues solved but also we need to know much more about the architecture of SSD drives.

As users, we ought to pay much more attention to the fact that much of this problem stems from the inordinate amount of secrecy SSD manufacturers adopt. We really do need some new manufacturer to come to market with a drive based on data integrity, its selling point would be 'it's open, here's what we do and here are our tools to help you recover your data'.

There's also the longevity problem that needs resolution, I've put SSD as backups into storage only to come back several years later to find the data missing.

As it is, the data on HDDs is ephemeral enough, but relying on a 'well' of electrons trapped in 'glass' is not only risky but almost foolhardy. If it had not been for the speed and convenience of SSDs we'd never have contemplated using them in the first instance.

Before we scrap HDDs we really do need much more reliable data storage than SSDs, especially so for archival data. It remains to be seen what eventuates, but whatever it is it should be based on a technology (physics) that's inherently reliable and stable, crystals for instance.


> This is most annoying, first, why does the interface have to also die when something goes wrong with the memory?

I think you've read too many horror stories about unreliability of flash memory and are now blaming the flash memory for everything that goes wrong with an SSD even in instances that look very likely to be a firmware bug rather than a direct result of flash memory failing.

Flash memory is error-prone, but largely in predictable ways. SSD firmware is a pile of overcomplicated software developed in a hurry and squeezed into an embedded system with limited resources. Do you really think the former is more likely to randomly cause catastrophic failures?

> Given the interface is essentially the smallest part of the drive's real estate it's unlikely the bit that fails, similarly the firmware would be second smallest—even then, why isn't redundancy built into drives for these two vital components (two of each)?

Maybe you're judging "smallest" in terms of transistor count or something, but your metric should be complexity. A NAND flash memory chip is a massive array of highly-regular memory cells that have been thoroughly characterized both in terms of aggregate/average behavior and localized defect rates. By contrast, SSD firmware is among other things implementing at least half the functionality of a filesystem, and the host interface of the SSD has to interact with third-party components that only loosely comply with protocol specifications that are themselves sometimes shockingly vague or silent on important questions.


"think you've read too many horror stories about unreliability of flash memory..."

Given my collection of dozens of dead SSDs I have my own horror stories, my comments are based on painful experience.

In most instances whenever I've had a HDD fail—and that's many over the years—I have been able to recover the data but I cannot say that about SSDs or USB thumb drives as they mostly fail catastrophically and without warning—and the data is irrecoverable. Moreover, with SSDs, in cases where data has been accidentally erased, wear leveling often mitigates against successful recovery.

I've lost far more data to SSDs than I ever have to HDDs yet I've only been using them for about one third as long. I've now reached the point where I don't bother to recover data from a dead SSD as I know it'll likely be a futile effort. That's still not so with HDDs but there are exceptions.

Moreover, whether it's hardware failure or accidental data erasure, ALL SSD and USB thumb drive manufacturers either provide no recovery tools or if they do then they are inadequate almost to the point of being useless (for instance, SanDisk's recovery tool is hopeless, I've just unsuccessfully tried it on its own 256GB SD cards). My contention is that if the firmware architectures of these devices were actually designed to be aware of file structures then recovery would be easier. Essentially, they're still not user-friendly and manufacturers have made little or no effort to rectify the problems.

"SSD firmware is a pile of overcomplicated software developed in a hurry and squeezed into an embedded system with limited resources."

Limited resources, why? A 500GB SSD is orders of magnitude larger than even the most bloated firmware would occupy. (The world' most bloated software Windows only takes about 7GB!) If SSD firmware is developed in such a hurry and thus jeopardizes users' data then I would contend the products are not fit for purpose. The fact remains that SSDs are intentionally made as 'black boxes' by their manufacturers and I'd maintain that this is highly unsatisfactory as far as users are concerned.

We ought to be long past the point where users' data are put at risk by shoddy or limited design efforts and or the desire of manufacturers to keep their designs secret. Nowadays there's too much at stake just to hold the world's data inside black boxes that no one understands—ones that not even manufacturers can recover (or are willing to recover) data from when they fail. We users deserve much better. [In saying that I'm no apologist for HDD manufacturers who adopt similar tactics.]

"A NAND flash memory chip is a massive array of highly-regular memory cells that have been thoroughly characterized both in terms of aggregate/average behavior and localized defect rates."

First, your point about SSD firmware complexity. Perhaps you are correct but I'd not be convinced until seeing the architecture and source code (it's not only SSDs, where the secrecy/lack of source code is a problem, it's a serious issue with many products and likely won't be resolved until AI is usefully deployed into reverse engineering it.)

Clearly, you have considerably more faith in NAND technology than I do. That's fine, your objectives are likely very different to mine. I'll just say this, current tends in NAND technology where more and more bits are stored in smaller and smaller 'cells' crammed in more and more stacked layers and that each data bit consists of fewer and fewer bottled-up charges that eventually leak away despite quantum potentials trying to keep them in place, is not my idea of a reliable, permanent storage system.

I've been recovering data from dead drives of both types for quite some time now so my views are mitigated by how difficult it can be. I'm not defending HDDs either except to say that as a more mature tech and given its storage architecture it is easier to do so. I'll also grant that recovering data from some HDDs is intractable but such instances are comparatively rare.

Incidentally, I remember one such failure very well as it was so obviously connected with what I did. It was a Connor Peripherals 20MB drive—yes, it was a long while ago—in a desktop PC. I dropped a small book onto the table where the PC was and the drive instantly gave read errors. Despite strenuous efforts, its data was irrecoverable.


Yes, I have suffered data loss from both HDDs and SSDs, paid for the recovery services available, and HDDs have always been the ones to see any degree of success. SSDs? You're probably screwed.


I’ve done three recoveries to a remote service for clients - the first failed and two which were ssd had a full success.m but all were ssds. Hdds I never sent to recovery because generally they were not complete failures (ie some bad blocks and a copy around was possible ) or the board could be replaced to fix it with the same model and firmware.


Meaningful data recovery from a failed SSD could need tens of millions of dollars worth of equipment, depending on how it failed.


Can you give a source for why you think that?


The headline seems a bit sensationalized, but there certainly are major forces pushing hard drives toward being a niche storage technology rather than a mainstream or default choice. Hard drive prices ($/TB) haven't been improving as quickly as SSD prices. SSDs have already overtaken hard drives for low capacities. For datacenters, SSDs provide more TB per rack unit. As drives get bigger, hard drive performance characteristics become more tape-like; some workloads will be unable to benefit from better $/TB of larger drives because they cannot tolerate the performance hit of consolidating more data behind a single actuator.

At scale, whether to use hard drives or SSDs is not so much a question or how much capacity or performance you need, but the ratio of capacity to performance. Hard drives are best for cold data, SSDs have been the choice for hot data and are taking over for warm data.


As a regular joe that wants to store some TBs of data it does not matter to me whether the $/TB has been improving faster or slower. I just care that I can still read my data after X years and what the lower $/TB is.

SSD improved price by 60% and HDD by 2%? I don't care if HDD is still half as expensive per TB as SSD per unit of space taken in my enclosure...


I prefer my data medium-rare.


I just can’t see this. HDDs hit a Pareto frontier in their niche not matched by tapes (slower access times, no real random access so expensive to fetch) and SSD (it will retain $/gb for the foreseeable future). Maybe electricity costs are significant but if you’re a hyperscaler storing exabytes, what else are you going to use to store the vast majority of infrequently accessed content?

Am I missing something?


Fitting in between two other technologies on the Pareto frontier doesn't guarantee the niche will stay wide enough to be economically viable. Optane failed because it couldn't carve out enough of a market in between flash and DRAM. But I don't expect hard drives to die out entirely within the foreseeable future. Their niche will get smaller, but they're able to steal some market share from tape and can sacrifice performance to save power, while probably staying cheaper on an up-front $/TB basis than flash.


TLDR: I think electricity costs of HDDs will come down rather than HDDs go away because the gap between flash and tape is so large on both performance and $/gb. Optane is a different story.

I can see drives switching to 5400RPM or even more complex stuff being developed like variable rate (like write out at one rate RPM but only go up to 15k RPM for reads when there’s sufficient traffic and park when idle) to conserve energy. Additionally, by looking at what data is cold and “defragging” in a distributed fashion, you could arrange for entire host machines attached to the disks to shut down. I would imagine that the electricity hog is the machine itself rather than HDDs idling with their heads parked and thus you’d see more savings that way.

Re Optane I agree not all Pareto frontier niches survive just because they exist but Optane is different here because it’s not a strict Pareto frontier. It’s more expensive, used more electricity, and has less capacity than SSDs. The introduction of NVMe made flash performance competitive enough with Optane that there wasn’t much to maneuver (https://ssd.userbenchmark.com/Compare/Samsung-980-Pro-NVMe-P... - granted 3 year gap but I think Intel wasn’t necessarily able to sustain continued performance improvements and could only really improve $/GB). Intel also failed to invest SW to take advantage of its byte addressability by reducing RAM usage through XIP although not sure how big of a difference that would end up being.


Can we step back and appreciate both how impressive and how bizarre it is that, in the year 2023, tape media is still somehow the cheapest per byte?


Nothing impressive for me, tape has the longest ping (especially in the worst-case scenario) and the least number of interactions (I mean it can not survive just one million reads).


It's usage scenario is completely different though. Generally, LTO is best-suited for backups and appending changes to the tape rather than reading it back. It's the backup method you hope you never need to use, but you'll be glad it's there if the shit hits the fan that badly.


I have a prediction/belief that in Blockchain era a lot of people are going to use those huge LTO for the sake of mining which is going to be something as important to us as smartphones in 2023.


Actual summary:

> Shawn Rosemarin, VP R&D within the Customer Engineering unit at Pure, told B&F: “The ultimate trigger here is power. It’s just fundamentally coming down to the cost of electricity.” Not the declining cost of SSDs and Pure’s DFMs dropping below the cost of disks, although that plays a part.

> HDD vendors sing a different tune, of course. Back in 2021, HDD vendor Seagate said the SSD most certainly would not kill disk drives.

So the claim being made here is that SSDs will "soon" achieve a lower lifetime cost than HDDs per unit of storage, due to high electricity prices.


Except that for infrequently accessed archival storage, you can power off the drives. (Which is--presumably--what AWS Glacier is.)


They might be right that spinning rust isn't going to be used for online storage due to the energy cost, but that doesn't mean it can't still be useful for offline storage. I'm not buying this.


SSDs are useless for offline storage: if they aren't powered on and accessed often enough, the data just disappears. They're totally unsuitable for archiving.


On a tangent: are there any data about nand data retention in low temperatures? Such as putting microsd in a freezer at -15°C ?


Whaaa? How much time? Does this also apply to SD cards and such?


Here's a link I found with a quick google search: https://www.minitool.com/partition-disk/what-happens-to-data...

According to this, JEDEC Solid State Technology Association's standard is that SSDs should be able to retain data for 1 year at 30C. Not great for an archival medium.

Here's a Quora discussion about it: https://www.quora.com/How-long-can-SSD-store-data-without-po...

Vendors will claim much better numbers than this, but the standard isn't very good.


It is also not at all clear what is needed to "top up" an offline SSD to assure the data is maintained. Some people say that merely connecting the drive to power for a few minutes to hours lets it restore the fading bits but others think nothing but a full rewrite of the data will. I haven't seen any official information or testing on this either so the conservative rewrite assumption seems the most reasonable.


What's big in offline storage these days? I did some work for UPS in the 2000s, and they used giant tape reels. Is that still the norm?


Well, at that point, why not LTO? Much better price-per-TB, despite initially comparatively expensive buy-in price for the drive/library.


If you're not an enterprise user, LTO makes no financial sense whatsoever. In addition, it just isn't very convenient; it's not a random-access media. With a HDD, you can just plug it in, use rsync to update only things that have changed, unplug it, and set it on a shelf. The use case is entirely different.


When HDDs fade away, what can we use for reliable storage? The endurance of SSDs has dropped from 100K cycles for SLC to 1K cycles for QLC. That makes an SSD essentially a consumable. Some say you should replace them every couple years. Meanwhile, I've got HDDs that are 10 years old and still going strong. Is there anything out there, or even on the horizon, that would be reliable long-term like HDD?


I don't think so. Eventually maybe but €220 for 14TB is not in the realm of SSDs yet and it won't be by 2028.


Most of my computers use ssd or nvme, except for my 16tb file server, which is a mixture of 12 hard drives and 6 ssds. I would upgrade it to a system of 4/8 nvme drives, but I'm still on the fence as to whether they could survive 5 years before needing replacement again.


Whats preventing datacenters from using free solar energy? Once you produce your own clean power, electricity price becomes irrelevant.


Has anyone done the maths on the square footage of the average data centre roof against the average power consumption of a modern-day data centre? It's either going to be disappointing relative to the cost of the panels, or very cool. I tend to be very conservative over these things, but in the right climate, probably starts to make sense for some data centres, but depends on what you're running in them. If it's mining crypto or running Xbox cloud gaming then lol, probably not.

Cooling is probably the biggest pain; the sunnier the climate, the more demanding cooling is going to be.


Solar irradiance is 1361 W/m^2 before accounting for losses due to the atmosphere, the sun not being directly overhead, solar panels not converting all wavelengths to electricity, etc. But none of those inefficiencies matter, because 1.3kW in a square meter means that we have an upper bound of solar power generation for a footprint roughly equal to that of a server rack, and the power is roughly that of one server (for compute; storage servers can have lower power density). Since servers are in fact operated in racks, we immediately know that solar power on the roof of a datacenter will fall short by at about two orders of magnitude at a minimum.


Solar does make lot of sense with data centres. But they are also run continuously so you need also other supply. Running servers only when you get decent output is not exactly very sellable idea.

So at other times you still need to buy power.


You can heat bricks and use the heat at night.


And just like that, I now know "Pure Storage" is a company that exists!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: