Hacker News new | past | comments | ask | show | jobs | submit login
Micron's 232-layer NAND enables 2TB flash chips that deliver data 50% faster (ieee.org)
221 points by samizdis on July 26, 2022 | hide | past | favorite | 125 comments



I see phrases like this "doubles the density of bits stored per unit area versus competing chips, packing in 14.6 gigabits per square millimeter. Its 1-terabit chips are bundled into 2-terabyte packages, each of which is barely more than a centimeter on a side" and I just imagine sending this sentence back to my 12 year old self, sitting in front of my Commodore VIC-20 and imagine how science fiction that would sound.

I mean, it sounds science fiction to me now. Amazing.


And affordable for common people.

I remember tiny “spy cameras” being a movie trope, now everyone has one in their pocket.


Yeah, human technological progress has come a long way!

The human mind acclimates to improvements remarkably fast, and the joy found in such novelties tends to be fleeting and quickly fade as it becomes just another normal part of everyday life we take for granted (e.g. fancy new tech, or really anything new and nice). I like to frequently reflect on how amazing it is I am alive and even exist at all!

Gratitude is truly the only effective counterdefense I've found to reduce the fade of true joys in life, like my friends and family.


> The human mind acclimates to improvements remarkably fast, and the joy found in such novelties tends to be fleeting and quickly fade as it becomes just another normal part of everyday life

The good (?) news is that this also works in reverse. So if we ever suffer a serious downgrade in living standards it’s not a serious issue. We’ll get used to it in no time :)


Does it? I get painfully nostalgic even about stuff that was objectively worse, don't even want to try to imagine how it must be with things that were better.


Well, we just had a global experiment wherein people all over world suddenly had to adjust their lives in a major way, and I think we came through alright.


I think you forgot a /s. A lot of people's mental health diminished quite a bit the past few years.


I get nostalgic too, but it's about my once young age that is forever in the past.


>The human mind acclimates to improvements remarkably fast

I bought a Fitorch P25 last November. A LED flashlight, runs on a single li-ion cell, costs about $70. Fits in the palm of your hand. It has a light output of three thousand lumens! (A T8 florescent tube, which is four feet long and runs on mains power, does about 2600 lumens.)

It's got a couple brightness modes, which it makes you step through in series, for safety. At full grunt, if you hold your hand in the beam you have to squint against the glare and you can feel heat on your hand-- all from visible-spectrum photons only, no infrared! It's bright! Of the many warnings silkscreened onto the tube, one needs to be "do not look into flashlight with remaining eye".

I keep it on my desk, to keep me humble. I remember incandescent flashlights, and I remember alkaline batteries. Will my children remember them? Of course not.


What is truly amazing is all our devices have been hampered by copyright laws and competing interests. On my iPhone I can’t load a YouTube song then switch over to Snapchat and start a video call while listening to that song. That is just one example. There are many many more. But we have been trained to thing <40% is the normal. This is why for years I followed the jailbreak scene because there was so many cool features one could get by jailbreaking that just wasn’t offered to normal users.


I have a different stance, I go frugal and away from tech. Because at some point I become confused by the quantities involved and the lack of joy.


> The human mind acclimates to improvements remarkably fast, and the joy found in such novelties tends to be fleeting and quickly fade as it becomes just another normal part of everyday life we take for granted

This is why I am not afraid automation is going to render us without jobs. We're going to get accustomed with better, and still need to work. AI exponential increase can't keep up with human entitlement.


> still need to work

Us needing to work does not meant that the jobs will still be there for us to work at.

The reason so much money goes into automation is simply because it cuts the payroll. The argument that there have always been jobs in the past, so there will always be jobs in the future, simply does not follow..


There will always be jobs, their nature will just be different. Even if you make the machines, someone has to design them, program them and set them up. And even if we automate the automating, we will still be able to create art and as long as it brings any value to others, it will be worth paying for, making it good old work.


Companies want automation for many more reasons besides cost cutting:

- latency - respond sooner to a request

- speed - work faster

- volume - produce more

- scalability - just deploy more models

- vigilance - the bot doesn't get tired and start making mistakes

- consistency - the bot does things one way, many people do things in many ways

- fewer human problems - no cutting corners, slacking off, office politics, etc

- lower environmental footprint

- increased work safety

- improved tracking - collect more data to analyse in order to improve your processes


I envy you. All I read there was "write endurance and data retention halved again"


I have no idea why you'd read that when the cells are the same size as before.


That outcome would be expected from either decreasing cell size, or increasing bits/cell (what used to be commonly called SLC/MLC/TLC/QLC).

I don't see TFA mentioning either directly, but there are some hints that one of the two (or both) did happen: Number of layers increased from 176 to 232 (that explains 32% increase) but bits/chip area has doubled (total density increased by 100%). Where does the rest of this increase coming from, you reckon?


From a much better Anandtech article: 32% increase in layers, 40% increase in die area.

https://www.anandtech.com/show/17509/microns-232-layer-nand-...

A bunch of the cost is test and packaging which is constant so there will be a small cost per bit reduction. But the new chip will cost significantly more to manufacture than a 512Gbit device. No free lunch.


I don't see these as problems as long as software is smart enough.

It merely requires the device to be always powered up (read: has a backup battery that can do periodic flash rewrites for 5 years). And it requires the controller to keep track of the current endurance of every page. A 'worn out' page can still hold data, it just holds either less data, or holds that data for fewer hours before needing to be rewritten.

Long gone are the days of "you can rewrite this data 100,000 times, and then leave it 10 years and it'll still be there". The new normal is "rewrite this data every 20 days to begin with, and after doing 1000 writes you'll need to rewrite the data every hour, or double the ECC bits and then you'll need to rewrite every day".


It's amazing to me how you don't consider any of these major drawbacks as problems. The Samsung 840 evo was a disaster and here we are celebrating more of the same.

While I completely agree endurance can - and it is - be managed with software, rewriting data every 20 days is a major, major drawback. To me, it is a regress


We can live with DRAM. Ok. But I cannot have my computer turned off for 3 weeks though? Or I lose data?


Well the SSD should have a built in battery capable of doing whats necessary for years.

Just like your computer has a CMOS battery that's capable of running the clock for years. It's no different. In fact, many computers will refuse to boot if the clock loses time because all the digital certificates on drivers aren't valid yet.

Computer requires battery backup to remain functional. SSD requires backup battery to remain functional. I don't see the difference.


That's an extremely load-bearing "should"; are any current drives shipping like this?


The purpose of these drives is not for archive. In fact, individual SSDs in general should not be used for archiving. They are for performance. Since we're probably talking about laptops or phones typically, there's also the loss of the device or catastrophic failure (fire?) to worry about so there needs to be a disaster recovery plan regardless.


No need for any "archiving". I live in a country where you get drafted for a mandatory army service once you reach 18 years of age. The user leaves his desktop (or laptop) at home, goes to the army, returns back in a year, and — whoops — there's no data left on the drive. You know the user should have kept backups, and I know that, but your average Joe is going to be pretty unpleasantly surprised.


Thats why the drive should have a battery backup builtin lasting many years...

Years of battery life isn't tricky, because the drive can, upon being unpowered for a week, rewrite data to be more durable at the expense of access time. 'recovery blocks' can be created which allow the recovery of any unreadable block on the drive. The drive can then wake up and rewrite data only as often as needed based on temperature and the error rate found on the last rewrite. For example, on a 1TB SSD, you might only rewrite 5GB/day, taking just 10 seconds at 1 watt. On a 5 Wh battery, thats 5 years. And thats a worst case - keep it somewhere cool, and the rewrite rate might be halved, doubling battery life. Have the drive half full, and the battery life is doubled again (due to half as much to rewrite), and multiplied by 8 (due to the ability to use the spare space for ECC data). So a half full drive stored in a cool place might be able to last 160 years. Obviously at that point, the battery self discharge and power to run timers will dominate.


>or double the ECC bits

Are you aware of even a single ssd product allowing user to reconfigure capacity outside of obtaining secret/proprietary non public manufacturer service software? The only technical way for a drive to automagically shrink size on its own is by starting to mark TRIM freed sectors as BAD and hoping OS running on top will be able to recognize this and transfer BAD status to the filesystem - thats a lot of assumptions. No such drive exists either, drives ship with spare capacity and existence of BAD sectors is treated as a catastrophic event signaling running out of backup NAND.


You can use hdparm(8)'s --dco-setmax option to reduce the visible size of most SATA drives. This is not particularly useful for single drives — leaving unpartitioned space works just as well, assuming an OS that never writes to it — but it does stop TRIM-ignorant RAID controllers from potentially kneecapping SSD GC by "initializing" intentionally unused space.

Though I imagine most RAID controllers initialize with zeroes, and I hope SSD firmware doesn't reserve flash to store large extents of zeroed sectors, so this shouldn't be terribly useful for underprovisioning purposes. But without extensive testing or access to SSD firmware algorithms, it's hard to say.


This is one layer too high - ssd firmware has no way of knowing your intentions purely from filesystem/sector activity.


Drives internally do this with their reserved sectors. They also reallocate sectors between 'fast access' and 'long term storage', where the long term storage uses larger logical pages and hierarchical ECC, allowing a lower percentage of ECC overhead at the cost of performance.

At some point, I wouldn't be surprised if an SSD manufacturer releases a drive with a driver which can reduce the user-visible disk size as it ages. It would probably involve using a 'balloon file', in a similar way to Memory ballooning [1]

[1]: https://en.wikipedia.org/wiki/Memory_ballooning



On SAS disks you can reduce the capacity. I don't know if this has any influence on spare sectors and write cycles though.


You can reserve space by setting host protected area (HPA). There are even people theorizing this might add provisioning are to an SSD https://www.thomas-krenn.com/en/wiki/SSD_Over-provisioning_u... . But so far there is zero evidence for it. HPA is not a magic nobody touches area, its an area reserved for special superuser use and no Drive will simply start reusing it for its own internal processes.


You don't have to go even that far. The minimum hardware requirements for Microsoft Windows 95 were:

A personal computer with a 386DX, 20MHz or higher processor, 4MB of memory (8MB recommended) At least 70MB of available hard disk space for installation

A hard drive at the time, had 250-500MB space. Not even 1GB (that was a luxury), This not even 1/4000 of this small chip. So, you have stack 4000 of those Windows 95 PC Hard Drives, to match the capacity of this chip. Only super computers/clusters of the time had that capacity.


Honestly, what I think is more amazing is how far we didn’t come given the enormous increase in processing power.

Windows 95 was pretty good for it’s time, but windows 11 isn’t essentially different.


> windows 11 isn’t essentially different

That's only if you ignore all of the differences! 8)

There are many big and small differences that people just gloss over, because they feel it doesn't apply to them, or they don't even realise it's there.

Windows 95 used to crash if you looked at it wrong. It was horrendously vulnerable to malware, even across the network. Its "maximums" were hilariously low due to many remaining 16-bit internals. Programming for it was a PITA, even in 32-bit mode. Its network capabilities were woeful. Multilingual? Yes, but only in some combinations at a time. No hope of mixing, say, Chinese and Hebrew in a single text box. Poor text input in general, poor accessibility, etc, etc, etc...


Yeah, but would you call all those framework improvements fundamentally different?

It’s more stable, secure, and drivers are plentiful (and included!), but those feel ancillary. Most people don’t need multiple languages in the same text box.

Obviously it’s fundamentally different in a ship of theseus way, in that pretty much all of what was Win 95 has been replaced.

But the way people use their PC is the same.


That's more a people problem than a technology problem. Many user interface paradigms have been tried over the decades. We've largely settled on what we have because it is what people were able to adapt to. Similar to how driving a car isn't that different from 40 years ago, or even 80 years ago. There's minor iterative changes with improved reliability and accessibility, but not fundamental changes.


Except for:

- Kernel was modularized in the MinWin project

- A C++ subset is now used in kernel as well (see WIL library), since Vista

- COM, and now WinRT, took over the role of Longhorn ideas, now done with C++ instead of .NET

- Virtualization is now used to protect key kernel areas (see kernel data protection)

- Focus on user space drivers

- The GUI stack has been rewritten multiple times

One can program Windows 11 like Windows 95, but there are plenty of APIs being missed, it just happens to work thanks to Microsoft's way of dealing with backwards compatibility


Windows from 7 onwards is worse even for 1995.


A much better article on Anandtech [1]. Dual Stack 116 Layers. And as noted inside the article 100+ layer per stack isn't exactly new.

Personally I am interested in Cost per Bit reduction. Hopefully there will be news in Flash Memory Summit new week. I wouldn't mind to have a slow 100 write cycle NAND that cost a third of today's price. Think 2TB USB Stick for $49.

[1] https://www.anandtech.com/show/17509/microns-232-layer-nand-...


That would be a death knell for hard drives if they could actually get it that cheap (~$25/TB). In addition to obvious benefits for consumers, More and more in the enterprise space (which is the dominant buyer of HDD unit purchase volume) HDD's are being regulated to write few, read lots if not outright WORM, and focused on Sequential speeds. You can knock an order of magnitude of speed off modern NAND Flash and still compete with spinning disks on sequential speed and have a few more orders of magnitude to shave off of Random I/O. It wouldn't kill HDD's overnight, but it would definitely give them a definitive terminal diagnosis.

I would love that. I doubt that's what they're doing, but I would love that.


> wouldn't kill HDD's overnight

What's killing HDDs for me - looking at desktop use - is noise. Seems it got weirder lately, with wear-levelling ringing in every five seconds, etc.

Today I just "rebuilt" a brand new HP box by adding RAM and swapping the original 512 GB M2 SSD for a 2 TB one (Crucial P5 Plus, nice specs). Thinking of adding a 6 or 8 TB HDD for data landfill, but most any customer review mentions aggravating disk noise.

So, holding off the landfill for a while, peering at SATA SSD prices ...


When I built a quiet computer I was surprised that even at max load it was the HDD that was by far the loudest


That depends on how much those supposed 50TB hard drives cost in four years.

Though even if hard drive prices stay completely stagnant at $14/TB during sales, I'd still get a hard drive for videos and backups rather than pay $25.


> I'd still get a hard drive for videos and backups rather than pay $25.

So would everyone who's only motivation for buying storage is $/TB and nothing else. The vast majority of those people think 8 or even 80TB is alot and are pefectly happy with subpar drives being packaged in external usb enclosures. And there's nothing wrong with that, but that demographic of consumers is not the driving economic force in what gets developed, invested in, and brought to market.

HDD's ~$15/TB bargain values wont be sustainable if Enterprise drops HDD's for WORM/WFRM workloads because their TCO is too high (and many are). Disk shelves chew through Kilowatts like candy while capacious focused flash hosts seem comparatively allergic to electricity, and repairing/rebulding arrays on spinning rust compared to flash arrays alone is impetus enough for a lot of shops to dump rust and embrace solid state.

> That depends on how much those supposed 50TB hard drives cost in four years.

For sure. I'm highly skeptical they can pull that off in that time frame though. the pace of innovation in the HDD space is woefully lacking. They need a win like that, with competitive pricing, or the industries days are numbered. If they can pull it off though that's great. I'm all for cheaper/better storage options, even if it's mechanical (or optical or whatever).


Not really sure about that. Isn’t the write endurance of hard dives still much higher than that of flash?


Not really.

If I go look at some 20TB hard drives, I see them promising 1.5 to 2.5PB of endurance.

On the other hand there are 4TB SSDs promising 5PB of endurance.

In full drive writes that's 75-125 vs. 1250.

Even if you ignore the hard drive warranty, I'd say the maximum reasonable workload is a constant write at 50% of the minimum transfer speed. At that speed you might get over 1000 writes, depending on drive size. It's a struggle to even beat the worst case of a mainstream SSD chosen for endurance.

A QLC SSD will only withstand a couple hundred writes, but even that's moderately competitive.


Samsung rates their QVO (QLC flash) at about 0.36 DWPD for 5 years which works out to over 600 disk mean writes before failure. Even if we half that that's still easily 3x high end HDD's, with superior drive health monitoring and far superior recovery/rebuild performance. TLC Flash is 2-3 times higher than that, and exact endurance depends on the specific drive in question as there's more to endurance than single/double/triple/quadruple bit layering.


It seems to me like initially people were worried about SSD endurance but now nobody seems to care anymore, because endurance hasn't turned out to be a significant issue. In fact, I suspect developers care a lot less about I/O patterns now because 1) the performance impact of bad I/O patterns is still high, but the performance floor is way higher on SSDs, so it basically "doesn't matter" for most 2) for end-user applications, user's don't hear your I/O any more, so how are they going to notice?

For example, I have this C:\ SSD on a Windows 10 machine, which has accrued more than 100 TBW (>400 drive writes) in 2-3 years [1], which is pretty much only possible if the system is writing all the time to it (and it is). That's not something you would have done in the spinning rust days, simply because the constant crunchy hard drive noises would have annoyed people.

[1] And whaddaya know, that SSD still says it's got 70 % more life in it, so it's probably good for some five years of this, at which point the average PC will go into the dumpster anyway. Success achieved.


> now nobody seems to care anymore,

I presume you're talking about layman consumers here.

> because endurance hasn't turned out to be a significant issue

It used to be a big issue. Keep in mind SSD capacities were pretty small back in the day, with 80-120GB capacities being pretty typical. combined with just less mature overall flash technology meant that per cell endurance was not great, and it lacked the controllers and capacities to do proper wear leveling, so the massive speed boost meant you could burn through even "good" SSD's of the day pretty quick, especially in the days before TRIM and what not where the OS was doing unnecessary excessive writes to disk because it just didn't know how to handle flash. At first, the tech kinda brute forced it's way through this problem by using increased capacities (A 100GB SSD for example will often actually have more than 100GB of RAW storage to account for cell degradation, though to what extent and the nature of the transition flash and controller and DRAM or not and etcetcetc are all compounding factors in exactly how that works out), and then through improved overall endurance, controllers, cache, etc in combination with improved software and OS policies to reduce unnecessary wear to drives. So hasn't really been an issue for a while, but it absolutely was one back in the day, even for the layman with a healthy budget.

> That's not something you would have done in the spinning rust days,

Yes people did. They didn't really have a choice (unless they're having some fun with RAMdisk or something) + HDD I/O was so much slower that it took orders of magnitude longer. Even when you weren't actively doing something drive intensive, the computer the OS was consistently doing work in the background (remember defragging)?

> simply because the constant crunchy hard drive noises would have annoyed people

It's wild to me to see so many people share this sentiment. Then again I come from a background of having 8x10k 300GB Velociraptors in my personal workstation back in the day. Those were... Loud. lol.


> endurance hasn't turned out to be a significant issue

To be fair, I think it's a bit like y2k. It's not an issue because steps have been taken to mitigate it.


I don't know. Simple correction and wear leveling was required to get off the ground, and that by itself was enough to make early drives have sufficient endurance.

The real mitigation work was done in the service of storing more bits per cell, with endurance allowed to slip.

So sure effort has been put in, but the alternative wasn't disaster, it was using SLC more.


>0.36 DWPD for 5 years

so a mechanical drive running for ~40K hours with ~5% spend writing.

>easily 3x high end HDD's

how? high end HDD offer 2M MTBF, and with something like HGST 7K6000 you end up with pallet loads being retired at 50K hours run time 100% defect free.

>superior drive health monitoring

how? SSDs usually just die without any warning

>and far superior recovery/rebuild performance

you mean recovery/rebuild on a storage box level, because your data on dead SSD is unrecoverable gone forever


> how? high end HDD offer 2M MTBF, and with something like HGST 7K6000 you end up with pallet loads being retired at 50K hours run time 100% defect free.

The warranty says you only get 75 or 125 full drive writes on the two higher end 20TB drives I looked at.

I can't tell you why it's that low, but that's what it says.

> how? SSDs usually just die without any warning

Sometimes SSDs just die, sometimes they go read-only. Sometimes HDDs just die too. Do you have any numbers?


Hmm. Thanks for sharing. Definitely food for thought.


No. Premium flash (Write Endurance optimized flash) is far far far superior to HDD's for endurance and reliability. And modern flash controllers are far better at providing granular and detailed health statistics of cells/devices and allowing you to recover data from failing devices as well as far far faster at rebuilding or recovering data for array rebuilds/resilvers.

In the context of flash replacing HDD's, HDD's have already been regulated to Write Few, Read Many (WFRM), or even Write Once, Read Many (WORM) usecases, and flash has largely replaced spinners where write endurance is an important feature. This is true for consumers and enterprise. As such, a low write endurance flash drive isn't necessarily an inherent flaw. You have to be mindful that you can't just drop it into your databases as a hot spare (not if you expect good results anyway), but similar special use considerations are made for SMR spinners, so that's not exactly an atypical restriction, and is a flaw that still heavily favors flash.

Technically you can probably do more writes to an HDD than you could a 100 P/E cycle Endurance flash drive (that's 1/10th the endurance of a typical QLC NAND flash cell), but the HDD will be orders of magnitude slower and cost you several times as much in electricity just to idle, not to mention under load (again, we're talking capacious flash, not NVMe speed demons that suck down power). Given the usecase in question is specifically in scenarios where writes are already severely limited, the reduced write endurance of such NAND cells isn't really a drawback, especially given how much better they are at providing granular drive and cell health data to monitor device and array health, and how much faster/easier it is to do rebuilds/recovery with flash than spinning rust.

EDIT: To quantify how good flash is these days: Nearly all consumer flash and most non high-endurance enterprise flash drives typically have 0.5-1 DWPD of endurance. Entry level QLC flash (eg samsung 870 QVO) can be as low as 0.3 DWPD [0] but that's still pretty good compared to HDD's. DWPD Means that if you have a 1TB drive, you can write 500GB to 1TB everyday to that drive for 5 years before the drive fails. This is an MTBF, not a minimum lifespan, so YMMV but that's still very good. [1]

Meanwhile, HDD's are too slow to even do 1DWPD for the 20TB and up capacities, period. (keep in mind throughput for spinning disk degrades as physical data as allocated closer to the inner ring of the platter, so those >200MBps seq figures are unrealistic for full disk writes unless you plan to short stroke your drive which massively reduces capacity). And if it's SMR? lol. Even if you do short stroke them to give them the best possible chance, you're looking at about 15TBs maximum written per day theoretical for the best 3.5" spinners out there. Additionally, the main impetus for spinning disk failure is mechanical, not degradation of write medium like in flash, so while it's hard to do apples to apples testing, if your real world write workload works out to less than half a total disk per day, even basic bitch bargain bin flash that sabrent scraped off samsungs fab floor will last you the better part of a decade in 24/7 use. Let's not forget that HDD's will continue to degrade at idle by virtue of being mechanical, whereas the SSD will suffer very limited if any degradation in the same idle case.

High End Consumer/Entry level enterprise flash is 1-3DWPD. [2]

High Endurance enterprise flash is 5-10DWPD. [3]

The Endurance champion however is 100DWPD [4]

TL;DR: So setting money aside, HDD's really only beat SSD's (even cheap QLC ones) in small capacity (<10TB) usecases. For a lot of consumers, that makes HDD's an easy pick, especially considering HDD's are still cheaper per teraybte typically. Otherwise, SSD's are better in pretty much everyway except for price, and SSD's are improving far faster than HDD's on that front as well.

[0] Samsung 870 QVO: https://semiconductor.samsung.com/consumer-storage/internal-...

[1] Micron/Crucial MX500: https://www.crucial.com/ssd/mx500/CT2000MX500SSD1

[2] Samsung 980 Pro: https://semiconductor.samsung.com/consumer-storage/internal-...

[3] HGST SS530: https://documents.westerndigital.com/content/dam/doc-library...

[4] Intel Optane P5800X: https://ark.intel.com/content/www/us/en/ark/products/201840/...


Yes. Flash can’t be stored unused for long periods without losing data.


That's what tape is for and not the usecase in question.


isn't tape better here?


Taking into account that they don't say anything about data retention and reliability, i wouldn't hold my breath.


I am really interested long term in longevity. I am sure they will keep increasing speeds and decreasing prices but what about this stuff wearing out? NAND that can last for as long as a well cooled and unstressed CPU would be epic!


NAND wear is a function of write cycles, which itself is a function of both how much data you write and how much the controller's wear-levelling and garbage collection amplifies those writes.

If your benchmark is "lasting as long as well-cooled electronics" then flash already beats hard drives; since hard drives have bearings that will lose lubrication and fail until you open the drive up in a clean room and replace them.


And NAND has an electrical charge that will dissipate, eventually leading to data loss. This time period is on the order of 90 days (I think I’ve seen some data sheets saying 30 days) up to one year.

https://www.jedec.org/sites/default/files/Alvin_Cox%20%5BCom...

https://www.ibm.com/support/pages/potential-ssd-data-loss-af...

The OCP NVMe Cloud SSD Specification states that drives must not have data loss when powered off for up to 30 days and that when powered on data must me rewritten to refresh data that may have started to lose charge while powered off. Search for retention in:

https://www.opencompute.org/documents/nvme-cloud-ssd-specifi...


Your overall point is valid (there are lifetime constraints that mean you shouldn't rely on unpowered SSDs for long term archiving) but IMHO your post is overly pessimistic.

The spec is 1 year minimum for client and 3mo for server, but most drives exceed that.

This thread includes a response from a Micron rep:

https://community.spiceworks.com/topic/944779-enterprise-ssd...

"The very most important thing to realize when looking at these charts is that the data retention number stated are for SSDs which have reached the end of their useful life. They're worn out. If you have an enterprise-class SSD which is rated for 5 petabytes of writes over its lifetime (just to pick a typical number) , then you will not see this 3 month data retention number until you've actually written 5 petabytes to the drive. When NAND Flash is brand new out of the box, the data retention is actually several years (although we typically don't rate data retention when new).

Honestly, Flash-based storage probably isn't optimal for unpowered archiving. That three-month, EOL spec reflects this. In enterprise computing, when the NAND lifetime is used up, the intent should be to move any important data to a new device (or wipe the old device for security reasons).

In client computing (notebooks and desktops), we rarely see the typical user getting to the point where NAND wearout is a concern, so we rarely see data retention as an issue, either.

So, you needn't be concerned about data retention until your SSD gets really, really old."


That makes sense. The numbers from standards and spec sheets do look quite pessimistic compared to the retention I’ve experienced with cheap USB drives and old SD cards that are rarely powered.


What manufacturing changes do you imagine would make that possible? Making cells smaller gets unusable pretty quickly, and adding more bits to TLC/QLC is a mild price improvement at best.

Prices will go down over time as manufacturing improves, but if you're waiting for that then you don't need to sacrifice speed and durability.


>Prices will go down over time as manufacturing improves

That is not always true though. If we look at DRAM. There is a reason why we are still stuck with 8GB DRAM as entry level while 99.99999% of tech community including HN has been stating DRAM is cheap and it is only going to be cheaper for more than a decade. The cost to produce DRAM has only dropped 10-20% over the past decade, while pricing fluctuate a lot as with any commodity. Microsoft knew this way back in 2006 ( or even earlier ) when they were designing Xbox.

>What manufacturing changes do you imagine would make that possible?

AFAIK, Not any time soon. You could probably read up on all the issues on thememoryguy [1] or SemiEngineering [2]. But even PLC wouldn't bring that much cost savings, if there are any. There is nothing on the roadmap for the next 5 years. Not even in R&D Labs which suggest it may not even happen in 2030. That is why I am hopping for some news, some progress like [3].

[1] https://thememoryguy.com/how-3d-nand-makes-qlc-and-plc-feasi...

[2] https://semiengineering.com

[3] https://ieeexplore.ieee.org/document/9830513


DRAM markets have been notoriously price fixed in the past[1], and it would not be surprising if there's still a lot of winking, nudging, and backdoor handshakes going on. The chip shortage makes it even more excusable now.

[1] https://en.wikipedia.org/wiki/DRAM_price_fixing_scandal


The main reason the price of DRAM is not plunging is because DRAM faces stark physics problems. I don’t think it’s market failure.


Yeah, there's a minimum volume required to store an electric charge and memory has reached it.

Global demand is collapsing due to the Great NotARecession, whatever price we reach by Christmas will be the lowest planar DRAM ever gets. So buy up all the DDR4 you need. ArF immersion tools that sit in DRAM fabs are now full amortized and memory lines are not likely to be earning much more than the marginal cost of maintenance, power and materials that they consume, any lower in price and they get turned off.

I hear good things about vertically stacked thin film transistors as a future technology, but they haven't left the lab yet.


I bought 32GB for 100€. That is roughly 3€/GB which is 24€ for 8GB that is a ridiculously low price for something I would have had 10 years ago.



So same price in 2012.

Dunno how HNers still keep falling for this.


How about Optane?


> And as noted inside the article 100+ layer per stack isn't exactly new.

Per deck, not stack. The current prevalent 176 (total) Layer consists also of two 88 Layer decks, as the same article mentions (check the table) so 116 layers per deck is new!


I suspect it would be difficult to properly balance writing across blocks with only 100 cycles before wearout.


Depends. If you're trying to use it as a drop in replacement for a Scratch/Boot/Database drive, sure. That'd probably not be a good plan. But if you consider HDD's are primarily deployed in WORM or WFRM (Write Few, Read Many) work loads these days, a 100 P/E cycle NAND flash that has comparable $/TB value to spinning disk could definitely have a viable market.

It'd be kinda like Flashes version of Shingled HDD's. Got be a bit more careful with their usecase, can't just drop them in whereever and expect it all to work fine, but as long as you use them in WORM/WFRM, they'd be excellent for greater physical data density and far superior power efficiency.


For some kind of write once archival storage, it would probably still have a use. But with flash that fragile, I'd be worried about how sensitive to read disturb it would be.



At what point will the wear resistance be so low from increasing the layers (and bits) per cell that flash becomes effectively write-once, read many (WORM)?


There was a paper a while back that promised a healing process for flash:

https://www.zdnet.com/article/self-healing-flash-for-infinit...

I have no idea what happened to that though, because it sounded extremely promising.


Probably they're more interested in selling you new flash.


Keep in mind, these are layers of cells, not multi-level cells. Each of the layers has an individual multi-level cell.

That said, Samsung quad-level cells only have 1k write lifetime, so likely not far off.


Remember, even 1000 writes means roughly 3Y of full drive writes per day with the naïve wear leveling and the most brutal write load. For most write loads the drives made of this kind of NAND can be easily warranted for 3Y.


This doesn’t account for the write amplification factor, which is quite unlikely to be 1 with a naive wear leveling implementation and the most brutal write workload.


note that there are data structures (eg b epsilon trees) that can guarantee a write amplification factor of log_b(disk size) for very large b (over 1000). this means that if you are willing to sacrifice a small amount of sequential read speed, you can guarantee write amplification of no more than 4 (with a few mb of RAM).


I've added [1] to my reading list. Do you have suggestions for others that relate specifically to WAF?

1. http://supertech.csail.mit.edu/papers/BenderFaJa15.pdf


The TLDR is that the design of b^e means that (other than the root node which you can store in RAM), you never need to write in chunks less than b^e (smaller writes get batched and pushed down the tree only when you have that much data to write to a child node). With the typical, e=1/2, you can set b=blocksize of SSD^2. This means that any small write will turn into 1 write per depth of the tree, and each of those writes will be writing a full block of data (so the maximal WAF is the depth of the tree). Also, for bigger writes, you can just write to the bottom directly, so this ends up working out such that for large sequential writes, you get a WAF of 1, and for small/random writes, you get a WAF of d.


1k writes sounds good enough for consumer hardware I suppose. As long as there's some some type of S.M.A.R.T. system to warn the user before it wears out.


usually it goes into read-only mode when it wears out, so there's no data loss.


that's fantastic actually! One problem might be that programs don't anticipate this behavior, like if it wears out while a filesystem is still syncing


all modern filesystems are either journalled or always consistent on disk, so shouldn't be a problem if it dies during

worst case you have to copy it to another disk and replay the journal


Yeah, 1K is still in the range of reasonable (if the wear leveling isn't terrible), another order of magnitude decrease though and we're getting on the edge of not.


Yeah they are going to stacked cells because the endurance is so bad at smaller process nodes.

SSD prices have stalled for quite a while. By now they should be getting in shouting distance of hard drives but unfortunately memory makers are much more of a colluding trust than Hard Drive manufacturers.

What I really want is a high-lifetime write medium to go with SSDs. Hard disks aren't really it. It's too bad the DVD/CD form factor/drive died with high speed internet, holographic storage would have been great. But there was no investment/demand to push it.


I love how someone said there's a septuple-level cell out there... but you have to immerse it in liquid nitrogen for it to work.


PLC NAND is already on the way to commercial hardware.


> write-once, read many

but not too many, read disturb is also a NAND problem.


Levels per cell and wear I understand. What's the link between circuit layer depth and wear?


Increasing layers should cause roughly zero problems.

As far as bits per cell, 4 already gives you very slow writes and not a lot of them. Let's say that maxes out at 5 or 6.


Thermals? SSDs have slowly but surely increased in TDP.


Flash likes to run pretty hot and we still rarely attach heat sinks, even less often proper heat sinks with fins. There's a lot of headroom.

And if your main goal is capacity you could decide to just not run it faster, which will avoid basically all the extra heat.


Indeed, the flash write endurance is actually significant higher at higher temperatures (I presume because of greater charge mobility) with modern flash. But data retention, naturally, gets shorter at high temperatures and quite a bit longer at low temperatures, and it can vary by orders of magnitude between temperature extremes. Again, I think due to the charge mobility effect.


That's from controller chips.


I would actually love that. I need fast random access to data that never changes. SSDs are an overkill and too expensive, HDDs are too slow


Even the reads aren't truly nondestructive --- look up "read disturb".


This is TLC flash, don't worry.


Why is micron constantly first to market with new high density nand, but their market share doesn’t reflect their progress. Unless I’m miss informed.


Note that Crucial also belongs to Micron and some other vendors may also use their controllers and/or chips, there aren't that many companies that actually can make those high-end NAND chips end-to-end (Micron, Samsung, WD and Intel, IIRC).


Neither does the stock outperform. It's almost like investing in a commodity market (which doesn't seem that outlandish given the reach of storage now)


Is there a (known) theoretical limit on density/mm^3?


The Bekenstein bound? A maximum number of bits can be placed on the surface of sphere. Beyond that it turns into a black hole.

Mind you, one or two other practical limits might come into play before a memory chip hits that one. :-)

[1] https://en.wikipedia.org/wiki/Bekenstein_bound

---

Edit:

The maximum information capacity of a radio antenna is limited by the surface area enclosing the antenna. One could theorise that that in addition to a maximum density of information storage there will also be a limit on the error free rate at which information can be sent to/from a storage system system, probably related to the surface area enclosing the system.


Well under normal conditions (no degenerate matter)- the material with the lowest molar volume should give a good idea. At least on earth, this happens to be diamond: so ~1.7 * 10^20 atoms / mm^3.

With one bit per atom: 20 million TB.


I think we would be in trouble way before that. Quantum effects would mean you would change a decent percentage of adjacent bits while setting one. We’re probably struggling with such effects already.


Maybe now I can get a 2TB SSD for my Steam Deck.


You likely can very soon. Linus Tech Tips just fitted a Steam Deck with an engineering sample of a Micron M.2 2230 2TB SSD: https://www.youtube.com/watch?v=JXhT13oZchA&t=122s


Any word on DWPD or other endurance facts?


It's still TLC, so it should be pretty decent.


F*k density. I'm for reliability!


Amazing. Here we have a tier 2/3 company actually innovating, yet the engineers getting paid peanuts


How is Micron a "tier 2" company? They are probably the oldest surviving DRAM company and have always been in the TOP5.

No idea about their payment, but i met some of the engineers who work there and they are extremely dedidcated. I don't think they are shopping around for other opportunities to maximize they pay, like it seems common in the bay area.


Branding.


Where the median home price is probably $500K


How is it cooled?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: