I tested four NVMe SSDs from four vendors – half lose FLUSH’d data on power loss

jrockway · on Feb 21, 2022

I always liked the embedded system model where you get flash hardware that has two operations -- erase block and write block. GC, RAID, error correction, etc. are then handled at the application level. It was never clear to me that the current tradeoff with consumer-grade SSDs was right. On the one hand, things like the error correction, redundancy, and garbage collection don't require the attention from CPU (and more importantly, doesn't tie up any bus). On the other hand, the user has no control over what the software on the SSD's chip does. Clearly vendors and users are at odds with each other here; vendors want the best benchmarks (so you can sort by speed descending and pick the first one), but users want their files to exist after their power goes out.

It would be nice if we could just buy dumb flash and let the application do whatever it wants (I guess that application would be your filesystem; but it could also be direct access for specialized use cases like databases). If you want maximum speed, adjust your settings for that. If you want maximum write durability, adjust your settings for that. People are always looking for that one size fits all use case, but it's hard here. Some people may be running cloud providers and already have software to store that block on 3 different continents. Some people may be an embedded system with a fixed disk image that changes once a year, with some temporary storage for logs. There probably isn't a single setting that gets optimal use out of the flash memory for both use cases. The cloud provider doesn't care if a block, flash chip, drive, server rack, availability zone, or continent goes away. The embedded system may be happy to lose logs in exchange for having enough writes left to install the next security update.

It's all a mess, but the constraints have changed since we made the mess. You used to be happy to get 1/6th of a PCI Express lane for all your storage. Now processors directly expose 128 PCIe lanes and have a multitude of underused efficiency cores waiting to be used. Maybe we could do all the "smart" stuff in the OS and application code, and just attach commodity dumb flash chips to our computer.

ATsch · on Feb 21, 2022

There's really two problems here:

1. Contemporary mainstram OSes have not risen to the challenge of dealing appropriately with the multi-CPU, multi-address space nature of modern computers. The proportion of the computer that the "OS" runs on has been shrinking for a long time and there have only been a few efforts to try to fix that (e.g. HarmonyOS, nrk, RTKit)

2. Hardware vendors, faced with proprietary or non-malleable OSes and incentives to keep as much magic in the firmware as possible, have moved forward by essentially sandboxing the user OS behind a compatibility shim. And because it works well enough, OS developers do not feel the need to adjust to the hardware, continuing the cycle.

There is one notable recent exception in adjusting filesystems to SMR/Zoned devices. However this is only on Linux, so desktop PC component vendors do not care. (Quite the opposite: they disable the feature on desktop hardware for market segmentation)

btown · on Feb 21, 2022

Are there solutions to this in the high-performance computing space, where random access to massive datasets is frequent enough that the “sandboxing” overhead adds up?

bayindirh · on Feb 21, 2022

HPC systems generally use LustreFS where you have multiple servers handling metadata and objects (files) separately. These servers have multiple level of drives, where metadata servers are SSD backed and file servers run on SSD accelerated spinning disk boxes, with a mountain of 10TB+ drives.

When this structure is fed to a EDR/HDR/FDR Infiniband network, the result is a blazing fast storage system where you can make a massive number of random accesses by very large number of servers simultaneously. The whole structure won't shiver even.

There are also other tricks Lustre can pull for smaller files to accelerate the access and reduce the overhead even further, too.

In this model, the storage boxes are somewhat sandboxed, but the whole model as a general is mounted via its own client, so the OS is very close to the model Lustre provides.

On the GPU servers, if you're going to provide big NVMe scratch spaces (a-la nVidia DGX systems), you soft-RAID the internal NVMe disks with mdadm.

In both models, saturation happens on hardware level (disks, network, etc.) processors and other soft components doesn't impose a meaningful bottleneck even under high load.

GauntletWizard · on Feb 22, 2022

Additionally, In the HPC space, power loss is not a major factor; backup power systems exist, and rerunning the last few minutes of a half-completed job is common, so on either side you are unlikely to encounter the fallout of "I clicked save, why didn't it save"?

NGRhodes · on Feb 22, 2022

Hours and days of jobs need to be rerun in my experience, our researchers do a poor job of check pointing. Of all the issues we have with lustre, data loss has never been one whilst I have been in the team.

throw0101a · on Feb 22, 2022

> Hours and days of jobs need to be rerun in my experience, our researchers do a poor job of check pointing.

Enabling pre-emption in your queues by default and that'll change: after a job is scheduled and run for 1-2 hours it can be kicked out and a new one run instead after the first's priority decays a bit.

* https://slurm.schedmd.com/preempt.html

You can add incentives:

> When would I want to use preemption? When would I not want to use it?

> When a job is designated as a preemptee, we increase the job's priority, and increase several limits, including the maximum number of running processors or jobs per user, and the maximum running time per job. Note that these increased limits only apply to the preemptable job. This allows preemptable jobs to potentially run on more resources, and for longer times, than normal jobs.

* https://rc.byu.edu/documentation/pbs/preemption

bayindirh · on Feb 22, 2022

> Enabling pre-emption in your queues by default and that'll change.

We run preemptive queues, and no. Not all jobs are compatible with that. Esp. the code researchers developed themselves.

My own code also doesn't have support for checkpointing. Currently it's blazing fast, but for bigger jobs it might need the support, and it needs way more cogs inside the pipeline to make it possible.

bayindirh · on Feb 22, 2022

This is absolutely correct. Cattle vs. Pet analogy [0] applies perfectly there. On the other hand, HPC systems are far from being unprotected. Storage systems generally disable write caches on spinning drives automatically and have all on the fly data on either battery backed or flash based caches. So FS level corruption is kept at minimal levels.

Also, yes, many longer jobs are checkpoints and restart where it's left off, but it's not always possible, unfortunately.

[0]: https://blog.engineyard.com/pets-vs-cattle

shellac · on Feb 22, 2022

> HPC systems generally use LustreFS

Or IBM's GPFS / Spectrum Scale. Same deal, really, although GPFS is a more complete package.

saul_goodman · on Feb 22, 2022

Is it though? IBM hates GPFS and has been trying to kill it off since its initial release, but every time it tries the government (by NSF/tertiary proxy) stuffs more money its mouth. It lives despite being hated by the parent. Both GPFS and Lustre have their warts.

the_duke · on Feb 21, 2022

I can recommend the related talk "It's Time for Operating Systems to Rediscover Hardware". [1]

It explores how modern systems are a set of cooperating devices (each with their own OS) while our main operating systems still pretend to be fully in charge.

[1] https://www.youtube.com/watch?v=36myc8wQhLo

StillBored · on Feb 22, 2022

Fundamentally the job of the OS is resource sharing and scheduling. All the low level device management is just a side show.

Hence why SSD's use a block layer (or in the case of NVMe key/value, hello 1964/CKD) abstraction above whatever pile of physical flash, caches, non-volatile caps/batts, etc. That abstraction holds from the lowliest SD card, to huge NVMe-OF/FC/etc smart arrays which are thin provisioning, deduplicating, replicating, snapshoting, etc. You wouldn't want this running on the main cores for performance and power efficiency reasons. Modern m.2/SATA SSD's have a handful of CPUs managing all this complexity, along with background scrubbing, error correction, etc so you would be talking fully heterogeneous OS kernels knowledgeable about multiple architectures, etc.

Basically it would be insanity.

SSDs took a couple orders of magnitude off the time values of spinning rust/arrays, but many of the optimization points of spinning rust still apply. Pack your buffers and submit large contiguous read/write accesses, queue a lot of commands in parallel, etc.

So, the fundamental abstraction still holds true.

And this is true for most other parts of the computer as well. Just talking to a keyboard involves multiple microcontrollers, scheduling the USB bus, packet switching, and serializing/deserializing the USB packets, etc. This is also why every modern CPU has a mgmt CPU that bootstraps and manages it power/voltage/clock/thermals.

So, hardware abstractions are just as useful as software abstractions like huge process address spaces, file IO, etc.

ksec · on Feb 22, 2022

Our modern system has sort of achieved what microkernel was set out to do. Our Storage and Network each has their own OS.

rhizome · on Feb 22, 2022

And if the entire purpose of computer programming is to control and/or reduce complexity, I should think the discipline would be embarrassed with the direction in which the industries have been going the past several years. AWS alone should serve as an example.

notriddle · on Feb 22, 2022

> And if the entire purpose of computer programming is to control and/or reduce complexity

I honestly don’t know where you got that idea from. I always thought the whole point of computer programming was to solve problems. If it makes things more complex as a result, then so be it. Just as long as it creates fewer, less severe problems than it solves.

rhizome · on Feb 24, 2022

What are some examples of complex solutions to simple problems? That is, where a solution doesn't result in a reduction of complexity? I can't find any.

And this is where the increased complexity is necessary for a solution, not just Perl Anti-golf or FactoryFactory Java jokes.

smaudet · on Feb 22, 2022

More complex systems are liable to create more complex problems...

I don't think you can get away from this - yes, can solve a problem, but if you model problems as entropy, increasing complexity increases entropy.

It's like the messy room problem - you can clean your room (arguably high entropy state), but unless you are exceedingly careful doing so increases entropy. You merely move whatever mess to the garbage bin, expend extra heat, increase your consumption in your diet, possibly break your ankle, stress your muscles...

Arguably cleaning your room is important, but decreasing entropy? That's not a problem that's solvable, not in this universe.

danuker · on Feb 22, 2022

> but unless you are exceedingly careful doing so increases entropy

In an isolated system, entropy can only increase. Moving at all heats up the air. Even if you are exceedingly careful, you increase entropy when doing any useful work.

rbanffy · on Feb 22, 2022

An interesting approach would be to standardize a way to program the controllers in flash disks, maybe something similar to OpenFirmware. Mainframes farm out all sort of IO to secondary processors and it was relatively common to overwrite the firmware in Commodore 1541 drives, replacing the basic IO routines with faster ones (or with copy protection shenanigans). I'm not sure anyone ever did that, but it should be possible to process data stored in files without tying up the host computer.

StillBored · on Feb 22, 2022

http://la.causeuse.org/hauke/macbsd/symbios_53cXXX_doc/lsilo...

But, its still an abstraction, and would have to remain that way unless your willing to segment it into a bunch of individual product categories, since the functionality of these controllers grows with the target market. AKA the controller on a emmc isn't anywhere similar to the controller on a flash array. So like GP-GPU programming, its not going to be a single piece of code because its going to have to be tuned to each controller, for perf as well as power reasons never mind functionality differences (aka it would be hard to do IP/network based replication if the target doesn't have a network interface).

There isn't anything particularly wrong with the current HW abstraction points.

This "cheating" by failing to implement the spec as expected isn't a problem that will be solved by moving the abstraction somewhere else, someone will just buffer write page and then fail to provide non volatile ram after claiming its non volatile/whatever.

And its entirely possible to build "open" disk controllers, but its not as sexy as creating a new processor architectures. Meaning RISC-V has the same problems, if you want to talk to industry standard devices (aka the NVMe drive you plug into the RISC-V machine is still running closed source firmware, on a bunch of undocumented hardware).

But take: https://opencores.org/projects?expanded=Communication%20cont... for example...

rbanffy · on Feb 22, 2022

> the controller on a emmc isn't anywhere similar to the controller on a flash array

That’s why I suggested something similar to OpenFirmware. With that in place, you could send a piece of Forth code and the controller would run it without involving the CPU or touching any bus other than the internal bus in the storage device.

StillBored · on Feb 23, 2022

Adding a JIT to the mix only makes the problem worse, its a question of resources, your not going to fit all that code into a tiny SD micocontroller which has a very limited CPU/ram footprint even in comparison to sata SSDs. The resource availability takes another huge jump when you move to a full blown "array", which not only has all this additional disk mgmt functionality, but its going to have network mgmt functionality, etc. Some of these arrays have full blown xeons/etc in them.

Plus, disks are at least partially sold on their performance profile, and your going create another problem with a cross platform IR vs coding it in C/etc and using a heavyweight optimizing compiler targeting the microcontroller in question directly.

You also have to remember these are effectively deeply embedded systems in many cases, which are expected to be available before the OS even starts. Sometimes that includes operating over a network of some form (NVMe-OF). So it doesn't even make sense when that network device is shared/distributed.

wtallis · on Feb 21, 2022

Consumer SSDs don't have much room to offer a different abstraction from emulating the semantics of hard drives and older technology. But in the enterprise SSD space, there's a lot of experimentation with exactly this kind of thing. One of the most popular right now is zoned namespaces, which separates write and erase operations but otherwise still abstracts away most of the esoteric details that will vary between products and chip generations. That makes it a usable model for both flash and SMR hard drives. It doesn't completely preclude dishonest caching, but removes some of the incentive for it.

sitkack · on Feb 21, 2022

Check out https://www.snia.org/ if you want to track development in this area.

namibj · on Feb 21, 2022

There is no strong reason why a consumer SSD can't allow reformatting to a smaller normal namespace and a separate zoned namespace. Zone-aware CoW file systems allow efficiently combining FS-level compaction/space-reclamation with NAND-level rewrites/write-leveling.

I'd probably pay for "unlocking" ZNS on my Samsung 980 Pro, if just to reduce the write amplification.

wtallis · on Feb 22, 2022

Enabling those features on the drive side is little more than changing some #ifdef statements in the firmware, since the same controllers are used in high-end consumer drives and low-power data center drives. But that doesn't begin to address the changes necessary to make those features actually usable to a non-trivial number of customers, such as anyone running Windows.

vladvasiliu · on Feb 22, 2022

Isn't this a chicken and egg problem? Why would OS vendors spend time implementing this on their side if the drives don't support it?

The difference here being that it's not clear to me that there's much cost on the drive side to actually allow this. Aside maybe for the will to segment the market.

To me, this looks like the whole sector size situation. OSs, including regular Windows, have supported 4K drives for quite a while now. I bought a Samsung 980 (non-pro) the other day that still pretends to have 512B sectors. The OEM drive in my laptop (some kind of Samsung) can be formatted with a 4k namespace, but the default is also 512B. The 980 doesn't even support this.

wtallis · on Feb 22, 2022

It's not quite a chicken and egg problem. Features like ZNS come into existence in the first place because they are desired by the hyperscale cloud providers who control their entire stack and are willing to sacrifice compatibility for small efficiency gains that matter at large scale.

The problem for the rest of the market is that the man-hours to rewrite your software stack to work with a different storage interface that allows eg. a 2% reduction in write amplification isn't worthwhile if you only have a fleet of a few thousand drives to worry about. There's minimal trickling down because the benefits are small and the compatibility costs are non-zero.

Even simple stuff like switching to shipping drives with a 4kB LBA size by default has very little performance impact (since drives are tracking things with 4kB granularity either way) and would be problematic for customers that want to apply a 512B disk image. The downsides of switching are small enough that they could easily be tolerated for the sake of a significant improvement, but for most of the market the potential improvement is truly insignificant. (And of course, fancy features are a market segmentation tool for drive vendors.)

rbanffy · on Feb 22, 2022

> Why would OS vendors spend time implementing this on their side if the drives don't support it?

In the case of Microsoft, forcing the adoption of a de-facto standard (and refusing to support competing ones OOTB) they create is immensely beneficial in terms of licensing fees.

zrm · on Feb 22, 2022

> Consumer SSDs don't have much room to offer a different abstraction from emulating the semantics of hard drives and older technology.

From what I understand the abstraction works a lot like virtual memory. The drive shows up as a virtual address space pretending to be a a disk drive and then the drive's firmware maps virtual addresses to physical ones.

That doesn't seem at all incompatible with exposing the mappings to the OS through newer APIs so the OS can inspect or change the mappings instead of having the firmware do it.

wtallis · on Feb 22, 2022

The current standard block storage abstraction presented by SSDs is a logical address space of either 512-byte or 4kB blocks (but pretty much always 4kB behind the scenes). Allocation is implicit upon writing to a block, and deallocation is explicit but optional. This model is indeed a good match for how virtual memory is handled, especially on systems with 4kB pages; there are already NVMe commands analogous to eg. madvise().

The problem is that it's not a good match for how flash memory actually works, especially with regards to the extreme disparity between a NAND page write and a NAND erase block. Giving the OS an interface to query which blocks the SSD considers as live/allocated rather than deallocated and implicitly zero doesn't seem all that useful. Giving the OS an interface to manipulate the SSD's logical to physical mappings (while retaining the rest of the abstraction's features) would be rather impractical, as both the SSD and the host system would have to care about implementation details like wear leveling.

Going beyond the current HDD-like abstraction augmented with optional hints to an abstraction that is actually more efficient and a better match for the fundamental characteristics of NAND flash memory requires moving away from a RAM/VM-like model and toward something that imposes extra constraints that the host software must obey (eg. append-only zones). Those constraints are what breaks compatibility with existing software.

_0w8t · on Feb 22, 2022

If anything consumer-level SSDs move to the opposite direction. On Samsung 980 Pro it is not even possible to change the sector size from 512 bytes to 4K.

throwawaylinux · on Feb 22, 2022

It's called the program-erase model. Some flash devices do expose raw flash, although it's then usually used by a filesystem (I don't know if any apps use it natively).

There's a _lot_ of problems doing high performance NAND yourself. You honestly don't want to do that in your app. If vendors would provide full specs and characterization of NAND and create software-suitable interfaces for the device then maybe it would be feasible to do in a library or kernel driver, but even then it's pretty thankless work.

You almost certainly want to just buy a reliable device.

Endurance management is very complicated. It's not just a matter of PE cycles for any given block will meet UBER spec at data retention limits with the given ECC scheme. Well, it could be in a naive scheme but then your costs go up.

Even something as simple as error correction is not. Error correction is too slow to do on the host for most IOs, so you need hardware ECC engines on the controller. But those become very large if you have a huge amount of correction capability in them so if errors exceed their capability you might go to firmware. Either way, the error rate is still important to know the health of the data, so you would need error rate data to be sent side-band with the data by the controller somehow. If you get a high error rate, does that mean the block is bad or does it mean you chose the wrong Vt to issue the read with, retention limit was approached, the page had read disturb events, dwell time was suboptimal, operating temperature was too low? All these things might factor in to your garbage collection and endurance management strategy.

Oh and all these things depend on every NAND design/process from each NAND manufacturer.

And then there's higher level redundancy than just per-cell (e.g., word line, chip, block, etc). Which all depend on the exact geometry of the NAND and how the controller wires them up.

I think better would be a higher level logical program/free model that sits above the low level UBER guarantees. GC would have to heed direction coming back from the device about what blocks must be freed, and what the next blocks to be allocated must be.

CGamesPlay · on Feb 22, 2022

> Clearly vendors and users are at odds with each other here; vendors want the best benchmarks (so you can sort by speed descending and pick the first one), but users want their files to exist after their power goes out.

I don't know, maybe if there was a "my files exist after the power goes out" column on the website, then I'd sort by that, too?

javajosh · on Feb 22, 2022

Ultimately the problem is on the review side. Probably because there's no money in it. There just aren't enough channels to sell that kind of content into, and it all seems relatively celebrity driven. That said, I bet there's room for a YouTube personality to produce weekly 10 minute videos where they torture hard-drives old and new - and torture them properly, with real scientific/journalistic integrity. So, basically you need to be an idealistic outspoken nerd and a little money to burn on HDDs and audio/video setup. Such a person would definitely have such a "column" included in their reviews!

(And they could review memory, too, and do backgrounder videos about standards and commonly available parts.)

gruez · on Feb 22, 2022

>I don't know, maybe if there was a "my files exist after the power goes out" column on the website

more like, "don't lose the last 5 seconds of writes if the power goes out". If ordering is preserved you should keep your filesystem, just lose more writes than you expected.

toast0 · on Feb 22, 2022

I wouldn't expect ordering of writes to be preserved, absent a specific way of expressing that need, part of a write cache's job is reordering writes to be more efficient which means ordering is not generally preserved.

But then again, if they're willing to accept and confirm flush commands without flushing, I wouldn't expect them to actually follow ordering constraints.

gruez · on Feb 22, 2022

>part of a write cache's job is reordering writes to be more efficient which means ordering is not generally preserved.

you can use some sort of WAL mechanism to ensure that the the parallel writes appear as if ordering was preserved. that will allow you to lie and ignore fsyncs, but still ensure consistency in case of a crash.

>But then again, if they're willing to accept and confirm flush commands without flushing, I wouldn't expect them to actually follow ordering constraints.

it depends on which type of liar they think you are. if they're the "don't care, disable all safeguards type", then yes they're probably ignoring ordering as well. However, it's also possible they're the methodical liar, figuring out what they can get away with. As I mentioned in another comment in this thread, as long as ordering is preserved the lie wouldn't be noticed in typical use cases (ie. not using it for some sort of prod db, and not using it as part of a multi-drive array). power losses are relatively common, and a drive that totally corrupts the filesystem on it will get noticed much more quickly than a drive that merely loses the last few seconds of writes.

upofadown · on Feb 21, 2022

The flip side of the tyranny of the hardware flash controller is that the user can't reliably lose data even if they want to. Your super secure end to end messaging system that automatically erases older messages is probably leaving a whole bunch of copies of those "erased" messages laying around on the raw flash on the other side of the hardware flash controller. This can create a weird situation where it is literally impossible to reliably delete anything on certain platforms.

There is sometimes a whole device erase function provided, but it turns out that a significant portion of tested devices don't actually manage to do that.

infogulch · on Feb 21, 2022

"Securely erased" has transformed into 1. encrypting all erasable data with a key and 2. "erasing" becomes throwing away the key.

upofadown · on Feb 21, 2022

But then you have to find a place to store the key that can be securely erased. Perhaps there is some sort of hardware enclave you can misuse. Even a tiny amount of securely erasable flash would be the answer.

Tijdreiziger · on Feb 22, 2022

That's what a TPM is.

Computerphile made a pretty good video about TPMs: https://www.youtube.com/watch?v=RW2zHvVO09g

upofadown · on Feb 22, 2022

A TPM can only store a limited number of keys. You need a forseparate key for anything you want to securely delete and in a lot of applications you might have a lot of things you want to delete separately.

infogulch · on Feb 22, 2022

You can pretty easily expand one secure, rotatable key into N. 1. Don't use TPM key directly, use it to encrypt the list of working keys. 2. Store the TPM-encrypted list of working keys on disk. 3. When you need to drop a working key, remove it from the list, rotate the TPM key and reencrypt all the working keys, and store the new list on disk again. Remnants of the old list are irrecoverable because the old TPM key doesn't exist anymore, and the new list is inaccessible without the new TPM key. There, now you have an arbitrary number of secure keys and can drop them individually.

upofadown · on Feb 23, 2022

Great point. This assumes that the TPM does secure deletes. Their primary purpose is protect keys, not get rid of them. I think in practice a TPM is a small enough system that the deletion would be secure just because that is the simplest way to do that. If you do this enough then some overwriting will likely occur. I guess media endurance could be a problem in some cases.

infogulch · on Feb 23, 2022

Yes I admit there's a lot riding on the "rotatable" part.

Tijdreiziger · on Feb 22, 2022

Yeah, that's fair. I guess TPMs aren't really suitable for that use case, only for deleting a lot of data at once.

tinus_hn · on Feb 22, 2022

This is the theory, where you never have to store the key on disk. In reality you store the key on disk while performing actions that would block the TPM chip from releasing the key, such as upgrading the firmware.

uluyol · on Feb 22, 2022

The answer is full disk encryption.

klabb3 · on Feb 22, 2022

Great, we'll just store the key persistently on... Disk? Dammit! Ok, how about we encrypt the key with a user auth factor (like passphrase) and only decrypt the key in memory! Great. Now all we need to do is make sure memory is not persisted to disk for some unrelated reason. Wait...

shaicoleman · on Feb 22, 2022

Swap on zram instead of disk based prevents persisting memory to disk and also dramatically improves swap performance. It's enabled by default on Fedora. I use it everywhere - on my desktop and on production servers.

klabb3 · on Feb 22, 2022

For sure, I'm not saying it's unsolvable, just that the defaults are insecure. Even if I, as an app developer, wanted to provide security for my users, I can't confidently delete sensitive data since this happens below layers I can or should control. We can argue about who is responsible, but it's not a great situation.

solarengineer · on Feb 22, 2022

What if a specific memory region is not persisted to disk?

Are there hardware or OS approaches that facilitate this?

______-_-______ · on Feb 22, 2022

Yes, definitely.

https://man7.org/linux/man-pages/man2/mlock.2.html

matheusmoreira · on Feb 21, 2022

> Maybe we could do all the "smart" stuff in the OS and application code, and just attach commodity dumb flash chips to our computer.

Yeah, this is how things are supposed to be done and the fact it's not happening is a huge problem. Hardware makers isolate our operating systems in the exact same way operating systems isolate our processes. The operating system is not really in charge of anything, the hardware just gives it an illusory sanboxed machine to play around in, a machine that doesn't even reflect what hardware truly looks like. The real computers are all hidden and programmed by proprietery firmware.

https://youtu.be/36myc8wQhLo

aseipp · on Feb 22, 2022

Flash storage is incredibly complex in the extreme at the low level. The very fact we're talking about microcontroller flash as if it's even the same ballpark as NVMe SSDs in terms of complexity or storage management says a lot on its own about how much people here understand the subject (including me.)

I haven't done research on flash design in almost a decade back when I worked on backup software, and my conclusions back then were basically that: you're just better off buying a reliable drive that can meet your your own reliability/performance characteristics, and making tweaks to your application to match the underlying drive operational behavior (coalesce writes, append as much as you can, take care with multithreading vs HDDs/SSDs, et cetera), and testing the hell out of that with a blessed software stack. So we also did extensive tests on what host filesystems, kernel versions, etc seemed "valid" or "good". It wasn't easy.

The amount of complexity to manage error correction and wear leveling on these devices alone, including auxiliary constraints, probably rivals the entire Linux I/O stack. And it's all incredibly vendor specific in the extreme. An auxiliary case e.g. the case of the OP, of handling power loss and flushing correctly, is vastly easier when you only consider some controller firmware and some capacitors on the drive, versus the whole OS being guaranteed to handle any given state the drive might be in, with adequate backup power, at time of failure, for any vendor and any device class. You'll inevitably conclude the drive is the better place to do this job precisely because it eliminates a massive amount of variables like this.

"Oh, but what about error correction and all that? Wouldn't that be better handled by the OS?" I don't know. What do you think "error correction" means for a flash drive? Every PHY on your computer for almost every moderately high-speed interface has a built in error correction layer to account for introduced channel noise, in theory no different than "error correction" on SSDs in the large, but nobody here is like, "damn, I need every number on the USB PHY controller on my mobo so that I can handle the error correction myself in the host software", because that would be insane for most of the same reasons and nearly impossible to handle for every class of device. Many "Errors" are transients that are expected in normal operation, actually, aside from the extra fact you couldn't do ECC on the host CPU for most high speed interfaces. Good luck doing ECC across 8x NVMe drives when that has to go over the bus to the CPU to get anything done...

You think you want this job. You do not want this job. And we all believe we could handle this job because all the complexity is hidden well enough and oiled by enough blood, sweat, and tears, to meet most reasonable use cases.

colejohnson66 · on Feb 22, 2022

Apple’s SSDs are like that in some systems, and they’ve gotten flack for it.

aseipp · on Feb 22, 2022

No, they look like any normal flash drive actually. Traditionally, for any hard drive you can buy at the store, the storage controller exists on the literal NVMe drive next to the flash chips, mounted on the PCB, and the controller handles all the "meaty stuff", as that's what the OS talks to. The reason for this is obvious: because you can just plug it into an arbitrary computer, and the controller abstracts the differences from the vendors, so any NVMe drive works anywhere. The key takeaway is the storage controller exists "between" the two.

Apple still has a flash storage controller that exists entirely separately from the host CPU, and the host software stack, just like all existing flash drives do today. The difference? The controller just doesn't exist on the literal, physical drive next to the flash chips. Because it doesn't exist; they just solder flash directly on the board without a mount like an M.2 drive. Again, no variability here, so it can all be "hard coded." And the storage controller instead exists by the CPU in the "T2 security chip", which also handles things like in-line encryption on the path from the host to the flash (which is instead normally handled by host software, before being put on the bus). It also does some other stuff.

So it's not a matter of "architecture", really. The architecture of all T2 Macs which feature this design is very close, at a high level, to any existing flash drive. It's just that Apple is able to put the NVMe controller in a different spot, and their "NVMe controller" actually does more than that; it doesn't have to be located on a separate PCB next to the flash chips at all because it's not a general 3rd party drive. It just has to exist "on the path" between the flash chips and the host CPU.

bob1029 · on Feb 21, 2022

I would absolutely love to have access to "dumb" flash from my application logic. I've got append only systems where I could be writing to disk many times faster if the controller weren't trying to be clever in anticipation of block updates.

bradfa · on Feb 22, 2022

The ECC and anything to do with multi or triple level cell flashes is quite non-trivial. You don’t want to have to think about these things if you don’t have to. But yes, better control over the flash controllers would be nice. There are alternative modes for NVMe like those specifically for key-value stores: https://nvmexpress.org/developers/nvme-command-set-specifica...

StillBored · on Feb 22, 2022

This is like the statement that if I optimize memcpy() for the number of controllers, levels of cache, and latency to each controller/cache, its possible to make it faster than both the CPU microcoded version (rep stosq/etc) and the software versions provided by the compiler/glibc/kernel/etc. Particularly if I know what the workload looks like.

And it breaks down the instant you change the hardware, even in the slightest ways. Frequently the optimizations then made turn around and reduce the speed below naive methods. Modern flash+controllers are massively more complex than the old NOR flash of two decades ago. Which is why they get multiple CPUs managing them.

snek_case · on Feb 22, 2022

IMO the problem here is that even if your flash drive presents a "dumb flash" API to the OS, there can still be caching and other magic that happens underneath. You could still be in a situation where you write a block to the drive, but the drive only writes that to local RAM cache so that it can give you very fast burst write speeds. Then, if you try to read the same block, it could read that block from its local cache. The OS would assume that the block has been successfully written, but if the power goes out, you're still out of luck.

ATsch · on Feb 22, 2022

Have you had a look at ZoneFS? It exposes pretty much exactly that model to userspace: https://www.kernel.org/doc/html/latest/filesystems/zonefs.ht...

It does need support from the storage device though.

ClumsyPilot · on Feb 22, 2022

"Clearly vendors and users are at odds with each other here; vendors want the best benchmarks (so you can sort by speed descending and pick the first one), but users want their files to exist after their power goes out."

Clearly the vendors are at odds with the law, selling a storage device that doesn't store.

I think they are selling snake-oil, otherwise known as commiting fraud. Maybe they made a mistake in design, and at the very least they should be forced to recall faulty products. If they know about the problem and this behaviour continues ait is basically a fraud.

We allow this to continue, and the manufacturers that actually do fulfill their obligations to the customer suffer financially, while unscurpulous ones laugh all the way to the bank.

myself248 · on Feb 22, 2022

I agree, all the way up to entire generations of SDRAM being unable to store data at their advertised speeds and refresh timings. (Rowhammer.) This is nothing short of fraud; they backed the refresh off WAY below what's necessary to correctly store and retrieve data accurately regardless of adjacent row access patterns. Because refreshing more often would hurt performance, and they all want to advertise high performance.

And as a result, we have an entire generation of machines that cannot ever be trusted. And an awful lot of people seem fine with that, or just haven't fully considered what it implies.

mikepurvis · on Feb 22, 2022

I don't know if a legal angle is the most helpful, but we probably need a Kyle Kingsbury type to step into this space and shame vendors who make inaccurate claims.

Which is currently all of them, but that was also the case in the distributed systems space when he first started working on Jepsen.

rhizome · on Feb 22, 2022

Fraud is fraud, though.

mikepurvis · on Feb 22, 2022

Sure, of course. But even if you did want to seek a legal remedy, someone would have to do the work to clearly document the issue for the purposes of making it clear to a non-technical courtroom.

And at the point where that documentation had been done, that on its own might be enough to right the ship without anyone actually having to get sued.

erosenbe0 · on Feb 22, 2022

This isn't fraud.

The tester is running the device out of spec.

The manufacturers warrant these devices to behave on a motherboard with proper power hold up times, not in whatever enclosures.

If the enclosure vendor suggests that behavior on cable pull will fully mimick motherboard atx power loss then that is fraud. But they probably have fine print about that, I'd hope.

ClumsyPilot · on Feb 22, 2022

"The manufacturers warrant these devices to behave on a motherboard with proper power hold up times"

Thats an interesting point, doesn't 'power failure' also include potential failure of the power supply, in which case you might not get that time?

Or what if a new write command is issued withing the holdup time, does the motherboard /OS know about powerloss during those 16 milliseconds that the power is still holding?

erosenbe0 · on Feb 22, 2022

'Power loss' or 'power failure' for a part designed to operate at ATX specs does not mean supply failure. Supply failure can cause anything up to and including destruction of all components and even death of operator.

Anyway, let's firm up how an SSD works and what the OS knows.

SSDs have volatile DRAM buffers as a staging area to use before writing to the flash.

Flush (OS ioctl) means the data is successfully residing in the volatile DRAM of the SSD.

This is all the OS knows and usually ever knows in the ioctl cycle.

If power is lost there is some time before the >16ms is up that power good signal is lost on the motherboard. The voltage on the 3.3V rail will probably also drop enough from nominal to let the SSD controller know it better gets its housekeeping in order. In other words, dump the DRAM somewhere permanent and deal with it on the next power up.

Anything the OS is doing in the interim will not likely be acknowledged as flushed so that's not a concern. The OS userspace write will never complete. That loop works fine.

The thing that gets people up in arms is that flush means the SSD has the data only in volatile memory and not necessarily in non-volatile storage.

All performant SSDs seem to work this way. They need buffers.

The larger form factor enterprise drives, which are maybe 25% more expensive, have PLP capacitor banks. These supply a solid 50ms of power. Some manufacturers supply oscilliscope screenshots and such.

Anything else seems to be variable in its approach to power loss, particularly the smaller, hotter M.2 parts .

Capacitor banks have issues like taking up space, causing inrush currents, gaining impedance over time, and mediocre reliability at the high temperatures that latest M.2 sticks experience.

rbanffy · on Feb 22, 2022

> The larger form factor enterprise drives, which are maybe 25% more expensive, have PLP capacitor banks.

I wonder if there would be a market for a small board that contains the capacitors and passes the signals down to a M.2 female connector. The physical disconnect would probably help with the temperature as well (and the board could come with its own heatsink).

erosenbe0 · on Feb 23, 2022

I want to revise my comments. There are indeed some capacitors on many M.2 boards -- not sure how much. It takes several mF or more to drive a few amps at 3.3V for some tens of milliseconds, which is not insignificant, so larger form factors are certainly at an advantage.

rbanffy · on Feb 23, 2022

The PCI-e and SAS/SATA 2.3 and 3.5 carriers also could use larger - and cheaper caps for that, or, simply add a small lithium coin cell.

If you have that, then you can play A LOT with large DRAM buffers without risking losing any significant data.

ghshephard · on Feb 21, 2022

Nothing says that you can't both offload everything to hardware, and have the application level configure it. Just need to expose the API for things like FLUSH behavior and such...

jrockway · on Feb 21, 2022

Yeah, you're absolutely right. I'd prefer that the world dramatically change overnight, but if that's not going to happen, some little DIP switch on the drive that says "don't acknowledge writes that aren't durable yet" would be fine ;)

nayuki · on Feb 22, 2022

> the embedded system model where you get flash hardware that has two operations -- erase block and write block

> just attach commodity dumb flash chips to our computer

I kind of agree with your stance; it would be nice for kernel- or user-level software to get low-level access to hardware devices to manage them as they see fit, for the reasons you stated.

Sadly, the trend has been going toward smart devices for a very long time now. In the very old days, stuff like floppy disk seeks and sector management were done by the CPU, and "low-level formatting" actually meant something. Decades ago, IDE HDDs became common, LBA addressing became the norm, and the main CPU cannot know about disk geometry anymore.

mmf · on Feb 22, 2022

I think the main reason they did not expose lower level semantics is that the wanted a drop in replacement for hdds. The second is liability: unfettered access ti arbitrary location erases (and writes) can let you kill (wear out) a flash device in a really short time.

MobiusHorizons · on Feb 22, 2022

I could see that argument for sata SSDs, but the subject of the thread is NVME drives.

wtallis · on Feb 22, 2022

SATA vs NVMe vs SCSI/SAS only matters at the lowest levels of the operating system's storage stack. All the filesystem code and almost all of the block layer can work with any of those transports using the same HDD-like abstractions. Switching to a more flash-friendly abstraction breaks compatibility throughout the storage stack and potentially also with assumptions made by userspace.

hardwaresofton · on Feb 21, 2022

I've actually run into some data loss running simple stuff like pgbench on Hetzner due to this -- I ended up just turning off write-back caching at the device level for all the machines in my cluster:

https://vadosware.io/post/everything-ive-seen-on-optimizing-...

Granted I was doing something highly questionable (running postgres with fsync off on ZFS) It was very painful to get to the actual issue, but I'm glad I found out.

I've always wondered if it was worth pursuing to start a simple data product with tests like these on various cloud providers to know where these corners are and what you're really getting for the money (or lack thereof).

[EDIT] To save people some time (that post is long), the command to set the feature is the following:

    nvme set-feature -f 6 -v 0 /dev/nvme0n1

The docs for `nvme` (nvme-cli package, if you're Ubuntu based) can be pieced together across some man pages:

https://man.archlinux.org/man/nvme.1

https://man.archlinux.org/man/nvme-set-feature.1.en

It's a bit hard to find all the NVMe features but 6 is the one for controlling write-back caching.

https://unix.stackexchange.com/questions/472211/list-feature...

hrgiger · on Feb 21, 2022

I dont have ide in this machine but I found this in the source code [1] probably pointing to [2] and thanks for the tip!

[1]https://github.com/linux-nvme/nvme-cli/blob/master/nvme-prin...

[2]https://github.com/linux-nvme/libnvme/blob/master/src/nvme/t...

hardwaresofton · on Feb 21, 2022

ah thanks this is perfect, saved those links!

wtallis · on Feb 22, 2022

Also: https://nvmexpress.org/developers/nvme-specification/

Unlike eg. ATA and SCSI, the NVMe specs are freely available to the public. They're a little more complicated to read now that the spec has been split into a few modules, but finding the descriptions of all the optional features isn't too hard.

hardwaresofton · on Feb 22, 2022

Ah there it is on page 290, there's a table of feature identifiers.

I can't say I'm going to read the spec any time soon but thanks for sharing this pointer, I'll refer here.

Still would be nice to have some of this information in the man page though...

wtallis · on Feb 22, 2022

The nvme-cli tool and its documentation is written with the assumption that the user is at least somewhat familiar with the NVMe spec or protocol itself, because a large part of the purpose of that tool is to expose NVMe functionality that the OS does not currently understand or make use of. It's meant to be a pretty raw interface.

dncornholio · on Feb 22, 2022

A manual page should say what arguments the program takes and how the program works. So it's fine IMO

jlokier · on Feb 22, 2022

From reading your vadosware.io notes, I'm intrigued that replacing fdatasync with fsync is supposed to make a difference to durability at the device level. Both functions are supposed to issue a FLUSH to the underlying device, after writing enough metadata that the file contents can be read back later.

If fsync works and fdatasync does not, that strongly suggests a kernel or filesystem bug in the implementation of fdatasync that should be fixed.

That said, I looked at the logs you showed, and those "Bad Address" errors are the EFAULT error, which only occurs in buggy software, or some issue with memory-mapping. I don't think you can conclude that NVMe writes are going missing when the pg software is having EFAULTs, even if turning off the NVMe write cache makes those errors go away. It seems likely that that's just changing the timing of whatever is triggering the EFAULTs in pgbench.

hardwaresofton · on Feb 22, 2022

> From reading your vadosware.io notes, I'm intrigued that replacing fdatasync with fsync is supposed to make a difference to durability at the device level. Both functions are supposed to issue a FLUSH to the underlying device, after writing enough metadata that the file contents can be read back later.

Yeah I thought the same initially which is why I was super confused --

> If fsync works and fdatasync does not, that strongly suggests a kernel or filesystem bug in the implementation of fdatasync that should be fixed.

Gulp.

> That said, I looked at the logs you showed, and those "Bad Address" errors are the EFAULT error, which only occurs in buggy software, or some issue with memory-mapping. I don't think you can conclude that NVMe writes are going missing when the pg software is having EFAULTs, even if turning off the NVMe write cache makes those errors go away. It seems likely that that's just changing the timing of whatever is triggering the EFAULTs in pgbench.

It looks like I'm going to have to do some more experimentation on this -- maybe I'll get a fresh machine and try to reproduce this issue again.

What led me to NVMe as dropping write was the complete lack of errors on the pg and OS side (dmesg, etc).

iforgotpassword · on Feb 21, 2022

I think this is something LTT could handle with their new test lab. They already said they want to set new standards when it comes to hardware testing, so if they can hold up to what they promised and hire enough experts this should be a trivial thing to add to a test Parcours for disk drives.

balls187 · on Feb 21, 2022

LTT's commentary makes it difficult to trust they are objective (even if they are).

I loved seeing how giddy Linus got while testing Valve's Steamdeck, but when it comes to actual benchmarks and testing, I would appreciate if they dropped the entertainment aspect.

KennyBlanken · on Feb 22, 2022

GamersNexus seems to really be trying to work on improving and expanding their testing methodology as much as possible.

I feel like they've successfully developed enough clout/trust that they have escaped the hell of having to pull punches in order to assure they get review samples.

They eviscerated AMD for the 6500xt. They called out NZXT repeatedly for a case that was a fire hazard (!). Most recently they've been kicking Newegg in the teeth for trying to scam them over a damaged CPU. They've called out some really overpriced CPU coolers that underperform compared to $25-30 coolers. Etc.

I bet they'd go for testing this sort of thing, if they haven't already started working on it already. They'd test it and then describe for what use cases it would be unlikely to be a problem vs what cases would be fine. For example, a game-file-only drive where if there's an error you can just verify the game files via the store application. Or a laptop that's not overclocked and only is used by someone to surf the web and maybe check their email.

throwaway2037 · on Feb 22, 2022

This is a good idea. You should send a suggestion to them!

nezgar · on Feb 23, 2022

already already

judge2020 · on Feb 22, 2022

From the most recent WAN show at 2:22:52[0]:

> for starters i think the lab is going to focus on written for its own content and then supporting our other content [mainly their unboxing videos]... or we will create a lab channel that we just don't even worry about any kind of upload frequency optimization and we just have way more basic, less opinionated videos, that are just 'here is everything you need to know about it' in video form if, for whatever reason, you prefer to watch a video compared to reading an article

0: https://youtu.be/rXHSbIS2lLs?t=8572

cheschire · on Feb 22, 2022

The forums contain more of the actual stuff. Not necessarily for the steamdeck, but you'll tend to find less entertainment type stuff.

https://linustechtips.com/topic/1410081-valve-left-me-unsupe...

yokoprime · on Feb 22, 2022

AFAIK they are under embargo still as far as software and performance, remember the steam deck doesnt launch until the 25th

balls187 · on Feb 22, 2022

Ah, I see how my comment was misleading--it really meant to highlight that at times I do appreciate LTT's entertainment aspect, not that I expected there to be a technical review of the steamdeck.

ryan29 · on Feb 22, 2022

I'd really like to see one of the popular influencers disrupt the review industry by coming up with a way to bring back high quality technical analysis. I'd love to see what the cost of revenue looks like in the review industry. I'm guessing in-depth technical analysis does really bad in the cost of revenue department vs shallow articles with a bunch of ads and affiliate links.

I think the current industry players have tunnel vision and are too focused on their balance sheets. Things like reputation, trust, and goodwill are crucial to their businesses, but no one is getting a bonus for something that doesn't directly translate into revenue, so those things get ignored. That kind of short sighted thinking has left the whole industry vulnerable to up and coming influencers who have more incentive to care about things like reputation and brand loyalty.

I've been watching LTT with a fair bit of interest to see if they can come up with a winning formula. The biggest problem is that in-depth technical analysis isn't exciting. I remember reading something many years ago, maybe from JonnyGuru, where the person was explaining how most visitors read the intro and conclusion of an article and barely anyone reads the actual review.

Basically you need someone with a long term vision who understands the value you get from in-depth technical analysis and doesn't care if the cost of it looks bad on the balance sheet. Just consider it the cost of revenue for creating content and selling merchandise.

The most interesting thing with LTT is that I think they've got the pieces to make it work. They could put the most relevant / interesting parts of a review on YouTube and skew it towards the entertainment side of things. Videos with in-depth technical analysis could be very formulaic to increase predictability and reduce production costs and could be monetized directly on FloatPlane.

That way they build their own credibility for their shallow / entertaining videos without boring the core audience, but they can get some cost recovery and monetization from people that are willing to pay for in-depth technical analysis.

I also think it could make sense as bait to get bought out. If they start cutting into the traditional review industry someone might come along and offer to buy them as a defensive move. I wonder if Linus could resist the temptation of a large buyout offer. I think that would instantly torpedo their brand, but you never know.

______-_-______ · on Feb 22, 2022

I use rtings.com every time I buy a monitor.

https://www.rtings.com/monitor/tools/table

They rigorously test their hardware and you can filter/sort by literally hundreds of stats.

I just built a PC and I would have killed for a site that had apples-to-apples benchmarks for SSDs/RAM/etc. Motherboard reviews especially are a huge joke. We're badly missing a site like that for PC components.

Ajedi32 · on Feb 22, 2022

> I just built a PC and I would have killed for a site that had apples-to-apples benchmarks for SSDs/RAM/etc.

Userbenchmark has benchmarks for SSDs[1] and RAM[2]. Can't help you with motherboards though.

[1]: https://ssd.userbenchmark.com/

[2]: https://ram.userbenchmark.com/

gruez · on Feb 22, 2022

I can't take them seriously for anything because of their CPU benchmark debacle.

throwaway2037 · on Feb 22, 2022

I never heard about this. Can you share a link? I am interested to learn more.

______-_-______ · on Feb 22, 2022

Just google userbenchmark bias. Basically, when AMD shook things up a few years ago with huge numbers of cores, UserBenchmark responded by weighting down the importance of multithreaded workloads in their scores, so Intel would stay on top. Now their site is banned from most subreddits, including both r/intel and r/amd.

dralley · on Feb 23, 2022

You can get the gist by reading through this thread

https://www.reddit.com/r/hardware/comments/dzdacg/userbenchm...

It wasn't one event, more like the ratings for CPUs just became laughably, transparently, utterly worthless, to the point where Intel i3 laptop CPUs were scoring higher than top-end AMD Threadripper CPUs. And they refused to acknowledge any issues.

Within a month or two after AMD started shipping CPUs with more than 8 cores, they tweaked the algorithm to ignore >8 cores. And various other ridiculous changes that hurt AMD's rankings.

throwaway2037 · on March 2, 2022

Why was this downvoted? It is a useful post!

NavinF · on Feb 22, 2022

Unfortunately Userbenchmark is totally useless for comparing components. They don’t even attempt to benchmark one change at a time while keeping all other parts of the testbench identical.

Worse yet every time I benchmark one of my machines, I score significantly higher than the average user results for the same hardware. Perhaps the average submitter has crapware/antivirus installed or their machines are misconfigured (e.g. XMP disabled) which makes all their data suspect.

______-_-______ · on Feb 22, 2022

I appreciate the links. But it's tough to believe stats uploaded by random users, especially when we're only talking a few percent difference. Not to mention, if you sort by "avg bench %", apparently WD released an NVMe drive that's faster than Intel Optane. You'd think that would have made the news.

fwiw the best motherboard comparison I found was on overclock.net[1]. It didn't list everything I cared about, but it was a great starting point

[1]: https://www.overclock.net/threads/official-intel-z690-mother...

Ajedi32 · on Feb 22, 2022

Individual benchmarks uploaded by random users would be hard to trust yes, but UserBenchmark collects thousands. If you click through to the page for a given product it'll even show you a distribution graph of the collected scores from different real-world machines.

> apparently WD released an NVMe drive that's faster than Intel Optane

"Faster" is a matter of opinion; it depends on what you're optimizing for. Optane obviously has faster random reads, but it's not so great at sequential writes. The UserBenchmark score tries to take all of that into account: https://ssd.userbenchmark.com/Compare/Intel-905P-Optane-NVMe...

KennyBlanken · on Feb 22, 2022

I mentioned this in another comment, but I think GamersNexus is doing exactly what you want.

Regarding influencers: they're being leveraged by companies precisely because they are about "the experience", not actual subjective analysis and testing. 99% of the "influencers" or "digital content creators" don't even pretend to try to do analysis or testing, and those that do generally zero in on one specific, usually irrelevant, thing to test.

viraptor · on Feb 22, 2022

I hope they do a good mix of entertainment and GamersNexus's depth. I'm struggling to watch GN without zoning out after a couple minutes. It's good deep content for sure, but if it was in written form I'd just skim and get the interesting bits.

throwaway2037 · on Feb 22, 2022

You wrote: <<bring back high quality technical analysis>>

How about Tom's Hardware and AnandTech? If they don't count, who does? Years ago, I used to read CDRLabs for optical media drives. Their reviews were very scientific and consistent. (Of course, optical media is all but dead now.)

goodpoint · on Feb 21, 2022

LTT is more focused on entertaining the audience than providing thorough, professional testing.

Operyl · on Feb 21, 2022

He’s recently pivoted a ton of his business to proper lab testing, and is hiring for it. It’ll be interesting to see, I think he might strike a better balance for those types of videos (I too am a bit tired of the clickbait nature these days).

geerlingguy · on Feb 21, 2022

So he says; I wish the funding were available to other groups who already have a more proven / technical track record in the area, though.

Like... what if LTT bought out Anandtech instead of trying to spin up a new 'labs' to replicate what has largely been lost (but still exists to an extent) in a few dusty corners of the tech journalism world.

I'm willing to give the benefit of the doubt, but there's so far been a lot of "just try me" and "we hired someone amazing" but I'll believe it when I see results!

ryan29 · on Feb 22, 2022

AnandTech is owned [1] by a media conglomerate with a $4,000,000,000 USD market cap [2]. That's a lot of water bottles.

1. https://www.anandtech.com/show/13092/future-plc-to-acquire-c...

2. https://ca.finance.yahoo.com/quote/FUTR.L?p=FUTR.L&.tsrc=fin...

ggreg84 · on Feb 22, 2022

And Anandtech doesn't allow anyone to reproduce their results, and their readers are their product, so...

kevincox · on Feb 21, 2022

But audience is also important. If it is only super-technical sources that are reporting faulty drives then the manufactures won't care much. However if you get a very popular source that has a lot of audience, especially in the high-margin "gamer" vertical then all of a sudden the manufactures will care a lot.

So if LTT does start providing more objective benchmarks and reviews it could be a powerful market force.

bstar77 · on Feb 22, 2022

I would personally leave this kind of testing to the pros, like Phoronix, Gamers Nexus, etc. LTT is a facade for useful performance testing and understanding of hardware issues.

OJFord · on Feb 21, 2022

That's like Ellen DeGeneres declaring a desire to set new standards for film critique.

bradenb · on Feb 22, 2022

How so? What's the problem with LTT? Are you just bothered that they're more than a purely informational source?

smoldesu · on Feb 22, 2022

I think "the problem with LTT" is that, as time goes on, they've slid off the purely informational stuff and into the whatever-gets-the-most-clicks stuff. I don't mind a little bit of humor or personality (Digital Foundry is great in that regard), but when Linus started uploading videos that defended his use of click-baity thumbnails and the bribes he received from Nvidia/Intel, his credibility fell off a cliff for me. If you're not going to stand for the objective truth, why even bother reviewing hardware? I'd imagine that pressure is what pushed them to invest in this lab, but even then I have a hard time trusting them.

Linus is welcome to chase whatever niche market he wants, but as a "purely informational source" he's got a pretty marred track record these days.

czx4f4bd · on Feb 22, 2022

Why do clickbaity thumbnails matter more than the content of the video? Linus has made it clear that he hates using them, but videos with them consistently get way better viewership, which is kinda essential to keep the channel running.

I'm also very curious to see a source on the "bribes he received from Nvidia/Intel", because I'm not finding anything that looks relevant on Google.

michaelt · on Feb 22, 2022

> Why do clickbaity thumbnails matter more than the content of the video?

Take for example this recent video: "We ACTUALLY downloaded more RAM" [1] complete with grinning youtube face holding a stick of RAM marked '10TB' - and it's complete bullshit.

How can I trust the opinion of someone who publishes such embarrassing nonsense?

[1] https://www.youtube.com/watch?v=minxwFqinpw

czx4f4bd · on Feb 22, 2022

I like how you quoted my question and then completely ignored it. The fact that you disliked the title of a video is not in any way a meaningful criticism of its content.

But okay, let's take a closer look at that video. When I saw it in my YouTube recs, I rolled my eyes at the clickbait and skipped over it, but I didn't see how it makes LTT look bad. In fact, I just gave it a fair shot and skimmed through it for myself, and it actually looks like a pretty solid explanation of memory hierarchy and swap space for beginners, packaged in a format that will increase its reach. I don't see what's bullshit about that.

Look, say what you will about clickbait, the unfortunate truth is that it gets views, which channels like LTT need to survive and grow. Linus is on the record saying he hates it, but they've done the tests to confirm that the stupid thumbnails and titles just perform better.

And come on, let's be honest here: How many people are going to click on a video titled "What is swap space?" or "You can use Google Drive for swap space on Linux" or something similarly boring? Even the best explanation in the world isn't going to get traction with a title like that. I looked for comparable videos and it looks like "What is Linux swap?" by Average Linux User (https://youtu.be/0mgefj9ibRE) is the next most-viewed video on the same topic. That video has gotten about 90,000 views since it was posted in 2019. By comparison, the LTT video has averaged about 100,000 views per day in the 16 days since it was posted.

So it looks to me like LTT took a technical topic that most people would never think about, found an angle to make it interesting to random people browsing YouTube, and tricked potentially thousands of people into learning something. What exactly is embarrassing about that?

2fast4you · on Feb 22, 2022

I agree. To some extent you have to play the game to win.

If LTT can succeed and provide good information, then what is wrong with their click bait thumbnails (or similar techniques)?

chrisan · on Feb 22, 2022

He talks about the youtube algorithm being the primary driver for the shitty thumbnails https://www.youtube.com/watch?v=DzRGBAUz5mA vs his traditional one that got buried.

Not sure about his other stuff you claim, I'm not a super big video guy for tech things (just let me skip and search ahead easily) but this came up with my friends a few years ago after people started noticing many videos from various creators going to this format of thumbnail.

wmf · on Feb 21, 2022

Gamers Nexus?

nebula8804 · on Feb 21, 2022

Man thats a bit harsh.

pkaye · on Feb 22, 2022

I used to developed SSD firmware in the past and our team always used to make sure it would write the data and check the write status. We also used to used to analyze competitor products using bus analyzers and could determine some wouldn't do that. Also in the past many OS filesystems would ignore many errors we returned anyway.

Edit: Here is an old paper on the subject of OS filesystem error handling.

https://research.cs.wisc.edu/wind/Publications/iron-sosp05.p...

hughrr · on Feb 21, 2022

The important quote:

> The models that never lost data: Samsung 970 EVO Pro 2TB and WD Red SN700 1TB.

I always buy the EVO Pro’s for external drives and use TB to NVMe bridges and they are pretty good.

Trellmor · on Feb 21, 2022

There is a 970 Evo, a 970 Pro and a 970 Evo Plus, but no 970 Evo Pro as far as I am aware. Would be interesting what model OP is actually talking about and if it is the same for other Samsung NMVe SSDs. I also prefer Samsung SSDs because they are reliably and they usually don't change parts to lower spec ones while keeping the same model number like some other vendors.

RachelF · on Feb 21, 2022

And watch out with the 980 Pro, Samsung has just changed the components.

Samsung have removed the Elpis controller from the 980 PRO and replaced it with an unknown one, and also removed any speed reference from the spec sheet.

Take a look here for what's changed on the 980 PRO: https://www.guru3d.com/index.php?ct=news&action=file&id=4489...

It's OK for them to do this, but then they should give the new product a new name, not re-use the old name so that buying it becomes a "silicon lottery" as far as performance goes.

Trellmor · on Feb 21, 2022

Link seems to be broken, shows a picture of the note 10 for me. I guess you wanted to link this one [1].

I knew about the changed controller in the 970 Evo Plus, but I wasn't aware they also changed the 980 Pro. That's disappointing. Is there anyone left that isn't doing those shenanigans?

[1] https://i.imgur.com/YJShyLR.jpg

rekoil · on Feb 22, 2022

Is there a way for me as a customer to tell which controller my particular 980 Pro has? Can it be distinguished in software?

RachelF · on Feb 21, 2022

Thanks for fixing that link.

qwertox · on Feb 21, 2022

I mostly buy Samsung Pro. Today I put an Evo in a box which I'm sending back for RMA because of damaged LBAs. I guess I'm stopping my tests on getting anything else but the Pros.

But IIRC Samsung was also called out for switching controllers last year.

"Yes, Samsung Is Swapping SSD Parts Too | Tom's Hardware"

hughrr · on Feb 21, 2022

Sorry I should have said EVO plus there in my original post. I’ll leave the error in so your comment makes sense.

zargon · on Feb 21, 2022

The "EVO Pro" error was made by the OP. So it would be nice to know which drive OP actually tested.

cgriswald · on Feb 21, 2022

OP has since appended to his post:

> Correction: “Plus” not “Pro”. Exact model and date codes:

> Samsung 970 Evo Plus: MZ-V7S2T0, 2021.10 > WD Red: WDS100T1R0C-68BDK0, 04Sept2021

hughrr · on Feb 21, 2022

Indeed. I use the EVO Plus NVMe's though.

fuzzybear3965 · on Feb 21, 2022

How do you know the model? People are asking the same question in Twitter and OP doesn't seem to have supplied an answer, then.

xenadu02 · on Feb 21, 2022

Sorry, it was the 970 Evo Plus. Here are the exact model and date codes from the drives:

Samsung 970 Evo Plus: MZ-V7S2T0, 2021.10

WD Red: WDS100T1R0C-68BDK0, 04Sept2021

willis936 · on Feb 21, 2022

Which drives were tested/confirmed to lose data? Did a Samsung Pro drive have this behavior?

xenadu02 · on Feb 21, 2022

These two drives never lost FLUSH'd writes in any test I ran.

willis936 · on Feb 21, 2022

What drives did you test?

fulafel · on Feb 22, 2022

In the Twitter thread it's explained they don't want to name the vendors who failed the test ATM.

willis936 · on Feb 22, 2022

I see they don't want a "thing", but that hardly seems to be a reason to not name names. Is there some special status of companies that the non-conformant status of their devices should be private?

It turning into a "thing" sounds like a net win for consumers.

dspillett · on Feb 22, 2022

> I see they don't want a "thing", but that hardly seems to be a reason to not name names.

I see you've never experienced the shit-storm of abusive messages sometimes received from fans when you say something bad about the products from a company they are unreasonably attached to. Or the rather aggressive stance some companies themselves take when something not complimentary is said. Or in the middle, paid shills (the company getting someone to pretend to be one of those overly attached people).

That might be what is meant by "a thing" here.

willis936 · on Feb 22, 2022

Everything you listed are external chilling factors.

You would blame the person neutrally shining the flashlight for the obscene response of others? Simply identifying a list of tested drives should not cause fear for someone's wellbeing. Anything otherwise is a successful stifling of knowledge.

dspillett · on Feb 22, 2022

> You would blame the person neutrally shining the flashlight for the obscene response of others?

Absolutely not, you seem to have misread me there.

I'm saying I understand the bad results not being published to avoid the potential for the obscene response from others.

Publishing the good results is enough of a public service. More than required, in fact. The test results could have been kept to themselves as nothing is owed to the rest of us.

MageSlayer · on Feb 22, 2022

Perhaps, it's the author does not want to name vendors which fail giving them time to contact him with some attractive suggestions. Or I am too suspicious? :)

ysleepy · on Feb 21, 2022

Were samsung and WD consistent in this or did you have drives from them that behaved differently?

hughrr · on Feb 21, 2022

Thanks for confirming - appreciated!

colanderman · on Feb 21, 2022

I'm curious whether the drives are at least maintaining write-after-ack ordering of FLUSHed writes in spite of a power failure. (I.e., whether the contents of the drives after power loss are nonetheless crash consistent.) That still isn't great, as it messes with consistency between systems, but at least a system solely dependent on that drive would not suffer loss of integrity.

markonen · on Feb 22, 2022

Enterprise drives with PLP (power loss protection) are surprisingly affordable. I would absolutely choose them for workstation / home use.

The new Micron 7400 Pro M.2 960GB is $200, for example.

Sure, the published IOPS figures are nothing to write home about, but drives like these 1) hit their numbers every time, in every condition, and 2) can just skip flushes altogether, making them much faster in uses where data integrity is important (and flushes would otherwise be issued).

supermatt · on Feb 21, 2022

So, seems those drives may have been ignoring the F_FULLFSYNC after all…

https://news.ycombinator.com/item?id=30371857

The Samsung EVO drives are interesting because they have a few GB of SLC that they use as a secondary buffer before they reflush to the MLC.

verall · on Feb 22, 2022

> reflush to the MLC

I'm nitpicking, but an EVO has TLC. Also an SLC write cache is the norm for any high performance consumer ssd, it's not just Samsung.

fire · on Feb 22, 2022

> I'm nitpicking, but an EVO has TLC.

b...but the M in MLC stands for multi... as in multiple... right?

checks

Oh... uh; Apparently the obvious catch-all term MLC actually only refers to dual layer cells, but they didn't call it DLC, and now there's no catch-all term for > SLC. TIL.

marcan_42 · on Feb 23, 2022

The "L" stands for level, and that makes it even more wrong. MLC should have been been QLC or DBC, and TLC should have been OLC or TBC. It's two bits and four levels, or three bits and eight levels. The latest "QLC" flash has 16 levels (and is a massive step down in performance and reliability; it seems TLC is the sweet spot, at least right now, unless you really want the absolute cheapest, performance be damned).

Interestingly enough, Apple has a patent for 8-bit cell flash (256 levels!), going full blown analog processing and error correction, but I don't think that ever became a product.

supermatt · on Feb 22, 2022

Thanks, I thought this was a special Samsung feature. They certainly advertise it as such!

fomine3 · on Feb 22, 2022

I believe now SLC cache is available on every TLC consumer SSDs, because it's slow without it.

wtallis · on Feb 22, 2022

I'm not sure there were ever consumer TLC SSDs that didn't use SLC caching (except for a handful that use MLC caching). SLC caching was starting to show up even in some MLC drives when the market was transitioning from MLC to TLC, and now there are even some enterprise drives that use SLC caching.

verall · on Feb 23, 2022

See I wanted to say "every" but I just know there exists some crap slow drive without one that survives because it is technically an "S S D".

Even Samsung's last MLC drives (970 Pro?) had SLC write caches. Otherwise they'd just be giving up a benchmark.

Enterprise is a whole different game tho, as the value of an SLC cache is inversely proportional to the write duty cycle...

marcan_42 · on Feb 22, 2022

The two vendors he tested as not ignoring FLUSHes are precisely the two vendors I was comparing to Apple, so not so fast.

allisdust · on Feb 22, 2022

Samsung has a hardware testing lab where all new storage products (SSDs/memory cards) are rigorously put to (automated) tests through a ridiculous number of reads, writes and power scenarios. The numbers are then averaged out and dialed down a bit to provide some buffer and finally advertised on the models. I'm not surprised that they maintain data integrity. They also own their entire stack (software and hardware) so there is less scope for a untested OEM bug to slip through.

yellowapple · on Feb 22, 2022

"Data loss occurred with a Korean and US brand, but it will turn into a whole "thing" if I name them so please forgive me."

This does a disservice to those who might be running drives from those vendors with an expectation that they don't lose data post-flush.

That said, this narrows one of the data losers down to Hynix. Curious about the other one, considering how many US-based SSD vendors there are.

PhantomGremlin · on Feb 22, 2022

That said, this narrows one of the data losers down to Hynix.

Not really. Samsung builds a plethora of SSDs.

yellowapple · on Feb 22, 2022

Per the title, four vendors were tested. Samsung was already mentioned as a non-loser, so it can't be one of the two losers (or else the title would be wrong and the SSDs would be from 3 vendors at most).

PhantomGremlin · on Feb 22, 2022

Yeah you got me.

I didn't pay careful attention to the wording of the submitted title. I may have been confused because of the wording of the actual tweet: I tested a random selection of four NVMe SSDs from four vendors.

The word "random" meant to me that Samsung drives could have been selected twice. But, yes, then there wouldn't be four distinct vendors.

Unstated but implied by you is there are only two (major) Korean vendors to choose from.

So if Samsung is a Korean winner then Hynix must be the Korean loser. Which is now clear to me.

Is it possible there's a third (minor) Korean player? Could I possibly still have a chance? :)

yellowapple · on Feb 22, 2022

> Is it possible there's a third (minor) Korean player? Could I possibly still have a chance? :)

Well supposedly Zalman (another Korean company) makes SSDs, but I don't think I've ever seen one in the wild. Their specialty is heatsinks and fans, last I checked.

cerved · on Feb 22, 2022

since they mentioned a Samsung working it's unlikely

erosenbe0 · on Feb 22, 2022

Nobody ahould be expecting that a flush actually flushes because the biggest manufacturer of hard drives tells you it doesn't.

Read documents and specifications like this tester didn't do.

And don't use random enclosures and pull the plug since the design spec assumes hold up times and sequencing that enclosure may not be compliant with.

xenadu02 · on Feb 22, 2022

Please stop replying with misinformation all over this thread.

The NVMe spec is available for free; you should read it.

And you're 100% wrong about the enclosure too. It's driven by an Intel TB bridge JHL6240 and the drives are PCIe NVMe m.2 devices. Power specs are identical to on-board m.2 slots with PCIe support (which is all modern ones). There is no USB involved.

See my other reply to you where I explain what Flush actually does (your comments about it are also completely wrong).

erosenbe0 · on Feb 22, 2022

I apologize.

Your TB test sounds valid but did you verify with manufacturer that power loss protection or power failure protection works in your TB enclosure? Is that a fair assumption or do you need to ask?

tr33house · on Feb 21, 2022

I'm a systems engineer but I've never done low level optimizations on drives. How does one even go about even testing something like this? It sounds like something cool that I'd like to be able to do

xenadu02 · on Feb 21, 2022

My script repeatedly writes a counter value "lines=$counter" to a file, then calls fcntl() with F_FULLFSYNC against that file descriptor which on macOS ends up doing an NVMe FLUSH to the drive (after sending in-memory buffers and filesystem metadata to the drive).

Once those calls succeed it increments the counter and tries again.

As soon as the write() or fcntl() fail it prints the last successfully written counter value which can be checked against the contents of the file. Remember: the semantics of the API and the NVMe spec require that a successful return from fcntl(fd, F_FULLFSYNC) on macOS require that data is durable at that point no matter what filesystem metadata OR drive internal metadata is needed to make that happen.

In my test while the script is looping doing that as fast as possible I yank the TB cable. The enclosure is bus powered so it is an unceremonious disconnect and power off.

Two of the tested drives always matched up: whatever the counter was when write()+fcntl() succeeded is what I read back from the file.

Two of the drives sometimes failed by reporting counter values < the most recent successful value, meaning the write()+ fcntl() reported success but upon remount the data was gone.

Anytime a drive reported a counter value +1 from what was expected I still counted at that as success... after all there's a race window where the fcntl() has succeeded but the kernel hasn't gotten the ACK yet. If disconnect happens at that moment fcntl() will report failure even though it succeeded. No data is lost so that's not a "real" error.

benlwalker · on Feb 22, 2022

On very recent Linux kernels you can open the raw NVMe device and use the NVMe pass thru ioctl to directly send NVMe commands (or you can use SPDK on essentially any Linux kernel) and bypass whatever the fsync implementation is doing. That gives a much more direct test of the hardware (and some vendors have automated tests that do this with SPDK and ip power switches!). There's a bunch of complexity around atomicity of operations during power failure beyond just flush that have to get verified.

But the way you tested is almost certainly valid.

mrd999 · on Feb 22, 2022

Is it possible the next write was incomplete when the power cut out? Wouldn't this depend on how updates to file data are managed by the filesystem? The size and alignment of disk and filesystem data & metadata blocks?

xenadu02 · on Feb 23, 2022

Yes, kinda. If the drive completes the flush but gets disconnected before the kernel can read the ack then I can get an error from fcntl(). In theory it's possible I could get an error from write() even though it succeeded but I don't know if that is possible in practice.

In any case the file's last line will have a counter value +1 compared to what I expected. That is counted as a success.

Failure is only when a line was written to the file with counter==N, fcntl(fd, F_FULLFSYNC, 1) reports success all the way back to userspace, yet the file has a value < N. This gives the drive a fairly big window to claim it finished flushing as the ack races back to userspace but even so two of the drives still failed. The SK Hynix Gold P31 sometimes lost multiple writes (N-2) meaning two flush cycles were not enough.

debug-desperado · on Feb 22, 2022

This seems like it would only work with with an external enclosure setup. I wonder if a test could be performed in the usual NVMe slot.

Of course, it seems it would be much harder to pull main power for the entire PC. I'm not sure how you'd do that - maybe high speed camera, high refresh monitor to capture the last output counter? Still no guarantee I'm afraid.

wtallis · on Feb 22, 2022

If you have a host system that has reasonable PCIe hotplug support and won't panic at a device dropping off the bus, then you can just use a riser card that can control power provided over a PCIe slot.

Quarch makes power injection fixtures for basically all drive connectors, to be paired with their programmable power supply for power loss testing or voltage margin testing (quite important when M.2 drives pull 2.5+A over the 3.3V rail and still want <5% voltage drop).

toast0 · on Feb 22, 2022

There's plenty of network controlled power outlets. Either enterprise/rackmount PDUs, or consumer wifi outlets, or rig something up with a serial/parallel port and a relay. You'd use an always on test runner computer to control the power state.

The computer under test would boot from PXE, on boot read from the drive and determine the last write, send that to the test runner for analysis, then begin the write sequence and report ASAP to the test runner at each flush. The test runner turns the power off at random, waits a minute (or 10 seconds, whatever) and turns it back on and starts again.

In a well functioning system, you should often get back the last reported successful write, and sometimes get back a write beyond the last reported write (two generals and all), but never a write before the last reported write. You can't use this testing to prove correct flushing, but if you run for a week and it doesn't fail once, it's probably likely not to lie.

I haven't evaluated the code, but here's a post from 2005 with a link to code that probably works for this. (Note: this doesn't include the pxe booting or the power control... This just covers the what to write to the disk, how to report it to another machine, and how to check the results after a power cycle)

https://brad.livejournal.com/2116715.html?

wmf · on Feb 22, 2022

Put the usual NVMe drive in an external enclosure which is what the OP did.

klysm · on Feb 21, 2022

Write and flush and unplug the cable!

StillBored · on Feb 22, 2022

which is more difficult (and sometimes slower) than STONITH style devices which just kill power to the entire machine. The latter allow you to program the whole thing and run test cycle after test cycle where the device kills itself the moment it gets a successful flush.

post_break · on Feb 21, 2022

The problem is you can't trust a model number of SSD. They change controllers, chips, etc after the reviews are out and they can start skimping on components.

https://www.tomshardware.com/news/adata-and-other-ssd-makers...

infogulch · on Feb 21, 2022

This needs to be cracked down on from a consumer protection lens. Like, any product revision that could potentially produce a different behavior must have a discernable revision number published as part of the model number.

gruez · on Feb 21, 2022

>Like, any product revision that could potentially produce a different behavior must have a discernable revision number published as part of the model number.

AFAIK samsung does this, but it doesn't really help anyone except enthusiasts because the packaging still says "980 PRO" in big bold letters, and the actual model number is something indecipherable like "MZ-V8P1T0B/AM". If this was a law they might even change the model number randomly for CYA/malicious compliance reasons. eg. firmware updated? new model number. DRAM changed, but it's the same spec? new model number. changed the supplier for the SMD capacitors? new model number. PCB etchant changed? new model number.

ClumsyPilot · on Feb 22, 2022

"If this was a law they might even change the model number randomly for CYA/malicious compliance reasons. eg. firmware updated? new model number."

Judges are a bit smarter than linters, they can tell when someone is fucking with them

gruez · on Feb 22, 2022

That's why the examples I listed are plausible reasons for changing the model number. For firmware, it's plausible that it warrants changing the model number because firmware can and do affect performance, as other comments has mentioned.

Also I really don't see this being something that judges will stop. You see other CYA behavior that has persisted for decades, eg. drug side effects (every possible symptom under the sun), or prop 65 warnings.

sacrosancty · on Feb 22, 2022

Doubtful. That already happens with the "known to the state of California to cause cancer" labelling on products sold in California. Some companies just put that on everything when they have no idea if it contains those chemicals or not.

Siira · on Feb 22, 2022

The model should be required to be “most prominently displayed.”

gruez · on Feb 22, 2022

Okay, suppose the packaging looks like:

   980 PRO
   MZ-V8P6T0B/AM

No fine prints. The first line is the same font size as the second line. You think that's going to help the average joe figure whether it's been component swapped or not? Oh, by the way, "MZ-V8P6T0B/AM" isn't the model number from the last comment. I swapped one digit. Did you catch that? If you were already on the lookout for this sort of stuff, you'd be already be checking the fine print at the back. This at best saves you 3 seconds of time. It also does nothing for the "randomly changing model numbers for trivial changes" problem mentioned earlier. In short, the proposed legislation would save 1% of enthusiasts 3 seconds when making a purchase.

ClumsyPilot · on Feb 22, 2022

Actually yea, a random joe will be able to see that 980 isn't the whole deal, and if he does care, might dig in. Most people dont even realise that this is a possibility

gruez · on Feb 22, 2022

>Actually yea, a random joe will be able to see that 980 isn't the whole deal, and if he does care, might dig in.

If that works that'll probably be because of the novelty factor. Once it wears off everyone will just tune out the meaningless jumble of letters and only look at much memorable marketing name. See also: prop 65 warnings.

closeparen · on Feb 21, 2022

The PC laptop manufacturers have worked around this for decades by selling so many different short-lived model numbers that you can rarely find information about the specific models for sale at a given moment.

kevincox · on Feb 22, 2022

This does mitigate the benefit. But it still provides solid ground for a trustworthy manufacture to step in and break the trend.

Right now if a trustworthy manufacture kept the same hardware for an extended period of time they would not be noticed, and no one could easily tell. Because many manufacturers are swapping components with the same model number it is poisoning the well for everyone. If the law forced model number changes then you could see that there are 20 good reviews for this exact model number and all of the other drives only have reviews for different model numbers. All of a sudden that constant model number is a valuable differentiator for a careful consumer.

renewiltord · on Feb 21, 2022

True. It’s the Gish Gallop of model numbering. Fortunately, it is the preserve of the crap brands. It’s sort of like seeing “in exchange for a free product, I wrote this honest and unbiased review”. Bam! Shitty product marker! Asus GL502V vs Asus GU762UV? Easy, neither. They’re clearly both shit or they wouldn’t try to hide in the herd.