Back in the mid 90s I was working in IT at a midsized company. We used Compaq desktops (this is the Pentium Pro timeframe). The HD in the system where quantum fireball (IIRC). They failed at an amazing rate all in the first 3 months (30%). We called Compaq at one point and said, look we want replacement drives for all of our remaining systems. After much back and forth (stupid on their part as we had 1000+ machines over their complete line, including servers, so good customer) they refused to help saying “we will only replace them as they fail.” Fine, as you wish. A week later we called into support and reported the failure of 50 HD. They did not believe us. Sent 3 technicians. After they tested the 20th HD the look on their face was priceless. At some point one of the techs noticed the taser on the one of the other work benches in IT. An hour later we received a call from the sale rep telling us that they would replace all the current drives in every system with this model as long as there are no more “mass failures.”
Had a similar experience at a previous job. I'd debugged a problem with one system down to a flaky drive, which I extracted and took to our sysadmin. Without hesitation, he slammed it down on the floor.
"It's not flaky now."
The thing is, I knew a lot more than he did about how hard drives work, but he knew a lot more about how vendors work. A flaky drive might be hard to replace. A dead one wouldn't. Lesson learned.
My boss at a drug store did the same thing to a moderately damaged case of product X (no longer remember). He dropped it a few times and made sure it was unequivocally damaged and called the vendor.
I recall a similar experience with a particularly poor-performing $10k+ 3D graphics card for CAD/CAM back in the 90s. It had gone back and forth multiple times for crashing the system, locking up and just plain chugging along relative to the other cards we purchased. The problem was definitely the card, but the vendor claimed there was nothing wrong. Every time it was sent back to us there were more wire traces added to the card and other rework done with no explanation.
After running a pair of probes hooked up to 110vac along the fingers of the VESA bus connection on the video card (and a few microprocessor chips exploded), the scorch marks were cleaned up and it was sent back one more time. We finally did get a new card that worked perfectly along with a long, vaguely threatening letter.
IIUC the same as in other contexts, the subtext being that in face of their refusal to replace these doomed HDs before a failure has registered, the failures were made to prematurely happen with a gentle stream of electrons.
To the other comments. We where a good customer that experienced an amazing stupidly high failure rate in a part that others also reported the same issue. The cost of deploying these systems and having them fail randomly broke planning and wasted the teams time and impacted our IT cost metrics. The cost of the support ticket, time and replacing and re-imaging a system was not free. We of course tried very hard to be preemptive in sorting this once we ID’ the issue. The vendor did not care, so another path was taken in order to help express our unhappiness to the vendor which resolve the issue. We paid additional money by paying for a brand name vs for white box with the idea that we would prevent this type of failure.
We've had to fry our failed HDDs because otherwise the vendor would replace them with those we've returned previously. Then they would not endure RAID rebuild and the cycle would repeat.
IANAL but wikipedia says that you just need some sort of gain, not necessarily financial benefit. Warranty service entitles you to replacement only if it's broken, not when you feel like it. In that aspect you are gaining, since you're potentially getting a replacement you're not entitled to.
I would assume it had a purpose similar to an EtherKiller [0]. Most experienced Shop techs have at least a couple of similar tools to assist with recalcitrant RMA departments.
Unless it's a Creative labs product. You'll never RMA one of those. Ever.
In the early 2000s I 'successfully' RMAd a Creative Labs Nomad Muvo MP3 player. The replacement was a different color, capacity, and had someone else's music on it.
Hmm connecting directly to 110V is both dangerous as not limited in amperage (except by the fuse which is way too high for this purpose) and in some cases not enough (e.g. some ethernet has 1.5kV protection)
It would be better to slowly raise the voltage until it blows, leading to minimal pyrotechnics on the components (smoke/burning)
Sibling comments have answered this correctly, but I like to imagine it was there as a threat to the Compaq visitors, like one might leave a loaded gun in a visible spot to facilitate negotiation.
In these parts we threaten to withhold biscuits from meetings. Sometimes we really mean it. Escalating straight to firearms sounds a little bit excessive.
You don't throw them. It's the threat of having to drink your tea (yes, coffee too) without a biccy that does the damage. Withholding tea is unthinkable.
Good reference. My first coding teacher made us read all the early BOFH stories while learning Unix and Java. Really brought that shit to life, thanks Jonathan.
Read same and could not believe that author is calling them selves "good customer" and meantime also intentionally failing hard drives with teaser. Some mental gymnastics there ;)
It is an amusing story from a completely different time and place in IT. Today people would just loudly and publicly complain on Twitter and other social media to embarrass the vendor into sorting it. Sharing a story, not speaking to the nature or the correctness of the solution. I have worked on both sides. Once when I was on the vendor side a very large customer returned a very expensive switch as RMA. A set of 16 ports stopped working. The cause in the end, the customer had gotten an SFP stuck and tried to rip it out, pulling the cage from its solder joints on the board. 3 switches in 2 months. We smiled, swapped all of them and then the sales team had a chat with the customer to help resolve the issue (their 3rd party SFP with a crappy designed latch). Is what it is.
Ours was Western Digital. I worked at a VAR before the turn of the century and there was a rumored clean room problem at WD. We had over a hundred drives that failed but they flat refused to acknowledge the problem.
Ended up having to kill the drives on a dedicated rig and send them back to keep our customers from having issues. They knew they were bad but refused to accept them back prior to failure. For a while they were returning a failing drive on the RMA too.
Yep. We had maybe 60 of these. I think 2GB, but can’t remember for sure. We were all Compaq desktops and servers (Toshiba laptops for sales) but much smaller so no clout.
We bought some drives to rotate while we waited for RMAs, but the ones we got back failed too, so we just ended up keeping a stack of drives ready to swap.
They sat next to the spare dongles for the PCMCIA Ethernet cards, that the sales folks would inevitably break by tilting the laptop on it to get at a piece of paper on their desk.
That was a fun time being a small IT shop playing with WinNT and early Linux. We did everything from writing internal apps to running Ethernet.
3 guys in a glorified closet using post-its as our ticketing/user story system. Very IT Crowd.
"In the case of the Fireball SE, the mounting of the [power] socket was so flimsy that it was all too easy to break the PCB and thus ruin the drive. (With most other drives, you would have to be a gorilla and a drunken one at that.)"
I worked tech support for a large desktop line. This was around 96 and pentium 133 days. We had a large batch of HD's that would fail. Funny thing is, a nice slap on the side of the tower would get them working again. Many times the customer would be satisfied with the solution, but I had to keep them on the line so I could get them a replacement. Can't remember which manufacture the HD's were, but rumor was that a truck broke down in the desert and it degraded the grease inside. So, they would work for a while but then lock up if drive stayed off for a while.
I remember a time when Compaq used the Quantum Bigfoot (5.25" harddrive) in their workstations. They were noisy and slow. They were also mounted upside down. The failure rate was very high.
A massive drive. I worked in computer retail at the time, and I lost count of how many customers brought their Compaq Presarios in, DOA, and one of those drives inside.
We'd sell them a new hard drive off the shelf, install it and send them on their way (they were out of warranty, of course).
In 2012 I was working at the Internet Archive. I remember the 2011 Thailand floods restricted the supply of hard disks and pushed up prices.
The best deal we could find was buying these Seagate hard disks as external drives at retail from Costco. We picked up a truck load. Then had somebody with a screw driver remove them all from the enclosures and install them in new servers.
The error rate was high, the drives didn't last long in production.
I did precisely this too, as I needed storage for a few thousand hours of video footage I’d shot for a documentary, needed it now, and nobody had anything other than external HDDs in any kind of useful size.
I was really careful - I had two NASes, each with raid 5 volumes holding a full copy of the footage, as it was irreplaceable.
I got up one morning to find a disc had failed in one of them. Swapped it. During the restripe, another failed. Oh well, I have the other NAS, I thought, and after replacing the busted disks and reformatting the failed one, started copying the data over again - and then a disk failed in the backup nas. And then another. And then another in the previously failed one. And that was that. Five disks, all dead, within a day of each other, all the same model. The WD drives in them were all absolutely fine.
I haven’t bought a seagate drive since, and never will again.
That's the reason I never put the same brands/models in RAID. It is logistically a bit more complicated, but I've often noticed failures like that really close in time.
Ye copying sick HDs always make me nervous. It is like taking your really old and scrappy car for a last ride to the salvage yard going over the whole country.
Hah, I remember a pallet or the now-empty IA enclosures showing up at Noisebridge. They were super useful - a USB3-SATA adapter. I still have mine for occasional data spelunking.
I don't know if Backblaze's usage predates it becoming the common way to describe this practice, but it's definitely caught on. For example see any deal post on an external drive on reddit /r/datahoarder
I worked at an ISP in 1996, where we would buy external Boca 28.8k modems in bulk from retailers and take the boards out to put them in our racks because they were cheaper - yet identical - to the "ISP-grade" versions that Boca sold as just boards.
Back then we referred to it as shucking the modems, so it's definitely not a new term in IT/computing (never mind the actual origin of the word).
The plausibility of a conspiracy theory diminishes as the number of people necessary to keep the secret increases.
How many people work in Thailand's hard drive manufacturing plants? Hundreds? Thousands? How many people in the logistics channels? How many people are vending food to these employees? How many people in the US Government? Oh, and pulling this off would also require co-ordination with Thai Government. That's hundreds of people right there.
At the time it was kind of an open secret in the industry. The floods were a convenient explanation, but a very large order from a client with a confidentiality clause ( very common in the industry) was widely known to be the main reason for short supply. It is possible that this client was someone else other than the NAS er... NSA but it is not very likely and at the time it was conventional wisdom among insiders that the "big spin" was the cause. There were even jokes about how all those drives spinning had to be oriented si that the precession wouldn’t slow down the earths rotation etc.
Just because a customer orders a few gazillion of product x and wants to keep it quiet doesn’t make some kind of nutty conspiracy theory. It happens all the time. The Thailand floods drew down production even more, but without the giant order, there wouldn’t have been the same impact.
Fun fact, any time two or more people want to keep something on the DL , it’s a conspiracy. Nearly everything that an intelligence agency does, is de facto, a conspiracy.
One of my favorite memes (in the sense of memetics) is the meme "conspiracy theorist". This phrase was weaponized by the CIA as a memetic self-defense mechanism. (Their own documents admit this!) But if you tell someone this, the meme itself is triggered in their mind, and their instinct is to not believe you, because you sound like a conspiracy theorist.
>This phrase was weaponized by the CIA as a memetic self-defense mechanism. (Their own documents admit this!)
That seems... totally expected? The problem comes when you're trying to use it as proof of something. Some conspiracy theories do turn out to be true, but that doesn't mean every conspiracy is true.
> but a very large order from a client with a confidentiality clause ( very common in the industry) was widely known to be the main reason for short supply.
So, just for the fun of it: Forbes estimates a capacity (at the time) of up to 12 EB [1]. Assuming they used 3TB drives, they'd need 4 million drives for this [2]. This article [3] shows the market for consumer drives at about 150 million in 2013 (figure 7). So the NSA order would've been in the ballpark of 2% of consumer drives sales - insanely large at the face of it, yes, but probably not enough to cause such a shortage.
I know of people who were delivering software to certain customers that were required/intended to work on EB scale datasets over a decade ago, and this was well before the Utah data center was a thing.
I have a feeling that just comparing the numbers without looking at any of the logistics is just a convenient way to dismiss a theory you don't like.
There's all kinds of things to think about like how 2% may be less than 100% but in an elastic supply of a globally distributed good, 2% of (total yearly) supply called upon all at once can be a _huge_ departure from the norm. Maybe the shortage isn't just a hard disk shortage but a shortage of shipping space to take those hard drives anywhere else but to the NSA. Maybe it's that the factory(s) don't make 100% of the drives in an instant and then just distribute them for the rest of the year, but slowly make a constant stream of drives, all of which were swallowed up for a period of time by the massive order, causing the shortage.
There's a million other factors besides just "this number smaller that this number".
Cant edit on phone but we can also consider the size of the drives. We can only guesstimate what sizes the NSA wanted but to your source we can say they needed 12 EB. Out of 150 million drives, there might not have been 4 million 3TB drives out there, this was 2014, maybe there was but they didn't want all the exact same drive, if you calculate that some of the drives are 2TB or 1TB the estimated numbers only get larger. I think it's totally reasonable that a single order of 12,000,000TB to a single location in Utah could grind the widespread consumer market to a halt for a bit, especially if they're almost all made in one place, which it seems most of this discussion is taking for granted.
It looks like you've picked only the most favorable statistics and ignored everything else in formulating your opinion/response.
> I have a feeling that just comparing the numbers without looking at any of the logistics is just a convenient way to dismiss a theory you don't like.
I really have no feelings for this theory either way, I simply don't think it's realistic.
> There's all kinds of things to think about like how 2% may be less than 100% but in an elastic supply of a globally distributed good, 2% of (total yearly) supply called upon all at once can be a _huge_ departure from the norm.
I generally agree with that. However, for one, we're talking about consumer goods within a growing marked - a demand jump of 2% should be within expected deviation and not generate the problems we've seen there.
> Maybe the shortage isn't just a hard disk shortage but a shortage of shipping space to take those hard drives anywhere else but to the NSA.
3,5" hard drives have a volume of ~ 390.000 mm^3. Let's say that's 3.000.000 mm^3 shipped. That would allow ~10.854 drives in a standard container, for a total of 370 (assuming 4 million drives). A large container ship can pack 850,000 containers. So yes, it's a large volume, but for global shipping it's a drop in the bucket.
> Maybe it's that the factory(s) don't make 100% of the drives in an instant and then just distribute them for the rest of the year, but slowly make a constant stream of drives, all of which were swallowed up for a period of time by the massive order, causing the shortage.
Sure they do. But, again: If we're assuming 2% of total drives, it would be the output for one week. Not ideal, but not a major shortage. And that's the worst-case scenario: Orders in the billions of dollars don't usually work in the time frame of one week; it's quite likely that the manufacturers knew of this well in advance and were able to upscale/order accordingly.
> Out of 150 million drives, there might not have been 4 million 3TB drives out there, this was 2014, maybe there was but they didn't want all the exact same drive, if you calculate that some of the drives are 2TB or 1TB the estimated numbers only get larger.
Sure. But 4TB drives and probably 6TB were also available at the time and it's reasonable that for extremely large storage, they'd shoot for larger drives - if only for space and heat efficiency. There surely were a few smaller drives as cache etc. here and there, but I'd presume the lions share to be large drives.
> I think it's totally reasonable that a single order of 12,000,000TB to a single location in Utah could grind the widespread consumer market to a halt for a bit,
I won't disagree with that it surely did impact the market a bit, but I highly doubt it was a major cause for the shortage. Especially when there's a perfectly fine explanation of a tsunami hitting where most drives are manufactured.
> especially if they're almost all made in one place, which it seems most of this discussion is taking for granted.
Most likely they'd order from multiple manufacturers, alone for capacity and failure probability reasons. In fact, if they actually ordered in one place, you could easily get evidence for this theory by showing the problems originating from one specific manufacturer.
> It looks like you've picked only the most favorable statistics and ignored everything else in formulating your opinion/response.
I highly disagree with that. I picked the largest estimate of the data center size and the capacity commonly available to consumers at the time; additionally, I only used a (low!) estimate for the size of the consumer market and completely disregarded the enterprise one. If anything, I probably overestimated the impact.
A said above, the massive buying surely did not help the shortage and I dislike the NSA as much as the next guy. But unless my estimation is off by an order of magnitude the numbers simply don't add up - even assuming the ordered with a delivery date of yesterday! - and given that there's a very reasonable alternative explanation for the shortage, I don't think this is a sound theory.
All big manufacturers bought whatever supplies they could get their hands on as soon as the impact of the floods was clear.
Companies are uniquely positioned to minimise risk in their ability to execute.
I remember this period well and aware that a number of large players basically bought up whatever supply was available to cover their needs in the short to medium term. The result was that retail pricing more or less doubled (if you could even get drives).
> The plausibility of a conspiracy theory diminishes as the number of people necessary to keep the secret increases.
Not if you can discredit those who dare to speak out by labeling them "conspiracy theorists," thereby discouraging others from coming out as well.
I remember an episode from an old American sitcom where people make fun of the dad (an airline pilot) when he claims to have seen a UFO. It is the same thing. Any deviation from the standard narrative isn't tolerated by most people.
... but aren't the observations itself part of the standard narrative, and the nonstandard part is how people explain them? (One standard explanation being that they are just some next-level high-tech stuff being tested.)
> but aren't the observations itself part of the standard narrative
Only in a technical sense.
When people talk about UFOs, they are not referring to known things like weather balloons, planes or fighter jets, but, instead, to something out of the ordinary. However, every "standard" explanation offered is one of these things.
I assume you mean UFOs as in "alien flying saucers".
I think the rationally correct belief is that we simply have no evidence of aliens at this time, and since we virtually know absolutely nothing about what pilots saw (all we know is that there are these observations), we just fall back to our priors (base assumptions). Sure, it might be aliens. And hence the ancient aliens meme. It might all be secret military stuff. (Or the mix of the two, hence the storm Area 51 meme.)
And in this case I don't see what's wrong with the standard narrative.
If there are truly aliens flying regularly in our atmosphere we should strap a few 4K recorders on regular airline planes, and we just have to wait.
Not to mention the increasing number of space launches and spacecraft up there. And increasing number of downward looking observatories.
You can come up with any number of scifi-ish explanations: aliens, humans from the future, lost/forgotten civilizations. But it could also be someone testing an uncommon craft.
Frankly, we don't know, and one of the reasons is, I believe, people are hesitant to take a serious look at it (or even talk about it) because they don't want to be classified as kooks.
You overthink this. The flooding might not have led to a crisis alone if the NSA or any other big buyer didn't place a huge order at that time. You also don't need a conspiracy to keep the procurements of a government agency confidential.
Because of Big Santa (ie. parents). And that promptly fails when the cost of maintaining the Santa system gets too big (ie. as the child's mental faculties develop, plus as the child interacts with the outside world not controlled by the parents).
So unless we all are living in a very well orchestrated NSA Truman show, it's unlikely that your argument applies.
That doesn't matter. Millions of adults orchestrate a secret conspiracy without any direct coordination even. The faculties of the target will only change the coordination and sophistication required.
If everything in the world were an illusion, I doubt that any group would be able to orchestrate that for all of us. But large groups can coordinate secret happenings, and they do, all the time.
The difference is whether the entire organisation would have to know or not. "Actually, our HDD factory is operating" is both a thing that lots of employees would necessarily know (unlike CIA assignments in Canada), and also those employees would not be as trusted as high level intelligence agency operators. Particularly for low level employees, like contracted cleaners or whatever "Don't tell anyone the factory is operating" seems like it would be a great way to get the information out there that it actually is.
You tell the employees that they are working for a HDD factory. A small number of people at the top of the company are in on it and keep sales limited to one or a small number of clients, all of whom are the clandestine buyer.
That would be similar to the way in which the fake Swiss cryptographic machine manufacturer was run. Their only leak risks were from employees who realized that their cryptosystem was easily broken. An HDD manufacturer would not have that problem because all of its work would be perfectly legitimate.
Perhaps you should read Hogfather or Making Money by Terry Pratchett. In which the importance of the lies that society tells itself is touched upon.
"All right," said Susan. "I'm not stupid. You're saying humans need... fantasies to make life bearable."
REALLY? AS IF IT WAS SOME KIND OF PINK PILL? NO. HUMANS NEED FANTASY TO BE HUMAN. TO BE THE PLACE WHERE THE FALLING ANGEL MEETS THE RISING APE.
"Tooth fairies? Hogfathers? Little—"
YES. AS PRACTICE. YOU HAVE TO START OUT LEARNING TO BELIEVE THE LITTLE LIES.
"So we can believe the big ones?"
YES. JUSTICE. MERCY. DUTY. THAT SORT OF THING.
"They're not the same at all!"
YOU THINK SO? THEN TAKE THE UNIVERSE AND GRIND IT DOWN TO THE FINEST POWDER AND SIEVE IT THROUGH THE FINEST SIEVE AND THEN SHOW ME ONE ATOM OF JUSTICE, ONE MOLECULE OF MERCY. AND YET—Death waved a hand. AND YET YOU ACT AS IF THERE IS SOME IDEAL ORDER IN THE WORLD, AS IF THERE IS SOME...SOME RIGHTNESS IN THE UNIVERSE BY WHICH IT MAY BE JUDGED.
"Yes, but people have got to believe that, or what's the point—"
I'd still recommend the book though. There are so many things that are little more than flaky veneers taken for granted in adult human life that the fact anything works at all generally comes down to an intentioned group of people 'conspiring' to make it so, and no one ever bothers to ask about it, as it 'just werkz' and creates value. It's only a problem when too many externalities get ignored, in which case, everyone starts to notice.
Once you understand this, the beauty of the 'conspiracy theorist' meme coming out of an intelligence agency to get people to stop taking those who ask inconvenient questions starts to take on a stroke of genius. You get people who rock the boat discredited out the gate, and reinforce the status quo.
...Until that document gets declassified and blows up in your face anyway.
But where would a person speak to disclose the secret? Internet is very effectively censored by the US monopolies now.
If something is not indexed by Google, it's effectively non-existent. FAANG censorship is fast and effective. E.g. there were many important US bureaucrats involved in the Theranos scandal (even current US president). When I try to find anything related to that matter now, Google returns pages with many links removed by censors. It's almost impossible to find people responsible for withdrawing money from US budget via Theranos.
Also, assuming that 3 letter agencies were involved in the operation, any whistleblower would understand that destiny of Epstein will await her/him in near future.
Also, there's no "independent" journalism, all media belong to few influential entities, there's only approved articles appear in newspapers, magazines, on TV and YouTube.
Literally anywhere, the internet isn't consored anywhere near as much as you think.
If someone posted a plausible/verifiable account of what happened in a HN thread, then it would've made it halfway round the world before the NSA could take it down
One thing I learned recently, which made the effect of the 2011 flood on the HD market worst: Aramco purchased a big chunk of several months of the world supply of hard drive to recover from Shamoon in 2012, by replacing every HD in the company [1]
Or they managed to fill it with hardware without having to coordinate an intricate conspiracy involving supply chain and manufacturers across the globe.
Nobody is alleging that but you. Personally I don't think either "surveillance system needs a way to store what is collected" or "high-volume customers have economic leverage to demand confidentiality" are particularly intricate concepts.
Colordrops said, "The Thailand story was a cover." This is the allegation. Not that the NSA bought a lot of drives and kept it secret, but that this purchase alone rather than the floods were responsible for the drive storage, and the flood story was fabricated. That is the alleged conspiracy.
I find that a number of conspiracy theories are 'debunked' by creating an obviously ridiculous strawman/rendition of the theory that nobody actually claimed.
The Snowden effect has worn off, it seems. Nowadays the NSA is back to being able to pull the most extreme stunts and everyone who thinks it is even remotely possible is called a crazy conspiracy theorist.
To claim that the NSA caused or faked the Thailand flooding is conspiracy theorist bullshit of the highest order.
To claim that the NSA took advantage of an existing event to help their already-planned ordering fly under the radar more easily is perfectly plausible. And, honestly, pretty boring.
Which one are you alleging? Choosing your words carefully might be wise.
The NSA has no shortage of front companies to do such jobs. For instance, when the CIA needed to place torture experts in Brazil so they better could support the Brazilian military dictatorship, companies like Ford and Coca Cola were more than happy to offer them management jobs.
Good thing NSA et al aren't also competing for parallel-computing hardware or it would be even harder to buy a new GPU. Luckily all participants in online discussion forums are real humans who can tell me what to blame :)
Because it doesn’t sound like one of those crazy conspiracy theories. They need to feel privy to some knowledge “the others” don’t have and refuse to see.
First time I hear this but I think timelines match up and at this point I think it is completely believable. (Not saying I'm sure it is actually true, only that it won't surprise me more than if it is untrue.
Before the ST3000DM001, the ST31500341AS became infamous, both for its failure rate, and its firmware bug.
The latter, if memory serves, was the bug where the drive had an internal event log that was a ring of 256 entries, and if it was at the last entry on boot, the hard drive's code on bootup for handling rolling back to 0 was (usually) broken, causing the hard drive to not start up any more. (I say usually because for this particular bug, regardless of whether I'm associating it with this model correctly, there were anecdotes of repeatedly power cycling the drives causing them to eventually boot normally, whereupon you could eventually apply the firmware update.)
If you look up that model, you can find people commenting on stories about the ST3000DM001 asking "so when's the class action for the ST31500341AS?"
edit: I see someone else in this discussion also remembers that family even better than I, because I forgot (or didn't update fast enough to encounter) the bug that was present in the first firmware update to fix this.
In the late 1990s there were a slew of Connor hard drives with leaking seals. The outside of the platter would become corrupt and that would proceed inward.
Seagate bought Connor in the middle of this and refused to honor the warranties. From what I recall a lawsuit followed and Seagate eventually capitulated.
Ha, it's one of the few HDDs that have their own Wikipedia pages [0], and even more so, it's one of the only two Seagate drives [1] - the other was the ST506/ST412, Seagate's first product! It just shows how infamous this 3 TB Seagate drive really is.
Other HDDs on Wikipedia include: IBM Deskstar (equally infamous), and other historical milestones like the DEC RK05 (classic HDD for PDP-8 and PDP-11).
I just hope the upcoming Heat-Assisted (Seagate) and Microwave-Assisted (Western Digital) Magnetic Recording drives won't repeat history.
Had a deathstar before the drives were known to be widely failing.
Decided one day to consolidate my backups from CDR to DVDR. Copied CDRs, deduped the contents and could fit about 15 CDRs worth onto one DVDR. Satisfied, I cut the CDRs and rebooted my system to prepare for a DVD burn.
The drive died on reboot.
Ended up taking a sledgehammer to the drive to get the anger out of my system.
I had a 75GB "Deathstar" at the time and was really worried that it would fail. I was a teenager and couldn't really afford replacing it, nor have good backup solutions.
I accidentally pulled a power pin while removing the molex connector and had to solder on a modified power extension cable.
The drive kept on living for many years and IIRC it never failed while in active use.
Thanks for the details! I just came here to post them, because I found them interesting. I certainly didn't realize that the image was just added nor that I would have the chance to thank the contributor.
I was responsible for maintaining some large glusterfs clusters built on top of these drives, and they certainly added excitement to the experience.
1) At one point we were experiencing disk failures often enough that we wrote a cronjob to detect drive failures via smartctl and automatically send an email to our hosting company requesting them replace the drive. This saved engineering time, and more importantly reduced time to drive replacement, because
2) On at least one occasion, we had the third drive in a RAID6 array fail before we had rebuilt from the initial failure, leading to loss of the array. We think the increased load of the rebuild increased the chance of subsequent failures. Needless to say recovering from this destroyed all plans (and sleep) for the weekend it happened.
That serial number sends shivers down my spine. I also used to work on this system, and still do! At one point seagate admitted that the firmware was faulty. It would periodically stop responding under load, and cause the raid controller to think the drive had failed, causing it to remove it from the array.
I’ve never seen this posted publicly by them, but I’m fairly sure a revised firmware was offered. We deemed it too risky to upgrade the drives one at a time, so built a new cluster on Western Digital drives.
These things always happen in the weekend don't they? When I managed a large network of servers, it seems that all the outages happened during parties and/or the weekend.
This makes sense though doesn’t it? Rebuilding an array is a very heavy load op, and putting components that have been around for a while under heavy load seems like a good and fast way to expose failing parts.
> These things always happen in the weekend don't they?
Thats because servers know.... oh they know. Tuesday morning when you get into the office? Naw... that's too easy. Let's shit the bed on Christmas morning at 3:49am. That will let those stupid humans know who is boss around here.
I remember this fiasco. There was a class action lawsuit that was filed. I got curious as to what happened with that, and it turns out... nothing. From the wiki:
> On June 15, 2018, Judge Joseph Spero ruled that the class action plaintiffs must separate into multiple classes, as there was too much variability in failure rates to combine all claims a single class. In 2019, the plaintiffs were denied class certification a second time.
Fujitsu not being US corporation didnt get so lucky. Their 2005 annual report listed HDD litigation-related expenses at 10,220 mil yen = ~$100mil
Mmmm Fujitsu, the sweet sweet smell of conifer forest. The flux they used on those drives smelled amazing!
I worked at European Fujitsu distributor in late nineties. Absolutely the best non IBM(1) hard drives you could buy at the time. Cheap, fast, dead silent, super reliable. Then this happened:
“blame was laid on the supplier of epoxy mould compound used in the manufacture of Cirrus’ Himalaya 2.0 and Numbur chips”
Symptoms were drive not being detected, reporting as garbage corrupted strings, not spinning up or even clicking. The irony is they are 100% perfect mechanically. Btw I read somewhere swapping pcbs without making sure they have same firmware rev on board might result in service area corruption requiring actual specialist knowledge to recover (or running premade script in PC3000).
(1) and we all know how IBM Deskstars turned out ;-( Amazing and fast drives, until the sad conclusion.
I worked at a small web hosting company back then, and we used a bunch of IBM Deathstar's. They certainly lived up to their name! I disassembled one just to see what went wrong, and I found the cause (of that drive's failure, at least).
At some point, the bearing that serves as the pivot that the actuator arm and actuator attach to, is attached to the chassis of the hard drive with a bolt on the bottom of the drive. At that time, most drives also had a bolt that went through the cover. The Deathstar did not. It was only attached to the chassis. Which was fine, and that's the method commonly used since then. This particular drive had no thread lock on the bolt, and it had come loose, leading to a catastrophic head crash.
I don't know if other Deathstar's suffered the same fate for the same reason. The Wikipedia article on the subject says it was due to the magnetic coating of the platters coming off.
Back in the day, my employer sold "disk cartridges", which were 24"diameter magnetic-coated aluminium disks in a plastic case. Head crashes were fairly common,, and made a horrible noise.
We'd open the case of a crashed cartridge, extract the platter, and hang it on the wall; the crash would produce a rather spectacular sunburst pattern in the magnetic coating.
Ah man, there’s something about reading about physical hard drive failure that really makes me cringe. Really beautifully intricate technology, but man, when things come out of place…
Hard drives are one of those pieces of precision engineering that we encounter on a regular basis - it's quite amazing they work at all given the conditions we submit them to.
And then they cleaned up their act, sold to Hitachi and are now the most reliable disks under HGST brand, currently owned by WD. Somebody suggested you should buy disks from manufacturers that have relative recent history of a catastrophe. Those should be cutting less corners.
I had one of those in my Amiga. I remember friends telling dull stories about those drives, but mine ran for 8-12 hours a day, every day, for years. It is a bit flaky today, but it still spins up.
Every time a submission like this pops up I expect someone who worked at Seagate during that era to pipe up and explain what went wrong and why ... but I haven't seen it yet. Surprising.
I think it's sad that the hacker/maker community doesn't have enough knowledge/skill/inclination to disassemble this kind of thing and figure it out for themselves.
It's a bad sign when tech complexity gets out of the expertise of a hobbyist - I believe that's the start of a stagnation of that technology as new young engineers never get excited about disassembling and modding hard drives in their free time, and in turn it becomes paid engineers only who, I suspect, are less innovative.
When was the last time you heard someone who had modded their hard drive to, for example, seek faster?
The technical answer of what part failed and why is only a small part of the overall answer, though.
For example, with the Challenger explosion, it's O-rings vs. the culture and organizational deficits that led to: the O-rings, failure to revise the seal structure despite evidence of problems on prior launches, and the final decision to launch Challenger.
All of that is arguably the most interesting part.
Equipment cost is a huge problem there. You need a proper clean room if you're going to tear apart HDDs and expect them to work again when you reassemble them. Not to mention, the extremely precise tolerances involved would make mechanical mods a tall order indeed.
Firmware mods would be more doable, assuming it isn't locked down.
If you want to give it a go, I've still got a bunch of these drives lying around. We also have some still running in our zpool which I would gladly replace with something else, we're running out of space yet again... ;)
The difference is, these hacks we've mentioned are software and solid-state electronics hack. On the other hand, doing a hard drive failure analysis in the mechanical realm is totally a different challenge, a hard drive is hard.
> The Seagate 3 TB models failed at a higher rate than other drives during the Backblaze deployment, but in fairness, the Seagate drives were the only models that did not feature RV (Rotational Vibration) sensors that counteract excessive vibration in heavy usage models -- specifically because Seagate did not design the drives for that use case.
Do drives in this type of stationary assembly (i.e. a server rack or a stationary PC) ever trigger their vibration protection? Barring of course during a bit of bumping in the server rack. I mean, these drives are not sitting in laptops. I have a hard time thinking the lack of this feature bears any weight on the huge failure rates Backblaze and discrete customers saw, though I'd accept the argument if there was a 2.5" laptop model of that drive seeing the same failure rate.
As I understand it another driver in the same rack doing work is enough to effect the performance of the other drivers in the rack. This is why you use fancy expensive "enterprise grade" drives as they can cope better with it.
> In the presence of an external vibration caused
by adjacent disk drive seek operations the bandwidth
performance of a consumer-grade disk drive “feeling”
these vibrations will decrease about 10%-15% when
reading data and about 25%-40% when writing data
Don't really know if this is enough to actually cause damage if the disk does not have the hardware to detect this vibration.
I don't know if the latency spike is caused by sensors detecting the extra vibrations and delaying a read or seek operation or if its just vibrations causing minor read-head errors which either require a 2nd pass (that isn't reported through to the OS via SMART) or some sort of extra processing to pull the correct signal out of the raised noisefloor.
Edit: person shouting at the disks is Brendan Gregg, not Bryan Cantrill.
Is there any way to tap this hdd sensors data on linux? I remember there were some apps on windows that read these hdd sensors data and use them as accelerometer / gyro, which was pretty neat when run on a laptop (this was when iBeer and virtual lighter apps were still a hot thing).
A search through data recovery forums like hddguru.com (they tend to have a secretive "trade union" attitude, but some useful stuff leaks through...) and the Internet reveals some more info like this:
Most of them seem to be head problems, and there are some outright head crashes too. Seems like bad media (leading to head failure) is mainly to blame:
Huh. So many fixes in there are listed as 'replace head(s)'. I didn't realize that was a common fix for hard drives, thinking it would require near cleanroom conditions to ensure no dust got into the drive as its being worked on, and that it would be much cheaper/faster to maintain backups and swap in a new drive.
It's a "fix" in the context of data recovery business. The aim is to get the drive in a working condition for a few hours, just enough to get an image of it. After that the drive is trashed (or more precisely, put in a big box so it can be used as a donor for other drivers in future)
> Paul Alcorn of Tom's Hardware argued that Backblaze used the drives in a manner that "far exceeded the warranty conditions" and questioned the "technical merits" of the lawsuit.
I would question this. Backblaze knows what they're doing. And at least they provide a relative reliability index as they treat all drives the same.
At my previous employer we had a 24 disk zfs storage server built using these disks (thanks to an external consultant who replaced enterprise grade disks because "they're basically the same").
While they were still under warranty we replaced 50% of them and after warranty 100% of them were replaced using another brand.
This model was really bad, but he wasn’t wrong in general - while consumer drives have a higher failure rate, you can’t pretend the enterprise version can’t fail - and if you’ve built your system to withstand failure, whether 2 or 5 fail every year makes little difference if you have competent IT.
I do make an effort to source them from as many different batches as possible, though - based on a DeskStar (“DeathStar”) experience a couple of decades ago. And there have been occasional bad batches for many models.
(The iirc 500MB DeathStar and this 3TB seagate are special in having mostly bad batches)
> I do make an effort to source them from as many different batches as possible, though
If you can, also stagger the initial power ons. There's a history of disk firmware bugs that are triggered by runtime; if it makes sense, you want to have enough of a difference in power on time to do a lossless replacement on your first disk before your second disk hits the magic number of time on.
Synology, which I’ve been using for the last 10 years or so when the project cannot afford NetApp/EMC class storage, does this out of the box. (I’m sure netapp and EMC
do too, but it’s not my problem when the disks inside them fail)
I remember using a bunch of Seagate 7200.12 HDDs around 10-15 years ago.
After a couple years, I had multiple of them failing. Some of them were replaced by the shop I bought them at, some of them were replaced by Seagate directly. Every time, the data was lost.
At some point, when the 5th drive failed, I realized that the failure was due to a firmware issue, that could be fixed by upgrading the firmware using a TTL connection and a serial console. I did exactly that, my data was accessible again, and the drive never failed again (it's still running now).
I was very angry at Seagate and other vendors for sending replacement drives with the old firmware when it was known that the firmware would cause the problem again, while the new firmware was out and working correctly.
I own a pair of those in RAID-1 mode, now retired. I upgraded the array start of this year, as one of the drives developed a fault. In all honesty, they operated continuously for 8 years without much of an issue, despite the reported flaws. They were pretty fast, too.
The 1TB Barracuda ES.2 was hardly better. Around 2009-2010, I've set up 3000 of them. 1800 had to be replaced under warranty.
After that we switched to HGST and never looked back. For the thousands of Helium HGST/WD drives I've set up since 2014, I had to replace only a handful of them overall. In fact my support account generally needs a password reset because it's been sitting unused for 1 year straight... Contrast that with the awful time when I was spending half my days replacing Seagate drives, and the other half asking for RMAs...
I won't buy Seagate hard drives unless I really can't avoid it. I missed this 3GB drive, but a while back I bought a Seagate 8TB SMR drive (ST8000AS0002) to store my backups on. It failed within months, so I got it replaced on warranty. The replacement failed within months, and I gave up on it. I was storing large files on it (just what it was designed for), each one with a par2 error correction file. Every now and again I would run the error correction, and it would usually find a few sectors that just returned the wrong data.
I don't know why anyone bothers to buy anything else than WD, seagate is making bad products for at least 10 years if not 15. Barracuda 7200.11 anyone? That was 12 years ago,I got a pair of Ironwolf 10TB, they failed in 6 months. How many seagates I saw failing? All - 30 - 50 God knows how many exactly. I have 10 years old WDs still running, actual powered on 6 years or so. I don't know why people are buying the same bad engineered products over and over again. Seagate should not exist anymore since long time ago.
WD is temporarily dead to me after their shenanigans of selling Red drives that are SMR, which in my opinion is flat-out fraudulent. When WD pledges to stop selling SMR drives to NAS customers, I'll reconsider using them again.
A Seagate may (or may not! see Backblaze stats) die sooner than a WD drive, but at least it's fit for purpose until then.
After the WD / Hitachi merger, there was an anti-trust case and WD was forced to sell some assets to Toshiba.
Toshiba took on a huge portion of Hitachi's former business, and seems to have a solid business. Backblaze doesn't buy Toshiba because the sticker is in the wrong location (and Toshiba doesn't want to move the sticker), but otherwise... their Toshiba tests look pretty good.
When you have 60 hard drives in a server, and server says "Hard Drive SerialNumber #51233142 has failed", you need a quick-and-easy way to find that hard drive.
A sticker with the Serial Number on the correct location will help you find which of those 60-hard drives are broken. Toshiba didn't have their stickers in a location that worked for Backblaze.
On the one hand, you can make 600-labels per rack.
On the other hand, you could just buy WD hard drives at virtually the same price and not have to worry about those labels at all since the sticker is in the correct spot.
Its not that its an unsolvable problem. Its that the solution is not worth the price difference between the drives.
I use Seagate all the time, and while there are failures, I find the doom-saying about them to be greatly exaggerated. Moreover, because they are so much less expensive than the competition, and because I am using them in applications where redundancy is already a requirement, and because they have a warranty, everything has worked out.
Every seagate drive I’ve ever bought failed. After the 3rd failure on an external drive used for a multi-media PC I switched to WD. Still have 10+ year old WD drives that still work.
Other infamous drives: Sun used the Quantum 105S for the main drive in SPARCstations, and that drive turned out to have a stiction problem (would stop spinning up). At a workstation ISV that was primarily Sun for its own workstations, on the rare occasion we had to power off large numbers of workstations, always at least one machine wouldn't come back. I vaguely recall slapping a drive, or dropping it and its caddy a short distance onto a desk, in the best tradition of "percussive maintenance".
Given my luck managing to find the worst lemon drives, for my subsequent home server, you can be sure that I intentionally used RAID mirroring of two different brands of drive. (Yes, it turned out that one make&model ran hot, while the other ran cool, and the hot one died.)
I brought 4 WD Green 2TB disks 7-8 years ago. All failed in the next few years because their firmware was parking the heads after 8 seconds of inactivity. And they were rated for 300.000 cycles so at the time that become common knowledge I already had one failed disk and 3 others with way more cycles than it should. Sadly, flashing new firmware didn't stop the rest from failing.
I bought a 500GB WD Green drive in 2010, and in 2011 I needed a second 500GB disk in a hurry. I went to a local supplier and all I could get was a Seagate 500GB disk. Oh well, beggars can't be choosers.
I retired them both earlier this year, replacing them with a single 2TB WD Blue drive to augment my SSD. They are both still fully functional, not so much as a SMART error.
I chalk it up to two things: The first is an extremely low number of power cycles. The second is that I never moved the computer thy were in until after they'd finished spinning down.
I also generally tend to keep drives powered up, and they indeed tend to live long. Still, I have to be on alert for failing drives. I had four of the WD30EFRX together in a 2x2 pool (roughly equivalent to RAID10). One day, one of them began showing checksum errors after a scrub, and I replaced it with a Toshiba DT01ACA300.
Once the bad drive was out of the pool, I ran a full self-test on it, which it failed. I did at least get five years of service out of it, but in any case, it's time for me to think about some newer hardware.
For my next NAS, I'm kind of leaning towards using 2x8TB mirrored, with a third drive to be rotated into the mirror, splitting off the rotated-out drive as an offline backup.
Mine was in a home NAS server, almost never powered down, or moved, they die nonetheless, the replacement I bought, mostly TOSHIBA models, still works at this very moment, NONE of them failed.
Back in the day I used IBM drives and they were super reliable. Too bad they stopped making hard drives and all the Maxtors, Seagates, Hitachis, and WDs came along with all their crap.
Back in those days, Maxtor was on my good list and WD was definitely not. At work, we had to deal with prematurely failed WDs on our desktops all too often. It took a very long time before I'd consider a WD drive again, and then in the 2000s I finally tried them again. I've been mostly happy, but now I look kind of side-eyed at them after WD Red SMR fiasco.
I was none too pleased when Seagate bought out Maxtor and turned it into their lowball brand.
I still have one of these and it has always been loosing a little bit of data here and there for the last seven or eight years. Maybe even from day one and it just took me some time to notice. You can write a bunch of data to it, check it, and it will be fine, but after a few months there are probably a few hundred or thousand new unreadable sectors. Oddly consistent.
I have this drive. It throws a "caution" in CrystalDiskInfo due to too many reallocated sectors. It has about 58,000 power on hours. I used it extensively for years for video recording.
It held up well for me, but I don't really use it much since it threw that error. SSDs seem better for reliability these days - the TBW numbers are big.
I genuinely don’t know why anyone buys Seagate drives.
Every single one I have owned with any decent level of usage since the mid-90s has died on me after a couple years. I actually bought a 4 Disk Seagate NAS and all 4 Seagate drives have since been replaced with HGST drives one by one.
I have 1 of these still in an array and my second to last one died a few weeks ago. They had high failure rates for sure but some of them held on for a good 5+ years for me. All of these drives came from shucking and a little over a year ago I moved from Seagate externals to WD externals and my pain increased 10-fold. Not only did I have to deal with the 3.3V pin issue but I've had drives die in under a year. Since then I dropped externals for internals (and the warranty that comes with it) and I went back to Seagate (Exos). Of course now buying a hard drive is insane with the prices where they are (easily $100 over what I paid before the price spike).
Wow, what a coincidence. I did some maintenance on my NAS yesterday and finally removed one of my oldest hard drives, which I had unplugged a while ago because it was too noisy (it still worked when I unplugged it). I thought the name was familiar, I checked and yes, it is the very same model. I guess I was lucky not to lose any data on it, I wasn't very good with backups back when I first bought it.
I got three of these for a home NAS, not knowing any better.
One failed within a day, the next failed within the first month, the third one lasted several years. For a long time, I didn't trust the NAS, making regular second backups (always a good idea) figuring it was the common point of failure, but the replacement drives were all working fine.
> This particular drive model was reported to have unusually high failure rates, approximately 5.7 times higher fail rates in comparison to other 3 TB drives.
Some anecdata: I have exactly two of these drives in my home NAS. The other two were WDC_WD30EZRX, one of them already failed, while the ST's are still going. All purchased in 2016.
I had one that came in in a Macintosh G4 tower, and it died suddenly with a lot of photos that I wanted to save. I put the drive in the freezer in a plastic bag for about an hour and was able to boot the OS long enough to burn a DVD with the files that I wished to save.
I'm much better about backing data up regularly now.
I had a a moment of "Well, that's weird, I recognize that name." I have one sitting in this very PC right now. Used it as a gaming drive until I was able to buy a 1TB SSD but it's still trucking along.
Hard drives tend to fail early or after a long, long period of use, without much of an in-between. If it survived the first year of use then it should last as long as any other spinning disk drive.
I knew that looked familiar...
https://i.imgur.com/ITCFzeu.png (order: middle, bottom, top). Lucky me (not sarcastic, DOA is far from worst-case scenario).
I would love to know what the specific physical difference is with Seagate drives in general that makes them less reliable compared to WD, toshiba, etc.
Nobody ever seems to describe that other than saying vaguely that they use cheaper parts.
I had a backup JBOD with 36 of these drives. I thought since it was a backup machine that rarely will be used I could buy non-enterprise drives...
Big mistake. There were 47 disk failures from 2013 to 2018.
I guess I got lucky then! I've had one of these for quite a long time, maybe since 2013/14. A good number of those years it's been running 24/7, and I actually only recently replaced it
You're probably thinking of the Barracuda 7200.11's event-log suicide-roulette firmware bug. It was the first time I ever had to update the firmware of a fixed disk, and I remember it being a huge pain since the revised "SD1A" firmware came with a brand new bug and had to immediately updated again:
The IBM Deskstar drives from the early 2000's are the most infamous to me. I remember having to put one in the freezer for a bit in a last ditch attempt to get one more boot out of it and copy data off before it was lost for good. Early 2000's computer hardware in general was a total nightmare at times.
Me too. I missed the 3TB debacle because my Seagate 750GBs and 1.5TBs had all croaked, so Seagate had lost me as a customer by the time the bigger drives came into my price range. Silver lining, I guess. I've had great luck with Hitachis and Toshibas.
> Please don't do things to make titles stand out, like using uppercase or exclamation points, or saying how great an article is. [...] please use the original title, unless it is misleading or linkbait; don't editorialize.
Most of the time, this rule does more good than harm to prevent clickbaits, but it's not perfect and occasionally causes problem if the original title is not descriptive. In fact, there are multiple incidents that edited titles were reverted back by the mods, much to the complaints of the readers, I guess it's a small price to pay. Sometimes you can be sneaky so it won't be noticed, saying "Seagate's infamous ST3000DM001" is unacceptable, but the mods don't care if you say "Seagate ST3000DM001".
I once replaced all of my friend's filing box labels with new labels, whose cryptic text had an aura of mystery and adventure. Strangely, my friend did not enjoy the adventure I had set him upon. If you know what I mean.
I like to think the flood was a cover for some government buying a mind boggling quantity of hard drives. All for some massive spying operation. Obviously I have zero proof or even the tiniest bit of evidence. Just my personal conspiracy theory.