Hacker News new | past | comments | ask | show | jobs | submit login

> As usual, I wanted to check if any firmware update is available

Updating HDD firmware is something you do to resolve a very specific problem, not ... just because it's available. People are blind to the fact that updates can and do introduce new bugs.




> Updating HDD firmware is something you do to resolve a very specific problem, not ... just because it's available.

It is important to check if there is an update and what has been fixed. Like with any software, it may introduce new bugs, but blindly suggesting to "not touch, if it's not broken" is harmful too. Some time ago Samsung rolled out SSDs that were self-destructing after very short period of time and fixed this in firmware. If your SSD breaks or start having problems it is already too late to update, you have to be proactive. And hardware vendors doesn't release firmware updates for nothing, in most cases there is very good reason for that.


That wasn't an actual "fix", it was just a workaround --- the flash they were using was far too leaky and lost its charge very quickly, so they decided to have the firmware constantly rewrite in the background. Even the updated firmware won't help for machines that are powered off for months at a time.

https://forum.acelab.eu.com/viewtopic.php?t=8735


Stop encouraging the manufacturers to ship bad firmware that "will fix later".


I think the manufacturers needed no encouragement, and at this point it would take multi-national intervention to get the genie back in the bottle. The poster you replied to simply recognized the situation for what it is: Samsung is still a going concern after releasing SSDs with firmware that would brick them after a relatively short service lifespan.


>Some time ago Samsung rolled out SSDs that were self-destructing after very short period of time and fixed this in firmware.

If it broke, fix it.

If it ain't broke, don't fix it.

The latter became a rule of life because many people decided to fix what ain't broke and got burned for their troubles.


But how would you know it's broken unless you proactively check for updates?


Just checking for updates is fine, actually installing every single firmware versions is bad.

This is because embedded firmware are less moving parts more tightly packed, which makes their failure modes inevitably catastrophic; they're incapable of progressively degrading like Electron apps, but the whole system always spontaneously crash into the wall and die from just one typo, and you don't want that.


It depends on what is in the changelog. If there are performance or durability improvements, you might want to change the firmware even if you are not facing a significant issue. The downside is the risk to data on the disk - don't do that (or anything, really) on a drive that holds the only copy of something important.

If the drive is being moved to a different array or machine and the data is to be lost anyway, the risk is very low and, if the process is easy (unlike this) it might be a sensible move.

I agree "just because it's new" is a poor justification to risk data.


More often than not, the changelog just states "performance improvement" or "bug fix", without any detail on what they've done.

For example, I've been bitten before by Dell due to plundervolt updates that removed the undervolting capability under the umbrella of "security fixes".


> I've been bitten before by Dell due to plundervolt updates that removed the undervolting capability

That's enterprise computing for you.

Been bitten by that kind of shenanigan more often than I can remember.


Happened to a colleague. Not sure if he ever found out how to rollback to the older version.


Usually BIOSes can be flashed with the machine off, using a "chip-clip" programmer.


They can but the data on the chip is a combination of machine specific identifiers, config data, and BIOS code. If you chip flash it you have to get an image from some sketchy place and it will nuke your serial number etc. You can also have problems if your RAM is made by a different manufacturer. Chip flashing is a last resort.


Usually there isn't, unless there is a sufficiently serious issue with the update.


Why wouldn't there be a rollback option, especially on enterprise hardware? If it breaks something you need to revert, and quickly


HP has a bootable Linux CD that updates all the firmware on their devices, since the early 00's. We used to regularly update firmware across our entire fleet of servers during maintenance cycles. Never had a failure.


You are speaking about enterprise grade hardware which is meant to be updated. Consumer stuff not as much.


Back in the day I'd just wing it, hey, firmware update the only harddrive I own, what could go wrong.

Now if it's not on at least two drives within a few hours of something new (photo, video, work done, whatever) it seems like a terrible risk.

Part getting old, part being able to afford to have redundancy, I guess. Storage feels so much cheaper now.


And in part because you remember all the times YOLO went wrong and you bricked stuff / lost data


Well it is nice to stay current with the news about your harddrive. Wheren't there are some point some SSD from Intel or Samsung that would eventually brick themselves unless you applied some firmware?

I remember having patched some SSDs for this very reason the last time I worked on on premise bare metal systems.


The SSD firmware failure mode you're describing was a thing: https://www.cisco.com/c/en/us/support/docs/field-notices/705...


Crucial M4s certainly had one around 2010/2011. They would “fail” after 5,000 hours.


A year or two after that I had an M4 that after a certain period of uptime would just stop responding to commands, thus acting like a dead drive until power cycled (a reboot was not enough). There was of course a fw update available and it did fix it, although I don't recall it being listed in the changelog.


It killed a whole fleet of processing unit in our lab back then which had been all put in service at the same time, ran 24/7 and then failed all within the same night.


> Wheren't there are some point some SSD [...] would eventually brick themselves unless you applied some firmware?

Yeah, that was a thing with older enterprise SAS SSDs a few years ago. HP, Dell, Lenovo, Cisco, etc, all affected as they all rebadge the available hardware manufacturer's drives. And several manufacturers have had bricking-firmware problems over the years.


Stay current with news is good, stay unproven with firmware isn't.


> Updating HDD firmware is something you do to resolve a very specific problem, not ... just because it's available.

Indeed. As an anecdotical data point: number of personal and servers HDD firmware upgraded over four decades (!)... Zero.

Zero isn't not a lot.

We should ask people who know what it's like to have a lot of HDDs: does BackBlaze upgrade their HDDs' firmwares?


Here's another datapoint:

number of personal and servers HDD firmware upgraded over three decades: a couple thousand.

Our storage systems at work regularly ship firmware updates with bugfixes. I have never seen a FW update introduce new bugs. But of course there has been the occasional HDD that didn't spin up during the (required) power cycle


It's just software like any other. It's vulnerable and it has defects. So it's very reasonable to at least check if there's an update available.

> People are blind to the fact that updates can and do introduce new bugs.

I do agree that this fact is often underappreciated - the single biggest cause of breakages are usually the rollout of changes themselves. But again, it's just like with any other software.


changelogs are generally useless and manufacturers hate having to release firmware updates, so the mere existence of an update is pretty strong evidence there's something seriously wrong

it's not like web or app dev where you can ship trivial upgrades every day for free


Case in point: We update the HDDs and SSDs on our storage systems regularly. Every month there's new disk drive firmware coming out for various (OEM) drives.

I have never seen a disk fw update introduce new bugs. Firmware development works quite a bit differently than regular software development. There's usually no new "feature" that can possibly be added to a disk drive. It's not like you're suddenly getting a new desktop environment on it.

But yeah, I have seen a handful of HDDs die during the mandatory power cycle because they wouldn't spin back up. And sometimes, disks fail at a higher rate after the firmware update, but that is not due to introduced bugs but rather because the error detection/reporting functionality has improved and is now reporting issues that it didn't before. For enterprise storage systems that is actually a good thing, because a predicted failure is always better than a sudden failure.

Case in point: A few years ago, the HGST Cobra drives were leaking fluid from the motor onto the platters. There was nothing that could be done, so the firmware was updated to move the heads more (to prevent fluid buildup) and the error detection was changed to detect the slowed head movement and report that via sense codes. Of course that caused more early failures, but that's better than having a handful disks fail all at once, and the disks would have failed anyway at some point.


Depends if they've read the changelog and saw something useful or not, right?


Don't fix it if it ain't broke.


according to the release notes 0602 fixed some SMART failures. Seems pretty important to me




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: