I am very curious about how long this hack took to complete. I write firmware for SSD controllers for a living, and this would probably take me many months of full-time work to pull off with an unknown controller (granted, I generally work on algorithms at a slightly higher abstraction layer in the firmware, and some of my colleagues who are more focused on the hardware interfaces could figure something like this out much faster than me). I am incredibly impressed by this effort.
Also, I want to mention that it's common to have multiple processors in storage controllers. I can't talk about the specifics of the drives that I work on, but for SSDs at least there are several layers of abstraction: the host interface to receive the data, a middle layer to perform management of the data (SSDs require things like wear leveling, garbage collection etc in the background, to ensure long life and higher I/O speeds), and a low level media interface layer to actually write to the media. These tasks are often done by different processors (and custom ASICs).
It took about half a year, mostly slaving away in the evenings and my day off. That was mostly because the WD firmware is pretty complex: you gotta remember almost 20 years of ATA/SATA/... cruft is stored away in the firmware, and the firmware doesn't help you by having any strings embedded in it.
Yep, I know guys who worked at WD and they mentioned how cluttered their firmware is. They have an incredibly rigorous code check-in process, which means not a lot of code gets cleaned up, only more code gets added. It sounds like a mess.
"At this point mechanical drives are pretty much dinosaurs, it would be only useful to tinkerers"
That may be true for end users / consumers, but I think that the advent of ZFS makes this particularly interesting for anyone that uses that filesystem.
The reason is, unlike most "RAID" that we've all used these past 20 years, ZFS does not want you to give it a raid set, or to put a raid controller between the disks and the OS.
Instead, the best practice is to present the raw disk to ZFS and let it do all the work. But that means you're more exposed to funny business on the part of the drive, etc.
I have, mostly. All my code, minus the directly-applicable shadow-password-hack thing ((to thwart script kiddies, if you're a bit knowledgable ytou should be able to re-implement it yourself) can be gotten from http://spritesmods.com/hddhack/hddhack.tgz
If you were looking at uncommented C with meaningless variable and function names, do you think you'd be faster? I think part of what makes it seem daunting is being familiar enough with the disassembly to read it quickly and easily.
If you already knew roughly how a HDD controller would need to work - what kind of I/O it would need to do, both at the controller end and the head movement end, couldn't you work backwards from there?
Reversing something like GC as needed in an SSD I can see as being harder, but again, wouldn't it largely be a matter of knowing that a certain number of parts of particular shapes - whether it's marking, tracing, copying / compacting, etc. - need to exist, and and finding them?
I think that if you really got stuck into the problem, it wouldn't take as long as you think. A lot of it is familiarity and fluency with the disassembly, I reckon.
(The closest I've gotten to this professionally was in reversing the Windows kernel and related user-side DLLs to debug a particularly thorny issue we had with the Delphi compiler, when I was at Borland. Not quite the same, but still a lot of staring at disassembly with not a huge amount of help. Ultimately, it only took a few hours of effort, in large part because there's so much tooling support, and things like messages and public symbols were available.)
Being more familiar with assembly would definitely help. The only time I look at assembly is when I'm using a JTAG debugger, which is not my primary activity.
Our firmware has a lot of ASIC pathways that are only accessible through register reads/writes, and in some circumstances only statuses are available through the registers, so I believe reverse engineering an SSD would be more difficult than it first appears.
I work on storage systems employing SSDs at a higher level still, hearing and learning more details on how things are working underneath is immensely useful and gets rarely discussed with the FSEs which is a shame. If there are things that you can share from your experience it would be great to learn more about the inner workings of SSD firmwares.
So far I only get to learn through the (few) failures that we actually see. That's the only time I get sufficient access to technical folks to ask the pointed questions and get a sensible response.
Request: Please add a "bypass mode" so operating systems that want to can use the flash blocks directly (in other words, a /dev/mmcblk-like interface).
The business problem with this is that the vendor will no longer be able to provide any guarantee on the product since it will strongly depend on the user handling. It will also lose most if not all traceability to be able to say that the user did something wrong or didn't do something that was ought to be done to maintain reliability.
If some enterprising hardware folks wanted to create a PCIe or SATA attached flash with minimal interference for external control it would be very interesting to me too. I wanted to buy OpenSSD device for experimenting but at a price tag of $3000 it is well beyond reach for me for pure experimentation.
I wonder if crowd-funding can help bring such a thing to life.
This could be solved with a simple fuze or lockout bit that is set. "if you enable this mode you void your warranty, your drive will be marked as such"
I appreciate the spirit of the request, but to be blunt any large company would probably never spend resources on this. Basically large SSD manufacturers are focused on billion dollar markets, and the demographic who would appreciate a "bypass mode" is probably so small that the company would never recoup money spent on implementing the feature.
I think it's rather unfortunate that the workings of modern HDDs (and other storage devices, like SSDs, microSD cards, etc.) are all hidden behind a wall of proprietariness, as this is mainly a form of security through obscurity; and government agencies probably know about such means of access already, while not many others do.
Although they're largely obsolete today, for many years the most well-documented and open storage device that could be connected to a standard PC was the floppy drive. The physical format was standardised by ECMA, the electrical interface to the drive nothing more than analog read/write data and "dumb" head-positioning commands, the controller ICs (uPD765 and compatible) interfacing it to the PC were based on simple gate arrays (no need for any firmware), and all the processing was otherwise handled in software. The documentation for the earliest PCs included the schematics for the drive, and the ICs on it were documented elsewhere too - e.g. https://archive.org/details/bitsavers_westernDigorageManagem... A lot of the technical details of early HDDs were relatively open too. I've interfaced a floppy drive to a microcontroller before, and being able to see how the whole system works, to understand and control how data is read/written all the way down to the level of the magnetic pulses on the disk, is a very good feeling.
It's not really aimed at security; that's not a priority for consumer hard disks where cost/GB is the main criterion. It's more a question of three factors:
- laziness: publishing quality documentation costs money
- fear of competition: publishing info also helps your competitors
- latency: given that far more computing power can be fitted on a chip, and the relative cost of sending some data versus processing it locally has changed dramatically, a modern computer is a distributed system cooperating over network-like links.
True, I was mostly referring to design security (the competition factor), but most HDDs do support setting a password which is not hard to get around (the data itself isn't encrypted using it, since that would introduce some other problems with being able to change the password.) Consumer disks are also better not encrypting by default, as otherwise the whole contents could be lost easily and unrecoverable if the key is unreadable - and for the majority of users and data, the availability of the data is more important than its absolute security.
There were several amazing talks at hacker conferences last year about reprogramming storage devices so that they can tamper with their contents. This researcher's talk was one of those. Another significant one was
and I think there were at least two others that I can't find right now (plus recent stuff on USB devices that attack their hosts in various ways). In light of these and other firmware and hardware-borne threats, a good overview of the bigger verification and transparency problems is
"An Arduino, with its 8-bit 16 MHz microcontroller, will set you back around $20. A microSD card with several gigabytes of memory and a microcontroller with several times the performance could be purchased for a fraction of the price. While SD cards are admittedly I/O-limited, some clever hacking of the microcontroller in an SD card could make for a very economical and compact data logging solution for I2C or SPI-based sensors."
"The embedded microcontroller is typically a heavily modified 8051 or ARM CPU. In modern implementations, the microcontroller will approach 100 MHz performance levels, and also have several hardware accelerators on-die."
Was discussed on HN, but Algolia search looks to be down at the moment.
Most people are surprised when I tell them that their computer is a lot of little computers working together on a sort of internal network.
This is why if your machine is compromised, and you have a threat model that involves serious (state or otherwise well funded) attackers, you really should just send it off to be recycled.
It also makes security of the supply chain extremely difficult. So many components, so many microprocessors and microcontrollers. You'll have a hard time proving that nothing has been tampered with from factory to integration...
"cheap and multitudinous commodity parts, each with a processor, memory, and a fast communication interface"
This reminds me of when I first went into business and bought some machinery. It actually surprised me (at that young age) to learn that the production machine I bought used standard parts that I could buy anywhere (bolts, screws and the like) and that if I needed one I didn't have to order it from the company that I bought the machine from. That seems obvious to me today but it wasn't obvious back then ("back then" was way before the web of course where info was not readily available)
Get a cheap JTAG debugger to play around with. Basically this entire hack hinged on the fact that he was able to connect via JTAG to the drive controller. Obviously it took a lot of knowledge to understand how to interpret the data he got, but learning a JTAG debugger is a good start.
I really like his article about dumb to managed switch conversion. I wonder if more projects like this exist perhaps with some existing community. Would be really cool if one could buy a cheapo switch and hack it to a managed one in a similar fashion like you can flash OpenWrt on some cheap routers and make them 100x better.
Yeah, my database scheme was fucked up and the work mysql had to do slowed everything down. I did some quick optimization of everything, meaning a few minutes of downtime, but now mysql only takes 10% CPU instead of pegging one core to 100%.
I'm just speculating here, but Cortex-M series are known for their DSP capabilities.
DSP is used in hard drive control. From wikipedia [1],
"Typically a DSP in the electronics inside the hard drive takes the raw analog voltages from the read head and uses PRML and Reed–Solomon error correction to decode the sector boundaries and sector data, then sends that data out the standard interface.
That DSP also watches the error rate detected by error detection and correction, and performs bad sector remapping, data collection for Self-Monitoring, Analysis, and Reporting Technology, and other internal tasks."
And one of the comments on the blog[2] mentions "...whereas the Cortex's ID turns up relevant boundary scan file..."
He mentioned that disabling it didn't seem to do anything, however. ("The Cortex-M3 handles... nothing? I could stop it and still have all hard disk functions.")
So SMART and bad sector remapping, perhaps. But not decoding.
Probably nothing. Creating the masks needed to mass produce chips is expensive so manufacturers will often use the same chip for multiple products. The M3 may be used by higher end products (i.e. enterprise drives) or it may just be the vestigial leftovers of some feature which was planned but didn't ship.
You mean the extra M3 core I guess? I wondered too, but I suspect it was just part of the vendor's chip design and went unused - if they could have saved some money by not including it, they probably would.
I wouldn't trust the data on a hard drive anyway, since the hard drive can be removed and the data changed. If you want to make sure you're reading _your_ /etc/shadow, it needs a message authentication code. If you want to prevent others from reading your disk, it needs to be encrypted.
There is a developing theme about how parts of a PC can avoid trusting other parts of the PC. For example, the PrivateCore folks (whose company was later acquired by Facebook) were describing a wide range of attacks where one part of your PC attacks another.
(That page is mostly focused on someone coming into your data center and seizing or tampering with your device, but they've also talked about the idea of counterfeit or backdoored hardware components, and they do allude to that a bit there.)
I find this kind of sad, because it adds overhead (for people designing systems, for people building and setting up systems, for people administering systems, and in terms of computational and memory overhead) and maybe reduces flexibility, but it seems like a well-justified threat model.
Most people would regard a hard drive that they have received from somewhere and formatted themselves as safe. This hack shows that this is not necessarily the case.
Also, I want to mention that it's common to have multiple processors in storage controllers. I can't talk about the specifics of the drives that I work on, but for SSDs at least there are several layers of abstraction: the host interface to receive the data, a middle layer to perform management of the data (SSDs require things like wear leveling, garbage collection etc in the background, to ensure long life and higher I/O speeds), and a low level media interface layer to actually write to the media. These tasks are often done by different processors (and custom ASICs).