It's things like these that sometimes make me think of all those businesses that rely on some ancient Windows XP machine because it's the only thing that supports the proprietary drivers for the engineering machine they bought nearly 2 decades ago for hundreds of thousands of pounds. If instead the drivers were open source (and the relevant controller software for that machine), they could more easily move over to a more recent operating system that doesn't run the risk of being infected with some virus (think Wannacry)
I imagine also how more difficult it would be for someone to reverse engineer the drivers for something a lot more complex than a VGA-to-webcam box.
Well yes, open source firmware and drivers would be cool but for most companies that's part of their proprietary IP and they think that making it open would make their product easier to copy/exploit by competitors. Since it cost them lots of time and money to develop there's no way your gonna convince management to just make it open unless you're some stinking rich FAANG Co. and those HW product sales are not your main source of income.
Just look how locked down Broadcom documentation and even datasheets are. Everything is under NDA.
HW business make money by selling more HW so making you buy their latest product just for the latest updates even though the 10 year old one still does the job is a viable business model for them.
Source: I worked in the hw industry and it's how they think.
I think there is a lot of ... cross pollination ... in the hardware industry, and companies are pursuing every angle to make sure that it's hard to figure out what IP they may have "independently discovered" that their competitors patented. I believe this is the reasoning NVidia gave for not open-sourcing their Linux drivers; everyone is copying everything and looking the other way. (It all started with the MOS/Motorola thing. Engineers took documents from Motorola to MOS, Motorola noticed, Motorola sued. Now people are more careful.)
As for datasheets, I feel like from certain vendors, they are "living documents" and every customer gets a different datasheet. You might mention in a meeting with the sales engineers "oh, we need a 100MHz SPI bus" and then they will edit the datasheet to say that the SPI bus works up to 100MHz. This is why there is an NDA, every customer gets a different datasheet exactly tailored to their needs.
That's an interesting explanation that I've never heard before, for why the hardware industry is reluctant to publish this stuff. It certainly seems credible.
You say:
>I believe this is the reasoning NVidia gave for not open-sourcing their Linux drivers
Do you happen to have a source for that? Or at least an explanation of why you think this is the case?
AMD is a company that has largely gone from proprietary to open source and without looking at their financials I think it seems to have been going really well for them. AMD GPUs just work out of the box on linux without having to install anything so I and almost every linux gamer buys an AMD card now even if it isn't the best performance per $
I do all my gaming on Windows and I still only buy AMD GPUs, so that when I switch back to Linux to start working, everything's rock solid and stable. I don't know if maybe I've just had bad luck, but nVidia cards have always been a bit off, even using the proprietary drivers. It was particularly noticeable with screen captures, I'd get this odd flickering that I could never completely solve. Switched to AMD and it cleared right up.
Even if the performance isn't the best, the openness is worth that price to me. I don't feel like the stability of the rest of my system is hampered by a third party that doesn't seem to care.
Didn't that strategy start before ATI was bought by AMD? How much of that strategy is just following through on what they were doing at ATI before and how much is it is in line with what AMD does in general?
I'm really curious how you came to believe that, given that AMD started supporting the development of open source GPU drivers in 2007 by collaborating with Novell/SuSE and releasing low-level documentation. By 2013 the open-source AMD GPU drivers were far better than any reverse-engineered NVidia driver has ever been, and only two years later AMD started the process of deprecating part of their proprietary AMD driver in favor of a hybrid approach (proprietary userspace module, open-source kernel driver).
I just looked in to this and it looks like what I am thinking of is the AMDGPU driver which wikipedia says released in 2015. I'm not sure what the driver you are talking about is or what the difference between the AMDGPU driver and yours is. I just remember in 2013 people saying nvidia was best for linux use and then with the AMDGPU driver people saying AMD now is.
The AMDGPU driver was the first one where the "main" consumer driver was the FOSS one. Prior to this, "fglrx" was for many years the fully-featured, proprietary driver, intended for both professional purposes where certified OpenGL drivers are needed, and for consumers. The "radeon" open source drivers tended to lag behind in both hardware support, features, and performance. Initially, they were largely a community effort, but then AMD started publishing some GPU documentation, and then also put more of its own resources into developing the FOSS driver.
The UX of the "fglrx" drivers tended to be pretty awful, all told; typically worse than nvidia's proprietary drivers, so they were wildly unpopular among users. The performance gap eventually started closing, to the extent that some games/apps would be faster on the "radeon" drivers, and some on "fglrx".
Eventually, AMD switched to the AMDGPU model, where the kernel driver is always FOSS (and part of the upstream kernel), and there is the FOSS userland driver as part of Mesa for consumers, and a proprietary certified OpenGL driver for workstation users.
(I believe Valve also had a hand in this somewhere along the way - they were interested in shipping Steam Machines/SteamOS and understandably were keen on having a stable and fast graphics stack and so contributed to Mesa.)
>The UX of the "fglrx" drivers tended to be pretty awful, all told; typically worse than nvidia's proprietary drivers, so they were wildly unpopular among users.
This is my memory of the time. Both options being undesirable and proprietary but nvidia's at least working better.
> AMD is a company that has largely gone from proprietary to open source and without looking at their financials
"without looking at their financials" does not make sense. Do they open anything in the Windows space? If a HW company wants to sell their stuff to the Linux desktop users they are generally forced to open the source, not because of good will. In terms of openness Intel does it a far away more. nVidia seems to be only exception.
Freesync comes to mind and I think they had something related to physics/hair rendering.
There is no requirement to open source on linux.
Nvidia does not. It just means you have to constantly update your driver for the latest kernel version and spend time keeping up to date with all of the latest developments like wayland.
> There is no requirement to open source on linux.
This is wrong. If you publish a Linux kernel module, you might be obligated to release the source. Fundamentally, this is how Linux differs from *BSD: Linux is intended to be generally hostile to binary blobs.
But it's not universal. Linus thinks nVidia's module may be exempt from the GPL's 'virality' because it wasn't originally written specifically for the Linux kernel.
Isn't that just planned obsolescence? Isn't the EU very much opposed to that kind of business strategy? Shouldn't there be some law in the EU prohibiting that?
I mean, I wouldn't force the manufacturers to keep supporting the hardware after a decade, but shouldn't there be a law compelling them to open source the drivers and firmware upon end-of-life? If a product reaches end-of-life, surely the manufacturer already has a demonstrably better, upgraded version of the product to attract new sales?
You'd think so, but what we are wrestling with now is that vendors can create products (microelectronics) that are too difficult to reverse engineer that its operation is effectively a secret to all consumers.
We're probably going to see more of this in the coming years around SecureBoot/TrustZone on ARM.
Is it just how they think, or is it the reality of the business?
Would be cool if manufacturers would make old versions of sourcecode public after a couple of years, and the final version when a product goes out of production.
I think for a lot of software businesses there is a truism that "your competition doesn't want your code, they think you're morons and you think they're morons."
But in the hardware game there is a lot of counterfeiting. There is rarely any shame or consequence when it comes to outright IP theft. Making it easier to access the firmware makes it easier for someone to counterfeit your widget.
You think the public facing representatives (management etc.) are morons, because
you think their strategy for working in the same space sucks unless
you think your company is totally blowing it and you are on your way out
but you don't think their programmers are morons so much (except when you examine site and it performs worse than yours for obvious reasons that you fixed on yours) but
if they roll out a really cool feature you can implement in your product you will write code to implement that feature and not wish you had their code because anyway your stuff is incompatible codebases unless
you get bought by them or they get bought by you or you both merge because you are in fact compatible and this way something beautiful will emerge
Firmware and drivers are different though. If the firmware was totally locked down then the drivers would be no use for a clone and if the firmware was taken but the drivers were proprietary the clone could just use the exact same driver binary.
Revealed preferences show that
when given the choice among economically sustaiat options, customers prefer to pay value based pricing and to rebuy things they still use, instead of paying more upfront for forever products.
> when a product goes out of production.
By the time this happens, the company has lost their throwaway source code.
In terms of taking care of the environment and fighting planned obsolence, this should be mandatory. If you stop supporting your product, you must enable your customers to support themselves.
It is sadly the reality in many cases, especially their business is small compared to FAANG. When a HW becomes popular it is natural that Chinese begin to make clones (google "Saleae clone"). In this case their propriety HW driver should filter these clones. Making it open source mean they give their business away to the Chinese clone makers.
Less maintenance and testing when OSs like linux distros bundle the drivers in and keep them working on future versions of the OS. This is also huge for mobile hardware since OEMs won't have to put in any effort to keep patching the latest upstream android.
Of course they may not want any of this since it means they can sell a new phone every 2 years.
I used to work for a Power Company, that had a hydroelectric dam. There were sensors in the dam that needed Windows 98 to read data from. I remember we ended up having a fleet of old Asus Eee PCs specifically for that, that obviously never ever ever connected to the Internet, or really any network.
ReactOS really sounds promising for these specific areas and I hope it gets better and better with time, it's just a shame that it's development cycle is slow right now though I understand it's hard to reverse engineer Windows
I'm surprised a manufacturer of such engineering machines hasn't differentiated itself by offering a SLA on driver (and general software) availability for future operating system versions. It would be reasonable to require users to pay a subscription fee for extended support. I'd think the resale value of machines with drivers still available would be much higher than for those without and that this eventually would allow the manufacturer to charge a premium price for the machine at initial sale.
I own a Fujitsu Scansnap s1500. It's out of support, and the existing drivers are incompatible with Catalina. I now must pay a 3rd party $100 for drivers or fiddle around with a Linux scanner server or similar. Never again will I pay $400 for a Fujitsu scanner, that's for sure.
It's probably because when faced between paying X for continuous support (where X is a big number) and "we will keep around a few computers for when it's a problem in 20 years", most companies are going to go with the later.
Even if it would make financial sense long term, how likely is it that the person making the decision will be there in 20 years to deal with the mess?
On the other hand, it will look much better on their next quarter results.
It happens all the time, but it's generally not "visible" to the typical HN developer.
PC and peripheral manufacturers in the embedded space often provide availability guarantees e.g., "this version of this hardware will be available for sale at least until YYYY."
Sometimes it matters, usually it doesn't. But it does at least make ordering spare parts easier.
If the drivers were open source, they would need good programmers to maintain them. And if they had good programmers, they could have reverse-engineered proprietary drivers. But I guess that buying an old PC might be cheaper than hiring a good programmer.
It's more that buying old PCs is safer than trying to reverse engineer drivers. Buying enough extra hardware to have spare parts for the life of the product is a pretty well known and cheap way to insure no down time.
I have experienced even when the original driver writer (me) is still on staff, it can be less expensive to just keep buying old-spec PCs than rewrite the code for a new OS. I make the company more money writing new code than supporting code that's a decade old.
Hang on, did he just buy essentially a freely reprogrammable FPGA with built-in RAM, A/D converter and USB support - for $20? Seems to me, you could do a lot more with those devices, now that the wire protocol is known.
(Edit) It doesn't appear that he decoded the bitstream format though, so some more work would be needed before you could use it as a general-purpose FPGA.
Yeah, a 16K Spartan 6, 64MBytes of DDR3 RAM, and an FX2LP USB High-Speed controller is a pretty terrific repurposed FPGA dev board in the vein of the chubby75 or panologic, even if the high-speed ADC turns out to have too many limitations / too weird signal conditioning (i.e. is too VGA-focused) to be very useful (but VGA probably at least means a pair of I2C pins is exposed on the DSUB-15, right?). Considering the FX2LP is already wired up to load the bitstream to the FPGA, and expects to on every power-up, that's a huge bonus for a dev board. Even if the binary blob of FX2LP firmware provided in the manufacturer's drivers does checksums, signature checks, etc. to make changing the bitstream data difficult, it would be pretty simple to write a new firmware to allow uploading of any custom bitstream (see: fx2lib), since it's such a well-known and widely-hacked chip -- maybe you can even manage to trick Xilin's ISE into believing it's a real Xilinx "USB Platform Cable" (FPGA JTAG programmer) since at least some of them use an FX2 for their USB interface. With a generic bitstream uploader utility for the FX2 you wouldn't even need to deal with any hardware modifications or even need to buy and connect a JTAG programmer if you were just starting out. And with everything being uploaded on every reset of the devboard, you don't have to worry about bricking anything either, since you just power-cycle the devboard to bring the programmer AND FPGA back to their default waiting-for-firmware/"rescue" mode.
But while the article says he got a lot of them (at least 3 from the photo) for $25 total, it looks like they're more typically ~$50+S/H each on ebay, which is less exciting.
The same company seems to have moved on to 'av.io'-branded USB3 devices now, which from a quick glance/binwalk of the downloadable firmware package (inspired by this article) appears to have FX3 USB Super-Speed controllers, and Spartan FPGAs. But it's designed to store the FX3 firmware and FPGA bitstream on the device itself (to allow it to boot up as a UVC webcam with no drivers required on the host, like this article was hoping would be the case for the older device), so it would require a bit more reverse engineering to figure out how to kick the FX3 into firmware update mode to replace the FX3 firmware with a custom bitstream uploader. But it's a ton more expensive, and who knows what generation and size of FPGA or how much and what sort of memory it would have.
Correct, that's actually a pretty nifty piece of general-purpose hardware for the price. It's essentially an SDR (software-defined radio) for baseband, meaning it provides simultaneous sampling on three channels from DC to whatever a suitable Nyquist rate for VGA decoding would be. Several MHz at least.
The bitstream format isn't anything special. It's whatever Xilinx says it is for the Spartan 6. Any bitfile generated by ISE for the XC6SLX16 could be uploaded via the (generic) Cypress FX2LP chip.
This is all relatively old hardware, very well-documented and well-understood but still quite capable. It has a lot in common with the earlier-generation Ettus USRP receivers that used the FX2.
A lot more, actually. Manufacturer's specs for the ISL98002CRZ-170 [1] say it'll do 170 MSPS, with an analog bandwidth of 780 MHz. It's actually a rather impressive ADC! The question would be whether it can be convinced to sample continuously.
That diagonally skewed image is familiar to me from old game writing days when sprites were written straight to screen memory - you soon learn that if it comes out skewed, you've probably got an off-by-one bug on the horizontals.
FWIW, I haven't done a lot of reverse engineering but I have written code to drive the USB peripherals for a few microcontrollers in the ST and NXP series and the techniques discussed in the article are just as applicable when trying to debug what the heck is going on in your USB code. I always get something interesting out of them.
I coincidentally just finished writing some userspace libusb code for the first time today!
Nothing as impressive or interesting as TFA, just enough to let the thing boot on Linux to the point where proprietary half-working drivers can take over. It'd be a stretch to call my epson scanner abandoned hardware, but considering the quality of official drivers, only a small one.
Today I learned that you can use wireshark to analyze USB packets. Neat. Besides the cool-hack-factor, this kind of article is especially great when it teaches you something.
+1, i like the Linux Device Drivers book _much_ better than the Linux Kernel Module Programming Guide, which is currently recommended in the top comment of this thread.
LDD3 goes way more in-depth and teaches you about the specifics of many different classes of device drivers, their interfaces and data structures, as well as where to find the appropriate in-tree documentation.
(both LDD3 and LKMPG are written for the 2.6 kernel. it's a shame there are no newer resources of the same caliber.)
Something a little more recent but still dated unfortunately. https://lwn.net/Kernel/LDD3/ I guess there is not much of an audience for books these days.
It's been about 10 years since I've done any Linux kernel development, but often you can get the rest from the documentation folder in the kernel tree, plus the source code of a similar module also in tree.
This feels a bit too simplistic to actually be helpful to someone. If you want to learn Golang for example (from 0 previous experience), don't jump into reading Docker or Kubernetes, start with the guide on the website that explains how it works, look over the reference manual, write some smaller examples to solve some of your own minor pet peeves then once you have a little knowledge about the language, start looking for popular and big codebases, maybe implement some part of them yourself.
Now I don't know how to write userspace USB drivers, but there has to be something between 0 knowledge and just "trying and immitating"[SIC] kernel source code. At least I'd be happy for more pointers on similar resources as the submitted link, but more beginner oriented.
For userspace USB drivers, you want to look at libusb. I don't know of a straightforward tutorial, but the API matches up really closely to the operations you'd expect to be able to do against a USB peripheral.
For learning more about how USB itself works, I highly recommend Jan Axelson's USB Complete[1]. I used an older version of this book to build a from-scratch USB device (on an STM32F0 but not using the built-in USB example code) and libusb-based usermode driver for it.
It's tough though, because any book you'd read will not be up-to-date, things are constantly changing/improving inside the kernel, etc. For some things you'll just not find any documentation whatsoever.
And what to learn depends on what driver you want to write, so you best start by looking into the code in the relevant subsystem, and read it, understand it, and use it, and learn what you need on the go, based on the needs you discover you have for your driver. Then you may realize what you need to learn, and if you can't grok it from the code, go for the books/presentations/linux kernel docs.
So diving into kernel code is a way to discover what you'll need to learn to write your particular kernel driver. Kernel's api surface is huge, and you'll never learn/need anything, and it's best to only learn what you'll be using soon. And this is one way to discover what that might be for your driver.
For kernel wide resources that the kernel offers (like threads, workqueues, locking primitives, of interfaces, kobj/device interfaces, sysfs attributes, timers, etc.) you can learn them from books, but it's also just as effective to learn them by example simply by going to https://elixir.bootlin.com/linux/latest/source and searching for all references of given function, and learn by example/implementation.
USB userspace drivers are not kernel drivers, so there the situation is different. Interface for writing them is part of kernel-userspace ABI, so you'll be able to use them even with half a decade old guides on the internet.
> If you want to learn Golang for example (from 0 previous experience), don't jump into reading Docker or Kubernetes...
I aggree that to learn C, you'd probably not start with reading Linux kernel code.
That's not the correct analogy, though. Learning to write a kernel driver would be more analogous to learning how to extend Kubernetes.
So if that was the goal, and it would be known that Kubernetes internals and its interfaces to extensions are constantly evolving/changing, that there'd more value in diving into Kubernetes code, instead of reading up on possibly outdated howto guides or whatever.
Thats true enough too, of course. But too often people want to be spoon-fed, and so you want to point them at decent resources which they can self-study, and then be on hand to answer specific questions.
There are books such as "Linux Device Drivers", which are great resources, and there is a lot of existing code out there to read too.
The person who asks "How do I do x?" without having appeared to carry out the minimum of self-searching themselves is often impossible to help in any detail anyway.
I recently wrote a user space driver for my Clevo laptop's backlit keyboard. I'm currently trying to convert it into a Linux kernel driver. I'll describe my approach.
First of all, I had to realize that the keyboard was in fact controlled via USB. It seems obvious but other Clevo laptops did it via ACPI:
I assumed my laptop worked the same way and wasted a lot of time dumping my laptop's ACPI tables and trying to figure them out. I even installed some Windows tools to decompile the WMI buffers. Didn't find anything. I was about to give up when someone on IRC set me on the right path and I was able to make progress. Even though I saw my keyboard on lsusb output, I never thought to look for a USB protocol. Learned this one the hard way.
Before I started, I emailed Clevo and asked for technical documentation on the keyboard. Their marketing department replied: "Ubuntu was not supported". I emailed Tuxedo Computers (mentioned above) and they were much more helpful: their developers shared a fan control application they wrote with me! It was great but the keyboard drivers were missing.
So I decided to reverse engineer it.
While not as complex as the VGA-to-USB device, the process of reverse engineering the keyboard was similar:
1. Boot Windows 10 with the proprietary driver
2. Start Wireshark
3. Capture USB traffic
4. Use the manufacturer's keyboard control program
5. Correlate the data sent to the keyboard with what I did
6. Document the structure of the messages
For example, to set the color of a specific RGB LED on my keyboard, the following bytes must be sent to the device through a USB control transfer:
0xcc01kkrrggbb7f
01 = some kind of operation code
kk = target LED
rr = red
gg = green
bb = blue
cc, 7f always surround all bytes (part of USB protocol?)
I used hidapi-libusb to create a small program that sends those bytes to the keyboard:
This user space driver gave me access to most of my keyboard's functionality on Linux. I used to have to boot into Windows in order to configure the keyboard lights -- not anymore!
The keyboard also generates some custom key codes that are used by the proprietary Windows driver to implement some Fn hotkeys. I know this because the kernel logs these unknown key events. To implement this, I'll have to make a real kernel module that maps them into something user space can make use of.
To this end, I've been studying the Linux kernel's documentation:
The input_mapping, input_mapped, event and raw_event hooks seem especially relevant to my case: the custom driver just needs to process the custom key codes and let the generic Linux keyboard driver handle the rest. Might as well add sysfs entries for the LEDs but I don't know how to do that yet.
I expect to end up with something like the open razer driver:
I know virtually nothing of low-level programming, so bare with me; to a lay person, can someone explain why drivers are every closed source? Isn't a driver inherently a loss-leader for some kind of hardware product, and wouldn't it be better for the manufacturer to have it open-source so that more people could contribute to it?
I'm not asking this passive-aggressively, if someone knows the answer to this I'd really like to hear it.
Sometimes, it just doesn't make sense. For example, I write and maintain the Linux kernel modules for one of the largest, most popular, and most powerful land vehicles in the world. It wouldn't make sense for us to release these to the open source community because it requires specific hardware that we make, specific FPGA firmware that we interface with, and it will only work with specific software we make. So there's at least one case where it wouldn't make sense to release this, and that's not even getting into the security aspect of things.
In addition to the very good replies you already got, sometimes it just wouldn't make a difference.
In some environments, if the code was not developed using a particular process (my experience is with regulated medical devices), it may as well not exist, because putting it into production would be in violation of Federal law.
There are also small niche products where most of the IP is in the driver: lose control of that and you're on your way to being out of business.
I have a few devices with no public drivers of any kind. Thinking about how to develop drivers on Linux, I’m guessing I should lookup into the chipsets used on the board and see if those companies have drivers I can use as a starting point?
Does it have Windows drivers? If it does, you can set up a VM, set the device as passthrough, and sniff the traffic to it. Then it's not really any different than reverse engineering a network protocol or file format.
tl;dr Ben Cox bought a vga in => webcam out device that its parent company hadn't updated the linux drivers in a while. This details his journey of using tcpdump, et al, to analyze the sent packets. Eventually, he simulates the usb firmware and fpga download such that he can use the device. A quick, engaging read that was an appropriate level for this non C programmer.
Looking through his archive, I recognize his bgp battleship post from last year on HN.
It's one of the contexts in which I'm willing to be the change I want to see.
It's disappointing to click on the comments for a post and no one (especially the submitter) has told me what I'll learn/experience. And yet, there it was, four hours old, ten points up and no comments. So I give it a skim and it's really good. Now I want others to read it. Spending five or so minutes polishing three sentences is basically free for me and positive sum (unless I do a bad job).
Cynically, tl;dr submission summaries are how I've chosen to feed the beast until it grants me downvoting power.
What if the vendor would prefer not to disclose source code? All drivers cannot be included into the kernel, even if they were, this would be a lot of code and who is going to maintain it?
This document shows how unfriendly is Linux to third-party code. Either you contribute into giant monolitic kernel, or your device won't work with next major kernel release.
> Depending on the version of the C compiler you use, different kernel
> data structures will contain different alignment of structures
That doesn't make sense. How does Microsoft use structures in their API then?
Furthermore, unlike Windows software, you cannot compile Linux kernel with any compiler except for GCC [1]. It doesn't comply with C standards.
> Depending on what kernel build options you select, a wide range of
> different things can be assumed by the kernel
> - different structures can contain different fields
That looks like poor design. Why care about non-standard configuration? Let those who use it solve the problems themselves which would be the true open source way.
> Linux runs on a wide range of different processor architectures.
> There is no way that binary drivers from one architecture will run
> on another architecture properly.
Supporting i386 is enough. Nobody is going to insert a PCI device into a smartphone. If there will be new CPU architectures, one can use i386 emulator.
> As a specific examples of this, the in-kernel USB interfaces have
> undergone at least three different reworks over the lifetime of this
> subsystem
They could make 3 versions of API and adapters that would support drivers written against older versions of API.
I think the real reason is that designing API is hard, and Linux developers have no interest in this. They care only about server platforms and those platforms have open-source drivers because they need Linux. Nobody cares that your USB webcam doesn't work because you don't connect webcam to a server.
I think that Linux could also rely on existing Windows drivers that exist for almost every device. Linux could provide an environment similar to Windows kernel, and run those drivers in a sandbox. For many devices, this is the only sane way to have them working with Linux.
There are many ways to get somewhere, but here's a general list of things you have to do:
1) Think for a little bit about what you have got and what you want to do
2) Do research. Has this been done before? Have similar things been done before? (Usually the answer is yes to either question)
2) Think whether it's theoretically possible to do what you want to do with the information and parts you have (in this case: there is a working driver for Linux and Windows, but they're closed-source)
3) Fixing computer things usually involves a lot of debugging (tracing). Will tracing help us out here? What tracing tools/skills do you have, what tracing tools/skills would you like to have? Often you'll need to extend the tracing skills/tools you already have, but normally you don't want to spend too much time on this. (In this case, we can trace without any specialized equipment -- we just need a virtual machine.)
4) Once you have the appropriate traces, take a good look at them (you'll probably want some scripting abilities here) and see if you notice any patterns (e.g. "lots of data after this", "just a little data after this", "this bit of data is always the same", etc.)
5) If you don't know what to do at this point, maybe go back to the tracing step
I would have given up long before the point this person got to. This is pretty damn amazing, and could gain a little bit of life from these old, closed, binary blob, everything in software devices. Tons of kudos for doing this.
On a side note, I wish there was a more standard way to deal with Linux kernel modules. The ABI incompatibility leaves a ton of dead old hardware out there for non-maintained kernels. In an ideal world without tons of binary blobs and .o files out there; this wouldn't matter. You could recompile everything for the latest version.
You might have to fill in some missing symbols or swap out deprecated functions, but it'd be considerably easier than this. That was the ideal behind open source; that we could always be able to figure out how our stuff works, and it seems like today we only have open source middle ware, and so much close source stuff build on top of open source stuff.
I imagine also how more difficult it would be for someone to reverse engineer the drivers for something a lot more complex than a VGA-to-webcam box.