Hacker News new | past | comments | ask | show | jobs | submit login
Reverse engineering software licensing from early-2000s abandonware (yingtongli.me)
307 points by whack on Aug 30, 2021 | hide | past | favorite | 94 comments



After spending years in manufacturing IT, there are hundreds if not thousands of systems like these running, where the company that created the software is long gone.

This where you end up with DOS, Windows 3.1 and even Windows NT computers controlling machines that make millions of dollars worth of product, 24 hours a day.

We've spent hours scouring eBay or industrial auction sites finding parts of computers to keep 'just in case'. None of this software virtualizes easily or can even be moved to a machine of similar vintage without relicensing. Some of it is hardware dongles, some of it software keys.

It seems like you could create quite a business being able to crack this software. Companies would pay tens of thousands to get these machines running. Often times the 'new' version of the hardware and software is $100,000 US.

In a few years, internet-based licensing will be the thing to crack.


> It seems like you could create quite a business being able to crack this software. Companies would pay tens of thousands to get these machines running.

That would be against the copyright holder interests because as you point out:

> Often times the 'new' version of the hardware and software is $100,000 US.

An as per the Disney lobbied US copyright law you would have to wait at least the life of the author + 70 years or 95 years from publication depending on some circumstances.


The supplier of the "new" software often isn't the same as the (usually defunct, hence the trouble) legacy supplier.


That doesn't mean the copyright ownership disappeared. Those either got acquired/merged into another entity or liquidated in case of bankruptcy.


And the software is most likely owned by the competitor that sells said "new" software that most likely doesn't fit the niche anymore. They normally refuse to re-license even without having to give updates/support.


The hardware failing doesn't take away their legal right to operate the software, does it? (Maybe the contract says "for the life of this dongle?")


Right, but that's unlikely to be a scenario where there's any enforcement going on.


If this were tried, the present copyright holders would come out of the woodwork and enforcement would happen pretty quickly.


Are there actual cases of this on the books?

I'd be pretty surprised if a judge was like "ah yes the defendant's consultancy modified a piece of industrial control software that you haven't given a thought to in three decades to make it run on a modern computer and not require a parallel port dongle, and that's definitely a DMCA violation and you've been harmed by it and deserve all the money."


"Laches" https://en.wikipedia.org/wiki/Laches_(equity) is probably a relevant concept here. In a scenario like the one you're describing, the delay is likely to be taken into account, but it's unlikely to be the whole of the argument.


As I understand it (IANAL), the delay period for laches only starts when the complainant becomes aware of the violation.


That's irrelevant in many cases where you want to support existing hardware. Clean room reverse engineering for interop purposes is allowed under copyright law.


Sure, but in fairness, this thread is about bypassing the copy protection in the existing software. Which I'm also arguing is safe, but it is not as obviously safe as a clean room reverse engineering effort.


I'll reiterate my comment from a few parents up: This is currently happening, and no, the copyright holders either don't even know, don't care, or are truly gone (even as far as being deceased.)


Is it even illegal to crack software? Cracking copy protection is not the same as software piracy.


Laws will vary worldwide. For the US I believe the DMCA makes circumvention a crime. This was why the DeCSS number was illegal.

There may be b2b contracts in addition to what the law says, but I believe the DMCA has an interoperability exception. I'm curious if "getting it to run on new hardware" is enough for that to trigger.


*Deep breath* *Opens can of worms*

What about defining the equipment as "repairably broken" and asserting that circumventing the license protection falls within right to repair?

*Opens even bigger can of worms*

What about invoking this in a situation where someone's using an older piece of equipment and does not want to pay for example $500k, $1m, or more to green-field replace an entire installation when 99% of the existing system has perhaps a decade or more life left in it?

--

I'm not entirely read up on the finer points of the John Deere tractor situation (which is kind of the current poster child for this whole thing), but I actually think the above arguments actually resonate with the precedent set by this particular case.


I did try to search for John Deere to remind myself before posting because I wanted to draw that link, but the best I could find was that farmers were installing software from Russia (where it's widely available, unlike the US) to enable repairs. The articles didn't say this was illegal (which makes sense under my limited understanding of the DMCA), but did mention it'd break warranty under current rules.


It's a violation unless you have rights to use the work. This was added after the first revision, probably because they were worried about a challenge to the DMCA on these grounds striking more of the law's text.

So, below, you have people wondering about cracking machine equipment. Totally legal to do.


You can turn a Windows HDD into a VHD but I was never able to get it to boot up properly on a VM, so I just use Linux to view old files instead. I forgot the approach I used, but it probably isnt too complicated to google.


I recall reading that Windows historically built its hardware abstraction layer at install time and that's why you couldn't simply move a disk from like hardware to unlike hardware without a reinstall. You may want to try a fresh install of Windows, then copy over the application files. It might be tough if the vendor modified Windows libs and did not provide any installer.


In the days of XP I could not slap an old hard drive from one computer into another and expect it to boot. With Windows 10 I have done so many of times to my surprise.

We have also virtualised many physical servers at my job. We use Veeam to create the VHD


I'm kind of screwed cause I don't have the original hardware, I wonder if there's some sort of QEMU hackery I could pull then.


boot windows safe mode, execute sysprep. This strips windows of hardware-specific drivers, so that qemu should have an easier time running it. The whole thing is crusty, and it may or may not work. It does change Windows, so make a copy and keep the original image unchanged.


Thank you! I might wind up trying this, it's an old computer my wife used to one imaged, I've been wanting to make it bootable for the nostalgia effect that it would produce. I just want the least amount of damage to the OS as possible. Getting the files is one thing, being able to boot it, is something far more interesting.


I think there was a way in NT (at least with NT5/2000) to replace HAL components with more 'generic' components (eg standard SATA drivers, uniprocessor kernel) so you could more easily move it to different hardware.

I've only read about it but never tried it though, it may have been using the SYSPREP tools?


For Windows XP, you have to go to device manager and uninstall the CPU and/or the motherboard, that will reset them to generic ACPI something if I recall well.

I've done it once, migrating a very old machine to new hardware by just moving the disk.


I just thought about this, but do you think upgrading the OS on the VM might trigger it?


You can totally do this at least since Windows 8.1. The main hurdles are the boot menu settings (if you're unfamiliar with that) than any hardware related stuff.


> millions of dollars worth of product, 24 hours a day

> Often times the 'new' version is $100,000 US

Is the price really the issue here? It almost sounds like the company could afford it if it wanted to.


A $100,000 machine is probably a machine with an entire workflow and facility engineered around it.

So it's $100,000 plus:

* The costs to disassemble and remove the old machine

* The costs to install the new machine, which could include specialized contractors.

* The costs to realign any supporting equipment to fit the new machine (i. e. if a conveyor belt feeding parts in or out has to be moved)

* The costs to retool the rest of the manufacturing process that was dependent on quirks of the old machine. This could have a significant and expensive research phase.

* Retraining to operate the new machine

* Any parts scrapped by use in the testing or teething-trouble adjustment of the new machine

* Downtime of the actual production process, potentially extending to failure to meet contracts and associated penalties

So that "$100,000" machine could have end-to-end costs well into seven or eight figures.


Yes, such services exist. What a coincidence that I just posted a relevant comment about it on a different article yesterday:

https://news.ycombinator.com/item?id=28351754


Oh hey, I'm the author of the post! Happy to chat about any aspects of it.

This project is a spiritual successor to an earlier project reverse engineering a gaming DRM system, so if you enjoy this post you might enjoy that older one too: https://yingtongli.me/blog/2018/11/16/drm1-1.html


Given that the application was written in Delphi, I'd bet it's using some form of Partial Key Verification [0], which I wrote a fun blog post about a couple months back [1]. :)

[0]: https://www.brandonstaggs.com/2007/07/26/implementing-a-part...

[1]: https://keygen.sh/blog/how-to-generate-license-keys-in-2021/


Wow, that's super interesting reading! Thanks for the links! I will certainly be keeping this all in mind if I ever jump ship to proprietary software land ;)

The key validation algorithm in this software is extraordinarily simple, so I'm leaning away from there being anything fancy. I was unable to correlate keys used in later versions of the software with this algorithm, though, so you might be on to something. (I don't have a copy of a later version, but would love to check if I ever get my hands on one.)


Oh boy there are some interesting possibilities here with the partial key verification stuff.

What if you release a new version where, if the key is valid under the old check but not under the new check (indicating a keygen-ed licence), you start subtly screwing with the user. Like EarthBound or Spyro...

Quite off topic but very interesting!


Wine dev, here :) Using Ghidra & winedbg is something I do quite often for Wine development, it's super cool to see someone using those tools for other purposes, too.


Happy to oblige :P You know, now that you've mentioned that, it only just occurred to me that winedbg is probably mostly used for Wine-related debugging, not debugging things that happen to run in Wine!


Sorta. Winedbg mostly exists because most native debuggers won't have support for the situation Wine creates (Windows PE files with a non-native memory layout co-existing with native libraries). Just turns out that debugging Windows software in Wine is not a very common usecase outside of Wine dev :)


Interesting post! I recall a lot of older copy protection instructions around eax:edx register/space. Is there a reason you don’t just JMP around the license validation entirely?

Also, I love your anti-cv!


Totally, putting some small patches into the binary would definitely work in the case of just wanting to get rid of the licence validation. The goal of my project, though, was to get to a state where the software could be used in its original unmodified state, with a "real" licence. Just felt more authentic! So the process over the 3 parts of the blog series is guided by that final destination.

Re: anti-CV – Thanks! Imposter syndrome is a big problem in medicine, as it is in IT and probably every field, and I wanted to do my little bit to combat it. (Not my idea, got it from my seniors, who got it from some uni professors.)


I think its pretty cool your in the medical field and doing this stuff lol. I've already noticed that "hacking" tends to attract a lot of random people from different disciplines lol


What's the name of the app? Why the secrecy?


It's a good question – I'll copy what I wrote for the Redditors who had the same thought:

> Copyright law is pretty scary around anti-circumvention rules – putting the name of the software right in an article about how to break its DRM/licensing just sounds like asking for trouble, so I never do. (Not legal advice – just my personal musings!)

> At least if the software is unnamed, the article is clearly more for educational purposes – you won't find the article if you've got the software and you're trying to break it, and you won't have access to the software if you're just reading the article.


To be honest, it wasn't too hard to figure out what the target software was from all the subtle clues; after all, crackers are the sort of people who will enjoy such a challenge too! I won't reveal it here either for your sake, but don't think that the "anonymisation" was even close to being complete...

Then again, I also exercised my skills from the Fravia/Searchlores era ;-)


Makes a lot of sense.

Out of benign curiousity, was the software...?

- Industrial/control oriented (talking to bespoke hardware)

- An "internal" B2B line of business thing

- An off-the-shelf/productized/marketed piece of software

I suspect the latter.

I'm naturally also curious what it was for, but I suspect that even generally scoping that would make identification significantly easier for a large majority of people, so I'll leave it there :)


I mean, if the company that wrote the software doesn't exist anymore, who's going to bring that copyright claim?


On a technical point, even if the company has ceased to exist, its assets might have been sold, or it might have assigned its copyrights at some point, or perhaps a third party has a copyright interest, and there would be no way for me to know about that.

The broader point to make is that this is a general policy of mine – I deidentify all software that I discuss in any of my RE writeups. Having a blanket policy avoids needing to make ultimately arbitrary decisions about what to name and what not to name – and in any case, not naming the software doesn't prevent anyone from reading the writeup and taking inspiration from it if they choose.


I get what you are saying. These days--the first thing lawyers pay others to do is comprehense internet searches.

I've always felt the biggest mistake people make is thinking no one is looking at their ramblings.


Many companies don't just cease to exist, but rather the rights to their IP are purchased. Some of that IP is viewed as not valuable and ignored... but they still hold the rights to it.


It's a bit like landmines left over after a war.


In some cases IP like this can be even more dangerous because there is some disgruntled CEO potentially sitting around with ownership of all the IP and he/she sees you're infringement as a quick cash grab.


Some other company or individual could have bought their IP portfolio and now own the rights. They have no obligation to publicize this, as far as I know.


If you ever get into reverse engineering Mac PPC copy protection, I’d be interested in your approach.

You might be able to guess why I write this.


> However, the decompilation of the next part of the function is incorrect

How (long) did (it take) you (to) find out?


It was fairly straightforward to see in this case honestly. I made a habit of looking at both the disassembly and decompiled code – my previous project was in IDA Free which had no decompilation, so I was used to referring to the disassembly. The address to use for breakpoints also come from the disassembly, so one naturally spends a lot of time looking at it.

In the first case, the decompiled code reports a function call, but in the disassembly it is preceded by pushing some suspicious-looking magic numbers onto the stack which are not reported in the decompiled code – clearly, something was going on there.

In the second case, the "ret" instruction at the supposed end of the function was immediately preceded by pushing an address to the stack – so again fairly simple to determine that the return must necessarily jump to that address, rather than return from the function.


As someone who is significantly smarter than me, how does Ghidra compare to IDA? I'd love to get into decompilation, but I've heard that the free tools leave a bit to be desired.


Ghidra is as good as IDA with caveats, in my opinion. If you're reversing a less-common architecture (not ARM/x86) which Ghidra supports well, it's much more effective than IDA simply by virtue of having a psuedocode decompiler (IDA's Hex-Rays is architecture-specialized).

The IDA GUI and scripting functionalities are much more common in tutorials and the ecosystem, so the Ghidra learning curve can be greater, but it's not really inferior.

IDA has fewer decompilation/disassembly bugs but in both IDA and Ghidra, bugs are usually fairly easy to spot and not a huge detriment to achieving a goal.

IDA deals with C++ better than Ghidra (imo).

Anyway, for free Ghidra eats IDA's lunch, and the IDA home edition offering is weak - so for a hobbyist, Ghidra is a clear home run.


I haven't ever been able to test IDA's decompiler or debugger, as IDA Free only does x64, and all the RE I've done is on 32-bit binaries.

Ghidra's decompiler worked fine for this project. It made 2 relevant mistakes which I talk about in the blog posts, but they were fairly easy to identify when comparing with the disassembly.

As I discussed in the post, Ghidra did have some difficulty (which IDA did not have) locating all the functions, so I did end up using both Ghidra and IDA in the initial stages.

The progress that Ghidra is making though (e.g. the recent implementation of debugger support) is promising for the future.


More precisely, this is about reversing the license key generation/verification algorithm.

I got a bit confused by the title at first, thinking they were trying to deduce the specific licensing terms or something.


This tangentially reminds me of when I used to work on some commercial software which linked against a FlexLM binary blob for licence checking. We had a customer bug report where the software was occasionally crashing on start up on 64-bit Windows and it turned out to be happening in the licence checking code.

I disassembled the blob and it turned out that it was down-casting a NT handle to 32-bits. This seemed to be fine in practice as I never observed the higher bits set. Unfortunately however, the code then used a signed load to read it in from memory and hence corrupted the handle if the 32nd bit was set, causing a crash.

I made a patched blob which fixed the problem but sadly the legal department vetoed shipping it in case it violated our license with Flex :-P.


Fun read. I wrote a DRM system in the early 90's for try-before-you-buy. Instead of a gateway, we would perchlorate portions of code APIs through a lattice. Somewhat like a one-way hash. I think there 512 keys -- one for each node. You couldn't disassemble static code, you had to set breakpoints. But, there was a bug. Instead of extracting 512 keys, you only had to extract 9. So, it got cracked sooner than expected.


(Regarding footnote "3" of the post):

Nice! Next time when I encounter an "Enter license key:" dialog, I'll simply try some simple registration codes first.

I'll start with clever variations of the value "1", e.g. 00001, 00010, 00100 ...


I wonder, from a legal point of view, if putting a random text in a textbox and getting a valid license could be proven to break any copyright law or something.


You could be breaking copyright law even if the software had no DRM.


I wish I knew how to do this. There’s Mac software I bought 10 years ago and found myself using it again today, but is buggy. The developer released a new major version in the meanwhile and then retired it due to low sales.

I contacted him to sell me a license but he refused categorically, telling me I “should have bought it when it was being sold.”

Now I find myself using a buggy version and hoping I’d get around to cracking the new version myself. Heck I’d pay to get it cracked.


I can take a look at it. How do I contact you?


Sorry about the delay, I was hesitant to paste my email address on HN. If you see this within 2 days you can email me at wwqs@tmp.mail.e1645.ml or samanyue8@delaysrnxf.com


I had to do this in a past job too - a vendor provided a module with a license check which wouldn't allow the binary to run on Windows Server, but the "enterprise" solution which was licensed for Windows Server was not only not only sold anymore but lost!

Ghidra didn't exist yet and I didn't care to deal with the IDA demo, so I used OllyDbg and then just manually hex-patched the binary. Simpler times :)


You had another problem too: Windows XP x64 Edition was built on NT 5.2, the same branch as Windows Server 2003.

A few software (especially antivirus software) did a simple version check, Windows reports it's version 5.2, and the assumption was made that it must have been Windows Server 2003. Refuse to run because you have to pay more for a server edition.


I don’t know why someone would do that. VerifyVersionInfo has existed since W2K.


OllyDbg was basically the best-in-class of freeware tbqh; it's a shame the developer never really got the 64-bit version out. Other software, such as IDA, are leaps-and-bounds ahead of OllyDbg but IDA's crazy expensive. I've not yet tried Ghidra even though it's been out for a while. I hear it's great.


x64dbg is probably spiritual successor to OllyDbg: https://x64dbg.com/


Oh that's good to know. I've since moved on from the dumpster fire of Windows. Let me move the goalpost and desire this for Linux.


Ah ollyDBG! Trip down memory lane. Also brings back memories of softICE.


SofeICE was awesome! Did you studied at the +Fravia/+ORC +H.C.U. (High cracking university if I remember correctly) ?


I've actually never seen reverse-engineering explained in a more straight-forward manner. I was able to skim the article and understand exactly what was done in a minute or so. Excellent article!


Thanks for the feedback! I get this comment a bit, and I'm not really sure what it is that I'm supposedly doing right, but I'll do my best to keep doing it!

I actually don't know all that much about binary RE – my usual work is generally high-level Python stuff – so I try to write how I would like things explained to me, which I think helps.


I think not being an expert can contribute to clearer communication[1]. I sometimes joke that physics is such a hard class primarily because it's being taught by physicists...

1: Of course not being an expert can contribute to communicating the wrong thing clearly, which is its own problem.



Not sure why this got downvoted, it's a spectacular example of why OP's post was good


The way you explain things doesn't require a ton of existing domain knowledge. Having basic intuitive understanding of binaries and the fact that different codepoints have different memory addresses was sufficient, whereas most articles on this topic get super technical super fast.


Great series of articles. I also went through the other DRM article linked in part 3. I love the insight into how something like this was reverse engineered.

Does anybody know any similar articles? Maybe something where the software is named and it's possible to follow along step-by-step? Seems like it'd be a fun exercise.


Glad you enjoyed!

It sounds like what you might be after is some content on crackmes/specific RE challenges. I'm not involved in that space, so someone else probably would have better links, but one challenge that was my start in RE was the Synacor Challenge: https://challenge.synacor.com/

It starts off just as a programming challenge, no real RE knowledge required, but if you see it through to the end you'll definitely wind up with a bunch of foundational RE skills. And there are a whole bunch of public writeups online if you want to follow along with someone else's approach.

(Just to note, though, that it's based on a custom CPU architecture – implementing that is the programming part of the challenge – so very much from the ‘learn it the hard way so when you do regular stuff it feels easy’ school of thought.)

The Youtube channel LiveOverflow also has some videos going step-by-step through some RE puzzles, and his content is very digestible.


I wrote a pair of articles earlier this year about hacking a GameCube game

https://www.smokingonabike.com/2021/01/17/hacking-super-monk...


Very nice. I'm up to part 2 (https://yingtongli.me/blog/2021/08/29/drm5-2.html) and I had a thought.

The SEH pattern (PUSH 32bit address then RET) should be identifiable with a plugin, and a code flow override should fix the decompilation.

I wonder, did you try this, and did it help fix the Ghidra decompilation?


Good thought! I don't have enough understanding of Ghidra to attempt this myself I think, but it looks like it is already on the radar of the Ghidra folks: https://github.com/NationalSecurityAgency/ghidra/issues/2477

Sounds like try-catch handling is not implemented in general yet, but is on the cards.


The easiest way that I used to do this same kind of thing is to either patch the code to add a JMP instruction bypassing the registration check altogether, or using an in-memory patcher that runs the program and does the same thing... Validating the whole registration process seems alot more tedious than just patching the binary to skip it completely.


Yep absolutely. What I wrote for someone else who had the same thought:

> Totally, putting some small patches into the binary would definitely work in the case of just wanting to get rid of the licence validation. The goal of my project, though, was to get to a state where the software could be used in its original unmodified state, with a "real" licence. Just felt more authentic! So the process over the 3 parts of the blog series is guided by that final destination.


I believe this is what is done for a lot of old games (you know, the ones that would ask you to enter page 7, line 3, word 4 from the manual, etc.) If you buy them from Good Old Games or something, you simply enter any answer for the copy protection and it let's you in.


Very nice writeup, brings up memories with Delphi.

Looks like Armadillo protection at first sight, but not 100% sure, been too long :)


This was a great article and makes me want to dig in to some of my own ancient software.

Could these tools and techniques be used on older PowerPC Mac executables? I have some old software that was protected by ADB dongles. I own the software, and even have the dongles, but I don't have any PowerPC ADB-equipped machines.


I’m guessing most software from this era has been cracked already. I know my niche B2B software would get cracked within a few days of a new release.


messing around with soft ice was fun lol




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: