If hypothetically, someone were to take a copy of that code just before the business destroyed it, they could do almost whatever they liked with it, because the original business wouldn't be able to prove it was stolen from them.
Let me use Microsoft as an example. (And, I'd rather not identify the company in question, but I can say that it wasn't Microsoft – I have absolutely zero information on what Microsoft's data retention policies might be.)
Suppose Microsoft's policy was to destroy all copies of the Windows 3.0 source code. Having followed that policy, they have no copies left. Now suppose an ex-Microsoft employee has a copy of the Windows 3.0 source code in their possession, and decides to distribute it. You really think that Microsoft couldn't prove in court that said source code is in fact the Windows 3.0 source code, even if they no longer possessed any copies of it themselves?
> I'd rather not identify the company in question, but I can say that it wasn't Microsoft – I have absolutely zero information on what Microsoft's data retention policies might be.
While earlier things may have been lost, Microsoft operates a Microsoft museum. Anything they have today is pretty likely to be intentionally preserved.
It's been fifteen years since I've last darkened MSFT's doorsteps, but they had an internal server that had everything back to DOS 2.0, IIRC. Now, source code I cannot say with any real knowledge, but the bits were (and I'm guessing, still are) there.
The easiest one would just be to prove that the person in question didn't have the many thousands of programmer-years at their disposal necessary to make something so large and therefore the only possible source is the one entity that ever did, and used it to make Windows. Since "some guy who leaked Windows" is obviously not in possession of that sort of programmer-power, it's a pretty open-and-shut case.
That one is so obviously going to win that speculating further isn't worth much, but there's a lot of other ways we could prove it too based on architectural similarities to the existing binary code (which I presume is not wiped off the face of the Earth), but this one, while quite sufficient to prove the case as well, is more abstract and prone to internet snipers trying to prove their smarts by quibbling about endless leaves while missing the obvious location and sheer mind-blowing size of the forest. But we don't have to go here, because the previous paragraph would do just fine.
>The easiest one would just be to prove that the person in question didn't have the many thousands of programmer-years ... //
We're talking a copyright infringement case here. They have to prove that they own it, proving he doesn't does get them a little along the road but nowhere near far enough.
Compile with different flags and I don't see how your "current binary" argument works; also they said the code had been changed a little, at least.
Whilst the necessary proof should only be "beyond reasonable doubt" the burden of proof is still with "Macrosoft" and they have to prove their ownership, with no original materials.
IIRC you're legally limited in what you can recover from an alleged infringer if you haven't registered the work in question with the US copyright office. If MS had destroyed all source code for Windows 3.0, then presumably they never registered the work (and if they had, why bother destroying it?). I agree that an infringer would likely not get away with it scot-free, but MS also wouldn't be able to collect statutory damages or attorney fees.
US Copyright registration doesn't require you to lodge the whole source code. It is allowed to print it out, and submit a 50 page sample of the printout. (I think you are allowed to send the Copyright Office the whole thing if you want, but 50 pages is all that is required.)
A vendor could submit 50 pages of their source code, then delete the rest. If an ex-employee later distributes a retained copy of the deleted source code, the 50 pages are sufficient for statutory damages, even for parts of the source code not included in that 50 page sample. (The point of the 50 pages is to be a big enough sample to identify the work.)
> Wouldn't MS also have to prove that the code is not a derivative of some other PC desktop solution of that era?
That would be pretty trivial. If you had the Windows 3.0 source code, you could compile it, and the binaries you'd get would be the same as the shipped binaries. (Your version of the source code might not be exactly the same version as the shipped binaries, but even if not 100% identical, a binary diff or disassembly would should great similarity.) Even if someone has modified the code a bit, there would still be immense similarities there, which would be difficult to explain as anything other than copying.
Also, GeoWorks and Windows 3.0 have quite different APIs. If you find code exposing an API which is bug-for-bug-compatible with Windows 3.0, then it almost surely is a derivative of the real Windows 3.0 source code. It could be a derivative of some compatible implementation of the Windows API, such as Wine or Sun Wabi, but neither of those would be bug-for-bug compatible; and, as jerf pointed out, a defendant which claimed that they (or someone else other than Microsoft) wrote the code from scratch would have to provide some evidence that they (or someone else) actually carried out the Herculean task of creating a bug-for-bug compatible clean room clone of the Windows API. And not just the application-level API – Windows 3.x has heaps of internal APIs, which largely weren't used by applications and which newer Windows versions don't support (for example, the legacy Windows 3.x device driver models) – any modified copy of the Windows 3.x source is going to support all that, a clean room reimplementation is unlikely to do so. Courts decide civil cases on the basis of the balance of the probabilities, and such a claim must be viewed as improbable, unless some concrete positive evidence is put forward to demonstrate it is true.
(IANAL, but you don't need to be a lawyer to know that Microsoft would win this one.)
>That would be pretty trivial. If you had the Windows 3.0 source code, you could compile it, and the binaries you'd get would be the same as the shipped binaries. //
That's not how that works AIUI. Repeatable, provable builds are I understand a recent phenomenon. Use a different compiler, or different flags, and you get different binaries.
You're right on BoP, but the burden of proof still lies with the alleged copyright holder of a work, who certified in their internal processes, does not exist anymore.
> Repeatable, provable builds are I understand a recent phenomenon. Use a different compiler, or different flags, and you get different binaries.
You are right that I overstated my case somewhat. You are not guaranteed to get the exact same binary even with the exact same build system, and reproducing the exact same build system decades later may not be easy.
However, using the same version of the same compiler with the same flags, you'll get very close to the same binary even without repeatable builds. Not exactly the same – some binaries embed compilation timestamps, sometimes compilers have a bit of non-determinism in their processing. People who want repeatable builds for security need to produce exactly the same binary. For a copyright lawsuit, you don't need the exact same binary, just a binary which is as close as possible – expert human analysis will compare the two binaries and their disassembly in order to demonstrate copying. (So, while ideally you'd have the exact same compiler version, even if you don't, it can still work – the binary doesn't have to be exactly the same, just close enough that a human expert can determine that it is more likely than not produced from the same source code). The whole point of repeatable builds is you don't need an expert forensic analysis to determine that the two binaries are compiled from the same source, you just compare the hashes.
> the burden of proof still lies with the alleged copyright holder of a work, who certified in their internal processes, does not exist anymore.
Microsoft will pay an expert witness a lot of money to perform a forensic analysis of the distributed source code and compare it to the surviving Windows 3.0 binaries. That expert witness will testify the copying occurred. It is up to the defendant to find their own expert witness to testify to the opposite. If they do so, it then comes down to which expert witness the judge and/or jury finds more convincing.
I think I misunderstood the original idea: I thought that the person still having the source would have released it with some possible modifications, but I didn't realize these mods would only be to hide the code origins, and keep it otherwise fully compatible.
I sincerely hope that you do not ever put that to the test and I think you should stay miles away from advising people on legal matters. To put it bluntly: you are clueless about this stuff.