> Yes I have no idea where they got the 32TB stuff. We had a big leak of Win10 builds yes, but these were all Windows Insider stuff that were collected over time available to all Windows Insider members at one time or another.
Hiya - I wrote the article. What's happened is that the Beta Archive folks have now deleted (or in the process of deleting) the private material that was uploaded to the BA FTP. There most definitely was non-officially-released internal Microsoft files in the archive, regardless of BA's intentions, such as the Shared Source Kit, the ARM64 Windows Server build, the Mobile Adaption Kit, and various prerelease versions of Windows.
We've updated the story to explain why things aren't what they seem. Essentially, the files at the heart of the matter were there (we screenshotted them and saved copies of the forum posts) at time of writing, and they were removed later on Friday.
In terms of the 32TB: that's the full decompressed dump of Windows files uploaded to BA. From what I understand, Microsoft hasn't released 32TB of public Insider material, so obviously there's extra sauce in the mix.
That includes, yes, copies of officially released Insider builds plus confidential private stuff that should never have left Microsoft, let alone turned up in BA. We make this clear in the story - I'm starting to feel the headline could have been better to make this clearer rather than grabbing the biggest figure. I am beginning to regret this.
BA can twist and complain all it likes - but stuff that was confidential within Microsoft ended up in their FTP archive (and some is still in there, such as the ARM64 stuff). The next stage of this story will be to uncover how exactly did this material escape Redmond.
All the old builds of Windows 10 listed were presumably grabbed via public Unified Update Platform (UUP) infrastructure or the Ecosystem Engagement Access Program (EEAP), but I haven't confirmed yet. It's common knowledge in the Windows enthusiast community that builds (yes, even arm64 desktop Insider builds) can be pulled from Microsoft via these channels. It's not confidential, and not useful to share with anyone other than a build vault like Beta Archive.
Debugging symbols for most of those builds are available on symsrv.
The Windows Mobile Adaption kit (like the OEM Preinstallation Kits, OEM Adaptation Kits) is shared with a similarly sized audience, which used to include self-attested Microsoft Partners. Again, not confidential. Just gated stuff.
The Shared Source stuff is a slight unknown here because it's not clear what was in the ZIP. I presume this was a sampling of materials shared via the Shared Source Initiative (https://www.microsoft.com/en-us/sharedsource/), none of which includes high-value intellectual property, cryptographic code, third-party code, etc. It could still be damaging but Microsoft has clearly calculated the risk here; this stuff is shared with mere community MVPs.
So with all this knowledge, it's hard to digest the "omg more exploits coming" and "Microsoft lost 32TB of private IP" angles in The Register write up. I don't think there's a story here, frankly.
Clarify the 32TB and 8TB figures please. People with access to the archive who successfully downloaded the confidential stuff did not get nearly that much.
Do you consider windows installation images to be "compressed files" in this context?
Looks like the 32TB size reported is the total size of all the various Windows installation images, prior to de-dupe. 8TB after de-dupe. Not a very useful figure, however.
Compressed, it is ~8TB. Fully expanded it is ~32TB. I think the bigger issue is not the final size, but that internal Microsoft material - particularly source code - has escaped into public FTP. That, to me, is the main thing, right?
Windows sources have escaped before. I doubt that Windows is buildable outside of Microsoft (and the bits are definitely not signable, since you need access to a key vault for that).
Useful for research, and finding security issues. Not much else.
You can break patents without ever knowing the patent existed. So looking at this code wouldn't trigger a new patent problem.
And simply looking at some code, closing it, then later writing code that does the same functionality is not breaking copyright. So looking at this code would not trigger copyright.
Clean room reverse engineering. The idea that, if you build something with a specified interface (Windows API in this case) without prior knowledge of the implementation details, and you haven't broken any patents in doing so, then you haven't broken copyright either and you are free to do business. This is a gross oversimplification. See Intel vs AMD case for more details.
But what data does this 8TB refer to specifically? Is this the source + all the windows builds from a plethora of sources? Did you download 8TB of data from BA and expand it to 32TB or was this a figure provided to you by one of the raided hackers or their associates?
>think the bigger issue is not the final size, but that internal Microsoft material - particularly source code - has escaped into public FTP
Happens regularly, although usually it's MS employees leaving stuff in public FTPs or inside released ISOs, updates, whatever. redmond\ domain is huge and the (accidental or not) leaks never stop.
There does seem to be some source, as (now, not when you posted) discussed in that thread. Here's a pastebin (taken from that thread) with some filenames, including, for example, usb drivers
If you really want to see some Windows source code. You can just ask them - and they will send it to you. It's not open source and there are limits to what you can do with it.
They call it their Shared Source Initiative. They want a reason for sharing it with you but I have used, 'I am just curious.' With that excuse, this was a long time ago when I still used Windows, I got the specific code I wanted for Outlook Express.
I haven't worked directly with Microsoft for well over 15 years, but this sounds similar to what I remember. Back then I worked for a partner who was doing direct integration work against low-level SQL Server and Windows libraries. Often when we encountered obscure bugs, they'd just give us the SQL Server or Windows source code and basically say, "Fix it, and we will release a hot fix." All of the comments would be replaced with white space which made things more difficult.
But the point is you need to already have a relationship, right? It seems you can't just say "I'm curious" (or even better, "I want to track down X bug") and expect to get source access, contrary to what was claimed earlier... Enterprise specifically says you need to "Maintain 10,000 Windows seats" which is not something a lone developer would do...
And you got it through the Enterprise license? Through your company or personally? Nothing related to 10,000 Windows seats? That's pretty weird if so since they say you must meet that requirement...
I didn't misrepresent myself, was logged in, and had no issues. They may have changed it, but that is what I selected. You'll probably have to sign an NDA. Give it a shot.
This is a gross exaggeration. As far as I can see, what "leaked" was the "shared driver source kit" that nearly any hardware vendor (like chipset manufacturer) can get; basically anyone who puts up a few thousand bucks and signs an NDA.
All I did was change a registry setting (or maybe it was a gpedit) to prevent automatic reboots. That was enough for me though, as I didn't appreciate my running apps being shutdown during the night.
Can you maybe recall exactly what u did to stop your computer to automatically shutdown(and up)?
I "resolved" the issue by dual booting. The second os(prev ubuntu, going to deb) changes something that takes away win ability to automagicly turn on my machine for updates.
The private symbols in these builds could actually be very useful. The article alluded there were private symbols. So, even if only 1% of the overall windows code was leaked, if there were, say private symbols for the heap allocator of the kernel, for a practiced reverse engineer that is pure gold. Not as good as code, but a hell of a lot better than having to figure everything out and name functions and symbols themselves.
There are two levels of debugging symbols. One is released with every Windows build, for end-users reporting back stack traces and the like. These are of the second more granular level - private debugging symbols available to developers only.
This will make Windows less secure in the short term, but as good and bad actors find bugs and Microsoft patches them, they will end up with a hardened product. Their OS is now effectively open-source.
Yup. It's the Windows Shared Source Kit, which is already mostly public. Many of the big security firms and government agencies already have licences to the full source code anyway.
The only thing this really gains anyone is it possible some non-public debug symbols might have been left in some builds. Not earth shattering.
My favorite takeaway was, "With every day that passes, that stolen source code is more and more out-of-date."
I remember hearing Windows source code leaks in the past (I see articles from 2000 and 2004) and remember hearing about problems with "clean room" implementations of open source SMB implementations.
Yeah, the fundamentals and much of the source code will probably stick around for many, many years. But this has happened before and I don't see why this is any more of a big deal.
Given the other article I read today about US companies bowing to Russian requirements to review source code, I wonder if MS has also already given away code that can be studied for security gaps.
Microsoft indeed make their sources available for review by major customers including governments. From what I heard this is done under NDA and reviewers are forbidden from taking the code away from MS facility.
Looks like the page where the source kit was listed was altered since the screenshot that's in the article was taken. I hope the files surface somewhere.
Throwaway account for obvious reasons. Does anyone have a link to the leaked data?
At this point avoiding links is pointless as the source code will be essentially public knowledge in matter of days/weeks. Damage control is the only strategy left. The sooner security researchers outside Microsoft can start analyzing and reporting vulnerabilities, the better.
Some of the leaked data seems to currently only be accessible on a https://www.betaarchive.com/ , but your account won't be able to access it for a month or so.
It seems that the "leak" was what you need to develop a driver. You can sign up for MSDN and get that, right? Does that come with the $3000/year it now costs to subscribe to MSDN?
I'm imagining that other OS will be able to interop better with Windows increasing the value of Windows and improving experience of Windows by other OS users?
Quite the opposite. WINE developers will have to go an extra mile to avoid getting anywhere even remotely close to the proprietary source code, otherwise they may get sued for copyright infringement -- even if they didn't intentionally copied any of the code.
> 1. How would Microsoft prove that they saw the code?
Get the court to order discovery on all of your computers. They could probably also get subpoenas for the source code hosting sites to reveal relevant access logs. Or someone could admit to reading the source code someplace public, like a bug tracker. Or they could argue that the choice of variable names and minor details of algorithm details are too close to be coincidence. A jury convicted Google because of rangeCheck, after all.
> 2. If microsoft sued wine devs it would be horrible for Microsofts public image. They won't do it.
No it wouldn't, particularly not if they had a strong case (e.g., someone bragging about it). If you think MS looks bad for suing people for stealing the code, then you'd have to think the FSF looks bad for suing people for violating the GPL and stealing the code of, say, Linux.
> 3. I hope the WINE devs don't listen to you.
I hope they don't listen to you. The repercussions are quite large--it's not unimaginable that shutting down the WINE project could result from a lost case. These cases do happen, and defendants do lose (Oracle v. Google is a notable recent one, and that's based on IMHO fairly weak evidence). There's a reason that projects that do major reverse engineering for interoperability have rather elaborate procedures for doing so.
And I'm one of them. We don't want to alienate the already small number of people who develop Free Software. I would rather see companies who violate the GPL comply rather than seek damages. Actually bringing a suit, in my mind, is basically the nuclear option.
When I was involved with Mono we avoided reflecting .NET for much the same reason as elaborated by the poster who you (pretty jerkishly) dismiss. It's very easy to say that other people should undertake extreme personal risk.
It looks like mainly kernel and drivers were leaked. WINE is emulating the Win32, which is distinct from the Windows NT kernel & drivers - kind of like how the Linux API is distinct from the Windows NT kernel, despite Windows supporting it.
This might be a boon for the ReactOS folks, who are trying to implement the NT kernel, except for seeing the Windows source code automatically disqualifies you from being a contributor.
Exactly. A few years ago the ReactOS developers had to stop all development for several months to perform a source code audit. This was meant to deflect accusations that they had derived code from disassembled Windows binaries.
If anything, this could make their legal situation more sticky.
That's unfortunate and doesn't fit my understanding of the situation - is disassembling a binary not fair game, in the same way that Samsung buying an iPhone and cracking it open is? (Assuming we're worried about copyright and not patents).
I vaguely remember that either the Wine or the ReactOS developers rewrote parts of their source a while after Win2k source leak in 2004, because there were some contributors who had been exposed to Microsoft's source code, and a rewrite of the parts those devs had touched was apparently the only way to make sure they were "clean".
IIRC, they had to go through some trouble to find people for that rewrite who had not looked at Microsoft's source code AND the parts of the Wine/ReactOS source code that needed to be rewritten.
So I am convinced that they will make extra sure not to even get in the position where someone could imply they might have looked at that source code.
So far I haven't seen any links to source code.
Quote from one of the admins:
> Yes I have no idea where they got the 32TB stuff. We had a big leak of Win10 builds yes, but these were all Windows Insider stuff that were collected over time available to all Windows Insider members at one time or another.
Edit: BA's official statement: https://www.betaarchive.com/forum/viewtopic.php?f=1&t=37283