Reflections on Distrusting xz

irdc · 2024-04-03T10:02:25.000000Z

One thing that comes to mind is that “Jia Tan” might be more accurately seen as a “sleeper” of some sort: a foot soldier who infiltrates a juicy open source project and waits for further instructions; backdooring sshd might not have been part of the original plan.

Which raises the concerning question of how much more sleeper maintainers there are.

constantcrying · 2024-04-03T10:51:33.000000Z

>Which raises the concerning question of how much more sleeper maintainers there are.

Given how easy the infiltration is and how extremely hard to detect it is, likely a lot.

For an intelligence operation it is also extremely cheap, you just need a few knowledgeable developers spending some time each week on the project. The upside being a backdoor into a significant portion of infrastructure, the downside being wasted time.

I do not think it is unlikely that in many important open source software projects there are one or two people assign to keep an eye on things. They don't even need to be malicious, just being somewhat trusted contributors is enough. I would be extremely surprised if the NSA hasn't a couple of guys who keep watch on the Linux Kernel.

moritonal · 2024-04-03T11:00:42.000000Z

The irony is that this would make a oddly effective way of having paid open source devs who for the most part just honestly improve projects. With the massive downside they undermine it at a critical moment.

rightbyte · 2024-04-03T11:52:19.000000Z

Ye then maybe thanks to them finally we'll have The year of the air gapped Linux desktop.

I dunno what to do if e.g. Debian gets compromized, as in, I can't trust the collective of maintainers.

I assume any Windows machine is backdoored. Trivially proven by forced auto updates.

Maybe air gapping some home computer for sensitive data might be a good idea.

pants2 · 2024-04-05T04:13:13.000000Z

The ideal situation is having multiple intelligence agencies all working on one project and spotting each others' backdoors, so at the end of the day we just have a really secure and well-maintained project.

david_draco · 2024-04-03T12:46:58.000000Z

> Given how easy the infiltration is and how extremely hard to detect it is, likely a lot.

I read it exactly the other way around: the infiltration took years and detecting it was the default with a fuzzer, which had to be disabled for the exploit to succeed.

It speaks to the hardening and that hardening should be required more. And of course the precarious roles of maintainers, which have been discussed elsewhere.

dullcrisp · 2024-04-03T15:03:37.000000Z

The exploit could be detected with a fuzzer perhaps. The infiltration seems like it was something that would be very easy for a funded intelligence agency to do, and would be nigh undetectable if they had been subtly introducing bugs rather than shipping a sophisticated backdoor to every Linux distro.

You have to assume if this was an intelligence agency they didn’t burn their only agent like this.

HankB99 · 2024-04-03T18:20:49.000000Z

> they didn’t burn their only agent like this.

I think I'd characterize that as "their only identity." What's the chance that some number (>1) of actual agents were sharing this identity? If that's the case, I'd extrapolate that to multiple identities. In fact the social engineering to gain the trust of the original maintainer likely involved several identities.

I suspect that detectability of intentionally injected bugs would be very low.

geggo98 · 2024-04-04T04:38:49.000000Z

You are right, it took quite some time. On the other hand, it looks like the legitimate part of contributing to xz was only a part time job for the attacker. The rest of the time, they either worked on the exploits, or in other things, like infiltrating other projects using a different handle.

Basically I can imagine the attackers being a well organized group, using work sharing and pipelining. Some members of the group would be preparing exploits, some would infiltrate projects and some would make sure not to get caught. And since infiltrating takes time, they would make sure to have multiple projects in the pipeline, seine in the early contributor stage, some in the social pressure stage, and some in the exploiting stage.

Thorrez · 2024-04-04T03:56:40.000000Z

>a fuzzer, which had to be disabled for the exploit to succeed.

According to this comment, the fuzzer wouldn't have detected it. It wasn't necessary to disable the fuzzer:

>https://news.ycombinator.com/item?id=39911249

jannes · 2024-04-03T13:03:26.000000Z

At this point I wouldn’t be surprised if the NSA also has a couple of Microsoft employees on their payroll.

Gormo · 2024-04-03T14:00:55.000000Z

> Given how easy the infiltration is and how extremely hard to detect it is, likely a lot.

This particular case seems to be an example of the exact opposite. It took "Jia Tan" two years to conduct all of the social engineering necessary to get into a position to introduce the backdoor, then upon doing so, was caught almost immediately, with the initial discovery not even coming from a dedicated security researcher, but from a sysadmin who kept digging when he saw unusual performance issues.

And the threat actor here deliberately went after the weakest link in the chain. Assuming that the goal was to compromise sshd, the antagonist evidently found it too much of a challenge to attempt to infiltrate OpenSSH itself, so targeted a small-team compression project that only some deployments even link to, and still got caught extremely rapidly.

constantcrying · 2024-04-03T15:11:17.000000Z

I don't think you are looking at it the right way. Two years matter if you are a hobbyist or your goal is to compromise the system for some individual gain.

For a nation state actor two years is nothing. Likely the entire attack didn't cost more than a couple of thousand hours of developer time. I would guess it was easily cheaper than 100k in financial terms, that is extremely efficient for an intelligence operation with the upside being access to a large amount of servers.

The method by which he got caught also depended largely upon random chance, some major "if's" needed to happen, the performance reduction was an unfortunate side effect from the perspective of the attacker, really a minor mistake exposed him. Even if he were exposed a couple months later the damage would have been enormous, if that version had become a stable part of any major distro for the next few years hundreds of thousands of machines would have been vulnerable.

There is absolutely no reason to assume that if another attack of this quality happens it will not find it's way into some stable distro.

Gormo · 2024-04-03T15:45:51.000000Z

> For a nation state actor two years is nothing.

Perhaps it is, perhaps not. If they are playing a long game, two years might be a straightforward investment. But the point is that it still took two years either way -- the fact that something is only feasible for big institutional actors implies that it is not in itself "easy".

And the kinds of organizations that have the motivation and resources to engage in this kind of long-game infiltration by definition have the resources to influence, infiltrate, and manipulate in a variety of ways. It is absolutely not clear here whether having formal organizations involved in directing the development of xz would have made it easier or harder for whomever was behind this attack.

> The method by which he got caught also depended largely upon random chance,

Why do you presume it's random chance, and that it wouldn't be better represented by more complex probability models? On the surface, it seems like the chance of overall detection would be reasonably represented as an equivalent of an MTBF calculation, with the number of skilled users capable of detecting the vulnerability through normal use as an input. (Not to say that we have enough data to calculate this probability, or that this necessarily even happens enough to have a statistically significant sample -- it might all be "black swans" -- but there's certainly non-random stochastic inputs into this.)

> There is absolutely no reason to assume that if another attack of this quality happens it will not find it's way into some stable distro.

There's no guarantee for any of this. All rules can be gamed, all incentives can be manipulated, all organizations can be compromised. There is no perfect solution to security, and no guarantee that any prescriptive solutions based on generalizing from the specifics of one incident would not create more opportunities for attackers rather than fewer.

constantcrying · 2024-04-03T16:32:20.000000Z

I don't disagree, I am not even sure what exactly you are arguing against.

I completely agree with your comment about the detection being the result of iterative low probability chances of detection. My point was that, even if that is the case, there was a major chance that it would have gone into many stable distos or would have been exploited.

Gormo · 2024-04-03T17:23:46.000000Z

> I don't disagree, I am not even sure what exactly you are arguing against.

I am arguing that this incident demonstrated the resilience of the FOSS model to even extremely strategic, long-term compromise attempts. Some antagonist invests years worth of time, effort, and money to introduce this backdoor, and the openness of the entire process allowed a single engineer to unravel the whole thing almost immediately after the backdoor was introduced, and every distro immediately sprung into action and neutralized it entirely.

All of the people complaining about the vulnerability of small projects, the pseudonimity of contributors, the need for more institutional involvement, etc. are getting things exactly backwards.

Imagine if this were a closed-source project funded and managed by an opaque institution, and the attacker used a different set of social engineering tactics to backdoor the code from within the org. Suppose then that Andreas Freund noticed the exact same behavior and set out to investigate it. How far would he have gotten?

This incident validates the "many eyes" concept and is a point in favor of the FOSS model.

kevans91 · 2024-04-03T17:29:12.000000Z

fwiw, characterizing Andres as a sysadmin isn't really the whole picture; he's a postgres developer that conducts benchmarking operations with some frequency (and he's quite good at what he does)... he's perhaps naturally a bit more sensitive to things like the cumulative effect of 500ms or so over a number of sshd invocations.

Gormo · 2024-04-05T15:30:30.000000Z

You're right -- I went back and changed "sysadmin" to "engineer". Either way, though, he was not a dedicated security researcher, and managed to unravel this entire thing upon noticing an anomaly in the course of his regular work.

cactusfrog · 2024-04-03T13:49:09.000000Z

I believe LLMs could be useful here as a component of pre-commit hook.

berkes · 2024-04-03T10:30:03.000000Z

I once got a (probably scam) offer for adding a cryptominer to a library that I maintained at that time. And a more serious offer to add trackers to a popular >1M installs app.

Both cases I obviously ignored it. But it made me aware of a nasty attack vector: someone who's thanklessly building a wordpress-plugin, pip, npm, or whatever software, thanklessly dealing with issues, PRs, support, maintainence, often for no pay, suddenly gets offered three figure sums to add a few lines of "affiliate stuff" or such. There are many places in the world, or people in situations where this amount of money really makes a compelling case.

irdc · 2024-04-03T10:40:42.000000Z

“Given enough underfunded maintainers, all security is shallow.”[0]

0. https://en.wikipedia.org/wiki/Linus%27s_law

lenerdenator · 2024-04-03T14:00:30.000000Z

I wouldn't say "funding" is necessarily the problem.

Most maintainers do it because they like doing it. Their main limiting factor is time. I can drop a million dollars an hour into a maintainer's lap; that doesn't mean they can dedicate every waking moment to a project. They still have human needs that money can't buy like sleep, family obligations, and health concerns. And that's making the assumption that the maintainer uses that million/hr to quit their job.

No, the problem is a lack of trustworthy candidates for maintainership and a lack of time. There are components of a GNU userland that are now too complex for a single human to both maintain and enhance at the same time. We now need to target multiple distros (really, more than are necessary, strictly speaking) and ISAs. Most are written in systems programming languages like C that are more complex than the average software engineer in 2024 works with.

We need consolidation, simplification, maintainer redundancy, and a trust/governance framework for packages.

mikrotikker · 2024-04-04T12:36:23.000000Z

We need to utilise a specialised AI to scan through the code looking for bugs and security holes. Imagine if openai donated server time to this.

strogonoff · 2024-04-03T11:36:21.000000Z

I believe the problem of thankless maintenance is best solved with two things: the thanks (yes, we are all human and want recognition and appreciation from fellow humans)[0], and a stable employment (work for a good large business while open-sourcing what’s possible)[1].

If you do OSS for profit, then it can become a question of where is more money; but if you work a reliable job with insurance, relationships and other implications then the stakes may be a bit different.

Many of the biggest OSS projects today were started by people who had no money in mind whatsoever. Some had other jobs, others were students, etc. If we feel relatively secure, we are driven by our innate desire to tinker, create cool things and show it off.

[0] Undermined by LLMs that are used to gobble up your code and suggest it to others commercially and without attribution.

[1] Undermined by low employment protections (if you can expect to be fired at any time, you would be less loyal), and by LLMs (whatever you open-source now more directly benefits Microsoft or whatever).

aleph_minus_one · 2024-04-03T12:51:07.000000Z

> and a stable employment (work for a good large business while open-sourcing what’s possible)

Even if it was hypothetically possible to open-source basically everything that the team in which I work produces:

The software that I work on is very specialized software that is used by the company's employees and customers for specialized purposes. Imagine some nice LoB application that is actually somewhat comfortable to use. It basically does "what the users need" and is thus deeply ingrained in some parts of the company's workflows. The only use someone outside the industry might have for it is "cosplaying being employed in this industry".

A lot of software that is developed (in particular in companies that don't sell or rent software) is of this kind.

Thus: the open-source scene does in my opinion not have any use for a huge amount of software that is actually developed and actively used.

strogonoff · 2024-04-03T13:01:54.000000Z

> The software that I work on is very specialized software that is used by the company's employees and customers for specialized purposes

Oh really? Welcome to the club. Our very specialized software for very specialized purposes used Django with a certain auth provider. So I refactored that into a standalone Django app that painlessly handles this specific OAuth provider, configurable via settings with sane defaults, and open-sourced it. (The refactoring was very beneficial to myself, that part of the project got instantly nicer to work with.)

Of course, it is small beans compared to something algorithmically hardcore (I was a junior myself back then), but it’s just an example.

Any software, no matter how specialized and bespoke, can be expressed as many self-contained isolated components that individually know nothing about that specialization and bespokeness. In fact, such factoring is generally a sign of good design: you may have heard of the loose coupling & high cohesion principle—once you follow it, open-sourcing a particular component is very straightforward.

Note, though, that if your contract has certain licensing provisions in certain countries you may not be allowed to unilaterally open-source anything during the full term of employment (even if it is unrelated to your dayjob). You may need to get approval first. However, many good tech companies are reasonable when it comes to open-sourcing non-core components.

david_allison · 2024-04-03T11:49:01.000000Z

3. More maintainers

Days of 100+ notifications aren't easy. Things will slip through

strogonoff · 2024-04-03T11:49:54.000000Z

Agreed, but especially in light of recent events it’d be important to know who they are, and that’s not always easy.

kijin · 2024-04-03T11:29:17.000000Z

This is what worries me more.

It's easy to point a finger at a specific Bad Guy® and shout "He did it!" It's much harder to face the reality that any maintainer of any open-source project can slowly burn out to a point where they become accomplices in an attack, or at least turn a blind eye.

The pool of open-source developers does not split cleanly into honest contributors and evil agents. The boundary is quite fluid -- more so in some circles than in others -- and there are always temptations to move from one side to the other and back again.

david_allison · 2024-04-03T11:32:01.000000Z

> suddenly gets offered three figure sums to add a few lines of "affiliate stuff" or such

Back of the envelope calculation: you're looking at 2 orders of magnitude more money from "affiliate stuff" than you would be from generous user donations

berkes · 2024-04-03T19:26:05.000000Z

Well, yes. But it's also something you can do once. When (not if) it comes out, all credibility is lost.

Whereas donations, regardless of how puny, are recurring and potentially forever.

xign · 2024-04-04T01:41:53.000000Z

That's definitely true. And a lot of times it could be a random open source project that is under the radar and rarely thought about. E.g. The Great Suspender Chrome extension which was sold to an unknown buyer which later turned it to malware: https://www.bleepingcomputer.com/news/security/the-great-sus...

moritonal · 2024-04-03T11:02:32.000000Z

It's why I actually always encourage app devs to charge for their apps, even open source ones. It creates an exchange of value for the author to feel valued and detract from these vectors.

acdha · 2024-04-03T11:38:06.000000Z

> There are many places in the world, or people in situations where this amount of money really makes a compelling case.

It’s especially easy to imagine that using the classic intelligence agency playbook: monitor high-impact maintainers and look for leverage before making the approach (“hey, saw your post about the divorce settlement and that $%#@ cleaning you out. My affiliate marketing pays in bitcoin…”) just as they’ve done for ages.

andrewinardeer · 2024-04-03T10:33:59.000000Z

I believe this is a nation state actor and there are a a fleet of 'Jia Tans' working on other OSS projects to backdoor operating systems.

And some have probably succeeded.

prmoustache · 2024-04-03T11:02:38.000000Z

I wouldn't limit that to OSS projects. How many of them managed to get hired and are working for Microsoft, Apple, Google, Oracle or Amazon?

In some cases they don't even need to introduce backdoors themselves but just review and spot bugs they don't correct or raise issues for but communicate to mothership. They could even work in team with having one building the backdoor and the other approving the code.

Most companies have more thorough processes to avoid this but that doesn't mean those processes are applied correctly everytime, especially if more than one malicious engineer is involved.

adql · 2024-04-03T11:15:22.000000Z

I'm now imagining a department where every single worker is a spy for different government and they play endless game of "add exploit, close off the other people's exploits". And all of them think they are the 10x developer because all the other people do in their view is pushing shoddy code

simonvc · 2024-04-03T11:26:03.000000Z

I worked on a project just like this once.. a mobile phone network build in the middle east before the arab spring. 10/10 would not repeat the experience.

belorn · 2024-04-03T11:37:03.000000Z

Most of the time they don't need infiltrators. Governments can just pressure companies with export controls or warrantless surveillance to get backdoors into commercial systems. OSS projects require different methods because the more direct method would be discarded by the community and forked.

StayTrue · 2024-04-03T13:09:34.000000Z

> Most of the time they don't need infiltrators. Governments can just pressure companies with export controls or warrantless surveillance to get backdoors into commercial systems.

Or they simply pay companies with a “support contract” in return for embedding spyware that sells out customers. Seen that first hand (private key exfiltration), resigned the same day.

Lots of comments saying we need to do something about the OSS supply chain but in my estimation the problem is much worse with closed source commercial software.

autoexec · 2024-04-03T11:54:48.000000Z

> I wouldn't limit that to OSS projects. How many of them managed to get hired and are working for Microsoft, Apple, Google, Oracle or Amazon?

They don't have to sneak into those companies tho, they just hand over something like a national security letter and do whatever they want while making it clear to the heads of the company that anyone who talks or pushes back will rot in gitmo. Why wouldn't there be at least an equivalent to Room 641A (https://en.wikipedia.org/wiki/Room_641A) in every major US corporation that deals with massive amounts of people's sensitive data and communication?

acdha · 2024-04-03T11:28:57.000000Z

> Most companies have more thorough processes to avoid this but that doesn't mean those processes are applied correctly everytime, especially if more than one malicious engineer is involved.

I’d also bet that you could exploit the tiers at many companies: how many places have more robust review for the staff engineers but then assume that some lowly “ops monkey” will take care of the build environment, etc.? I’d hope that wouldn’t work at Google, Microsoft, etc. but have heard enough stories about disparities between which jobs are contracted out and which have the coveted FAANG benefits that I wouldn’t exactly be shocked if it turned out otherwise.

thewanderer1983 · 2024-04-03T11:38:08.000000Z

Deleted.

prmoustache · 2024-04-03T11:59:35.000000Z

The thing is they only need one member sometimes, to observe what is in use.

Example scenario: "malicious engineer in say Microsoft, finds out that office365 is using xz internally and the library is pulled directly without code review. Same engineer or another member of same group would be that Jia Tan doing the necessary backdooring in xz to target office365. And bam all worlwide Office365 accounts would be backdoored."

I am not saying Office365 is using xz, I have no idea really, but this would be a possible scenario. I know MsTeams is using ffmpeg for example.

So I think having this discussion while only scoping linux distributions is a big mistake. xz project was particularly interesting as a target as it is distributed under BSD zero-close license, which is pretty much a public domain license. You don't have the attribution part of the BSD license so there are probably myriads of proprietary software using it too without them acknowledging it.

Y_Y · 2024-04-03T10:58:25.000000Z

I believe the term "nation state actor" is a term that means "country" but with the bonus of connoting that the writer is an armchair infosec wizard. This speculation is not valuable without adding information, otherwise it's just McCarthyist bluster.

MattPalmer1086 · 2024-04-03T11:32:58.000000Z

Nation state actor is a standard info sec term. Using it does not imply any kind of wizardry.

Edit: most threat actors do not have the patience or the motive to behave in this way. It is reasonable to suppose that this is a nation state actor.

XorNot · 2024-04-03T12:02:31.000000Z

There are organised crime networks which buy 0-days to run ransomware, and actively target companies to do it.

Why couldn't this be an attempt at finding or selling exploit access on the black market?

The problem here is no one is looking properly at the scope. It's more then trivial so everyone is leaping for "nation-state" as though that's the only threat actor with motivation and patience.

MattPalmer1086 · 2024-04-04T00:40:50.000000Z

Granted, other threat actors remain a possibility, there is no proof.

It looks like the sort of thing nation states would do and develop. It could be some other group hoping to make money as you say.

Whoever they are, they seem to have good opsec, over multiple years.

HelloNurse · 2024-04-03T13:20:54.000000Z

It's also a term that implies the competent, official hacking departments or espionage agencies of that country, rather than government-supported amateurs (e.g. an untrained policeman) or generic people from that country.

hyperhopper · 2024-04-03T11:30:27.000000Z

Whales is a country but it is not a nation-state and probably does not have its own APT.

Y_Y · 2024-04-03T11:54:28.000000Z

I'll bite. Wales is a country only in a "traditional" sense, since it does not currently hold the type of sovereignty we require of "countries" in the usual sense. As Voltaire said, "This body which called itself and which still calls itself the Holy Roman Empire was in no way holy, nor Roman, nor an empire.".

Wales, when independent, could reasonably be described as a nation state, being roughly associated with the Welsh people and their culture, language, history etc. But most countries are not nation states! The USA, Russia, and China for instance are explicitly plurinational.

If you think "nation" and "state" are synonyms (along with "country") then it's redundant to use both. If you think that "state" alone might lead someone to think of Alabama or Minas Gerais then say "nation". If you think "nation" will make people think of Cherokee, then say "country" or "sovereign state".

"Nation state" has a specific and well-established meaning, misusing it is like misusing any other jargon and just comes across as a failed attempt to seem part of the ingroup and hence authoritative.

(yes, apparently I will die on this hill)

stavros · 2024-04-03T12:40:29.000000Z

Are there any actual nation states? Even small countries, like Greece, contain other nationalities.

Y_Y · 2024-04-03T13:02:54.000000Z

Greece is certainly a nation state, being primarily occupied by "Greeks" (referring the the nebulous concept of a national identity). It doesn't matter if some French or Turks live there.

A country that strictly limited residency by ethnic identity might be called an "ethnostate" and indeed it's hard to find a pure example of one of those.

irdc · 2024-04-03T10:45:52.000000Z

At this point, considering the apparent ease with which a project that is used pretty much everywhere was taken over, that seems like a reasonable position.

arp242 · 2024-04-03T10:58:42.000000Z

I can walk out on the street and stab someone to death if I wanted to. This is surprisingly easy.

Just because something is relatively easy to pull off doesn't mean it happens a lot.

It's also not that easy to pull off because you need to have a project with relatively few eyes and a place to hide it. In this case: binary tests. But most projects don't have those.

There is no evidence for any of this, including that it's a nation-state actor. There's also a case to be made that it's NOT a nation-state actor as nation states use Linux and want a secure Linux. The NSA and such have somewhat conflicting interests here. We just don't know. It's likely we will never know.

All of this is starting to resemble the spy paranoia of the first world war. A few spies got caught and suddenly everyone was now a suspected German spy (including a general, if I recall correctly, who was detained for a while because he couldn't answer a question about baseball or some such).

I suspect that very soon people will start demanding maintainers put some of their blood in a Petri dish to be tested with a hot needle. Just in case.

acdha · 2024-04-03T11:33:45.000000Z

> There's also a case to be made that it's NOT a nation-state actor as nation states use Linux and want a secure Linux. The NSA and such have somewhat conflicting interests here. We just don't know.

I agree that we do not know that it’s a nation-state but this point seems to work in the opposite direction: this attack was very carefully constructed so only someone with a particular key pair could exploit it. That’s reminiscent of what the NSA did with the Dual EC constants, and they were confident enough about that to push it into the FIPS requirements for federal IT.

mannykannot · 2024-04-03T11:51:08.000000Z

Motive, opportunity, means - and consequences: it is primarily the absence of a motive, and secondarily the likelihood of consequences, that keeps the prevalence of street stabbings way lower that if means and opportunity were the only factors.

The argument against nation-states being involved has some problems: a state can avoid becoming victim to its own work, while its own restraint would not prevent developments elsewhere.

lolc · 2024-04-03T11:53:56.000000Z

You're commenting under a link where commits to the xz-decoder are discussed. Some level of paranoia is warranted.

The binary files look like a sideshow in comparison. Maybe we're lucky the attacker was tempted to hide something in there.

The_Colonel · 2024-04-03T11:30:38.000000Z

> I believe this is a nation state actor

It is certainly possible, but we don't really have a good indication for that. This whole thing would be definitely doable by a single individual.

Barrin92 · 2024-04-03T12:00:04.000000Z

doable yes, but what seems to me like a strong indication is the duration, multiple years, and the effort to set up a quasi patch infrastructure for the backdoor which I can't remember ever having seen in some amateur or ransomware hack.

mrkramer · 2024-04-03T11:46:33.000000Z

My assumption is that this was state sponsored mass surveillance campaign of some kind but God knows what exactly they were looking for.

I think if backdoor was discovered 2 or 3 months later, we maybe could understand better what they wanted to do. My speculation is that they wanted to build a massive botnet and then snoop on machines' processes and traffic looking for something. It's hard to speculate because luckily they were captured soon enough.

swed420 · 2024-04-03T13:52:55.000000Z

I find it intriguing that out of all the speculative comment threads I've read so far, none of them have suggested it was Microsoft attempting to make FOSS look bad/vulnerable.

beardedwizard · 2024-04-03T14:37:56.000000Z

How would that benefit Microsoft, who owns GitHub, the home of OSS? It's not a secret that oss is vulnerable, the opportunity for MS is to sell the solution to a captive audience.

swed420 · 2024-04-03T15:02:48.000000Z

Microsoft making the decision to own GitHub in the first place also speaks to my suspicion. Embrace, Extend, Extinguish.

epr · 2024-04-03T11:49:12.000000Z

I've never been concerned about spies infiltrating open source projects compared to legitimate maintainers being hacked, even now after this whole xz incident.

I'll put it this way. Let's say a bad guy had a decent budget to spend on paying agents/criminals to break into maintainer's homes on their behalf with a rubber ducky, etc. I'd expect a pretty high success rate compromising their hardware...

dartos · 2024-04-03T12:21:06.000000Z

You’re ignoring scale.

A single Jia Tian can be infiltrating 10s or more OSS projects each week without needing to travel around the world physically stealing hardware from various maintainers who they then need to impersonate.

They can just impersonate some anons with no real lives or connections and just get the keys to OSS projects given time.

epr · 2024-04-03T13:25:33.000000Z

> A single Jia Tian can be infiltrating 10s or more OSS projects each week

Single? 10s or more per week?! I can't help but think you are underestimating the cost of developer time. How many hours of work did it take JT to infiltrate to the point of finally implementing a backdoor? How much does that time cost?

> just get the keys to OSS projects given time.

This is not what JT did though, and for good reason. Trust of anons in open source is generally built through contributions of real developer work over time. That does not scale.

> without needing to travel around the world physically stealing hardware from various maintainers

I wasn't suggesting stealing hardware to impersonate someone. I'm talking about hiring petty criminals or using field agents to break into a house, using physical hardware access to install a backdoor, etc. into the legit maintainers hardware. The field guy's goal is to not get caught, so the maintainer is unaware they are compromised.

I suppose the limitation with both approaches (maintainer plant vs compromising maintainers) is cost. My educated guess is that the cost of hiring skilled developers from a very limited pool for multiple years is more than it would cost to hire criminals that are already breaking into houses for low risk jobs where they don't even need to steal anything.

berniedurfee · 2024-04-04T11:47:17.000000Z

When you find one cockroach, you can be sure there are thousands more you haven’t found.

echelon · 2024-04-03T10:33:10.000000Z

We could all be Jia Tan.

Someone could be bought, killed and replaced, or simply shadowed when they die or go to jail. Anonymity makes this even easier.

craftkiller · 2024-04-03T11:04:24.000000Z

> killed [...] die or go to jail

All of my commits are signed with a PGP key that is on hardware security tokens and password-protected. In the event of my death, my digital identity could not be stolen without backdoors in my hardware security tokens.

That being said, $5 wrenches and large sums of money are still possible attack vectors.

acdha · 2024-04-03T11:43:52.000000Z

Also don’t forget that not everyone expects perfection and a canny attacker can exploit that. It’s really easy to focus on how you’d avoid trojans, keyloggers, etc. but I’d also ask how likely it is that if someone sent a message from your email address claiming you’d lost your token in a minor accident, etc. that they’d believe it - or simply accept it if commits started showing up with a new key (maybe with an upgraded crypto system) since 99% of Git users never check those.

ccccccc1 · 2024-04-03T11:30:23.000000Z

A cool tax-free no questions 500k can convince a lot of people

TheCondor · 2024-04-03T12:22:37.000000Z

One thing I’ve learned, not from direct experience but from observation. These things are way cheaper than the more ethical and optimistic of us in society think. Your point is totally valid but the number is probably more like $5k-10k.

saagarjha · 2024-04-03T13:32:40.000000Z

Tax free $500k? I don't want the IRS to come after me. Please mark all your bribes as regular income thanks

brabel · 2024-04-03T10:43:45.000000Z

Everyone working on important open source code should have a real identity associated with them. The fact that "Jia Tan" was able to become a maintainer without anyone ever trying to figure out their real identity shows a huge weakness in our trust model in OSS (everyone real would have something like a Linked In page, Facebook, Twitter, Instagram, or better, their own website with stuff that could be used to ensure they're a real person - that could be faked as well but the amount of effort would be high, and checking this would be much better than just allowing effectively anonymous users to be maintainers - there's just no need for anonymity in this scenario!).

ninkendo · 2024-04-03T10:45:57.000000Z

> everyone real would have something like a Linked In page, Facebook, Twitter, Instagram, or better, their own website with stuff that could be used to ensure they're a real person

Oof, I guess I’m not real then, as I have none of those things.

SanitaryThinkin · 2024-04-03T11:00:53.000000Z

On top of what you mentioned I also dislike the TSA-like response the OSS community is taking with this happen stance.

I have anonomously contributed to many projects because I enjoy my privacy. All of my founding projects have also been done with anonymity.

Because someone wants their anonmity and privacy does not mean they're nefarious, and I find it funny the group that takes to these principals most is negging on those ideas.

xign · 2024-04-04T01:47:46.000000Z

Personally, I do find it hard to trust an open source project maintained by an anonymous person. (I'm talking about maintainership, not regular contributions that need to be code reviewed by another maintainer) I may toy around with them but I will probably not use them in a manner where I need to trust them continuously.

It's totally cool for you to do whatever you want, since it's a free world after all, but if you want other people to use your code, then it's a two-way street no? Your code has a direct effect on their computers, and so they are placing their trust on you. You may value your privacy, but you need to balance that with other people's valuing their own security, and it's likely that whatever project you maintain may have an alternative as well.

If you just want to commit some code and not have people use them then that's another issue altogether.

I guess what I'm saying is: it's a two-way street. You can do things anonymously, but big companies / projects also don't have an obligation to use your code.

SanitaryThinkin · 2024-04-04T23:43:17.000000Z

> You can do things anonymously, but big companies / projects also don't have an obligation to use your code.

You're not wrong here, but I'm not forcing anyone to use my code bases or contributions.

Also, think about how many systems you blindly trust on a daily basis.

When you drive over a bridge, did you research the maitenance procedures and compliance was up to date?

When you got a house or apartment, did you look into the engineering sign offs and construction companies? And that maitenance has been done up to snuff? Even down to hoping the inspector knows what they're doing?

When you step into an elevator do you check the recent inspection plaque?

When you get on an airplane are you aware of its maitenance history? And to further my point by refering back to the house example, did the company even QA the plane before they shipped it?

And most importantly, did you check into whether the people actually did these things versus just saying they did them?

What kind of trust does having a persons name attached to the project actually provide? I would argue its a psuedo-facade trust basis that gives a false sense of security.

The truth of the matter is that you blindly trust millions of things on a daily basis. Including the very system you type from, which I guaruntee you has more than one anonymous maitainer attached to its underlying software.

I totally get where you're coming from, but the same problems exist in every industry, supply, politics, every facet of your life is based on many blind trust principles.

The one difference here with anonymous open source contributors is that they give you the code to read through yourself (and hope that you help ;) )

Very much unlike the proprietary software you're running beside it.

nottorp · 2024-04-03T11:30:07.000000Z

I have Facebook and the account and what's on it is no one's -ing business in a professional context.

ghaff · 2024-04-03T11:51:50.000000Z

I wouldn't post anything on Facebook (or on social media generally) that could be professionally embarrassing but I also don't generally accept invites from people who are solely professional acquaintances or use it in a purely professional context at all.

nottorp · 2024-04-03T11:59:12.000000Z

> I wouldn't post anything on Facebook (or on social media generally) that could be professionally embarrassing

The age of self censorship :)

I don't post anything on my FB. I'd still reject any employer who wanted to take a look.

ghaff · 2024-04-03T12:05:44.000000Z

Self-censorship is probably a good thing in many cases. And there are certainly things I don't care to share in writing on any public or semi-public medium.

But I agree that even if I can't keep an employer from sleuthing generally, I don't consider Facebook part of my professional record even if there's nothing on there I'd have a problem with a co-worker or potential co-worker seeing.

executesorder66 · 2024-04-03T12:15:56.000000Z

>> what's on it is no one's -ing business in a professional context.

> The age of self censorship :)

I find this amusing.

I would also like to know what you hoped to achieve by self censoring the word fucking in your message above.

- HN doesn't block posts with any kind of "profanity" filter

- You didn't spare anyone from the profanity, since we all knew exactly what you were saying/thinking

So I'm really curious what that actually achieved.

nottorp · 2024-04-03T13:01:50.000000Z

That one amuses me because of a character in a (iirc) fantasy book that swore all the time but used just -ing everywhere. Sadly I don’t remember what character of which book…

jcurtis · 2024-04-03T13:46:42.000000Z

It's Mr Tulip, of Terry Pratchett's The Truth.

nottorp · 2024-04-03T14:58:49.000000Z

Thank you! I was pretty sure it was Pratchett (but not which book and character) but I self censored in case i was wrong :)

ed_elliott_asc · 2024-04-03T11:03:15.000000Z

Also this can all be faked

reisse · 2024-04-03T10:51:33.000000Z

No-one working on open source code on their time owes anything to anyone using the code. If you want an important open source project to be maintaned by non-anonymous person, surprise-surprise, hire that person and pay them.

Besides, some of the best open source contributors I know are almost-anonymous people behind nicknames and anime girls avatars.

xign · 2024-04-04T01:53:50.000000Z

At the same time no one is obligated to use your source code. I think the point here is from now on people (companies and large projects) may be more paranoid about anonymous contributors and refuse to sign off on using code maintained exclusively by them. It's fine for people to stay anonymous, but they just run the risk of not having the credibility for adoption and need to accept that.

But yes, I do trust certain figures like that, e.g. Asahi Lina. It's a fine ambiguous line. But at least in Asahi Linux there are real known human figures and they know who Asahi Lina is.

brabel · 2024-04-03T20:30:27.000000Z

It is not about owing someone... it's about having provenance of code.

If you're just an anonymous guy doing stuff for free and want to remain anonymous, that's fine, but then your software shouldn't be used by anyone who cares about toolchain attacks as there's just no way to trust you, and no way to verify every single commit you make on new releases.

For software that gets used by many, which is a goal of OSS (otherwise just don't even bother to publish stuff, what's the point?), there needs to be a face behind it.

I do agree with others that identity is a hard problem, but people here are pretending there's no solution to that (or misinterpreting what I wrote to mean people should have a Facebook or Twitter account, which is absolutely not what I was trying to say - I just mentioned the most popular websites real people are likely to be found on, as that could be used to prove their identity... for example, I have a Keybase account where my proof of identity, which is tied to my public keys, can be found on my GitHub profile - but they let you choose Facebook or Twitter for that purpose as well) when obviously there is. I should know, I work on this space.

reisse · 2024-04-04T11:46:43.000000Z

> If you're just an anonymous guy doing stuff for free and want to remain anonymous, that's fine, but then your software shouldn't be used by anyone who cares about toolchain attacks as there's just no way to trust you, and no way to verify every single commit you make on new releases.

What is, from security point of view, the difference between a toolchain attack performed by an anonymous contributor and by an identifiable real person?

> For software that gets used by many, which is a goal of OSS

It's not. The goal of OSS is to give users the possibility to study, change and improve the software. And that includes giving you ability to independently audit the code. All of that does not need any person behind it.

acdha · 2024-04-03T11:59:27.000000Z

> everyone real would have something like a Linked In page, Facebook, Twitter, Instagram, or better, their own website with stuff that could be used to ensure they're a real person

Have you seen the campaigns people have run building fake LinkedIn profiles and slowly adding “connections”? There was one a few years ago which roped in a lot infosec people who should have known better and it’s gotten much worse with AI generators. Even before LLMs what you described would have been a godsend for intelligence agencies - who has more time for it, an open source developer writing actual code or the dedicated social media team at the IRA? – and now that’s increasingly worse.

brabel · 2024-04-03T20:40:39.000000Z

I believe the solution to identity on the Internet needs to be tied to governments, that's unfortunate but in real life, that's always been the case and I can see no alternative here. Blockchain is a pipedream and no serious work is going to associate a person's identity to a key which cannot be revoked, cannot be recovered in case of "loss", can be tracked on a public ledger etc. etc...

But there's actual good work going on in the identity industry, like Verifiable Credentials, so this will become a reality soon: you will be able to verify someone's identity as long as you trust the issuer of their "credential" (which in the case here would mean basically a username and a public key or reference to a JWKS which can be used to verify the signature of the person, very much like the digital version of an identity card which can be used to check the signature on some piece of paper, but actually cyptographically safe)... so you would need to add a few governments to your list of "approved issuers", or something more indirect like universities (which themselves would rely on the government-issued identity) or traffic authorities (if you rely on driving licenses). Sure, Governments can lie, and people go to great lengths to steal others' identities in real life, but in the current world, we're still able to get bank accounts, passports etc. based on this model... just because the system is not perfect doesnt' mean it's not good enough, specially when there's no better alternative at all.

iron-s · 2024-04-03T10:48:59.000000Z

What is real identity? Anything online can be faked. A state-issued id? How that protects against nation state?

brabel · 2024-04-03T20:45:32.000000Z

Nothing protects you if you're up against a state. That doesn't mean we should give up completely.

Do you have a passport? That's a real identity in most places. Soon, it may be possible to use that to link your identity to a set of public keys which you can then use to identify yourself.

There's a lot of work to be done to make this a reality, but work is surely going on right now and this is going to be possible one day.

Check this out, as a starting point: https://curity.io/resources/learn/verifiable-credentials/

nottorp · 2024-04-03T11:27:33.000000Z

Why would I trust an "important open source project" with my identity?

It goes both ways.

Besides, the 'state actor' the security theater people keep mentioning would have no trouble creating such real identities.

brabel · 2024-04-03T20:49:51.000000Z

If you don't trust the project, you wouldn't contribute to it.

The state actor may be able to fake identities, but that would still allow tracking the identity to a particular state... and if caught multiple times, that state would start losing credibility and projects may choose to stop trusting people from such nationality, unfortunately, or at least require more strong evidence the person is real and trustworthy if they come from known rogue nations.

nottorp · 2024-04-03T22:20:58.000000Z

> If you don't trust the project, you wouldn't contribute to it.

Trust them to merge a bugfix is different from trusting them with my identity isn't it?

There are degrees of trust. For example I have a gmail address in my profile because the spam filter on there is better than what I have on my personal domain. People I've known for longer, business or otherwise, get the other (that I read more often).

executesorder66 · 2024-04-03T11:00:25.000000Z

News just in: NSA et al. defeated after having to create a LinkedIn and Instagram profile for their agents.

the8472 · 2024-04-03T11:05:59.000000Z

If you want security, pay for independent code audits (not compliance bullshit). Repeatedly. Don't offload your desires onto one-man-shows that the world decided are useful tools.

MattPalmer1086 · 2024-04-03T11:38:25.000000Z

None of those things prove identity. A well funded or just patient attacker can spoof all of those. Sure, it raises the bar a tiny bit, but it's no proof of identity.

tomn · 2024-04-03T11:39:51.000000Z

Yeah, it's not enforced (and certainly not with linked-in and facebook) but it's really not uncommon to require use of real names for contributions.

Linux doesn't allow anonymous contributions:

https://www.kernel.org/doc/html/latest/process/submitting-pa...

and this guide has been adopted by a lot of GPL-licensed projects (at least openwrt, glibc and gcc).

xign · 2024-04-04T01:58:50.000000Z

Wasn't there some controversies around this before? I remember there was some talk of why Asahi Lina (anonymous vtuber working on Asahi Linux) can contribute code to Linux. From casual search: https://www.spinics.net/lists/kernel/msg4888830.html

FWIW I like Asahi Lina, just trying to understand the discrepencies

tomn · 2024-04-05T16:11:41.000000Z

Interesting. My understanding is that these projects don't allow anonymous contributions to make their copyright situation clear, so in theory if marcan42 sent a letter to the linux project saying that contributions from Asahi Lina are actually theirs, they might reasonably be fine with that.

It seems like this is how the ASF runs: you can be anonymous publicly, but you have to sign their CLA (or whatever they call it) properly.

To me, the people trying to unmask Asahi Lina are being simultaneously mean and silly. If it's so obvious that it's marcan42 doing a voice, do you really need to point it out? That's kind of the joke.

ghaff · 2024-04-03T11:58:56.000000Z

I'm not sure why the downvotes. That seems to be a statement of fact.

You can do a certain amount of identity obfuscation online but for anyone with a real professional profile you're generally not really anonymous if anyone really cares to find out your true name.

tomn · 2024-04-03T12:58:24.000000Z

Me neither, i even provided a source, and it's easy to find other examples. There certainly are projects that allow anonymous contributions, but i doubt it's the majority of projects that one would consider important.

For these kinds of projects you could make up an identity relatively easily and nobody would know, but you're screwing over the project (as they may need to remove your contributions if they find out), so it's not something to be doing if you actually want to contribute (instead of inserting backdoors).

The original idea (not being able to contribute without a verified identity) is still wrong, but it's wrong because it's impractical to prove identity in a way that people find acceptable (and works), not because people will not give up anonymity, as many of the replies state.

ghaff · 2024-04-03T13:19:52.000000Z

There are people who downvote things that present facts that aren't in accordance with how they think the world should be.

I do think it's difficult to verify identity in any reasonably acceptable lightweight way. That said, for the larger projects I'm most familiar with, a lot of people work for companies, attend conferences, etc. They may go by nicknames day to day, but they have known real identities and their professional existence wouldn't be possible without one.

j-krieger · 2024-04-03T10:58:12.000000Z

Who are you to propose requirements onto people who work for free?

brabel · 2024-04-03T20:53:02.000000Z

I am not imposing anything on anyone. I am only saying that an OSS project that aims to be used as part of important infrastructure should impose at least some sort of identity vetoing and not just make random anonymous users maintainers of anything.

If your project is not important and you don't care about any of this security stuff, feel free to continue publishing your untrustable projects.

loftsy · 2024-04-03T11:08:18.000000Z

I took a look at the diff linked in the article with code that "we are all running". The top of the diff certainly looks interesting. They remove the bounds check in dict_put() and add a safe version dict_put_safe().

This kind of change is difficult to make without mistakes because it silently changes the assumptions made when code calling dict_put() was originally written. ALL call sites would need to be audited to ensure they are not overflowing the dictionary size.

The diff I am referring to is here:

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=de5c5e41764...

justinsaccount · 2024-04-03T12:56:59.000000Z

Also because the 'safe' version only checks

  dict->pos == dict->limit

and not

  dict->pos >= dict->limit

if you can get one call of dict_put somewhere to pass the limit, all later calls of dict_put_safe will happily overwrite memory and not actually be safe.

Calzifer · 2024-04-04T08:22:55.000000Z

No, because dict_put will update the limit value if the new pos exceed it.

justinsaccount · 2024-04-04T11:55:16.000000Z

I don't see anything like what you are describing. What line exactly are you talking about?

ahartmetz · 2024-04-03T21:04:19.000000Z

Wow, that is 1000% obviously malicious

Matumio · 2024-04-03T12:26:36.000000Z

Agree, nice catch. Also, there are many other opportunities in this patch to hide memory safety bugs.

This is the kind of optimization I might have done in C 10 years ago. But coming back from Rust, I wouldn't consider it any more. Rust, despite its focus on performance, will simply not allow it (without major acrobatics). And you can usually find a way to make the compiler optimize it out of the critical path.

kmfpl · 2024-04-03T11:36:25.000000Z

I agree, this looks extremely sketchy. Especially because the code is just writing a fully controlled byte in the buffer and incrementing its index.

This would give you a controlled relative write primitive if you can repeatedly call this function in a loop and going OOB.

liendolucas · 2024-04-03T12:13:16.000000Z

I think at this point is clear that everybody has to assume that XZ is completely rotten and can no longer be trusted. Is it XZ easy to replace with some other compression tool? Or has it been so widely adopted that is going to take huge effort moving out of it?

dralley · 2024-04-03T12:39:30.000000Z

There is no reason to assume that. Even if you assume every commit since Jia became a maintainer is malicious, the version from 3 years ago is perfectly fine.

Zstd has a number of benefits over Xz that may warrant its use as a replacement of the latter, and this will likely be a motivating factor to do so. But calling it entirely rotten is going way too far IMO

mmd45 · 2024-04-03T13:19:34.000000Z

There is an interesting argument to be made that pre-JT xz code is probably pretty secure due to the fact that the threat actors would have already audited the code for existing exploits prior to exerting effort to subvert it.

tripflag · 2024-04-03T13:16:04.000000Z

I always use "zstd --long=31 -T0 -19" to compress disk images, since that is a usecase where it generally offers vastly superior compression to xz, deduplicating across bigger distances.

XZ offers slightly better compression on average, but decompression is far slower than Zstd.

dralley · 2024-04-03T13:19:57.000000Z

IIRC memory consumption is generally worse for Zstd at comparable levels of compression. Which, these days, is generally fine, but my point is you can't thoughtlessly substitute the two.

liendolucas · 2024-04-03T12:48:32.000000Z

What keeps ringing in my head is the "." that was found that invalidates compilation. I personally don't buy it (but is my opinion).

dralley · 2024-04-03T13:08:45.000000Z

What do you mean "don't buy it"?

liendolucas · 2024-04-03T14:22:05.000000Z

My bad. I thought that the person who made that commit was someone else than JT. Can't delete comment nor self-down-vote it.

kzrdude · 2024-04-03T12:30:17.000000Z

Huge effort, because it is the default .deb compressor in Debian for example

rthnbgrredf · 2024-04-03T12:36:03.000000Z

Arch Linux has replaced it with zstd in 2020 already. It's doable for the next major release of Debian.

kzrdude · 2024-04-03T13:43:53.000000Z

Certainly, but we need an xz decompressor to read the current debian repo versions for the next decades, when they are oldstable or archived.

formerly_proven · 2024-04-03T14:36:48.000000Z

Decoding is easy.

logro · 2024-04-03T18:09:52.000000Z

This is 100% malicious or novice coder. And we surely know it's not the latter.

If you need an unsafe call, you add a dict_put_unsafe(). That again should of course be rejected in a code review.

rwmj · 2024-04-03T10:14:10.000000Z

I think Joey's right that we should all go back to the "pre-Jia-Tan" xz, and I've raised this with Red Hat too. It's actually not a big deal as xz and liblzma is relatively stable and the version from 2 years ago is fine, although I understand that Debian's dpkg uses some new API(s) from liblzma which makes this a problem albeit a minor one.

(Unfortunately the Debian bug report that Joey filed got derailed with a lot of useless comments early on.)

iso8859-1 · 2024-04-03T10:35:55.000000Z

How do you know what 'pre' means given that pseudo-anonymous identities are free and Tan is already suspected of having some (e.g. Hans Jansen and Jigar Kumar: https://research.swtch.com/xz-timeline)

rwmj · 2024-04-03T10:46:52.000000Z

I mean we go back before all possible sockpuppets. We do have a reasonably good idea of when the attempt started.

glitchcrab · 2024-04-03T11:37:22.000000Z

The point that the person you replied to is trying to make is how do you know when the repo is clean? How can you ever be sure that someone hasn't introduced a backdoor at some point? It's bigger than just what has been discovered.

rwmj · 2024-04-03T11:39:32.000000Z

How do you know anything has not been compromised? You go and look at the commits and the code. It's hard work with no easy answers despite what many think.

rrr_oh_man · 2024-04-03T12:07:17.000000Z

By not letting the perfect be the enemy of the good-enough-for-now

meinersbur · 2024-04-03T10:31:03.000000Z

The greater concern should be how many other sleeper contributors are out there. Anonymous contributions are accepted every day, and we know of cases with malicious intent such by "James Bond" (https://lore.kernel.org/lkml/20200809221453.10235-1-jameslou...).

I am not specifically worried about other contributions by "Jia Tan", those are being extensively looked at right now. They and other sleepers may just as well have contributed to any project with a different name and therefore "Jia Tan" does not pose more danger than any other contribution whose submitter cannot be held responsible.

rkta · 2024-04-03T11:05:57.000000Z

What's malicious about that patch? From reading the thread it looks like an attempt to fix a FP from some tooling.

meinersbur · 2024-04-03T11:22:21.000000Z

One of the patches that the University of Minnesota was banned for from contributing to the Linux kernel. They were trying to introduce a use-after-free (Fig. 9 in their paper).

https://news.ycombinator.com/item?id=26887670

meinersbur · 2024-04-03T12:11:52.000000Z

I just had to think about how ironic it would be if "Jia Tan" turned out to be a Post-Doc from the University of Minnesota continuing that research on hypocrite commits.

codezero · 2024-04-03T11:30:25.000000Z

Consider “Jia Tan” started working on xz because they already found a critical vulnerability and wanted to maintain it, or more tin foil, they burned xz to get upstreams to use another compression that is also already backdoored. When dealing with state actors there’s really no limit to how complex the situation can be.

alt227 · 2024-04-03T12:31:03.000000Z

This is something I also wondered but havent seen discussed anywhere. This could all be a smokescreen to get distros to switch to the next best compression library which already contains malicious code. Hopefully maintainers of any upstream compression libraries are all looking hard at their code bases right now.

throwaway63467 · 2024-04-03T09:53:59.000000Z

Seems like a sensible thing to do, assuming this is a state-level threat actor there’s really no easy way to prove that their contributions are free of back doors. Seems not worthwhile risking the security of a large part of the Internet over a few thousand lines of code.

afc · 2024-04-03T10:01:17.000000Z

But why would the entire behind this submit all their attacks through the same single identity? Removing all this code could just be removing 1% of their harmful code. How do you deal with the rest? How do you discover the other identities?

rwmj · 2024-04-03T10:18:07.000000Z

You start with what you know about, and you investigate other projects carefully at the same time. There's no easy answer here, you do what you can.

berkes · 2024-04-03T09:52:57.000000Z

Full on tinfoil hat here. But warranted and practical.

I'm wondering what fallout we'll see from this backdoor in the coming weeks, months or years. Was the backdoor used on obscure build servers or obscure pieces of build infrastructure somewhere? Lying dormant for a moment in future to start injecting code into built packages maybe? Are distro's going to go full-on tinfoil-hat and lock down their distribution, halting progress for long time? Are software developers (finally?) going to remove dependencies(now proven to be liabilities!), causing months of refactoring and rewriting, without any other progress?

constantcrying · 2024-04-03T10:42:11.000000Z

>Are software developers (finally?) going to remove dependencies(now proven to be liabilities!), causing months of refactoring and rewriting, without any other progress?

How is that even a possibility? xz was very useful software, which can only exist if people with significant knowledge put effort into it. Not every OSS project has the ability or resources to duplicate that. The same goes for many, many other dependencies.

I believe that there is essentially nothing you can do to prevent these attacks with the current software creation model. The problem here is that it is relatively simple for a committed actor to make significant contributions to a publicly developed project, but this is also the greatest asset of that development model. It is extremely hard to judge the motivation of such an individual, for most benign contributors it is interest in the project, which they project onto their co contributors.

TeMPOraL · 2024-04-03T11:28:53.000000Z

Agreed. More than that, there's not much of a way to preven these kinds of attacks, period, whether in software or otherwise, if the perpetrator is some intelligence agency or such.

For threats lesser than a black op, the standard way of mitigating supply chain attacks in the civilized world is through contracts, courts, and law enforcement. I could, in theory, get a job at a local food manufacturer, and over the course of year or two, reach the point I could start adding poison to the products. But you're relatively confident that this won't happen, because should it ever did, the manufacturer will be smeared and sued and they'll be very quick to find me and hand me over to the police. That's how it works for pretty much everything; that's how trust is established at scale.

Two key components of that: having responsibility over quality/fitness for use of your product, and being able to pass the blame up to your suppliers, should you be the victim too. In other words: warranty and being an easily identifiable legal entity. Exactly the two components that Open Source development does away with. Software made by a random mix of potentially pseudoanonymous people, offered with zero warranties. This is, of course, also the reason OSS is so successful. Rapid, unstructured evolution. Can't have one without the other.

Or in short: the only way I see to properly mitigate these kinds of threats is to ditch OSS and make all software commercial again (and legally force vendors to stop with the "no warranty" clause in licensing). That doesn't seem like a worthwhile trade-off to me, though.

pixl97 · 2024-04-03T13:46:05.000000Z

>the only way I see to properly mitigate these kinds of threats is to ditch OSS and make all software commercial again (and legally force vendors to stop with the "no warranty" clause in licensing).

Which just pushes the problem to commercial companies getting a 'friendly' national security letter they can't talk about to anyone stating they should add REDACTED to the library they provide.

TeMPOraL · 2024-04-03T14:10:38.000000Z

Correct. Hence the disclaimer in my first paragraph, which could also be stated as the threat duality principle, per James Mickens[0]:

"Basically, you’re either dealing with Mossad or not-Mossad. If your adversary is not-Mossad, then you’ll probably be fine if you pick a good password and don’t respond to emails from ChEaPestPAiNPi11s@virus-basket.biz.ru. If your adversary is the Mossad, YOU’RE GONNA DIE AND THERE’S NOTHING THAT YOU CAN DO ABOUT IT."

--

[0] - https://www.usenix.org/system/files/1401_08-12_mickens.pdf

coretx · 2024-04-03T12:03:49.000000Z

Code is law. As such, the "standard way" you mention is appropriate for people with zero strategic foresight. There is no absolute need to depend on third parties to solve your problems and the possibility to limit and disperse trust to mostly yourself is real. Sure, glowies can always get to you but they can't get to you everywhere nor all the time. Security/Assurance models and both proprietary and free software architecture are already adapting to such facts.

berkes · 2024-04-03T19:54:47.000000Z

Not all dependecies are "xz" complexity.

Minimizing dependencies, probably means keeping a few libs things like crypto or compression and such.

But do you need a library to color console output? Even if colored console output is a business critical feature, you don't need a (tree of) dependencies for that. I see so many, rather trivial software that comes with hundreds or thousands of dependencies, it's mind boggling really. Why have 124million people downloaded a rubygem that loads and parses a .env file, something I do in a bash oneliner? Why do 21k public npm packages depend on a library that does "rm -f"?

The answer, I'm afraid is mostly that people don't realize this isn't just some value added to their project but rather a liability.

Some liabilities are certainly worth it. XZ is probably one of them. But a Library that does "rm -f" certainly isn't.

pdimitar · 2024-04-03T10:06:25.000000Z

It's impossible to insure against in any practical terms.

The way forward is to invest heavily in a much more security-oriented kernel(s) and make sure that each program has the bare minimum to achieve what it offers as a value-add.

The human aspect of vetting seems like an impossibly difficult game of whack-a-mole. Though realistically I doubt that the bad actors have infinite agents everywhere, this also has to be said. So maybe a "sweep" could eliminate 90% of them, though I'd be skeptical.

galangalalgol · 2024-04-03T11:11:17.000000Z

Agreed, as a developer: minimize your dependencies while providing your core function. Don't grant dependencies permissions they don't need. Be granular about it. Austral lets you select what filesystem, network, etc. access each library gets.

Also, in big organizations, risk assessment is more about making sure there is someone to point the finger at, than actual security. Treating libfubar as golden because it ships with something you paid another company money for makes sense in that light. But not from an actual security mindset.

mrspuratic · 2024-04-03T12:27:07.000000Z

"reduce the attack surface" is Security 101. Noting again that sshd doesn't natively use xz/liblzma (just libz) or systemd, so I don't think I need to point out where the billowing attack surface is ;)

Apache (by way of mod_systemd) is similarly afflicted, as is rsyslogd, I guess most contemporary daemons that need systemd to play fair are (try: "fuser -v /lib64/liblzma.so.?" and maybe "ldd /lib64/libsystemd.so.?" too).

Like a Luddite I still use Slackware and prefer to avoid creeping dependencies, ever since libkrb5 starting getting its tentacles into things more than a decade ago.

galangalalgol · 2024-04-03T13:35:16.000000Z

Yeah, its almost like "do one thing and do it well" had security benefits...

SElinux has the desired sort of granular permissions at the OS level, but if everything is dynamically linked to everything else that doesn't help as the tiniest lib is now part of every process and can hence pick and choose permissions.

But even if we go full monolith OS when systemd takes over the job of the kernel and the browser, that just changes where we need those permissions implemented. We can't practice zero trust when there is no mechanism for distrust in the system.

adql · 2024-04-03T11:18:51.000000Z

> Agreed, as a developer: minimize your dependencies while providing your core function. Don't grant dependencies permissions they don't need. Be granular about it. Austral lets you select what filesystem, network, etc. access each library gets.

Still wouldn't help for this particular exploit.

galangalalgol · 2024-04-03T12:02:10.000000Z

If systemd could deny liblzma any syacall or filesystem access, that would have prevented it. It is only used to compress a data stream, it only needs read access from one buffer, and write access to another. I realize there is no current mechanism for these granular permissions, that is what I was proposing be addressed.

saagarjha · 2024-04-03T13:36:39.000000Z

We don't have the way to apply any restrictions on a per-library basis. This is generally quite difficult to do.

pdimitar · 2024-04-03T16:32:33.000000Z

I know. That's what's missing from our current technology. I am honestly tired of everyone collectively pretending this is not a problem. Periodically we get very grim reminders that it's in fact a problem, everyone pretends to care for a month then it's all back to where it was.

It's depressing. (And no, this comment does not imply you are such. I am responding + ranting.)

saagarjha · 2024-04-07T06:34:42.000000Z

I suspect most of your frustration comes from reading "we don't think this is a good place to spend our effort" as "there are no problems here".

pdimitar · 2024-04-07T08:23:34.000000Z

Likely. Though people not seeing the problem is quite frustrating by itself.

berkes · 2024-04-04T15:29:25.000000Z

In a way it would.

If a software project has hundreds of dependencies, finding that one that was compromised is hard, impossible even. But if it has three dependencies (that aide in the core functionality) keeping a keen eye on them is much easier.

When I look at a typical `node_modules` or `pipenv` directory, I see there's absolutely no way I can vet that all is safe in there. When I look at my typical cargo tree, the four or five dependencies (of dependencies) are doable to just go over every so often.

Automation helps. But that doesn't give me the confidence that just opening the project pages of the stuff that I use, once every few months does for me.

pdimitar · 2024-04-03T11:25:50.000000Z

Since I didn't keep as current as I wanted to be (work and life happen a lot lately), what could have prevented it?

usefulcat · 2024-04-03T11:15:00.000000Z

> The way forward is to invest heavily in a much more security-oriented kernel(s)

While I don't disagree that kernels should be secure, I also don't see how that would have helped in this case, given that (AFAICT) this attack didn't rely on any kernel vulnerabilities..

pdimitar · 2024-04-03T11:25:06.000000Z

True, I wasn't specific enough. The attack exploited that nobody thinks security is a serious enough problem. It's a failure of us (the technical community) as a whole, and a very shameful one at that.

skywhopper · 2024-04-03T10:06:23.000000Z

IIRC Debian has wiped and is rebuilding all their build hosts, so yes.

But while I understand what you mean, I would not call improving the security of a piece of software “halting progress”. Security improvements are progress, too. Plus, revisiting processes and assumptions can also give opportunities to improve efficiency elsewhere. Maintenance can be an opportunity if you approach it properly.

berkes · 2024-04-03T10:20:27.000000Z

What I meant with "halting progress" is what commonly happens when a piece of software is "rewritten from scratch". Users (or clients or customers) see no improvements for years or weeks, while the business is burning money like mad.

The main reason why I am firmly opposed to "rewrite from scratch" or "we'll need some weeks to refactor¹".

Removing upstream dependencies and replacing them with other deps, with no-code, or with self-written code² is a task that takes long time during which stakeholders see no value added other than "we reduced the risk"; in case of e.g. SAAS that's not even risk these stakeholders are exposed to, so they then see "nothing improving". I'm certain a lot of managers, CTOs and developers suddenly realize that, wow, dependencies really are a liability.

¹ I am not against refactoring, just very much against refactoring as standalone, large task. Sometimes it's unavoidable because of poor/hard choices made in the past, but it's always a bad option. The good option would be "refactor on touch" - refactoring as part of our daily jobs of writing software.

² Too often do I see dependencies that are redicoulously simple. Left-pad being the posterchild. Or dependencies that bring everything and the kitchen-sink, but all we use is this one tiny bit that would've cost us less than 100 lines to write. Or dependencies to solve -- nothing really? Just that no-one took the time to go through it and remove it. And so forth and so on.

coretx · 2024-04-03T11:49:27.000000Z

Not fallout but increased vigilance is the expected most significant outcome. Some 5+ years ago or so I listened to a Debian guy his talk about reproducible builds and security, he was stressing the audience to be aware of just happened in a very detailed manner. One of the details he mentioned was glowies having moved their focal point to individual developers and their tooling & build systems. At least some people who matter have been working on these threats for many years already, maybe more people will start to listen to them; in such case this entire debacle could have a net positive effect on the long run.

ProblemFactory · 2024-04-03T11:13:31.000000Z

> Was the backdoor used on obscure build servers or obscure pieces of build infrastructure somewhere?

And developer machines. The backdoor was live for ~1 month on testing releases of Debian and Fedora, which are likely to be used by developers. Their computers can be scraped for passwords, access keys and API credentials for the next attack.

Macha · 2024-04-03T11:58:55.000000Z

> Are software developers (finally?) going to remove dependencies(now proven to be liabilities!), causing months of refactoring and rewriting, without any other progress?

We've been here before, with e.g. event-stream and colors on npm. So I don't think it will change much. Except maybe people will stop blaming it on JS devs being script kiddies in their mind, when they realise that even the traditional world of C codebases and distro packages are not immune.

gammalost · 2024-04-03T10:08:06.000000Z

You can't really remove dependencies in open source. It is so intertwined at this point that doing it would be too expensive for most companies.

I think the solution is to containerize, containerize and then containerize some more times and make it all with zero trust in mind.

rwmj · 2024-04-03T10:19:49.000000Z

Containerizing is entirely the worst response here. Containers, as deployed in the real world, are basically massive binary blobs of completely uncertain origin, usually hard to reproduce, that easily permit the addition of unaudited invisible changes.

(Yes yes, I know there are some systems which try to mitigate this, but I say as deployed in the real world.)

gammalost · 2024-04-03T13:21:51.000000Z

Your application is already most likely a big binary blob of uncertain origin that's hard to reproduce. Containers allow these big binary blobs of uncertainty to at least be protected from each other.

adql · 2024-04-03T11:22:05.000000Z

Pretty much; updating say libssl in a "traditional" system running app, or maybe 2-3 dependent apps fixes the bug.

Put all of them in containers and now every single one needs to be rebuilt with the dep fixed and instead of having one team (ops) responsible, you now need to coordinate half of the company to do so. It's not impossible but in general much more complex, despise containers promising "simpler" operations.

...that being said I don't miss playing whack-a-mole game with developers that do not know what their apps need to be deployed on production and for some retarded reason tested their app on unstable ubuntu while all of the servers run some flavour of stable linux with a bit older libs...

funcDropShadow · 2024-04-03T10:16:01.000000Z

Docker containers are not really a security measure.

gammalost · 2024-04-03T13:17:46.000000Z

It is a security measure. Sure it doesn't secure anything in the container itself. But it secures the container from other containers. Code can (as proven) not be trusted, but the area of effect can be reduced.

65a · 2024-04-03T15:37:41.000000Z

Only with additional hardening between the container and the kernel and hardware itself.

fsflover · 2024-04-03T10:01:26.000000Z

> What if xz contains a hidden buffer overflow or other vulnerability, that can be exploited by the xz file it's decompressing?

If you generalize this problem further, to all packages, then the only reliable solution is security through compartmentalization. On Qubes OS, any file I open, including .jpg and .avi, can't have the access to my private data or attack the admin account for the whole computer. This is ensured by hardware-assisted virtualization.

JoshTriplett · 2024-04-03T10:04:17.000000Z

> the only reliable solution is security through compartmentalization

I hope we get there eventually. Not just for standalone processes, but for individual libraries. A decompression library could run inside a WebAssembly sandbox, with the compressed file as input, the uncompressed file as output, and no other capabilities.

funcDropShadow · 2024-04-03T10:19:47.000000Z

What does this have to do with WebAssembly? That is another runtime that adds complexity. Apple has been sandboxing codecs for a long time. They run in a sandboxed process that is only communicating through stdin and stdout or something similar, if I remember correctly. You can ran native code directly. Adding a runtime with a JIT-compiler makes it harder to understand what is going on.

JoshTriplett · 2024-04-03T10:46:04.000000Z

WebAssembly can be run in-process rather than requiring a process switch, and it can be easier to port library code to run inside a WebAssembly sandbox than a completely separate process. Also, sandbox mechanisms for separate processes are not always as robust, since they have to give access to any direct syscalls the process makes, whereas WebAssembly completely insulates a library from any native surface area.

01HNNWZ0MV43FF · 2024-04-03T10:22:38.000000Z

There's AOT wasm too. Firefox uses it to sandbox some stuff. https://hacks.mozilla.org/2021/12/webassembly-and-back-again...

nottorp · 2024-04-03T11:44:03.000000Z

WebAssembly is the new Rust, I think

Hey, no one proposed to rewrite xz in Rust yet! I'm sure that would automatically protect any project from social engineering attacks!

joeyh · 2024-04-03T11:57:06.000000Z

Running xz in a sandbox would not prevent an attack that causes it to modify source code in a .tar.xz that is being streamed through it.

JoshTriplett · 2024-04-03T13:32:12.000000Z

No, it wouldn't, but that wasn't the attack here. And code outside the sandbox could check a checksum of the uncompressed data, to ensure that the decompression can't misbehave.

sergioisidoro · 2024-04-03T13:03:31.000000Z

It's a bit ironic that after a trust attack this person ends the article sayin

> I do have a xz-unscathed fork which I've carefully constructed to avoid all "Jia Tan" involved commits.

He may be fully legitimate, and perhaps a famous person in OSS (which I was unfamiliar with), but still ironic :)

KyleSanderson · 2024-04-03T10:23:58.000000Z

There seems to be a fundamental misunderstanding with a lot of these writeups. Are they 100% sure history was not rewritten at any point? Going back in time on the repo prior to listed involvement doesn't do anything as the attacker had full control. Starting from the last signed release prior to their involvement is the only way to actually move this forward (history may be fully lost at this point), the rest is posturing.

mxmlnkn · 2024-04-03T10:53:10.000000Z

Even history rewrites would be visible with Github's new Activity tab, e.g., see the two force-pushes in llama.cpp https://github.com/ggerganov/llama.cpp/activity So, while, yes, git history can be rewritten, commits pushed to Github can effectively never be deleted. Personally, I find this to be a downside. Think, personal information, etc. But, in this case, it is helpful. Of course, the repository is suspended right now, so the Activity cannot be checked.

azornathogron · 2024-04-03T10:43:34.000000Z

While it's certainly possible to rewrite git history, it's tricky to do it without other maintainers or contributors noticing, since anyone trying to pull into an existing local repo (rather than cloning fresh) would be hit with an unexpected non-fast-forward merge.

It seems likely to me that Lasse Collin would have one or more long-standing local working copies.

So IMHO injecting malicious changes back in time in the git history seems unlikely to me. But not strictly impossible.

KyleSanderson · 2024-04-03T11:32:01.000000Z

Based on how this has gone (remember xz has effectively been orphaned for years, and the majority of long-standing setups were using the release archives), unless if Lasse has never run any code from Jia (unlikely) I'd consider the entire machine untrusted (keys, etc). Provided the tarballs are still signed from that date, from another immutable source, that's really the only starting point here to rebuilding.

pdw · 2024-04-03T10:56:08.000000Z

In any case Debian has its own archive of every xz-utils version they've used in the past.

rkta · 2024-04-03T11:10:38.000000Z

The attacker had access to the GH mirror of the repo. The original repo remained at https://git.tukaani.org/

fl7305 · 2024-04-03T10:46:18.000000Z

> Are they 100% sure history was not rewritten at any point?

With git, one way to check is if other people still have clones of the xz repository from a time when it was trusted.

If you suspect the repo history has been tampered with, you can check against those copies.

I believe it would be hard to introduce such a history rewrite, since people pulling from the xz repo would start getting git error messages when things don't match up?

I don't know to what degree intentional SHA-1 hash collisions could be used to work around that?

dist-epoch · 2024-04-03T11:11:07.000000Z

You can create pairs of SHA-1 hash collission, but not for a particular existing SHA-1 hash (the git one)

AtNightWeCode · 2024-04-03T10:39:55.000000Z

People think git is immutable. It is not.

Lichtso · 2024-04-03T10:59:08.000000Z

Yes and no.

A local GIT repo can be changed (including its history) however you please. But once you have shared it with others you can't take that back. If you try to, then others will notice that the hashes mismatch and that their HEAD diffs uncleanly.

I know the term is infamous here, but GIT is essentially a blockchain. Each commit has a hash, which is based on the hashes of previous commits, forming a linked list (+ some DAG branching).

The_Colonel · 2024-04-03T11:39:42.000000Z

> If you try to, then others will notice that the hashes mismatch and that their HEAD diffs uncleanly.

So it relies on a human noticing and acting upon it. People not noticing backdoors being merged into the project is kinda the source of this problem.

fl7305 · 2024-04-03T13:00:33.000000Z

You can automate checks for if a large part of the previous git history suddenly changed.

You can't automate checks for malicious code.

The_Colonel · 2024-04-03T13:52:42.000000Z

That relies on some heuristics which can be worked around, unless you disallow rewriting history.

But the bigger issue is that this is some theoretical system which is not present in most git repositories.