Hacker News new | past | comments | ask | show | jobs | submit login
The Ken Thompson Hack (c2.com)
114 points by yla92 on Feb 8, 2015 | hide | past | favorite | 39 comments



The reference point is [Trusting Trust]. It was his Turing Award lecture in 1982. Honestly, it and similar materials should be required reading somewhere in university CS and Software Engineering curricula.

[Trusting Trust]: http://cm.bell-labs.com/who/ken/trust.html


In my experience Trusting Trust is required reading in many CS programs.


Well, it wasn't required reading back when I was in CS in the 80s, but it was a discussion that we had in class.

I nervously laugh whenever somebody talks about "open source electronic voting"


> I nervously laugh whenever somebody talks about "open source electronic voting"

Uh, why? I don't think anyone has seriously asserted that open source voting systems would be foolproof. On the other hand, we have actual cases where open source voting systems would have at least made serious flaws public before they were used for voting.

Besides, when you're talking "Trusting Trust", you have to worry just as much about the hardware you're voting on. Who designed and built that?

Don't let the perfect be the enemy of the good.


> Don't let the perfect be the enemy of the good.

Also, don't replace a secure, well-tested and practically proven system that works (paper and a ballot box), with something that fosters large scale malice (electronic voting) in the name of modernizing or, even worse, preventing small scale malice.

Voting is a process with unique constraints not seen elsewhere. Not with banking, not with aviation, you name it. Mainly because each voter must be able to cast his/her vote and be able to verify if the vote is counted correct and exactly once without letting anyone else know what he/she voted for.


With paper and ballot box, you need to trust the people doing the counting.

And since you usually need a lot of them when it's done manually, it's very probable that there's always some amount of fraud.

Now, it's probably easier to fraud if you can hack the voting machines. But let's not say that the current system really prevents fraud because it doesn't.


Hence my argument about large scale fraud (only possible with electronics) vs. small scale fraud. Fraud with paper and ballot boxes is extremely hard to organize at large, because it would involve a couple of orders of magnitude more people to conspire.


The electronic voting has its advantages and given the context of the discussion about trust and security, your laugh is related to the means of ensuring such security, not to the idea of electronic voting. The means should and can be treated separately.


As opposed to "closed source electronic voting"?


Perhaps as opposed to not doing it at all.

If you must have assistance for people who can't mark a ballot paper due to disability, have the machine mark their ballot paper and count it as per normal.

Everyone knows what cheating looks like in ballot counting, the evidence of ballot stuffing is harder to hide, easier to spot and generally understood.

Even if electronic voting could be made absolutely secure, safe and fair it would not be understood by most voters, that means it really shouldn't be used. But as pointed out, it can't be made secure so just don't use it. Ban it. Destroy it. Elections must be fairly counted, seen to be fairly counted and understood how they are counted fairly.


I read the comment as "open source doesn't ensure trustable" rather than "open source is inherently worse than closed".

It's that even OS would not be sufficient.


You heard right!


Required reference:

David A. Wheeler’s Page on Fully Countering Trusting Trust through Diverse Double-Compiling (DDC) - Countering Trojan Horse attacks on Compilers

http://www.dwheeler.com/trusting-trust/


Don't think he told that he actually put a bug in the c compiler. He was just explaining how he could have done it. A very interesting read though.


In the Eric Raymond-era Jargon File http://www.catb.org/jargon/html/B/back-door.html there's a claim that the hacked version was distributed, and used at least once.


According to [1], the original Multics trap door that Thompson cites as his inspiration was distributed, to an Air force installation no less.

Tangentially, I'm more fascinated by the fact that Paul Karger, the lead author on [1], was a principal on a Class A1 secure hypervisor for the VAX, which DEC essentially finished, but cancelled in 1991. [2]

[1] - see paragraph 3.1 - http://hack.org/mc/texts/classic-multics.pdf [2] - http://www.cs.dartmouth.edu/~ccpalmer/classes/cs55/Content/r...


Thanks for this - I'm reading the Multics retrospective - do you know of a follow up work? I'd be very interested to see a detailed account as to why some of these features never got implemented.

The paper kind of just leaves it at:

With the growth of individual workstations and personal computers, developers concluded (incorrectly) that security was not as important on minicomputers or single-user computers.


I don't know of any explicit follow up, but there is a ton of interesting info at multicians.org.

I think that, for multi-million dollar mainframe timesharing systems, it was easier to make the cost argument for good security, since the customer would pay for it so that they could spread out the cost of the machine on many users, between whom there was no trust.

But once you got $100k minis and $10k single user micros, why not just buy a second machine?

Of course, things turned out a little bit differently, but from a time before the ubiquitous internet, I can see it making sense to many people.

It's amazing how much more diverse the hardware/OS ecosystem was in say 1985 than it is now. A lot of good stuff has happened since then, but I think a lot of good ideas got lost, or at least are waiting to be dug up again.


John Gilmore(https://en.wikipedia.org/wiki/John_Gilmore_%28activist%29) has related to me that the doored version was distributed and persisted in the ATT production environment for several months before someone noticed the unexpected debug symbols which were left in to aid discovery.


The idea about KTH binaries being virtually everywhere is fascinating and frightening. But I highly doubt that this is the case.

Sure, in theory, a perfect KTH scheme would be undetectable, since it suborns every means of detection. But in practice it often wouldn't. A KTH virus would have to anticipate all tools which may be written to detect it, and given the modern open and closed source software world the complexity would explode.


Not only anticipate, but also guard against. For example, someone asked about a debugger. A specific debugger is simple: patch the debugger's source at compile time to hide the presence of the malware. Of course, that means you have to handle the case where the debugger is used to debug itself, too. That may pose a challenge, but given the existence of quines, I think it is possible. Doing that not for just _a_ debugger but every debugger in the world is left as an exercise.

Next, someone will look at timing and notice something strange. So, add stuff that tweaks the real-time clock to correct for this when your binary runs, when the OS gets built with your compiler. That way, running 'time' on your binary will not show anything suspicious.

Next, someone may notice a discrepancy between real-time clocks and their OS. Solution? Make sure analog clocks aren't that precise, hook all digital clocks up to a time signal, and tweak that signal if you hit slower code paths that the user shouldn't see. Also, make sure that all other computers in the world slow down whenever one of them hits one of these paths.

Next, someone will notice a discrepancy between clock time and solar time.

For that, the NSA made up the leap second.


The way to detect a KTH is listed in the Wiki article: https://www.schneier.com/blog/archives/2006/01/countering_tr...


It's interesting that when Bruce is talking about using an older, simpler alternative compiler as a "checksum" compiler, the first thing I thought of was pcc[1][2], but it's from the right lab and era that using it might not actually be a good test.

[1] http://pcc.ludd.ltu.se

[2] https://en.wikipedia.org/wiki/Portable_C_Compiler

EDIT: in case you don't follow the links and realize the implications yourself, from wikipedia:

"It was very influential in its day, so much so that at the beginning of the 1980s, the majority of C compilers were based on it."


Also interesting re: reproducible builds: http://lwn.net/Articles/630074/


Yup. I'm pretty sure it's provable that a perfect KTH virus must necessarily pass the Turing test.

Not sure if that's a case for or against tinfoil hats.


> a perfect KTH scheme would be undetectable

And would be at least powerful enough to solve the halting problem.

There's nothing to doubt. A 'perfect' KTH virus is impossible.


>A KTH virus would have to anticipate all tools which may be written to detect it

I wonder if you couldn't reduce this to the halting problem to prove that this is actually impossible.

edit: Ah, never mind. It's already been mentioned in the OP that this has to be the case.


How about a debugger? Would it be possible to even hide it? the inconsistencies would give it away (address discontinuitites/size issues)?


If your debugger is built with the compromised toolchain, it can theoretically lie about anything, including that.

Though yes, to accomplish this the original infection would have to basically #include a godlike AI that would be able to intelligently infect all future debugging tools.


I giggle every time I think about how Ken's compilers are alive and well even today ... I mean, who'd trust a compiler written by such a twisted mind? Something to chew on as you use the go toolchain ;)


A KTH virus doesn't have to be ubiquitous to do huge damage. All you need to do is land a malicious compiler on a machine used to produce widely distributed binaries.


This inspires a major plot element in Ramez Naam's Nexus, a near future transhuman sci fi thriller. It's risky to run other people's code on your brain.

http://www.goodreads.com/book/show/13642710-nexus


Could you not define a set of operations on your potentially compromised hardware/software/system that should take time x if the system is uncompromised, and verify with a separate clock that the real time is either x or x+delta? This is based on the assumption that whatever the KTH does must require some amount of computation time to achieve.

Yes, the clock could be compromised too - but building your own clock from scratch is a lot easy than building your own computer, OS, lexer, compiler, etc.

[pre-posting edit - I see others have discussed this possibility too. I thus leave this comment purely as a testament to my own ascension to the rank of "Flaw-spotter".]


I'll repost something another HN poster posted at me in a previous conversation about this. Thompson's attack can be defeated:

   http://www.dwheeler.com/trusting-trust/


I may be reading this wrong, but it appears to rely on trusting at least one (if not more) of the other compilers to not be compromised as well.

Given that so many modern compilers can be traced back to GCC or LLVM these days, this requirement would seem problematic.

EDIT: As pointed out below, it would take a nearly AI level compromise to protect against this kind of attack, but at a theoretical level it would be possible.

Of course, if you control the GCC compiler, you control the Linux Kernel, and no programs run on Linux that don't rely on the kernel for reading and writing files...


It does not rely on you trusting any of the compilers to be uncompromised. It relies on you trusting at least one pair of the compilers to not be identically compromised.


At a theoretical level, it wouldn't be too difficult to write a C interpreter from scratch, with enough simple obfuscation that detecting what it is interpreting really would start looking like strong AI.

And even if such a sophisticated program were written and somehow managed to not take up too much CPU time, it would at least probably result in rather large binary size increases in whatever it's injected into... even Linux isn't so big that a few hundred kilobytes wouldn't stick out like a sore thumb.


y


ms mea




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: