Hacker News new | past | comments | ask | show | jobs | submit login
How security flaws work: the buffer overflow (arstechnica.com)
154 points by guardian5x on Aug 26, 2015 | hide | past | favorite | 45 comments



One of the best explanations ever written is the now classic "Cult of the Dead Cow issue #351 - The Tao of Windows [NT|95|98] Buffer Overflow" @ http://www.cultdeadcow.com/cDc_files/cDc-351/


>but if you don't have automatic bounds checking like Java, I guarantee you that those 'A's are going somewhere unfortunate.

20 years later and we're still using C++. I hope Rust takes off. The status quo today is terrible.


It's also important that you can also avoid the bounds checks safely in Rust, as well. For example, if you're iterating over elements, you don't pay the cost of the bounds check on each element, because you can know statically that it's safe.


Also of note is the lack of investment in exploit mitigation at the operating system and compiler level. C++ and C are not going away from our systems any time soon. Yet exploit mitigation technology still remains on the sidelines of the security world while bug squashing, AV, and IDS are flush with cash.


How realistic is a compiler-level change that is 100% transparent and reverse compatible with existing codebases? I suspect this is a tough nut to crack without making fundamental changes to the language and perhaps invalidating past codebases. You might as well move to a different language then.

I suspect we'll have a Rust compiler that compiles fast applications with a very minor performance hit before this happens to aging languages like C/C++.


The question is not just when we will have the tech, though. It is also when the tech will have an impact. A perfect Rust compiler only improves the safety of new (or rewritten) code.


I daresay that some codebases might rely on overflowing behaviour even if they don't know it. I've worked with a codebase where assumptions about unused RAM were that it was always zeroed out. Moving to a different platform (Linux) changed those assumptions, and there was lots of weird behaviour in the codebase because of that.

I'll bet there are lots of legacy codebases that would actively resist using such a compiler technology for C/C++ because of unknown behaviours happening in their codebase (and because there is no test suite, or just poor test coverage).


Certainly, though usually that winds up perceived as brittleness moving between compiler versions in general. Ideally, the tech we are discussing could help make it easier to understand those unknown behaviors.

But even if not, and even if this accounts for 60% of existing C/C++ codebases, improving the remaining 40% is a huge win that would take tremendously more time to reimplement in Rust.

This is not to say that I am not also very excited about Rust.


Legacy codebases like that tend to run on outdated operating systems, so addressing issues with buffer overflows in the software would probably do little to improve the security of the systems as a whole.


Alternatively (just the concept, this should not work as outlined on modern systems): http://phrack.org/issues/49/14.html


I used to use Aleph One's article as a kind of shibboleth when interviewing candidates for security researchers. I'd just casually say "Smashing the stack..." and see if they would fill in the rest of the title.

Not the best data point, but showed pretty quickly who had a self-taught (and usually practical) understanding of security vulnerabilities vs an academic understanding.


That's a class move and you are probably a cool person to work with.


That was not sarcasm.


Some updates:

Smashing the Stack in 2010 (linux, windows) - http://www.mgraziano.info/docs/stsi2010.pdf

Smashing the Stack in Windows 8 - https://archive.org/details/SmashingW8Stack


i like this one better.


The worse is that this error only happens because we are trying to save 4 bytes. If all arrays would also store their own size we could eradicate this bug with a limited memory impact.


You have overlooked the CPU cost. Given that most memory objects already have at least an over-approximate size somewhere, and how we are more than happy to annotate stacks with canaries, I really don't think anyone is as concerned with the memory cost of adding a size to every array as the CPU cost of adding a read and compare to every array access.


I think that the CPU cost could be reduced if the CPU had some kind of instruction for array access: if you 'hardcode' that an 'array' has its address stored in register Rx and its length stored in Rx+1, you can easily have an instruction "checked_load Ry <- Rx, <index>" this instruction can have a latency similar to a normal load instruction as it could check the value of the index with Rx+1 in parallel to the load (1).

And if you have a 'load aligned double word' instruction which load two consecutive word in Rx and Rx+1 (aligned to avoid the thorny issue of page missing in the middle of an instruction), you can reduce the cost of having to manage both the array address and its length.

Sure this isn't free: more registers used, more cache used, still it doesn't look THAT bad from a performance POV no? Or very complex to implement. Is-there something I'm not understanding?

1: and if the checks fails there is TRAP executed, which is compatible with C/C++ semantic.


I think the problem a lot of people would have is that introducing hardware, which costs millions to design and bring to market, when it's possible to write code that isn't prone to overflow....just doesn't seem like a good idea.

When I was in school, my professor for CPU Architecture had a good saying: "If you're going down to NAND gates to debug your software you're doing something wrong".

I think another issue is that arrays, trees, and other structures are a complete abstraction. The CPU doesn't need to know a thing about them for them to work, because all it actually sees is addresses and instructions (I get that I'm grossly oversimplifying things), and so introducing that type of "check" at the hardware level is pushing the abstraction down, and I don't think that's a good idea either.

But I do think that's one of the great reasons for using a higher level language, where there are array bounds checks, etc.

To me, C/C++ is like "pretty" assemby language (Now with new features like Undefined Behavior!!!), and I like that about it. But I'm not sure how other people feel.

Anyways, that's just my take on it, take it with a grain of salt.


Intel is adding hardware array bounds checking called MPX in skylake, so this could become a reality.


There already was a BOUND instruction. Nobody ever seems to remember this when they propose adding such an instruction.


I presume it would only be used for code compiled for skylake?


I am not necessarily suggesting checking array boundaries on every array access, but at least providing an easy and safe way to refer to the array size.

Array boundary checking is I think desirable but I can understand that for some applications it is an unacceptable performance hit.


> Array boundary checking is I think desirable but I can understand that for some applications it is an unacceptable performance hit.

Given the severe potential security implications for not using it, enabling bounds checking is a sensible default in my opinion. People always mention the speed of C for example but ignore what is being sacrificed for it.



You can get pretty tough protection with address sanitizer (it works technically a bit different). It doesn't come for free however. It's much faster and better than anything that was there before (e.g. valgrind), but the performance impact is still significant. Still I think address sanitizer is an amazing project that should definitely get some more publicity.


What!? Size of the array is always known to the programmer in C. In fact, if the programmer wants to get rid of that information, he/she must deliberately do so. Once one does that, the array becomes useless.

You also haven't explained how would that size information, which we already have, magically solve the problem.


A large part of overflows come from querying an array beyond its size. If the size of the array was already contained in the array (think of C# arrays which always have a Length property), developers could build more robust code that wouldn't rely on the size of the array being inferred from some other parts of the code.


It's not just their size, but wouldn't all arrays need know the address of their first element too? For example, consider a simple char buffer:

    char buf[512];
where we want to read x number of bytes from a file descriptor starting from a offset other than the first element:

    read(fd, buf + 256, sizeof(buf)); // Whoops, incorrect sizeof, buffer overflow!
Could the compiler still detect this buffer overflow error without knowing the address of the first element in the array?

What about:

    char *dst = buf + 256;
    ...
    read(fd, dst, sizeof(buf));
perhaps the compiler could trace back all the calculations performed on the dst pointer to see whether this operation is valid?


The solution does involve storing a length, and this is called a Pascal string: https://en.wikipedia.org/wiki/String_%28computer_science%29#...

Many, many, many moons ago, C strings won out over Pascal strings. If the competition was somehow re-run from scratch today, with modern machines, it would be no contest: Pascal strings would win. But there was a time when every byte was precious. And not just "a bit precious", but very precious.


Wow, that's incredible, thanks for the link!


For this case, you can use "fat pointers" that include the size of their destination.


You're in the right general direction. The replies to you are misleading and show the common problem of INFOSEC people having no knowledge of what was achieved in the past or present for immunizing systems. Systems far back as 1961 were immune to this with that one using 3 bits to do it [plus code/data separation]. CPU cost when done in parallel is almost zero. See my main comment.

https://news.ycombinator.com/item?id=10125348


There were so many critical vulnerabilities attributed to that... and in spite of putting tons of measures this is still potential danger. I knew projects decided to use Java instead of C++ (which got array boundary checks) mostly to limit security surface.

Maybe if we could start CPU architecture from scratch, array with sizes would make more sense? Maybe even we could done it so memory segmentation would no longer be needed: https://en.wikipedia.org/wiki/X86_memory_segmentation

Haven't figured out the details, but maybe we can even gain some performance that way.


Friendly reminder: it's not just a choice between C and Java. In order to get bounds checking on arrays it's not necessary to give up the advantages of native programming and use a virtual machine.

Rust is obviously a great example. So is Go. Bounds checked arrays in languages that compile to native is not a new thing.


Swift is another new language with it. Its getting to be the norm now.


Smart thinking. It was already done in a dozen different ways. Two cited in my comment:

https://news.ycombinator.com/item?id=10125348

Also see capability-based computer systems by levy. Free book. Pointer protection by itself can give all kinds of benefits. Modern version of that is Cambridge's CHERI processor which is open-source and already runs FreeBSD. :)

https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/


"The buffer overflow has long been a feature of the computer security landscape. In fact the first self-propagating Internet worm—1988's Morris Worm—used a buffer overflow in the Unix finger daemon to spread from machine to machine."

I thought the daemon exploited was sendmail? ... quick search later ... so the fingerd was used. Good explanation here: ~ Eugene H. Spafford, "The Internet Worm Program: An Analysis" http://spaf.cerias.purdue.edu/tech-reps/823.pdf


Are "buffer overflows" possible in systems without without virtual memory?

I know how I would answer this question but I am curious how others would answer it.

EDIT: s/possible/known to occur


Yes. Why would you think not?

If you're thinking that memory for the program is less likely to be contiguous, then, yes, that is a way it may end up being less likely. But there are already techniques that do exactly that for systems with virtual memory (see https://plasma.cs.umass.edu/emery/diehard.html). But as long as you have a stack frame contiguous with a previous stack frame, you're susceptible to buffer overflows, and I think that will be likely even without virtual memory. (That is, the stack frames are not that large, and are likely to be well within the contiguous block size for the memory system.)


As long as you have the ability to address memory directly, I would say that it's possible. I'm not 100% sure about how memory is allocated though. Outside of virtual memory, are the actual chunks of memory sequential, or would overflowing an offset get you a chunk of another process? (I imagine the answer might be implementation dependent)


It's more important to have bounds checked access and to disallow pointer arithmetic. If you do that, pointers (addresses) are fine.



Seriously this is old news 20 years too late and not even good enough, checkout Phrack article, "smashing the stack for fun and profit" written in 96. This is what unleashed the flood gates. With sample codes to boot.


The Burrough's architecture (1961) had a CPU that bounds-checked arrays & pointer access. Also differentiated code and data to prevent... all kinds of security issues. Total overhead in memory was 6% w/ almost zero overhead in CPU in such a design if you run security-checks in parallel with rest of pipeline while only committing a write if security check is OK. This is the modern approach.

http://www.smecc.org/The%20Architecture%20%20of%20the%20Burr...

No matter what naysayers claim, the ability to inexpensively immunize a system against buffer and stack attacks has existed for decades while being deployed commercially on 1960's-1980's era hardware. The market went for what was backward compatible with IBM, UNIX, and later DOS software plus highest raw performance at lowest price. Systems designed for excellent security and maintenance (esp written in HLL) had low market share with many getting acquired, disappearing, or having security advantages removed in later releases. It's a market thing, not a technical thing.

Here's a modern example, the SAFE architecture, that's being developed by BAE & academia.

http://www.crash-safe.org/assets/ieee-hst-2013-paper.pdf

It's actually more heavyweight as it tries to fix... computing in general haha. It specifically mentions buffer overflows and stack risks with the ways they try to immunize against them. The atomic groups with associated processing are actually lightweight a la Burrough's model. They alone provide quite the bit of protection seeing how many types of attack leverage pointer, array, or stack flaws. The other tag will cause more of a hit but could do type enforcement of calling functions. Is also very helpful given interface errors underlie most attacks on systems. I've suggested porting Oberon System and compiler to this architecture to get a fully-usable box with automated protection. And see what attacks can still get through.

In any case, buffer overflows have been preventable since before they even had a name past being an example of "insufficient argument validation" (MULTICS security evaluation). They only exist because (a) market wants them to exist via how it votes with its wallet and (b) legacy software refuses to take the hit that automated protections created. The latter situation improved greatly with tools such as Softbound + CETS which enforces full safety on C code with average 50% hit, limited tools such as Control Pointer Integrity that protects pointers with a tiny hit, and tools such as ASTREE/SPARK that prove C/Ada code free of many types of errors that lead to crashes or hacks. Pretty much no uptake.

There are counterexamples though. I've seen a reverse stack on x86, a DNS server (IRONSIDES) written in SPARK, security kernels that extensively used segmenting for OS-level POLA, a web application server reducing risk w/ FSM's + a separation kernel, and recently MirageOS building a whole stack in a language that prevents so many issues at compile-time while designing for isolation in partitions. So, we can do it better, some are doing it better, and I encourage others to do the same. Look for work that knocks out issues, use it, improve it, and get our baseline up to at least OK instead of FUBAR (current status).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: