One of the best explanations ever written is the now classic "Cult of the Dead Cow issue #351 - The Tao of
Windows [NT|95|98] Buffer Overflow" @ http://www.cultdeadcow.com/cDc_files/cDc-351/
It's also important that you can also avoid the bounds checks safely in Rust, as well. For example, if you're iterating over elements, you don't pay the cost of the bounds check on each element, because you can know statically that it's safe.
Also of note is the lack of investment in exploit mitigation at the operating system and compiler level. C++ and C are not going away from our systems any time soon. Yet exploit mitigation technology still remains on the sidelines of the security world while bug squashing, AV, and IDS are flush with cash.
How realistic is a compiler-level change that is 100% transparent and reverse compatible with existing codebases? I suspect this is a tough nut to crack without making fundamental changes to the language and perhaps invalidating past codebases. You might as well move to a different language then.
I suspect we'll have a Rust compiler that compiles fast applications with a very minor performance hit before this happens to aging languages like C/C++.
The question is not just when we will have the tech, though. It is also when the tech will have an impact. A perfect Rust compiler only improves the safety of new (or rewritten) code.
I daresay that some codebases might rely on overflowing behaviour even if they don't know it. I've worked with a codebase where assumptions about unused RAM were that it was always zeroed out. Moving to a different platform (Linux) changed those assumptions, and there was lots of weird behaviour in the codebase because of that.
I'll bet there are lots of legacy codebases that would actively resist using such a compiler technology for C/C++ because of unknown behaviours happening in their codebase (and because there is no test suite, or just poor test coverage).
Certainly, though usually that winds up perceived as brittleness moving between compiler versions in general. Ideally, the tech we are discussing could help make it easier to understand those unknown behaviors.
But even if not, and even if this accounts for 60% of existing C/C++ codebases, improving the remaining 40% is a huge win that would take tremendously more time to reimplement in Rust.
This is not to say that I am not also very excited about Rust.
Legacy codebases like that tend to run on outdated operating systems, so addressing issues with buffer overflows in the software would probably do little to improve the security of the systems as a whole.
I used to use Aleph One's article as a kind of shibboleth when interviewing candidates for security researchers. I'd just casually say "Smashing the stack..." and see if they would fill in the rest of the title.
Not the best data point, but showed pretty quickly who had a self-taught (and usually practical) understanding of security vulnerabilities vs an academic understanding.
The worse is that this error only happens because we are trying to save 4 bytes. If all arrays would also store their own size we could eradicate this bug with a limited memory impact.
You have overlooked the CPU cost. Given that most memory objects already have at least an over-approximate size somewhere, and how we are more than happy to annotate stacks with canaries, I really don't think anyone is as concerned with the memory cost of adding a size to every array as the CPU cost of adding a read and compare to every array access.
I think that the CPU cost could be reduced if the CPU had some kind of instruction for array access:
if you 'hardcode' that an 'array' has its address stored in register Rx and its length stored in Rx+1, you can easily have an instruction "checked_load Ry <- Rx, <index>" this instruction can have a latency similar to a normal load instruction as it could check the value of the index with Rx+1 in parallel to the load (1).
And if you have a 'load aligned double word' instruction which load two consecutive word in Rx and Rx+1 (aligned to avoid the thorny issue of page missing in the middle of an instruction), you can reduce the cost of having to manage both the array address and its length.
Sure this isn't free: more registers used, more cache used, still it doesn't look THAT bad from a performance POV no? Or very complex to implement. Is-there something I'm not understanding?
1: and if the checks fails there is TRAP executed, which is compatible with C/C++ semantic.
I think the problem a lot of people would have is that introducing hardware, which costs millions to design and bring to market, when it's possible to write code that isn't prone to overflow....just doesn't seem like a good idea.
When I was in school, my professor for CPU Architecture had a good saying: "If you're going down to NAND gates to debug your software you're doing something wrong".
I think another issue is that arrays, trees, and other structures are a complete abstraction. The CPU doesn't need to know a thing about them for them to work, because all it actually sees is addresses and instructions (I get that I'm grossly oversimplifying things), and so introducing that type of "check" at the hardware level is pushing the abstraction down, and I don't think that's a good idea either.
But I do think that's one of the great reasons for using a higher level language, where there are array bounds checks, etc.
To me, C/C++ is like "pretty" assemby language (Now with new features like Undefined Behavior!!!), and I like that about it. But I'm not sure how other people feel.
Anyways, that's just my take on it, take it with a grain of salt.
I am not necessarily suggesting checking array boundaries on every array access, but at least providing an easy and safe way to refer to the array size.
Array boundary checking is I think desirable but I can understand that for some applications it is an unacceptable performance hit.
> Array boundary checking is I think desirable but I can understand that for some applications it is an unacceptable performance hit.
Given the severe potential security implications for not using it, enabling bounds checking is a sensible default in my opinion. People always mention the speed of C for example but ignore what is being sacrificed for it.
You can get pretty tough protection with address sanitizer (it works technically a bit different). It doesn't come for free however.
It's much faster and better than anything that was there before (e.g. valgrind), but the performance impact is still significant. Still I think address sanitizer is an amazing project that should definitely get some more publicity.
What!? Size of the array is always known to the programmer in C. In fact, if the programmer wants to get rid of that information, he/she must deliberately do so. Once one does that, the array becomes useless.
You also haven't explained how would that size information, which we already have, magically solve the problem.
A large part of overflows come from querying an array beyond its size. If the size of the array was already contained in the array (think of C# arrays which always have a Length property), developers could build more robust code that wouldn't rely on the size of the array being inferred from some other parts of the code.
Many, many, many moons ago, C strings won out over Pascal strings. If the competition was somehow re-run from scratch today, with modern machines, it would be no contest: Pascal strings would win. But there was a time when every byte was precious. And not just "a bit precious", but very precious.
You're in the right general direction. The replies to you are misleading and show the common problem of INFOSEC people having no knowledge of what was achieved in the past or present for immunizing systems. Systems far back as 1961 were immune to this with that one using 3 bits to do it [plus code/data separation]. CPU cost when done in parallel is almost zero. See my main comment.
There were so many critical vulnerabilities attributed to that... and in spite of putting tons of measures this is still potential danger. I knew projects decided to use Java instead of C++ (which got array boundary checks) mostly to limit security surface.
Maybe if we could start CPU architecture from scratch, array with sizes would make more sense? Maybe even we could done it so memory segmentation would no longer be needed:
https://en.wikipedia.org/wiki/X86_memory_segmentation
Haven't figured out the details, but maybe we can even gain some performance that way.
Friendly reminder: it's not just a choice between C and Java. In order to get bounds checking on arrays it's not necessary to give up the advantages of native programming and use a virtual machine.
Rust is obviously a great example. So is Go. Bounds checked arrays in languages that compile to native is not a new thing.
Also see capability-based computer systems by levy. Free book. Pointer protection by itself can give all kinds of benefits. Modern version of that is Cambridge's CHERI processor which is open-source and already runs FreeBSD. :)
"The buffer overflow has long been a feature of the computer security landscape. In fact the first self-propagating Internet worm—1988's Morris Worm—used a buffer overflow in the Unix finger daemon to spread from machine to machine."
I thought the daemon exploited was sendmail? ... quick search later ... so the fingerd was used. Good explanation here: ~ Eugene H. Spafford, "The Internet Worm Program: An Analysis"http://spaf.cerias.purdue.edu/tech-reps/823.pdf
If you're thinking that memory for the program is less likely to be contiguous, then, yes, that is a way it may end up being less likely. But there are already techniques that do exactly that for systems with virtual memory (see https://plasma.cs.umass.edu/emery/diehard.html). But as long as you have a stack frame contiguous with a previous stack frame, you're susceptible to buffer overflows, and I think that will be likely even without virtual memory. (That is, the stack frames are not that large, and are likely to be well within the contiguous block size for the memory system.)
As long as you have the ability to address memory directly, I would say that it's possible. I'm not 100% sure about how memory is allocated though. Outside of virtual memory, are the actual chunks of memory sequential, or would overflowing an offset get you a chunk of another process? (I imagine the answer might be implementation dependent)
Seriously this is old news 20 years too late and not even good enough, checkout Phrack article, "smashing the stack for fun and profit" written in 96. This is what unleashed the flood gates. With sample codes to boot.
The Burrough's architecture (1961) had a CPU that bounds-checked arrays & pointer access. Also differentiated code and data to prevent... all kinds of security issues. Total overhead in memory was 6% w/ almost zero overhead in CPU in such a design if you run security-checks in parallel with rest of pipeline while only committing a write if security check is OK. This is the modern approach.
No matter what naysayers claim, the ability to inexpensively immunize a system against buffer and stack attacks has existed for decades while being deployed commercially on 1960's-1980's era hardware. The market went for what was backward compatible with IBM, UNIX, and later DOS software plus highest raw performance at lowest price. Systems designed for excellent security and maintenance (esp written in HLL) had low market share with many getting acquired, disappearing, or having security advantages removed in later releases. It's a market thing, not a technical thing.
Here's a modern example, the SAFE architecture, that's being developed by BAE & academia.
It's actually more heavyweight as it tries to fix... computing in general haha. It specifically mentions buffer overflows and stack risks with the ways they try to immunize against them. The atomic groups with associated processing are actually lightweight a la Burrough's model. They alone provide quite the bit of protection seeing how many types of attack leverage pointer, array, or stack flaws. The other tag will cause more of a hit but could do type enforcement of calling functions. Is also very helpful given interface errors underlie most attacks on systems. I've suggested porting Oberon System and compiler to this architecture to get a fully-usable box with automated protection. And see what attacks can still get through.
In any case, buffer overflows have been preventable since before they even had a name past being an example of "insufficient argument validation" (MULTICS security evaluation). They only exist because (a) market wants them to exist via how it votes with its wallet and (b) legacy software refuses to take the hit that automated protections created. The latter situation improved greatly with tools such as Softbound + CETS which enforces full safety on C code with average 50% hit, limited tools such as Control Pointer Integrity that protects pointers with a tiny hit, and tools such as ASTREE/SPARK that prove C/Ada code free of many types of errors that lead to crashes or hacks. Pretty much no uptake.
There are counterexamples though. I've seen a reverse stack on x86, a DNS server (IRONSIDES) written in SPARK, security kernels that extensively used segmenting for OS-level POLA, a web application server reducing risk w/ FSM's + a separation kernel, and recently MirageOS building a whole stack in a language that prevents so many issues at compile-time while designing for isolation in partitions. So, we can do it better, some are doing it better, and I encourage others to do the same. Look for work that knocks out issues, use it, improve it, and get our baseline up to at least OK instead of FUBAR (current status).