What, exactly, do you think OpenBSD can do to prevent an attacker from comrpimising your responder binary's logic that everyone else isn't already doing? OpenBSD's sandboxing model is superb, but that's not what I'm criticising.
C is a bad choice for handling untrusted input precisely because it makes it very difficult to prevent logic errors that disclose user data in unexpected ways. The security community has done its best to prevent the even more disastrous class of breakout errors that comprimise the entire resource (and OpenBSD is great for this, way better than containers).
But as my comment was specifically addressed at the choice of C, I don't feel like I need to sweet talk an OS I already say lots of nice things about.
Maybe you'd like to respond with all sorts of great literature about how the C spec is not full of holes and gotchas?
And don't even get me started on that "simple example binary."
> it makes it very difficult to prevent logic errors that disclose user data in unexpected ways
Oh that's a new one. But now that you mention it, I'm starting to recall all the operators that will fprintf the contents of ~/.ssh/ to the network socket upon misuse.
Wouldn't it be nice if we could use a simple assert to test the state and inputs before we proceed to shovel private keys in a totally not privsep'd process handling public, unauthenticated connections. Heck, even a simple condition would do, if we could just return an error and stop further processing when things look bad. But you're right, that kind of code would've been too advanced for 1958 or whenever we got this language...
So, at the root of this bug we have a broken implementation of a circular buffer that fails to accommodate for one case. That is the kind of a binary logic error I can repeat in practically every single programming language. I fail to see how C makes it very difficult to prevent this type of error.
And on the other hand, we have code that doesn't bother check that the inputs from the outside world are valid and that they do not cause integer wraparound. The former is a problem you can repeat in just about any language, while the latter is relevant to many languages. Guess what, I know how to check inputs. I know how to prevent integer wraparound. And C doesn't make it hard for me to do so.
The next and last interesting part of the disclosure is mostly concerned with leaking private keys. Now what did I say about juggling private keys in an internet facing process? Just because the ssh devs didn't isolate that part into a separate process doesn't mean it can't be done (and honestly I have no idea why I would be juggling private keys during the generation of a web page).
So there you have it, old code from a time before explicit_bzero, juggling private keys, not checking inputs and running on a system without malloc_options. You can lament it all you like but that doesn't mean everyone has to do it wrong. It shows that you can do it wrong, not that C makes it hard to do it right.
> So, at the root of this bug we have a broken implementation of a circular buffer that fails to accommodate for one case. That is the kind of a binary logic error I can repeat in practically every single programming language. I fail to see how C makes it very difficult to prevent this type of error.
Not that any other language is on trial here, but there are languages that would naturally make such a bug into a compile time error.
I fail to see how C helps you avoid making such an error. Because C's general standpoint here is that there is no such thing as an error, there is no such thing as a type, and it is perfectly acceptable to have undefined behavior sitting within trivial edit distance of common code patterns.
> The former is a problem you can repeat in just about any language
Actually, no, that's false. A lot of popular languages check arithmetic. Faulting in such a case would have saved the day but hey, faster execution. Even old languages like lisp did this.
Not many languages are as lackadaisical as C is about this. But error handling (at any stage) has always C's weakest point. I can't think of a C successor who hasn't named C on this front, then tried to improve upon it.
> It shows that you can do it wrong, not that C makes it hard to do it right.
Insufficient abstraction means that you can't reuse code, so instead of getting it right once you have to get it right every time.
And what's the compelling reason to use C? It's "simple" and "close to the metal" but the problem domains are anything but. Availability and instrumentation bias encourage people to use C for "efficiency", trading off correctness for faster code. It's a tradeoff one can make, but if you're working with other people's data you should think twice.
How many times faster does a piece of code need to be to make up for violating a user's privacy? How many times "simpler" does code need to be for someone reading it to justify not making every effort to avoid security faults?
But then, I worked with financial data a lot, and my work ended up being associated with a national scale bank with an API. The sheer amount of attacks my code had to endure was on a completely different scale than most people will ever experience.
> Not that any other language is on trial here, but there are languages that would naturally make such a bug into a compile time error.
I'm sure there are languages where the use of any condition forces you into making sure there is some explicitly taken branch for all possible inputs -- and perhaps that language also magically knows what you must do inside each branch. Show me all the projects that are using these for web developement, which is the context for this discussion. Otherwise it is not fair to bash C over it.
All the mainstream languages I see in web development allow you to make the exact same mistake.
> I fail to see how C helps you avoid making such an error. Because C's general standpoint here is that there is no such thing as an error, there is no such thing as a type, and it is perfectly acceptable to have undefined behavior sitting within trivial edit distance of common code patterns.
But there are errors. There are types. UB is not relevant to what you are replying to. The problem in question was about a condition that was not considered. Again, show me how your average web language tells the programmer that he didn't write some if condition or stop bashing C over it because you're dreaming of features in some unicorn language nobody uses in the real world anyway.
> Actually, no, that's false. A lot of popular languages check arithmetic.
Please read again. I said "the former", referring to input validation. That is relevant to every language accepting untrusted input.
> Insufficient abstraction means that you can't reuse code, so instead of getting it right once you have to get it right every time.
That statement is so wrong I can only conclude that you're smoking something or you haven't programmed in C and you are completely oblivious to the work the OpenBSD folk (and many others) are doing to fix these issues in existing reusable library code as well as to introduce new, safer APIs. Sure you can pretend that everyone who wants a buffered output stream to a socket has to write their own circular buffer and repeat the same mistake. You are wrong, and if you had paid attention you would see counterexamples (libevent is a popular one) that prove you wrong. You're just hating on C but don't know it.
> And what's the compelling reason to use C?
I'm not trying to convince anybody to use it and my reasons are my reasons -- the strawman you make of performance isn't the key. But it doesn't matter.
> Show me all the projects that are using these for web developement, which is the context for this discussion. Otherwise it is not fair to bash C over it.
I'm not sure why I'd play the game when "mainstream" is basically a way to discount any offering. But in C++, C#, or basically any class-based language you can code to guard against this. Functional languages with types provide strong guarantees against this. Ocaml and Haskell come to mind as well known examples..
> But there are errors. There are types.
Not according to the compiler. Anything can become anything else.
> That statement is so wrong I can only conclude that you're smoking something or you haven't programmed in C and you are completely oblivious to the work the OpenBSD folk (and many others) are doing to fix these issues in existing reusable library code as well as to introduce new, safer APIs
I'm aware of the work, but C's problem is not that it lacks more library code.
> Sure you can pretend that everyone who wants a buffered output stream to a socket has to write their own circular buffer and repeat the same mistake.
I don't say they have to. It's just that C's language design makes it easier for people to do so. Very different statements.
> You're just hating on C but don't know it.
I see where this is now going. "If you knew you'd like it." I'm not going to waste anymore of either of our time if this is the new talking point.
>Not according to the compiler. Anything can become anything else.
You mind clarifying that? I constantly have compile-time errors from type mismatches. You can cast a variable to a different type, but you can do that in any language.
you can't implicitly convert a char to an int, your compiler will take that as a fatal error. About the closest you could get is char and short since they're essentially the same data type, but even then the compiler might throw an error over an implicit conversion.
That said, JavaScript doesn't have types at all, really. Every type can be implicitly converted to another type, and yet Node is still used server side.
> You mind clarifying that? I constantly have compile-time errors from type mismatches. You can cast a variable to a different type, but you can do that in any language.
Well, for one people overload types. The practice of, for example, using return codes as error values. The second problem is unions. They exist to make people converting bytestreams into structures happier, but often get misused elsewhere.
And void* is essentially a black hole, but it's a very common black hole to see in programs.
> That said, JavaScript doesn't have types at all, really. Every type can be implicitly converted to another type, and yet Node is still used server side.
Yeah and that's the most solid criticism against it! While it's true that it's quite a bit more difficult to cause fatal errors in a program in node, the language does little to help you solve these problems once you expose that functionality to it.
That's why Typescript has been doing so well, I think. It's consuming the javascript ecosystem faster than anything I've ever seen, and having written a fair sum of it (for a server side application, no less) I'm constantly surprised how capable it is.
And of course, I'm on record as a big fan of Purescript and Elm, with more bias towards the former.