There are plenty of codebases (including large ones) written in C that have a low number of vulnerabilities, even compared to projects written in higher level languages (setting aside Rust et al because they aren't that popular yet). It requires discipline and care, but it's not impossible or even that hard if you're a skilled C programmer. OpenSSL has louder vulnerabilities than most software because it's (1) very old, (2) very bad code, and (3) relied upon by heaps of software. There are lots of things wrong with OpenSSL, and those are the reasons that it's vulnerable. We should address the real issues instead of using C as a scapegoat.
> It requires discipline and care, but it's not impossible or even that hard if you're a skilled C programmer.
I have been reading about this mythical "skilled C programmer" since I got to learn C in 1993 and eventually joined C++ ranks instead, leaving my favorite Turbo Pascal behind.
Never met one to this day, but fixed lots of memory corruption issues left behind by not so skilled ones, during the days I used to work daily with C and C++ codebases.
We know they exist. Bernstein and Cutler are two probables based on reviews of their code. Thing is, these people are mentally not even human: it's like Human++ in terms of technical proficiency. So, maybe skilled C programmers only exist among the super-humans. ;)
There are plenty of freeways with speeding/puddle jumping drivers that have a low number of crashes, even compared to slower drivers (setting aside AI drivers, as they aren't that popular yet). It requires discipline and care, but it's not impossible or even that hard if you're a skilled driver at speed. Old cars have more accidents than not because they are (1) very old, (2) very poor parts, and (3) there are lots of them out there. There are lots of things wrong with old cars, and those are the reasons that they have accidents. We should address the real issues instead of using speeding drivers as a scapegoat.
I thought I was being clever. This is so well-written I can't tell if you're agreeing with C opponents, mocking them, or baiting someone like me into writing a comment like... Moving on.
I think the problem is partly with unsafe equipment, and partly with people that in seeking to go just a little faster, or get a little more of a thrill, make things quite a bit less safe for all those around them. To some degree, we all do this in different parts of our lives. People tend to vastly overestimate their ability to sustain high output, if not in one area (defensive coding) then in others (driving, procrastination, etc). Some of these affect the people around you more than others.
To clarify my original comment somewhat, I was talking less about general speeding of a few miles over the speed limit, and more about those people that are going significantly faster than surrounding traffic and weaving in and out of it to advance (puddle jumping). I do not enjoy having my chance of an accident increased my many orders of magnitude because of someone else's (impossible!) sense of competency, and I think that's very relevant in these discussions.
That's very interesting. I agree that effect is there. Passing down incorrect cultural knowledge about C is another I counter with my Pastebin. There's others like SPARK, Rust, and ATS countering concept that safety equals too slow or nothing low-level. Many things.
Back to your analogy, it seemed to start with both types of drivers: IBM vs Burroughs; C vs Wirth languages. The puddle jumper style got pretty popular with most roads and Interstates being built between their cities. The safer drivers have to be on those roads, too, but fewer in number since "Move Fast and Break Things" wasn't popular with small town folk. Got to point that majority of traffic and causes of accidents are puddle jumpers who mostly don't see their cars and driving styles are the problem.
And now we gotta find a way to re-design roads and cars to make their style less damaging while encouraging others to drive more wisely. Wow, put it in your analogy and I suddenly have less hope. ;)
The analogy breaks down the deeper you go, but there are some interesting parallels when you consider driving habits in other cultures. In the US (and probably most western countries) we have highly standardized and policed roads to prevent injury and accidents. I would argue this provides for a more efficient system overall, where in the end more people are able to get to their destination not just safer, but also faster, because the lack of accidents and assurance about other's likely actions on the road allow for mostly smooth operation. With computers, how much less (less, because it would never be none) hardware, software, CPU time and memory would need to be devoted security (firewalls, IDS, malware/virus detection and cleanup) if we had sacrificed a small amount of performance to ensure a more secure operating environment most the time?
We've achieved this with some aspects of society by making rules that hamper some for the benefit of all (traffic laws, employment laws, etc), because we've recognized in some places the failure of individual people and groups to be able to correctly assess risk and danger at wider levels and at longer time frames. These laws and regulations can go overboard, but they're needed to some degree because people are very poor stand ins for rational actors with good information, which in a pure market economy could make these decisions correctly. We've had little to nothing like this for software engineering, which has led to great advancements in short times, but I think we are nearing the point (if we haven't already passed it) when our past decisions to prioritize efficiency and performance over safety are resulting in a poorer relative outcome than if we have made different choices in the past.
"but also faster, because the lack of accidents and assurance about other's likely actions on the road allow for mostly smooth operation."
Yes, yes. It would seem so. There was a counter-point here that showed eliminating traffic controls reduced crashes and congrestion because people paid more attention. Was done in quite a few cities. I'd agree some standardization on behavior and design definitely improves things, though, as you know what to expect while driving.
"how much less (less, because it would never be none) hardware, software, CPU time and memory would need to be devoted security (firewalls, IDS, malware/virus detection and cleanup) if we had sacrificed a small amount of performance to ensure a more secure operating environment most the time?"
BOOM! Same message I've been preaching. My counterpoint to Dan Geer gave a summary of examples with links to more specific details.
"We've had little to nothing like this for software engineering, which has led to great advancements in short times, but I think we are nearing the point (if we haven't already passed it) when our past decisions to prioritize efficiency and performance over safety are resulting in a poorer relative outcome than if we have made different choices in the past."
Total agreement again. It's time we change it. We did have results with Walker's Computer Security Initiative and are getting them through DO-178B. Clear standards with CompSci- and industry-proven practices plus financial incentives led to many products on the market. The Bell paper below describes the start, middle, and end of that process. Short version: NSA killed market by competing with it and forcing unnecessary re-certifications.
Anyway, I've worked on the problem a bit. One thing is a modern list of techniques that work and could be in a certification. Below, I have a list I produced during an argument about no real engineering or empirical methods in software. Most were in the Orange Book but each were proven in real-world scenarios. Some combo of them would be a nice baseline. Far as evaluation itself, I wrote up an essay on that based on my private experience and problems with government certifications. I think I have solid proposals that industry would resist only because they work. :) The DO-178B/C success makes me think it could happen, though, because they already do it piece by piece with a whole ecosystem formed aroud re-usable components.
Of course, I'd be interested in your thoughts on the empiracle stuff and security evaluations for further improvement. For fun, you might try to apply my security framework or empirical methods to your own systems to see what you find. Only for the brave, though. ;)
> Of course, I'd be interested in your thoughts on the empiracle stuff and security evaluations for further improvement. For fun, you might try to apply my security framework or empirical methods to your own systems to see what you find. Only for the brave, though. ;)
If only I worked in an environment where that was feasible. I write Perl for a very small off-market investment firm (event ticket brokerage) as the only software engineer. My work time is split between implementing UI changes to our internal webapp (which we are thankfully going to be subbing out), reporting and alerting tools for the data we collect, manipulating and maintaining the schema as data sources are added, remove or change and maintaining the tools and system that streams the data into the model. While getting to a more secure state would be wonderful, I'm still working to reduce the number of outright blatant bugs I push into production every day due to the speed at which we need to iterate. :/
> Counter point to Dan Geer: Hardware architecture is the problem
This is interesting, and aligns quite well with the current discussion. We live with the trade-offs of the past, which while they may have made sense in the short term, are slowly strangling us now.
> Bell Looking Back
An interesting paper on the problems of security system development and how market changes have helped and hampered over time. I admit I skimmed portions of it, and much of the later appendix sections, as my time is limited. The interesting take-away I got from it is that our government security standards and certification are woefully inadequately provided for, where they aren't outright ignored (both at a software level and at an organizational policy level). I now feel both more and less secure, since I wasn't aware of the specifics of government security certification, so seeing that there are many and are somewhat well defined encourages me to believe the problem at least received rigorous attention. Unfortunately it looks like it's a morass of substandard delivery and deployment, so there that. :/ It is a decade old though, so perhaps there have been positive developments since?
> Essay on how to re-design security evaluations to work
and from that your "nick p on improving security evaluations" and
> List of empirically-proven methods for robust, software engineering
These all look quite well thought out, from my relative inexperience with formal computer security and exploitation research (I've followed it more closely at some times than others, and it's a path I almost went down after college, but did not). The only thing I would consider is that while these are practices for developing secure systems, and they could (and should in some cases) be adopted for regular systems, I think there is a place for levels of adherence to how strict you need to be, and how much effort needs to go into your design and development. Just as we require different levels of expertise and assurance for building a small shed, a house, an office building, and a skyscraper, it would be useful to have levels of certification for software that provided some assurance that specific levels of engineering were met.
"While getting to a more secure state would be wonderful, I'm still working to reduce the number of outright blatant bugs I push into production every day due to the speed at which we need to iterate. :/"
Interesting. What do you think about the tool below designed to handle apps like yours with low defect?
It originally delivered server parts on Java I think. Switched to Node due to low uptake, client/server consistency, & all RAD stuff being built for Node. Main tool written in ML language by people that take correctness seriously. I doubt you can reboot your current codebase but it seems it should be applicable for a similar set of requirements or a spare-time app. Also, not a write-only language. ;)
Jokes on Perl aside, you might find it fascinating and even ironic given current usage (eg UNIX hackery) to know that Perl only exists due to him working... on the first, high-assurance VPN for Orange Book A1 class (highest). Took me 10-20 minutes of Google-fu to dig it out for you:
I'm sure you'll find his approach to "secure," configuration management entertaining.
" We live with the trade-offs of the past, which while they may have made sense in the short term, are slowly strangling us now."
Yep. Pretty much. Outliers keep appearing that do it better but rejected for cost or lack of feature X. Some make it but most don't.
" I now feel both more and less secure..."
"I think there is a place for levels of adherence to how strict you need to be"
Very fair position. :) Early standard did that to a degree by giving you levels with increasing features and assurance: C1, C2, B1, B2, B3, A1. Really easy to understand but security features often didn't match product use case. ITSEC let you describe the features, security features, and assurance rating separately to fit your product. It was less prescriptive on methods, too. Common Criteria did that but knew people would screw up security requirements. So, they added (and preferred) Protection Profiles for various types of product (eg OS, printer, VPN) with threats specified, baseline of countermeasures, and minimal level of assurance applicable. You could do "augmented" (EAL4 vs EAL4+) profiles that added features and/or assurance. CIA's method more like Orange Book where it's simple descriptions on the cover but like this: Confidentiality 5 out of 5, Integrity 5 out of 5, Availability 1 out of 3. Specified level of risk or protection corresponding to each number with methods to achieving it up to manufacturer & evaluators.
So, your expectation existed in the older schemes in various ways. It can definitely be done in next one.
"different levels of expertise and assurance for building a small shed, a house, an office building, and a skyscraper"
Nah, bro, I want my shed and house build with the assurance of an office building or skyscraper. I mean, I spend a lot of time there with some heavy shit above my head. I just need the acquisition cost to get to low six digits. If not, then sure... different levels of assurance... grudgingly accepted. :)
> I doubt you can reboot your current codebase but it seems it should be applicable for a similar set of requirements or a spare-time app.
No kidding. At 85k+ LOC (probably ~70k-75k after removing autogenerated ORM templates) in a language as expressive as Perl... well, I wouldn't look forward to that. And really, if it didn't have a DB abstraction layer at least approaching what I can do with DBIx::Class, I'm not going to contemplate it. Mojolicious takes care of most my webapp needs quite well. As for type checking, what I have isn't perfect, but it's probably worlds better than what you are imagining. I'll cover it below.
> Also, not a write-only language. ;)
use Moops; # [1]. Uses Kavorka[2] by default for funciton/method signatures
role NamedThing {
has name => (is => "ro", isa => Str);
}
class Person with NamedThing;
class Company with NamedThing;
class Employee extends Person {
has job_title => (is => "rwp", isa => Str);
has employer => (is => "rwp", isa => InstanceOf["Company"]);
method change_job ( Object $employer, Str $title ) {
$self->_set_job_title($title);
$self->_set_employer($employer);
}
method promote ( Str $title ) {
$self->_set_job_title($title);
}
}
# Now to show of Kavorka's more powerful features
use Types::Common::Numeric;
use Types::Common::String;
fun foo(
Int $positional_arg1 where { $_ % 2 == 0 } where { $_ > 0 }, # Composes a subset of Int on the fly
Str $positional_arg2,
ArrayRef[HashRef|MyObject] $positional_arg3, # Complex type
DateTime|NonEmptySimpleStr :$start = "NOW", # Named param with default
DateTime|NonEmptySimpleStr :stop($end)!, # Named param with different bound variable in function, which is optional, so may be undef (which composes Undef into the allowed types)
) {
...
}
It's not at compile time checking, but man is it useful. If you've been following Perl 6 at all, it's mostly a backport of those features. Particularly useful is the ability to define your own subtypes and use those in the signatures to keep it sane. E.g. declare HTTPMethod, as Str, where { m{\A(GET|POST|PUT|PATCH|DELETE)\Z} }; Perl 6, where at least some of this is compile time checked (obviously complex subtypes cannot be entirely), fills me with some hope. I'm not entirely sold though, that's one beast of a language. It really makes you put your money where your mouth is when it comes to espousing powerful, extensible, expressive languages. I guess time will tell whether all that rope can be effectively used to make a net instead a noose more often than not. :)
> I'm sure you'll find his approach to "secure," configuration management entertaining.
Ha, yeah. I think secure is a bit of a misnomer here though, as it is a fairly novel way to do authorized configuration management, for the time.
> So, your expectation existed in the older schemes in various ways. It can definitely be done in next one.
And after you've filled me with such confidence that they are capable of both speccing a good standard and incentivizing its use at the same time! ;)
> Nah, bro, I want my shed and house build with the assurance of an office building or skyscraper. I mean, I spend a lot of time there with some heavy shit above my head.
I'm a little disappointed that I just image searched for "overengineered shed" and all the results ranged from a low of "hmm, that's what I would probably do with the knowledge and time" to "oh, that's a job well done". The internet has failed me...
" If you've been following Perl 6 at all, it's mostly a backport of those features. "
Definitely an improvement over my Perl days. Not sure how well they'll eventually get it under control or not. It is better, though.
" I think secure is a bit of a misnomer here though, as it is a fairly novel way to do authorized configuration management, for the time."
Definitely haha. Quite a few things were improv back then as there existed no tooling. Secure SCM was eventually invented and implemented in various degrees. Wheeler has a nice page on it below. Aegis (maintenance mode) and esp OpenCM (dead link) implemented much of that.
" I just image searched for "overengineered shed""
Bro, I got you covered. I didn't subscribe to Popular Science for the layperson-oriented, science news. It was for cool shit like the SmartGarage: a truly-engineered garage. They took it down but I found article and vid of its construction. Enjoy! :)
That is absolutely true and I don't disagree that it is possible to write C securely. But I would contend that the statistics you mention reflect differences in programmer attitudes and skill, and should be controlled for in any comparison. The same level of care and skill in another language will very likely lead to even fewer vulnerabilities, as long as that language doesn't have its own even worse foot-guns.
The priorities of the language are just different. When I use C, I get fairly decent performance basically for free, but I have to work and maintain discipline to obtain safety, high-level abstractions, etc. I'd much rather work in a language where I get memory safety and basic type-sanity checks for free and have to work for performance. This should really be the default, as it's much more in line with the end-user's needs. When I vet software, it's extremely easy to tell whether it is performant, but it takes serious auditing in C to tell whether it is even memory-safe, let alone actually correct and relatively free of side-channels, etc.
The thing about memory safety is that it can be compromised almost anywhere in your code. Your attack surface compared to what _should_ be exposed to attackers is incredibly large. If I can trust the memory-safety of a codebase I am auditing, I don't have to spend nearly as much time on it because I can focus on the parts that actually deal with the intended purpose of the code rather than having to go through every allocation and every memory access in the entire codebase with a fine-toothed comb.
Why do we continue to prefer a language where it's easy to achieve the only goal that is obvious when not achieved, at the expense of requiring great care and discipline 100% of the time in order to not instantly and fully compromise the harder-to-evaluate goals that actually matter (especially in security-oriented software such as openssl)? In this case, bad buffer handling in a data structure deserializer gave attackers a free ride past man-years (perhaps man-centuries) of work writing and auditing security-critical code.
Surely it'd be better if these kinds of failures were at least interesting rather than an endless parade of careless mistakes that could literally have been caught by trivial automation in the compiler. Especially when there are so many good compilers for so many good languages that already do it.
Oops, forgot to make the explicit connection to your comment - the TL;DR is really the last paragraph.
OpenSSL is a beautiful case study in this. If it were written in a memory-safe language, you're right that many things would still be wrong with it - the author(s) very well may have botched "heartbleed" anyway with their custom allocator, but the parade of other buffer handling errors would not have been exploitable except potentially as denials of service. The weaknesses would have at least been interesting (and less numerous, all other things being equal).
I think that we need an improved C compiler more than we need new languages. You put it quite eloquently:
>Surely it'd be better if these kinds of failures were at least interesting rather than an endless parade of careless mistakes that could literally have been caught by trivial automation in the compiler.
You're right - they CAN be caught by trivial automation in the compiler, so let's add that.