Preventing heartbleed bugs with safe programming languages

zepolud · on April 11, 2014

To take it one step further, it might be a good idea to specify the whole thing in Coq, prove correctness and derive the C code from there. Coq is a theorem prover based on the Martin-Löf type theory like the ATS mentioned in the article. Xavier Leroy (the man behind OCaml) managed to write a whole verified C compiler in Coq[1] so while it's slower than writing C directly it should still be feasible for projects of such critical importance.

On a related note, mathematicians are also finding it increasingly difficult to trust unverified proofs. Papers by Vladimir Voevodsky, a Fields medalist and one of the most prominent in the field, were also discovered to contain errors more than ten years after publication, having been cited hundreds of times and nobody noticing the errors in the proofs. This was one of the principal motivations for him to initiate the project for homotopy type theory, which also has roots in Martin-Löf type theory and is also being developed formally in Coq.

[1]http://compcert.inria.fr/diagram.png

jude- · on April 11, 2014

I've said it before and I'll say it again.

Formalizing the TLS specification and proving that an implementation is consistent with it only shows that the implementation is logically correct. However, it does NOT show that the implementation is secure. Your implementation can be vulnerable to side-channel attacks (particularly timing attacks) while still being logically correct.

Proofs of correctness are a (big) step in the right direction, but they're not a silver bullet for building a secure implementation.

lmm · on April 11, 2014

Heartbleed was not a side-channel attack. It was not a subtlety of the TLS specification. It was the single most well-known, common C-based security flaw. Better languages may not solve everything but they are absolutely the solution to this class of errors.

pessimizer · on April 11, 2014

Similar could be said for Goto Fail and the GnuTLS bug. I wonder how something like Coq could handle the Debian random number bug, though. Still, 3 out of 4 ain't bad.

Do any functional or logic programming languages or libraries have a concept of entropy, and entropy generating/reducing operations?

whatnot2 · on April 11, 2014

Well-said!

tomp · on April 11, 2014

You can use mathematics to prove other things as well. You could e.g. use a special bool type for all secure data that you can't branch on, so that no secrets could be exposed by examining the runtime path of execution. Or build a model of your CPU's caching and prefetching behaviour and prove that you're code doesn't expose any secrets by that channel.

jude- · on April 11, 2014

Right--this will work on some types of side-channel attacks. For example, I could prove that my decryption function runs in a constant amount of time no matter the input.

However, you can't proactively defend against side-channel attacks. By definition, a side-channel attack exists because the designer didn't know that the cryptosystem leaks information when (s)he designed and proved the correctness of the implementation.

Proving that the cryptosystem doesn't leak information is damn nigh impossible, and often outside the scope of software engineering. My favorite example is acoustic analysis (https://www.cs.tau.ac.il/~tromer/acoustic/). Before this was shown to be possible, how were cryptosystem designers supposed to know to defend against it?

The best we can do is come up with and formalize a threat model, and then prove that the cryptosystem is secure against the adversaries in the threat model. The problem is that threat models don't help you with adversaries in the future, who have means of getting information from your cryptosystem that you didn't think of.

justinsb · on April 11, 2014

I take your point, but I don't think _anything_ can defend against unknown attacks. However, we could defend against all known attacks using formal methods, which is a lot better than we're doing at the moment. Further, when a new attack becomes known, I would wager that having something that is verifiable today will be easier to verify or fix in the future, as compared to a code-base that is not provable.

Just because we can't achieve theoretical perfection, we should not rule out using methods that would solve many of the actual problems we encounter today.

Of course, if you can find a way to defend against unknown future problems, I'm all in favor of using your method!

conover · on April 11, 2014

Unknown-unknowns as it were.

midas007 · on April 12, 2014

A formal methods solver for an engineering description language was the thinking behind Zed (Z notation) https://en.wikipedia.org/wiki/Z_notation

But the ideas of trusted OS and instruction might be necessary to compartment secrets to reduce their potential exposure to the least possible opportunity. It's basically impossible to completely eliminate plaintext and secrets in clear unless maths allow it, but we should still try.

JetSpiegel · on April 11, 2014

How complex must that model be?

At least as complex as a real CPU, so we're back on square one.

midas007 · on April 12, 2014

TLS has so many features that it factorially increases the attack surface to the size of the Milky Way.

A much simpler nonrepudiated authenticated encryption scheme implemented in a safer language would be an ideal way forward. It also would help to have a reference implementation and reference test suite to lock down behavior.

anonymousDan · on April 11, 2014

It occurs to me that if I was a (black hat) hacker looking for vulnerabilities, I could reimplement critical software in a safe language, leveraging the type checker to search for bugs. Is anyone aware of people doing this already?

chriswarbo · on April 11, 2014

Your reimplementation would either be very different from the original, and hence not suffer the same bugs, or the compiler wouldn't get further than a couple of function calls before giving you a cryptic error message since it doesn't understand the control flow.

Static analysis tools are useful for finding bugs in existing code, but safe programming languages won't magically point out all of the errors in some arbitrary code written in another language. They're safe precisely because they do not allow you to write code in the way that an unsafe language would.

As an analogy, if you were looking for use-after-free errors in some C code, it wouldn't do you any good to check a Python version.

tokenrove · on April 11, 2014

Actually, your reimplementation can be identical to the original, in a language like ATS, as demonstrated by the article in question. This is one of the things I find very compelling about ATS.

fulafel · on April 11, 2014

You are on your way to reinventing grammar/model/spec based fuzzing. It works pretty well. Eg. it was used to find the Heartbleed bug (Codenomicon's tools at least, don't know how Google fuzzed it).

chotu · on April 11, 2014

You can do specification as type in ATS2 too. My biased view is that I find ATS2 to be far more practical than using coq.

josteink · on April 11, 2014

In the age where "most" web-servers are idling at 10% load and yaaaawning most of the hours away, it makes no sense that security-critical network-code should be written in unsafe languages which can leak memory, such as C.

Due to improved hardware, we've now reached the point where we can have reasonable security and performance at the same time. We no longer need to sacrifice security by using unsafe languages. So why are we using them?

Yes some companies like Google and Facebook has busy servers, but for the rest of us, for the rest of the entire internet having to redo all our keys and certs and passwords due to a archetypical C-style bug in OpenSSL, the cost using C compared to the potential performance-benefits is absolutely not worth it. Not even close.

TLDR: Stop writing network-code in C unless you bloody well need to

hackerboos · on April 11, 2014

I think the point of TFA is that we can write this code and it will run as efficiently as C in a language which is safer than C.

pjmlp · on April 11, 2014

It was already possible in the 70's with PL/I, Mesa and Modula-2, but alas they though it would be better not to offer bounds checking in the language, instead of a way to disable them if required, like the other languages.

ris · on April 11, 2014

Is electrical power free for you, then?

spoiler · on April 11, 2014

With just a little bit of brain power and abstraction C can be made "safe" (assuming you don't cock up the API).

alkonaut · on April 11, 2014

Certainly. But abstract it enough and the abstractions and syntax extensions required become another language. Abstract it to the point where you don't have any pointer arithmetic, a good type checker and so on, and the C code has turned into something else. Just adding wrappers around buffers to prevent overflows etc. doesn't come near the type of safety you can get in languages where you can prove formally what is going to happen when the program executes.

Before you have the kind of security you can reason about like http://www.mitls.org (rather than just abstractions for buffers etc.) it can't look like C any more.

jude- · on April 11, 2014

While miTLS is a step in the right direction (as are proofs of logical correctness as a general development strategy), one must keep in mind that miTLS in particular is implemented in F#, meaning that runs on top of the CLI. The CLI is written in an "unsafe" language and is not formally verified for correctness, but to use miTLS you must trust the CLI fully.

I'm not saying that the CLI is inherently insecure. I'm just saying that you can't ignore the security implications of the layers beneath your TLS implementation.

alkonaut · on April 11, 2014

I see what you are saying, any language that has a runtime has its security dependent on that runtime. F# can compile and run on a variety of platforms (not sure which mitls are compatible with) not necessarily the ms CLI only.

Normally I'd argue that running safe programs on top of a runtime/VM would make everything safer since you would only have to prove ONE system (the VM) to be safe, but with the Java debacle of recent years I think that argument isn't really as good as it used to be.

I agree with what you are saying: for proven security you need to prove properties of all the layers down to the metal. Not sure how difficult this is for good "safe" languages such as e.g. Haskell.

jude- · on April 11, 2014

"I agree with what you are saying: for proven security you need to prove properties of all the layers down to the metal."

Proving all layers logically correct would be a great start :) We could probably have avoided this whole Heartbleed debacle.

If you'll allow me to elaborate on the security properties, it's worth pointing out that even if your TLS implementation ran on top of a formally-proven correct VM and used formally-proven correct libraries, and ran on a formally-proven correct kernel and processor, you still have to prove that it resists side-channel attacks before it can be said to be secure. Doing so is open-ended, and likely impossible, since side-channel attacks come in all shapes and sizes, and some are outside the scope of software engineering. For example, how do we prove that a logically-correct TLS implementation is secure against acoustic analysis? [1] How would we have done this before it was known that acoustic analysis was possible? This is the essence of what makes writing secure code hard.

[1] https://www.cs.tau.ac.il/~tromer/acoustic/

alkonaut · on April 11, 2014

You can only prove it secure against known side-channel attacks. In a language with a good type system you would probably use types that support only constant-time operations.

Protecting against everything including side-channel attacks that are unknown at the time of implementation will likely never be possible, regardless of language. Thus for language choice this is a non-issue. Still: if you can avoid the stupid bug-holes and just leave the exotic side-channels that would be a big win.

Dewie · on April 11, 2014

> I agree with what you are saying: for proven security you need to prove properties of all the layers down to the metal. Not sure how difficult this is for good "safe" languages such as e.g. Haskell.

I wonder if optimizations affect the area of code that would need to be verified? Haskell is pure, and this fact is exploited in a lot optimizations and code generation - I guess one would have to prove that all of these transformations preserve the integrity of the Haskell semantics.

I guess you could have a small (compared to GHC) implementation of Haskell which is formally verified, which is small because you don't bother with hardly any optimizations?

jude- · on April 11, 2014

Side-channel attacks are my primary concern with writing secure code. For example, if Haskell optimizes and makes some decryption functions run faster than others for an attacker-chosen ciphertext, the attacker could use the runtime differences perform a timing attack [1], and potentially learn your secrets.

[1] https://en.wikipedia.org/wiki/Timing_attack

Scramblejams · on April 11, 2014

We really need to stop writing network code in unsafe languages, period.

eatfish · on April 11, 2014

But even 'safe' languages have unsafe VMs. There is no question that over the years these VMs (or the runtimes) have had equally severe vulnerabilities.

The thing that really sets heartbleed apart is not the details of the bug, it's the scale of the 'infection'. OpenSSL is a core dependency of so many distributions and of so many pieces of software.

I think we could argue for lots of different solutions (diversity of implementations, safe languages, more tests) and all of them might be good in one way or another but none of them are a silver bullet for any possible bug either.

I think the take away is that you need to in a position to upgrade any part of your software stack at a moment's notice not just the obvious top-level (e.g. Rails/Django/Jetty).

MichaelGG · on April 11, 2014

> these VMs (or the runtimes) have had equally severe vulnerabilities

I just looked at all the CVEs for .NET (62 of them). I did not find related to reading outside memory bounds or running arbitrary code. All the executable vulnerabilities were due related to loading code or escaping sandboxing: irrelevant unless you're running untrusted code in the first place.

A handful of them were due to calling out to an unsafe native library, like to render fonts.

The other serious ones were logic errors, for instance, ASP.NET returning file contents when it should not.

So while technically the VMs/runtimes have bugs, they aren't remotely the same severeness.

eatfish · on April 11, 2014

.NET isn't something I work with but that's good to hear.

Maybe you could tell me why this one doesn't count though? http://technet.microsoft.com/en-us/security/bulletin/ms10-06...

This is just the first I found. Sorry I'm not being awkward, I just don't work with CLR/Silverlight. What in your mind prevents this remote execution exploit from being serious? CVE denote it as a 9.3 and Microsoft claim it allows remote execution on a server too (under some circumstances).

tveita · on April 11, 2014

> The vulnerabilities could also allow remote code execution on a server system running IIS, if that server allows processing ASP.NET pages and an attacker succeeds in uploading a specially crafted ASP.NET page to that server and executing the page

Like he said, this matters if you're running untrusted code from potentially malicious people. It's not a serious bug if you're running well-intentioned but potentially buggy code, like openssl.

MichaelGG · on April 11, 2014

>A remote code execution vulnerability exists in the Microsoft .NET Framework that can allow a specially crafted Microsoft .NET application

An attacker has to get the user to run their application. If you can get the user to run arbitrary executables, usually you've already won. It's only news in this case because .NET, Silverlight, Flash, Browser JS, Java Applets, etc. offered a sandbox.

It would not have any impact on applications a user is running.

iopq · on April 11, 2014

Rust allows you to write safe code without a VM. It is statically checked to be memory and concurrency safe.

eatfish · on April 11, 2014

1. I think a lot of people believe Rust will just type-check any old program and tell you when it has faults. So you can start with a bit of Ruby/C/Python, translate it to Rust and presto, all your bugs are exposed for the world to see.

In practice Rust's type checker accepts only a _very_ small subset of correct programs. I've been in a position to write some decent sized Rust code recently and it takes a shift in your mindset to start writing decent Rust code.

Even now there are patterns I'm unsure how to model in Rust. Arena allocation is a good example because it was partly the cause of Heartbleed too. Arena allocation in rust seems to require unsafe pointers and unsafe code blocks. You can look at Rust's standard library and see this.

2. The point being that the Rust language exposes unsafe code blocks and pointers. At some point you're going to hit those blocks (if nothing else in 3rd party code) and you're back to square one: You need to trust unsafe code that it is correct. It doesn't matter if that code is a VM or unsafe code.

*edited for some legibility.

ldng · on April 11, 2014

The argument Rust devs make is that most of the time you would not need to use unsafe code and when you do, being explicit about it would make you more careful and think twice about it.

To me it makes sense. And the example you give here is very relevant. First you'd try to do it within the standard language bounds and only when you realize you can't do it that way, I'll resort to unsafe code. But now your very aware that this part of the code needs to be treated why extra care. So, to me, you're not completely back to square one.

Nicholas Matsakis make this very point near the end of this talk: https://www.youtube.com/watch?v=9wOzjbgRoNU

I would even add, if care is taken to make that unsafe code really small it can even been generated by Coq for instance as stated in some comments here.

That said Rust might not be the best out there for the job but IMHO it shouldn't be dismissed to fast either. It is similar enough to C++ to allow a less painful transition for devs with the domain knowledge.

natefinch · on April 11, 2014

Would you not assume that the entire OpenSSL library would count as being in need of extra scrutiny? The point is that any time you let people directly access memory, they can, and often will, screw it up.

ldng · on April 11, 2014

Ok, maybe we should make a distinction between, let's say the plumbing code and algorithms. Rust could help with the former. According to some comments I've read here it seems OpenSSL is using it's own abstraction of malloc/free (not that I have actually read the code). I suppose that this part of the code would be a suitable candidate for unsafe code with special extra care taken, then the rest of the algorithm does not not to be unsafe code. If you watch the video you might understand better what I mean : the ARC is unsafe but provide you with a safe abstraction for you to use in the checked part of the language.

Of course such a project must require extra scrutiny on all level and Rust does not resolves all the problems. I'd say pick your battles. Rust provide some interesting middle ground between C/C++ and a completely different language like Ada.

eatfish · on April 11, 2014

Oh, I'm not dismissing Rust. I think my overall point is that nothing is ever going to perfect. Plan for imperfection.

mercurial · on April 11, 2014

> It doesn't matter if that code is a VM or unsafe code.

What does matter is the amount of unsafe code to trust. It's much easier to check that a small area of clearly-marked unsafe blocks does "the right thing" than if your entire program is a gigantic unsafe block.

steveklabnik · on April 11, 2014

I haven't used it, but Rust provides a safe Arena abstraction for you: http://static.rust-lang.org/doc/master/arena/index.html

ASneakyFox · on April 11, 2014

Those aren't the only "dangerous" kinds of mistakes a programmer can make though

agapos · on April 11, 2014

Just because you can't design the perfect car, it doesn't mean that you should not improve the breaks.

robin_reala · on April 11, 2014

No, but it eliminates a large chunk of the more comment cases.

doublec · on April 11, 2014

ATS doesn't have a VM. It compiles directly to C. The example given in the article doesn't use a runtime and the resulting C code has no more overhead than the original C.

lmm · on April 11, 2014

The VM is a much smaller attack surface than all applications and all their libraries.

eatfish · on April 11, 2014

That is true, but I don't think this invalidates my central point that the rational response to heartbleed is to update your contigency plans. The irrational response is to plan the perfect world.

After 20 years of Java we still don't have our perfect VM. It still sees critical security vulnerabilities. I don't think I'm picking on Java unfairly here. Java is a well written code base, it has plenty of unit tests, a proper code review process, a sound architecture. Pretty much all of these have been put forward as ideas that would 'cure' the OpenSSL project. Yet it doesn't seem to be a perfect cure. At some point no matter what your language, VM, OS is you are going to experience something similar.

lmm · on April 11, 2014

> After 20 years of Java we still don't have our perfect VM. It still sees critical security vulnerabilities.

Sure, but how many, and how often? The last advisory for Java's SSL I can find is from 2009, and that was quite a limited flaw (allowed an attacker to inject a prefix into SSL data). Indeed the kind of exposure we see with heartbleed - leaking all of the process's memory including the private key - is more or less impossible by design. At this point maybe using Java for your internet-facing service might do more to improve your security than shaving a day off your response time.

eatfish · on April 11, 2014

Sorry, I'm not limiting the discussion to SSL vulnerabilities.

A remote code exploit is as bad as a memory leak.

I posted these two: CVE-2013-1493 and CVE-2013-0809 in another reply. These 2 were memorable to me just because visiting a page (or a compromised page) would allow the exploit to proceed without any password/prompt/warning.

lmm · on April 11, 2014

CVE-2013-1493 is more an argument for java - the vulnerability exists because that part of the standard library is implemented as native code rather than in java itself.

MaxBarraclough · on April 23, 2014

As others have pointed out above (https://news.ycombinator.com/item?id=7572092), the 'perfect JVM' is missing the point. The JVM aims to provide two things:

1: A high-level development environment which allows well-intentioned developers to avoid, say, buffer-overflow bugs

2: A sandbox, in which untrusted code can be safely run

Java has a truly awful track-record on point 2 (running untrusted applets by default? awful idea), but a much better one on point 1, which is what's actually relevant here.

> At some point no matter what your language, VM, OS is you are going to experience something similar.

No. If all/nearly all of your OS is written in a safe language, it's going to be much safer from, say, buffer-overflow attacks. Unfortunately there aren't any such languages in major production use, so it's hard to point to concrete numbers.

MaxBarraclough · on April 25, 2014

s/languages/OSs/

Nursie · on April 11, 2014

So we offload the safety to the people that write the runtimes.

I understand that that centralises the potential problem area and may make it easier to address... but it still means that someone has to do the 'hard' bits, and if they get it wrong then everyone using the runtime is screwed. Just like what happened here (too many people depending on a single implementation).

I don't know about you, but I'm not 100% comfortable with the idea that other, cleverer people will take care of all that for me, so I don't have to worry my pretty little head about the details of what's really going on with the machine.

And look at all the JVM vulnerabilities we've seen recently...

djjaxe · on April 11, 2014

Speed or security?... Age old question.

ohazi · on April 11, 2014

Both!

http://rust-lang.org/

azth · on April 11, 2014

Actually, even "modern" C++ is a step up from plain C. That being said, I am really looking forward for Rust.

djjaxe · on April 11, 2014

LOL "* In theory. Rust is a work-in-progress and may do anything it likes up to and including eating your laundry."

Pacabel · on April 11, 2014

It's a shame to see that comment has been downvoted. That's a quote directly from the bottom-right corner of the Rust website itself!

Rust is promising, without a doubt. But it's not yet truly usable in the same sense that C, C++, Java, Python, Haskell, Go and so many other languages are.

Maybe it'll start to get to that point once 1.0 is released, once we see at least some language and library stability, and then perhaps some adoption. But that just hasn't happened yet.

steveklabnik · on April 11, 2014

I downvoted it because "LOL" is not the kind of comment I'd like to see here. The point could have been made in a more substantial way. Like you just did.

djjaxe · on April 11, 2014

Wow sorry I can't laugh at something jeez. So, you have never in your life just felt like re-posting a quote off something and just added a little something to it to show the spirit in which it was meant to be. Now you are just being nitpicky and to be honest rude in a sense. I have just joined this community I am trying to fit in and you just come along and see the comment and you "don't like it" because it's short, sweet and too the point. I am laughing at the comment of the programmer of rust for the quote he put on his site and now you have just totally bashed me because you felt it necessary to not like my simplistic comment. Wow.

steveklabnik · on April 11, 2014

> you have never in your life just felt like re-posting a quote off something ...

I do, but I do that on Twitter, because, as you've found out, HN will downvote you into oblivion.

> I have just joined this community I am trying to fit in

Ah ha! Sorry, I didn't see that: usually, new users are in green. (also, your account is 163 days old?) If you haven't checked it out, you should check out the community guidelines: http://ycombinator.com/newsguidelines.html

For what it's worth, I am not trying to 'bash [you]'... but I don't think this was a great comment. Try to keep them more substantial here. Different forums are appropriate for different kinds of discourse, and short little comments are generally not taken very well here.

The same happens to "+1", "thanks", and "interesting!" comments. If you can't write more than two sentences, you probably shouldn't post.

Dewie · on April 11, 2014

What some one - I think kibwen - has brought up is that early adopters can benefit in the sense that the language design is still in flux. So these early adopters can uncover weaknesses in the design, before they get to the stage where they have to consider backwards compatibility.

So although early adopters might not get any useful software out of learning Rust at this stage, they might indirectly improve their future Rust code by having a small influence on the direction of the language.

kilburn · on April 11, 2014

If you are speaking about execution speed, you got the idea wrong. From a quote in the article:

> If you use the high level typing stuff coding is a lot more work and requires more thinking, [...] (but) you can even hope for better performance than C by elision of run time checks otherwise considered mandatory, due to proof of correctness from the type system. Expect over 50% of your code to be such proofs in critical software and probably 90% of your brain power to go into constructing them rather than just implementing the algorithm. It's a paradigm shift.

The idea is to formally prove that the code is not doing unexpected things. The process is relatively simple to understand:

First you define the assumptions you make about the program, its execution environment, and the acceptable/expected results of your program. This is known as "formal specification" of the program. It is a critical part. If your specification is wrong, then the whole approach breaks down. However, this part should be much smaller than your whole codebase, and hence you can be extra careful on it.

Next, using this specification, you write proofs showing that the code can not do anything unintended (such as accessing a buffer out of its valid range). The compiler goes through this proofs and checks that everything is provably correct (according to the specification). Then it can generate code without runtime checks that you would otherwise probably implement, because it is sure that certain things cannot happen. As a result, the code may end up being actually faster.

Although a bit involved, the idea should be pretty intuitive. It is exactly what you are doing in your mind when programming. The main differences are:

1. We humans are pretty comfortable working with inexact and/or incomplete specifications. Then some undefined behavior happens, and our programs bug out. For instance, it is very easy for us to think about the division operator as something that always yields a value, ignoring the "division by zero" edge case. Computers are not, and force you to specify what exactly should happen when you encounter such edge cases.

2. We are also pretty bad at exhaustively checking every possibility, whereas computers excel at it. With the help of human-written proofs, obviously (otherwise verifying a program would involve checking every possible input for it, which is obviously intractable).

TL;DR: The tradeoff here is between development and compilation speed versus correctness, which implies improved security and execution speed.

doublec · on April 11, 2014

The type safety shown in the article doesn't come at a speed cost. The times are erased during code generation. The generated C code is much like hand crafted C but with the safety confirmed via the types.

ibotty · on April 11, 2014

no. read the article.

djjaxe · on April 11, 2014

I read the article is does not mention speed, performance once, in which it had nothing to do with what I stated. I was simply stating that higher level languages will cause the library to be slower also less easy to be used by other high level languages like python.

spc476 · on April 11, 2014

The author has several articles about ATS (http://bluishcoder.co.nz/tags/ats/) and from what I've read, it outputs C code that is proven (for the parts that are in ATS) to be correct. There's a bit more detail in "Safer handling of C memory in ATS" (http://bluishcoder.co.nz/2012/08/30/safer-handling-of-c-memo...) and the end of the article contains some generated C code.

sitkack · on April 11, 2014

I do not understand why people keep pushing unsafe code when computers keep getting faster and we have more and more headroom (cpu, memory, bandwidth). There is no excuse to keep running unprovable crypto.

djjaxe · on April 11, 2014

Maybe because most of the code currently out there being use by the biggest companies in the world still use these "unsafe" languages. & tons of the job market still is in these "unsafe" languages.

slight · on April 11, 2014

I think you should read the article again. The language in question isn't higher level really it just has compile time type checking, which has no overhead.

djjaxe · on April 11, 2014

That's still not my point. At the time of starting openssl I don't believe that ATS was around. In any case my point is that back then C lang was the best choice for performance and still is revered as the "fastest" as `nearly` all other languages are written on top of it either directly or indirectly. In any case I would love to see someone tell all of the openssl community to just drop C and switch to a different language.

danieldk · on April 11, 2014

C being loved in the UNIX community is one of the primary reasons that these libraries are in C.

Ada has been around since the beginning of the eighties, has and had performance that is near that of C, does not use a garbage collector, provides C linkage, and is far more safe than C.

If you do allow garbage collection, there were many performant and safe alternatives in the 90ies, such as ML.

It's culture as much as performance.

djjaxe · on April 11, 2014

Ada is a higher language weather or not it has linkage to C or not. The UNIX community cares about performance, performance, performance.

Dewie · on April 11, 2014

'Higher level' doesn't necessarily say anything. Rust and ATS are both higher level than C, but they can both do everything that C does.

Is Ada less performant than C? I know it has bounds checking, but that can be turned off for "shipped" software. Does it have some features that incur a runtime cost and that can't be disabled?

Avshalom · on April 11, 2014

Ada with all the runtime features left on is slower than comparable quality C. It's faster than most languages though.

With all the runtime turned GNAT can/should produce code with in a percent or two as fast as GCC (they share the same backend).

And Ada has a thing called SPARK which is a set of compiler checks to formally verify your code so you can provably turn off those runtime features safely. https://en.wikipedia.org/wiki/RavenSPARK

djjaxe · on April 11, 2014

So rust & ATS & ADA & higher level languages can modify memory space? That I am aware of most higher level languages stray from being able to modify memory space on purpose as it's dangerous but, someone has to do it for the operating system is all I am saying about low level now that we are completely off topic here.

Dewie · on April 11, 2014

Don't know about Ada, but Rust and ATS can. The Rust code would need to be written in an unsafe block in order to as freely modify memory as C, but regular Rust can still do a lot without requiring automatic memory management, safely.

As you can see from the article (not that you seem to have read any of it), ATS can express C, and optionally prove low-level stuff about it.

Dewie · on April 11, 2014

No. I think that the article clearly shows:

execution speed, safety, programmer productivity ← pick any two

alkonaut · on April 11, 2014

This isn't a fundamental dilemma like the consistency/scalability dilemma of databases. This is (or was) just a limitation of languages and compilers. The arguments for using C are many but in this case the most common involve the need for low level access (for perf, timing)

C is certainly very much suited for some parts of an SSL implementation e.g. when you need absolute deterministic performance to avoid timing attacks etc. (Although performance should certainly be good enough with modern compilers for most languages, and avoiding side-channel attacks by having deterministic execution time is also possible without resorting to C).

Using the execution speed as an argument for writing the whole thing in C is just wrong. I haven't heard any good arguments as to why a library such as OpenSSL shouldn't be written in Haskell (or say 98% Haskell and 2% C).

Did someone at some point say

"There is 2% of the code that is performance critical and/or needs low-level code for cryptographic reasons so I'll write everything including the network code, command line argument parser, world, dog and kitchen sink in C" ?

Dewie · on April 11, 2014

> This isn't a fundamental dilemma like the consistency/scalability dilemma of databases. This is (or was) just a limitation of languages and compilers.

You're right. Null pointers are a nuisance in some languages, but other languages have shown that you can remove them and still have just as much of an expressive language (and the compiler can still translate pointers that might be "null" to actual null enabled pointers, so no performance cost). Rust might show that a stronger type system can remove certain raw pointer flaws from the language while still retaining both execution speed and programmer productivity.

Dependent types might mature to the point that you can use them and gain both execution speed, productivity and safety - time will tell.

> Using the execution speed as an argument for writing the >whole thing in C is just wrong. I haven't heard any good >arguments as to why a library such as OpenSSL shouldn't be >written in Haskell (or say 98% Haskell and 2% C).

Did someone at some point say

"There is 2% of the code that is performance critical and/or needs low-level code for cryptographic reasons so I'll write everything including the network code, command line argument parser, world, dog and kitchen sink in C" ?

...

The article makes the argument that, assuming that the whole program needs to be incredibly performant, you can write say 2% of it in verified ATS code, the rest in C-ish ATS code (ie. without proofs).

I guess you can also choose to write 2% verified low-level code, and the rest in a more high level ATS - ATS is a functional language with garbage collection and I presume other high level goodies that functional programmers are used to.

Pacabel · on April 11, 2014

I think that C++ is so useful in practice because it gives you most of each of those, rather than just some of two (or even just one) of them.

Programs written in C++ aren't necessarily the fastest out there, but they're usually pretty close. They're at least almost always better than what you'd get when using most other languages.

And the same goes for safety. It may not allow you to write bulletproof code, but using modern C++ techniques can go a very long way toward avoiding many common problems with relative ease.

C++ may not be the most productive language for some developers, but it still does quite a good job of offering a wide variety of functionality, reasonably high-level constructs, good library support, and decent tooling.

thinkpad20 · on April 11, 2014

C++ has just as much capacity to be unsafe as C, precisely because it accepts nearly all C code, not to mention that it has bare pointers, null references, and any number of other unsafe objects as first-class citizens. Of course, you're unlikely to use most of those if you're following best practices, but the same can be said of C. You might argue that C++ makes it easier to be safe because of its plethora of features and classes, but the massive size of C++ makes it a very hard language to master, and nearly impossible to guarantee that everyone will be conforming to best practices. C++ is quite possibly the largest language out there in terms of features, and with about the most gotchas (things that don't work the way you'd intuitively expect them to) that it's possible to put in a language. It's a step above C in terms of safety, but it is hardly a safe language, and it's only slightly more productive, because the benefits of its class system and standard library are so much at odds with it's huge mental overhead.

Dewie · on April 11, 2014

Maybe I should add another attribute to my original three - language complexity. :)

pjmlp · on April 11, 2014

Since the 70's we had lots of safe programming languages with native compilers to choose from

Ada, Modula-2, Modula-3, Oberon, Oberon-2, Active Oberon, Component Pascal, Delphi, Turbo Pascal, Turbo Basic, Quick Pascal, D, Haskell, OCaml, Eiffel, Go, ...

C only got widespread into the industry as a side effect of UNIX's adoption.

Nowadays we pay the price for it.

Security conscious developers should only use C when there isn't any way around it.

dorfsmay · on April 11, 2014

I'm trying to learn ATS right now, and indeed I immediately thought of ADA. One of the advantage of ATS is that it is a lot more functional, but far from being anywhere as mature yet.

chotu · on April 11, 2014

There seem to be misconception that safe language implies inefficient and slow code . It is not so in ATS2 , generated code is quite efficient and is safe even when manipulating memory from ATS2 .

jacquesm · on April 11, 2014

There will always be a compromise between cost, performance, features, usability and security.

'Safe' programming languages will improve security, presumably at the cost of usability through decreased performance. A bit like how a Ferrari is faster than a Volvo, but not as safe.

Performance has been a major driver in the choices made so far, I'm sure that the heartbleed affair will move the needle towards the 'security' end of the spectrum but I doubt it will move enough to drop C as the main work horse of systems software coding.

Reducing the complexity of the protocols used would seem to me to be a better place to reduce the exposed attack surface. No matter what the language used if you make a system extremely complex bugs are going to be more numerous and due to the interactions between the various parts much harder to detect.

zepolud · on April 11, 2014

>'Safe' programming languages will improve security, presumably at the cost of usability through decreased performance. A bit like how a Ferrari is faster than a Volvo, but not as safe.

This is fundamentally untrue. Safeness of the kind described in the article does not come at the cost of performance -- all type checks, invariant conditions and general formal proofs of correctness are determined at compile time. The produced code is to be indistinguishable from a correctly written, efficient C program. You are allowed to play as much as you like with direct memory accesses and use all the dirty tricks you like as longs as you can prove that the resulting program is formally correct.

What could be argued is that price you pay for all this is the difficulty of writing such programs. But definitely not their performance.

jacquesm · on April 11, 2014

Ok, point taken, if you take the pre-processor or code generator approach such as in the article the performance hit might be manageable, but when I look at that 'memcpy' example I can't help but suspect that that is not a free lunch (as in, that all the extra work is done at compile time). For that you'd need to look at the code generated.

Even if there is a 0 performance hit in this case, the majority of 'safe' languages are neither pre-processors for C or code generators for a C like language (which would presumably also require the linked libraries to be re-written using something safer).

So in the specific case outlined here this may be true but in the more general case there are usually run-time trade-offs involved.

I think the key operative word in your comment is 'correctly', writing correct C is extremely hard and this approach makes it harder to create a certain class of bugs at the expense of making it harder to write the program in the first place. Tough choice, even in the absence of a performance hit!

doublec · on April 11, 2014

In the example given in the article there are no run time costs. The generated C code is much the same as the original C code from OpenSSL. There is no ATS runtime that gets included either.

sitkack · on April 11, 2014

For anything operating over internet latencies and bandwidth, writing it in an unsafe language makes no sense. RTT will dominate, that compromise is a false dichotomy for real world scenarios a safe language will be indistinguishable in performance and orders of magnitude closer to the goal of robust encryption.

Speed gets you this, http://www.openssl.org/news/vulnerabilities.html and now where do you stand?

AnimalMuppet · on April 11, 2014

OK, help me out here...

If I understand the heartbleed bug correctly, isn't there an RFC that it is implementing? Does the RFC say that you can ask for a length, and you get that length? That is, isn't this really a bug in the RFC?

If the security flaw's in the spec, there's not too much that the programming language can do to help you. The best it could do is point out that you're overrunning a buffer, at which point the programmer has a choice: Proceed to implement the spec in code that is declared to be unsafe (useless), scream about the spec (the right answer, and useful if anybody listens), or find some other way to generate the extra bytes that are supposed to be in the reply (which makes this implementation safe, and still satisfies the flawed spec).

ASneakyFox · on April 11, 2014

I feel like every one is using heartbleed to pitch their products.

mercurial · on April 11, 2014

I don't the author makes any claim to have invented ATS. But why not show why an alternative technology would have avoided this issue?

djjaxe · on April 11, 2014

Just because this alternative language would have avoided this bug does not mean a much worse but would not have been created with a higher level language or even ATS.

mercurial · on April 11, 2014

Well, people can write bad/unsafe code in any language. But ATS can remove entire classes of bugs from a program, while C is notorious for its lack of safety. Though obviously, it has no more built-in protection from side-channel attacks than C.

djjaxe · on April 11, 2014

Basically the cut down to everything is the laziness of how it is developed and how many people are actually looking over the entire code. ATS allows the developers to know that the can be even more lackadaisical about coding as ATS will remove bugs for them...

mercurial · on April 11, 2014

"There are collisions because people don't pay enough attention. Imagine if we installed a collision detection system, people would pay even less attention!"

I hadn't heard this line of argument before.

Dewie · on April 11, 2014

So Rust has unsafe code blocks that you use when you can't use the type system to enforce that a piece of code is safe. I wonder if would be viable to write those things in ATS with all the proofs to guarantee that the code is safe, compile to C and then call that trusted C code through the FFI?

It might not work in all cases, since I have the impression that unsafe Rust code is even more expressive when it comes to low level code (I guess this also means more unsafe) than C.

kibwen · on April 11, 2014

Rust code within "unsafe" blocks is still way, way safer than C code, while still being almost uniformly more powerful than C.

coherentpony · on April 11, 2014

You can write unsafe code in any language. It's not the language's job to protect you from yourself, that's what tests are supposed to do.

chriswarbo · on April 11, 2014

Actually that's what types are supposed to do ;)

Tests are still useful, as sanity checks and for checking empirical properties (eg. wall-clock time usage, timing attacks, etc.)

coherentpony · on April 11, 2014

Consider the scenario where my programme is dangerous for values of an unsigned integer less than or equal to 2. As far as I know, there is no such type for unsigned integers equal to 3 or more. Sure, in object oriented languages you can essentially define your own type, but this flexibility is where you are not protected against yourself.

I see your point about types though. That said, both types and testing exist to protect you. I stand by what I said, however; languages do not exist to protect you from yourself.

tel · on April 11, 2014

The usual idea is that while tests demonstrate the existence (and subsequent removal) of particular bugs only types (or proofs) show the non-existence of whole classes of bugs.

Such integer checks as you suggest are nothing more than a number paired with a proof that the number is less-than some bound. The properties of those proofs cannot be faked and their proper management and construction is all handled at compile time... Then erased.

I challenge you to write a red-black tree with the RB invariants exposed in the types and then continue to say languages do not exist to protect you from yourself. The moment these things end up in types the language starts to pull back on you.

Jtsummers · on April 11, 2014

http://faculty.cs.wwu.edu/reedyc/AdaResources/bookhtml/ch05....

Ada, and probably other languages, allows subtypes that restrict usage to a range of some other type. 2 sections below that is modular types, so you don't have to put in the `mod 500` or whatever your code needs every time, the compiler will fill that in for you as long as you are using that type.

chotu · on April 11, 2014

In ATS2 you can do like this :

fun foo {n : int | n > 2} (x : int (n)) : void = println! (x)

now calling foo (10) will type check but foo (2) or foo (1) will not.

Dewie · on April 11, 2014

> Consider the scenario where my programme is dangerous for values of an unsigned integer less than or equal to 2. As far as I know, there is no such type for unsigned integers equal to 3 or more. Sure, in object oriented languages you can essentially define your own type, but this flexibility is where you are not protected against yourself.

You can specify that type in a dependently typed language. Then, if you can prove that values of that type can not violate your requirement, there is no runtime overhead.

Are there things which are unknowable or unprovable in a dependently typed system? Sure. But I think your initial assertion that all languages are unsafe to a degree and therefore it can't be their job to protect you against mistakes is unhelpful; it muddies the water by appealing to the fact that 'nothing is perfect/no approach is perfect'. But the whole point with type systems is to eliminate certain classes of bugs - the rest can hopefully be caught by other, less rigorous means, like fuzzy testing and unit testing. All other bugs are relegated to problems which are (in general) undecidable.

mannykannot · on April 11, 2014

I think there is an important and quite general point here: it is not just about programming languages, but also about programming knowledge and skills. You are replying to someone who was unaware of the state of the art in program verification (to be fair, he recognized the issue to be solved, which is an important start.)

As the example in the original post demonstrates, programming in languages that have this level of support for verification is very different from programming as it is currently commonly practiced. Not everyone will be capable of making the switch, and for an organization to simply say 'from now on, we are going to use this safe language', without addressing the skills issue, is setting up for failure.

Dewie · on April 11, 2014

> I think there is an important and quite general point here: it is not just about programming languages, but also about programming knowledge and skills. You are replying to someone who was unaware of the state of the art in program verification (to be fair, he recognized the issue to be solved, which is an important start.)

Well let me be clear that I knew of program verification because I'm a PL geek, not because of any skill whatsoever.

> Not everyone will be capable of making the switch, and for an organization to simply say 'from now on, we are going to use this safe language', without addressing the skills issue, is setting up for failure.

Well, let's keep the discussion to programmers who really need the things that we are after - safety and efficiency. It's not all programmers, just programmers in some domains. Some people might even think that some people can't adjust to writing low-level code ala C, period. But some domains need these things, which means that we just need the programmers who are motivated enough/have the patience to learn it. Just those programmers, not all programmers.

If we can't get them, then maybe some one will actually have to offer some incentives like money - instead of a purely volunteer effort as I think was the case in this debacle. :)

mannykannot · on April 11, 2014

I wasn't intending to cast doubts on any particular individual's skills, which is why I wrote 'knowledge and skills'. I am learning this stuff myself.

You make some good points about where safety matters most, but I think a greater general awareness would help drive adoption where it matters. Furthermore, while this problem had widespread consequences due to it being in widely-deployed system- or middle-level software, 'ordinary' programming can have quite serious vulnerabilities, too.

I think schools, especially below the first tier, could do more to promote awareness of static verification and other safe practices, and that might modify the way their graduates approach development, even though they probably will not be using formal methods.

There are things that can be done to improve safety in general-purpose programming languages. I feel certain that garbage collection and the avoidance of pointers has made programming safer, but I suspect 'duck' typing has had the opposite effect.

In the past, the DOD has been a driver of code safety, though it has backed down from its possibly ill-advised 'nothing but Ada' position. In fact, Ada might be the counter-example to the idea that you can drive safety through language choice.

You would think the banks would have a vested interest in improving things. Perhaps they could divert a fraction of their bonus payments to create incentives...