Xkcd: Heartbleed Explanation

billpg · on April 11, 2014

I showed this to my wife to see if the cartoon worked with an educated but not-technical person. She subconsciously glossed over the (n LETTERS) part of Meg's requests as just an annotation on the cartoonist's part, not realizing that it was actually part of the request.

Once I rephrased the final request as "Server, reply with the 500 letters of HAT", we finally had that light-bulb moment.

mitosis · on April 11, 2014

Your wife seems to be affected by the recently discovered vulnerability. It would be advisable to upgrade her firmware to the latest version.

SimHacker · on April 12, 2014

And don't forget to immediately revoke your old marriage certificate, reissue a new one, and invalidate all your old keys and cookies, or else your relationship is vulnerable to a man in the middle attack, like Dr. Frank-N-Furter did to Brad and Janet! [1]

[1] https://www.youtube.com/watch?v=m62xw7EWJDE

angelbar · on April 11, 2014

HN, funny button?

nsxwolf · on April 11, 2014

I had the same problem.

billpg · on April 11, 2014

You were showing cartoons to my wife? Sir, I challenge you to fisticuffs at dawn!

stavros · on April 11, 2014

Nice easter egg in the user who wants to change the password to CoHoBaSt (correct-horse-battery-staple).

kdodia · on April 11, 2014

I love geeks who find the same easter eggs. =]

ozh · on April 11, 2014

Oh, nice catch :)

nyellin · on April 11, 2014

This is why xkcd is unique - not because of the puns or nerdy references, but because of Randall's ability to make complicated issues simple.

alandarev · on April 11, 2014

But it was simple in first place. Though a self-explanatory picture describing it is a huge plus to refer for non-IT people.

brazzy · on April 11, 2014

Take a look at the top-voted answers to "How to explain Heartbleed without technical terms?" on security.SE and repeat that:

http://security.stackexchange.com/questions/55343/how-to-exp...

nollidge · on April 11, 2014

The two top-voted answer includes WAAAAAY more information than the XKCD comic, so they're not really comparable. The third one explains it pretty simply.

brazzy · on April 11, 2014

That's exactly the point: some experts, when asked for a simple explanation, produce a huge info-dump in simplified terminology, and most experts thought that an appropriate response (see upvotes) - failing to realize that the result is anything but simple.

It takes a special skill to eliminate all the non-essential information and produce something truly simple to understand, like the comic.

Guvante · on April 11, 2014

Myself included. So many interesting topics to touch on when explaining something.

The only reason I ever manage to be short in text form is being able to go back and delete.

nollidge · on April 11, 2014

Oh, gotcha, agreed!

nyellin · on April 11, 2014

Sure, it's simple if you realize that memory heaps are sequential and the server's private key can be found after the address of a short-lived packet buffer... It is not trivial explaining all that with simple analogy my grandmother would understand.

masklinn · on April 11, 2014

> it's simple if you realize that memory heaps are sequential and the server's private key can be found after the address of a short-lived packet buffer…

Your comment seems to imply an out of bounds access (read past the allocated buffer), but heartbleed has no out of bounds access.

Instead, it's a problem of malloc (and even more so openssl's freelist scheme) returning non-zeroed memory which can (and often does) hold previously allocated data combined with read(2) not overwriting the whole buffer and not checking read(2)'s return value, which means the aforementioned previously allocated data gets sent back.

gizmo686 · on April 11, 2014

I had no problem explaining it to my non-technical family members. The way I described it is that the server would return whatever happened to be there.

diminish · on April 11, 2014

Does anyone remember Randall explaining DDOS?

hisem · on April 11, 2014

This? https://xkcd.com/932/

dllthomas · on April 12, 2014

That's likely to be an example of DDOS, but isn't at all an explanation of DDOS. It's an explanation of the difference between "the CIA website" and anything actually sensitive.

teddyh · on April 14, 2014

Kevin & Kell has done that a few times:

http://www.kevinandkell.com/2000/kk0730.html

http://www.kevinandkell.com/2003/kk1005.html

http://www.kevinandkell.com/2005/kk0515.html

kevin818 · on April 11, 2014

Don't remember the link but yeah that was pretty good :)

ozh · on April 11, 2014

with bare simple tools (raw html text, one gif or png image, period)

mixedbit · on April 11, 2014

Security issue explained without Alice, Bob and Malory, this is way too confusing. Who is this Meg character?

namrog84 · on April 11, 2014

She is eve's cousin

weavie · on April 11, 2014

Wow. Was it really that simple? The heart beat request sends the text as well as the length it wants back?

masklinn · on April 11, 2014

> Wow. Was it really that simple?

Close enough.

> The heart beat request sends the text as well as the length it wants back?

The heartbeat sends a payload prefixed by its size. That's perfectly normal design (for variable-size payloads), that way the handler reads the size, allocates a buffer[-1] and uses read(2) to read the payload into the buffer. Otherwise the handler would have to "guess" the payload size, and that never ends well.

The problem here is twofold:

1. read(2) may read less than requested, if an attacker gave a bigger size than the actual one for instance. That's why read(2) returns the number of bytes actually read

2. malloc(3) hands out a bunch of memory, without clearing it[0]. Depending on the exact allocator and application runtime, chances are this bit of memory is at least in part freed memory, which is filled with the content of previous allocations such as SSH keys or passwords or whatever

(2.) is compounded by OpenSSL having its own freelists on top of malloc which it does not clear, making it certain to hit previously allocated data

You're supposed to check the result of read(2) and adjust your payload size and only copy that to the output buffer. Or just error out if the sizes differ.

And ideally unless you have very specific reasons not to you'd want to use calloc(3), so that if you forget to check read(2) you return zeroed memory anyway. The first part was forgotten and the second one not done (because "needs fasts!"), the whole input buffer was copied in the output buffer and an attacker gets 64kb[1] worth of previous allocations data.

[-1] possibly adding its own constraints on top of that, here the payload's 64KiB so it's not relevant, in other contexts the server could refuse overly large payloads

[0] except on BSD with a malloc.conf using the J or Z options

[1] because the user-provided length is a 16 bit uint

moyix · on April 11, 2014

I think you have misunderstood the bug, actually. It has nothing to do with read(2) returning less data than requested. There are two length fields at play here, the SSL record layer length field, and a second length field in the heartbeat field. Both are 16 bits.

OpenSSL correctly reads the whole packet using the SSL record length. It then passes the packet off to the `tls1_process_heartbeat` function, which uses the second length field to do (variable names changed because the originals were terrible):

  outgoing_packet = malloc(heartbeat_len)
  memcpy(outgoing_packet, incoming_packet, heartbeat_len)

So when the two length fields disagree, SSL's network code correctly reads the short packet, and the heartbeat code incorrectly reads past the end of the incoming packet buffer and copies it into the outgoing packet buffer.

Edit: As a shortcut to establish my bona fides, I wrote a honeypot for this issue so I ought to know what I'm talking about:

https://twitter.com/moyix/status/453760960671383552

danbruc · on April 11, 2014

Zeroing memory does not help in the general case unless you allocate an input buffer large enough that you can never overrun it; otherwise you can still hit whatever follows your input buffer.

If the input buffer was guaranteed to be large enough in this particular case I don't know but I can imagine an implementation that does not allocate the buffer for the 64 kiB worst case but just large enough to contain the actual request.

masklinn · on April 11, 2014

> If the input buffer was guaranteed to be large enough in this particular case

It was, since it used the same size to allocate the buffer and call read(2) it would never put more data than expected in the buffer.

> otherwise you can still hit whatever follows your input buffer.

Yes, but that's not the issue in heartbleed. My comment was about heartbleed, not about covering all the ways in which you can fuck up memory access in C.

> I can imagine an implementation that does not allocate the buffer for the 64 kiB worst case but just large enough to contain the actual request.

"just large enough" is impossible, you'll always over-allocate by at least 1 byte, and then to get the actual best precision you have to read the input data a byte at a time, performing a read(2) per byte. That's both slow and less readable.

danbruc · on April 11, 2014

"just large enough" is impossible, you'll always over-allocate by at least 1 byte, and then to get the actual best precision you have to read the input data a byte at a time, performing a read(2) per byte. That's both slow and less readable.

You can do this. First allocate for and read the fixed length part, then determine the length of variable length part and finally allocate for and read the variable length part. This may of course still return less data then expected and leave you with uninitialized memory. And you may of course receive a larger buffer then you asked for.

Yes, but that's not the issue in heartbleed. My comment was about heartbleed [...]

Of course, I just wanted to say that zeroing memory may not be sufficient in the general case without bound checking because sometimes people have or get the impression that this would be a good and easy fix.

masklinn · on April 11, 2014

> then determine the length of variable length part

That's the part you can't do, you're reading data from a socket, you can't skip around with fseek(3), you read(2) or you recv(2) and if you don't store your data somewhere you lose it.

danbruc · on April 11, 2014

Allocate a fixed length buffer for the header and read the header, inspect the length field and allocate a second buffer with this length and finally read the variable length part into this second buffer. Maybe check that there is no trailing data. This is what you would probably do anyway if the variable length part could be way larger then 64 kiB and just blindly allocating for the maximum length is not an option.

masklinn · on April 11, 2014

Right. Now you've got significantly more code (and even more chances of getting it wrong), more allocations, and you can still fail to correctly handle read(2) not completely filling a buffer and leaking data. Although most likely less than in heartbleed.

danbruc · on April 11, 2014

How did we end up here? My initial point was that zeroing memory does not prevent leaks in the general case and we both agree on that. I never claimed OpenSSL should or should not have done something differently.

gokhan · on April 11, 2014

Haha. Your comment is so complex it needs another XKCD comic just to make sure all those [-1], (2), 1., [1]'s etc. don't get mixed up.

stavros · on April 11, 2014

That's because it was the in-depth, technical explanation of the comic. Based on the GP's comment, the comic is exactly accurate.

rubiquity · on April 11, 2014

If it helps, the numbers in parantheses after read, malloc, calloc etc. are referring to the page number in the man pages.

e.g typing `man 2 read` into the shell on a Unix system will give you information about the read system call.

2's are for system calls, 3's are for members of the C standard library (usually).

underyx · on April 11, 2014

Here you go: http://xkcd.com/1208/

weavie · on April 11, 2014

An excellent explanation. Thanks.

torrent-of-ions · on April 11, 2014

The bug is really that simple. The fact that you actually get back sensitive information in a high proportion of requests is due to a few other factors that are a little more difficult to explain. Either way it's a severe potential security breach that is directly attributed to this error.

danbruc · on April 11, 2014

Yes. They really only missed the check that the length of the data you sent and the length you said it is are the same.

sp332 · on April 11, 2014

Well, they also failed to clear memory that includes sensitive data.

pbhjpbhj · on April 11, 2014

I'm not a programmer but .. why bother asking you what length it's supposed to be if you're going to strlen() it anyway? Sure you might need to allocate resources if this was carrying a proper payload but here we're just talking about an index. The spec not specifying the max length of the payload seems to be the problem??

danbruc · on April 11, 2014

The maximum length is specified, 65535 bytes, because that is the largest number that fits into the length field. If you have a variable length field you always have to specify the length, otherwise you will never know the actual length and are for example unable to read the next field because you don't know where it starts. There is no real next field in the case of the heart beat packet, only padding, but you still have to figure out the length in order to know what to return.

You have the option to specify the length implicitly by using a terminator like a zero byte in a zero delimited string but this is sometimes a bad idea. First you have to scan the entire field just to figure out its length while with an explicitly stored length before the actual data you can just read the length and for example skip to the next field without ever looking at the field. Forgetting to properly terminate the field will of course get you in serious trouble, too. And last but not least using a terminator only works if you have an used symbol (or are willing to perform escaping). In the case of the heartbeat extension there are no restrictions on the content and you can put in arbitrary binary data and therefore something simple like a zero terminator does not work.

pbhjpbhj · on April 11, 2014

Using UDP over IPv4 isn't the max payload size for a packet 65,507 bytes? That would mean that the max length that can be specified is already going to be a potential buffer overflow. What happens if you use specify a zero length? Seems that if the protocol were being defined loosely that should allow for "any length of field that fits within the transport packet" and so allow for larger datagram packets (a la Jumbogram).

Is there a trend to leave definitions of these protocols as loose as possible (eg not specifying the response field in the heartbeat as a double/long or whatever) so as to allow hacking it to serve some other use later?

Guess I'm going to have to look at the DTLS protocol definitions ...

danbruc · on April 11, 2014

TLS is usually used on top of TCP and therefore UDP limits do not apply. I have no idea how implementations on top of UDP or other datagram protocols deal with that issue. For a payload length of zero I would expect to just get back a payload of length zero, but I am be no means a network or even TLS expert and the standard may as well disallow this. And I am to lazy to look it up.

It is probably a really bad idea to make your protocol depend on details of an underlying protocol layer. Just imagine sending something over Ethernet and receiving it at the other end over WLAN where both protocols may disagree on the maximum packet size. You usually want your standard as precise as possible without imposing unnecessary and artificial constraints. Being allowed to stuff undefined amounts of data into a heart beat packet just to ease abusing the protocol really seems to call for trouble.

y0ghur7_xxx · on April 11, 2014

Yes. Here is a PoC in Python:

https://gist.github.com/takeshixx/10107280

On line 42 is the heartbeat request payload. 40 is the length of the message in hex = 64 bytes.

smtddr · on April 11, 2014

No, not "40".... "40 00".

https://news.ycombinator.com/item?id=7555156

Such that "FF FF" will be 65535 (the 64kb everyone keeps talking about)

ijk · on April 11, 2014

Pretty much, yes.

Here's an analysis of the actual OpenSSL code:

http://blog.existentialize.com/diagnosis-of-the-openssl-hear...

Here's the patch itself:

https://github.com/openssl/openssl/commit/96db9023b881d7cd9f...

AndrewDucker · on April 11, 2014

Can someone explain why Heartbeat needed to return the text it was sent, rather than always returning an "OK" message?

What advantage does returning the text give you?

antirez · on April 11, 2014

It is common for ping-pong protocols to allow to specify some data in the ping packet that is returned in the pong packet.

It allows to build more abstractions in the client side without the server realizing it at all. For example:

* Setting as payload the time at which the packet was sent allows to calculate client-side the round-trip-time without adding complexity (you don't need to remember the sent time client side).

* Setting as payload an unique incremental ID allows an event-driven client that may have application side reordering of packets to understand the order of the arriving replies.

* Setting the payload to a client-known pattern is useful to detect server-side or network data corruption.

And so forth... ICMP ping-pong packets use the same mechanism.

twic · on April 11, 2014

The heartbeat extension was originally designed for DTLS, which is TLS over UDP. UDP datagrams don't arrive in a guaranteed order, or within any bounded time. If heartbeat used a fixed message, then if you sent a request and receive a response using DTLS, you would have no way of knowing if that response was a response to that request, or to some other request you sent ten minutes ago, or what.

Allowing the request and response to carry a payload lets you mark each request with a unique identifier, so that you can recognise the resulting responses.

It's basically like an HTTP cookie. In reverse.

I have no idea why the payload is variable-sized, rather than being (say) a fixed 32 bits. I would guess the specifier didn't want to bake in assumptions about how much space a unique identifier would need.

fulafel · on April 11, 2014

> I have no idea why the payload is variable-sized

For DTLS path MTU discovery (according to the heartbeat extension RFC).

EvanAnderson · on April 11, 2014

That could have been accomplished with a fixed-length payload and variable-length padding, though.

nollidge · on April 11, 2014

I'm probably missing something, but if it's going to be of variable size, why not just a null-terminated string? Why does the size itself need to be specified along with the string?

EDIT: Nevermind, this thread [0] has more discussion about it.

[0] https://news.ycombinator.com/item?id=7572081

masklinn · on April 11, 2014

> I would guess the specifier didn't want to bake in assumptions about how much space a unique identifier would need.

Yep, although an up-to-64KB buffer seems like overkill you'd get a guy getting by on a byte, while the other one has a distributed system and wants UUID or something

priomsrb · on April 11, 2014

Makes sense. But 64k of data seems oddly large just to ID a message.

tedchs · on April 11, 2014

64k is the max value of a 2-byte unsigned int, as 2^16 - 1 = 65535; is that the size of the length prefix?

stackcollision · on April 11, 2014

I looked at the code and I'm pretty sure I saw that the payload size variable was a long int, so I was confused by this too.

andyjohnson0 · on April 11, 2014

This has puzzled me too. The RFC that describes the extension [1] seems to be silent on the actual rationale for allowing an arbitrary message.

All I can think of is that the designer of the extension wanted to specify a mechanism with enough extensibility to allow requester-generated sequence numbers with arbitrary formats to be sent. Alternatively, if the sender makes the message unpredictable and compares the request and response messages and finds them the same, then this provides some weak "proof" that the responding server is the one that the request was sent to.

I don't find either of these particularly convincing though.

[1] https://tools.ietf.org/html/rfc6520

jonnydark · on April 11, 2014

It verifies that you're actually connected to the server and it is listening to you rather than just some dumb box that keeps nodding and replying with "OK" with glazed over eyes regardless of what you say.

jon-wood · on April 11, 2014

So instead you could be communicating with a marginally less dumb box, which nods and replies with whatever you said instead?

dferlemann · on April 11, 2014

If it can reply whatever you requested accurately, then the established 2-way communication can be said to have the same quality.

hf · on April 11, 2014

That's precisely the point that riles our tinfoil-hated friends: Heartbeat should be a simple matter of hi-ho's ad infinitum (ie: no or constant-space parameters).

This is, of course, glossing over the fact that TLS over a TCP link doesn't actually need a heartbeat. TLS/UDP does, granted. Ever heard of DTLS?

Strilanc · on April 11, 2014

Here's a use case it allows: computing latency without storing the time locally. Drop the current time into the heartbeat, plus a cryptographic signature so the server can't change it. The signature requires a variable sized payload, or it won't be able to fit in ten years.

parax · on April 11, 2014

"And this is, kids, why you always have to validate your input and do not trust on the user".

damon_c · on April 11, 2014

It's hard to believe that even with all of our slavish mantra repetition about not trusting user submitted data... the freaking web server trusts user submitted data.

We're all going to have to start reading more source code...

yiedyie · on April 11, 2014

Check this answer: http://security.stackexchange.com/a/55539/15901

spbhat1989 · on April 11, 2014

Xkcd is best at simplifying the most complex things and complicating the most simple things! :)

danyork · on April 11, 2014

Brilliantly done! Great to have this out there to help explain the issue to non-developers.

nashashmi · on April 11, 2014

I just began appreciating all the hoops I jump through just to concentrate on things hardly anyone else cares about. It takes me nearly ten times as long to complete a program, and taxes my mind ten times more, and makes me frustrated twice as much about pursuing programming, but after a very sharp practicing curve, makes me hundred times better than the rest of the programmers out there. But, still, I wonder if it's worth it. Especially, considering my boring as hell job.

oafitupa · on April 11, 2014