OpenSSL bug exposed up to 255 bytes of client heap and existed since 2011

hannob · 2024-07-30T13:30:04 1722346204

An interesting aspect of this is that it's a bug in NPN, which is more or less a historic artifact.

When SPDY, the precursor of HTTP/2, got introduced, it needed a mechanism to signal that within TLS, a different protocol (SPDY instead of HTTP/1.1) was spoken. That mechanism was originally NPN. I don't know the exact details and motivation, but eventually people seem to have decided that NPN wasn't exactly what they wanted, and they invented a new mechanism called ALPN.

Now, that was a decade ago (the ALPN RFC is from 2014), so the question is: why do we still have NPN code in OpenSSL? I don't think anyone uses it any more. Shouldn't it have been removed long ago?

To put this in a larger context: it appears to me that OpenSSL still has a strong tendency to bloat. Heartbleed was essentially a "we added this feature, although noone knows why we need it" kind of bug, but it doesn't look to me they've changed. I still get the feeling that OpenSSL adds many features that they probably just should ignore (e.g. obscure "not invented here"-type algorithms), and they don't remove features that are obsolete.

fullspectrumdev · 2024-07-30T13:56:51 1722347811

Why it’s not removed: backward compatibility I bet.

I have to maintain a few VM’s and statically linked tools specifically for interacting with old network appliances that modern Linux hosts won’t “talk to” without crippling their SSL/TLS configuration or doing other horrendous workarounds like specific OpenSSL/OpenSSH config files for those things.

It’s actually an interesting problem when performing security assessments - some tools for scanning will have false negatives because they can’t “talk to” old shit.

hannob · 2024-07-30T14:31:21 1722349881

You have some VMs that talk to old network appliances that require NPN? And if NPN is not supported they don't just fall back to simple HTTP/1.1? I find that sounds extremely unlikely.

I am of course well aware of backwards compatibility issues. But I don't see how they'd impact NPN.

You might have to update some code that hard-codes NPN functionality, but well, OpenSSL has done API changes in the past, quite significantly so.

throwway120385 · 2024-07-30T16:10:27 1722355827

Not OP, but I have some old network-connected appliances in the field that I can't upgrade and that our current OpenSSH clients have deprecated all of the ciphers and methods for. You still have to be able to talk to these things over the network, and ideally you'd want to upgrade at least some of the systems out there.

You have to understand that outside of silicon valley and the hobbyist realm there are tons and tons of systems out there where the security posture is "don't allow anyone to have physical or network access to these systems who doesn't belong here" because the companies that made them deprecated them a decade ago or they were built before the widespread practice of supporting the Linux BSP for more than a couple of months.

These systems still work and you won't convince very many users of these systems to spend a bunch of money upgrading them for some notional security risk. You may also be working for a company that is contractually obligated to support these systems in some way. So what do you do?

In many cases, you do the best you can to mitigate the risks you know are there with tools like firewalls, port knocking, non-standard ports, and so on. And you hope that the 2-3% of attackers that actually know what they're doing never try to crack these things.

hannob · 2024-07-30T16:18:59 1722356339

I understand what you write, yet it doesn't have anything to do with NPN.

Look, I get it. Deprecating things has tradeoffs. But NPN looks like an incredibly safe thing to deprecate. It has only been used for a very short timeframe. Its only use case (SPDY) has a fallback (HTTP/1.1) making sure things still work if it's not supported. Your story about OpenSSH ciphers has nothing to do with it.

fsckboy · 2024-07-30T20:17:40 1722370660

OP couldn't talk to the old devices he used to, he installed some VMs with old stuff, that solved the problem. It's an adequate solution, why spend any more time on it?

throwway120385 · 2024-07-30T19:15:14 1722366914

It's hard sometimes to even upgrade the software to the version that deprecates it. I think that was hard to get from my nonsensical story about tying an onion to my belt.

fullspectrumdev · 2024-07-30T20:12:27 1722370347

I have the exact same issue with OpenSSH on an absurdly regular basis, which is why I maintain a set of statically compiled versions and VM’s with old versions - just so I can actually talk to old appliances/hardware.

I noticed also that some of the security scanning tools we use “silent fail” on some old appliances because they can no longer negotiate the appropriate SSH connection due to library updates :)

fullspectrumdev · 2024-07-30T20:09:34 1722370174

I was speaking in the more general case with regards OpenSSL - not NPN specific: where changes in OpenSSL have caused stupid devices that I have no control over to become “unmanageable”, hence having to either build static tools or use an ancient VM.

I wish I could replace these old devices, and I recommend to do so, but usually replacing them won’t happen until they physically break.

Think: Industrial IoT devices at client sites and other such security horrors.

Am4TIfIsER0ppos · 2024-07-30T14:36:30 1722350190

> without crippling their SSL/TLS configuration

On that point: why is the previously allowed crypto settings now considered as good as nothing? I have to force SECLEVEL=0 for openvpn/openssl to allow connecting to my company's vpn. From my reading that would allow any old cipher or hash rather than the previous minimum. Why is the previous level not kept as M and the library bumps the default to N? I know that a theoretical weakness means M is vulnerable to breaking but you force people to make it even worse.

I think my boss hasn't updated his openvpn so he doesn't have new openssl so he never saw the problem. His company, his problem.

tialaramex · 2024-07-30T15:08:29 1722352109

People aren't anywhere close to sophisticated enough to make meaningful use of this more complicated functionality.

What they're going to do if given this is call your hypothetical "SECLEVEL=M" feature secure and then be outraged when it isn't.

If nobody is attacking you, no security will work fine. If you are being attacked, obsolete security likely won't help anyway.

throwway120385 · 2024-07-30T16:15:36 1722356136

If you think of the ideal attacker as having a range of abilities and knowledge between "I found this tool on BitTorrent and am now going to try cracking your network from the outside with it" to "I spread my port scans out to multiple exit nodes over several days so you don't even notice me doing it" then having any security is better than nothing. All you have to do is make the sliver on the venn diagram of people who are trying to attack your system versus people who know how to attack your system as small as possible. It's not rocket science, and there are a lot of factors to balance here beyond the security level of a particular cipher.

Am4TIfIsER0ppos · 2024-07-30T19:08:01 1722366481

I am forced to make my connection less secure than someone who hasn't updated. What's the logic behind that?

pixl97 · 2024-07-30T18:42:15 1722364935

I'm not sure if you've ever plugged into the internet, but there are constant probes occurring. They may not be an attack but simple information gathering that is used later (say someone finds a weakness in your configuration) and you and everything like you is attacked at once.

jopsen · 2024-07-30T14:06:23 1722348383

Couldn't you just fallback to HTTP/1.1?

I guess it could be that it's not possible.

ctz · 2024-07-30T14:47:41 1722350861

> Why it’s not removed: backward compatibility I bet.

I mean, in the intervening period there was OpenSSL 3 which was a large backwards-incompatible (API and ABI) release, with a huge amount of effort on the part of OpenSSL developers and users to follow along. It was the ideal opportunity to drop this sort of old stuff.

toast0 · 2024-07-30T14:20:39 1722349239

> I don't know the exact details and motivation, but eventually people seem to have decided that NPN wasn't exactly what they wanted, and they invented a new mechanism called ALPN.

Looking at the spec for NPN and knowing about ALPN from implementing just enough of if to enable TLS false start way back when, ALPN is better because it's less steps.

With ALPN, the client includes its supported protocols in the Client Hello, and the server picks one or discards the extension in the Server Hello. With NPN, the client indicates it supports NPN in the Client Hello, the server sends back a list of supported protocols in the Server Hello, and the client sends an extra message to indicate its selection between change cipher spec and finished.

There's perhaps more client profiling information on the wire with ALPN, but it's not very interesting when 99% of connections just advertise h2,http/1.1. During the development of http/2, you could probably have a pretty tight range on client versions depending on what pre-standard versions were declared.

tialaramex · 2024-07-30T16:59:22 1722358762

It was and remains a stamp collecting library. "Oooh, the Half Cent South African Reverse Yellow Eagle. Only six hundred of these were ever issued and since all postage by that point already cost at least four rand the half cent stamp was entirely useless. Nine in mint condition unperforated were sold at auction in 1956 but none have been seen since"

So it's useless? Yes! So why do you want it? Because it's rare!

The fact that lots of real world code uses OpenSSL is one of those things it's going to be hard for our descendants to understand. Like explaining that people used to Smoke cigarettes on aeroplanes. But why though, that's crazy. Yes, yes it is.

znpy · 2024-07-30T17:06:15 1722359175

I remember the early days of libressl, they bragged A LOT about removing tens of thousands of lines of code that they deemed useless.

touisteur · 2024-07-30T18:18:50 1722363530

Heavy use of unifdef there. Following their progress was amazing from an 'let's own this thing now' perspective, seeing the size of code changes, how they decided which things had to go.

I would have been bragging too, were I removing ifdefs by the shovelload every day for weeks.

sillywalk · 2024-07-30T19:27:18 1722367638

The OpenSSL Valhalla Rampage. It was kind of amusing, though horrifying with what they found.

https://opensslrampage.org/

asveikau · 2024-07-30T16:18:24 1722356304

I could read that in the opposite direction: why were the SPDY people pushing their half-finished protocols into a bunch of upstream projects? Wasn't that irresponsible? Shouldn't they have pushed those same projects to remove support for WIP protocols?

lazide · 2024-07-30T13:44:01 1722347041

Removing old code, especially in open source, has a problem.

Unless it is obviously breaking something someone wants to do, it has roughly a zero cost (in maintenance) being left alone, especially since bugs can often be ignored, but a non-zero cost to remove it - as it generally will only garner complaints and whining from any remaining users.

And in open source in particular, it’s damn near impossible finding remaining users in advance, or even telling if there are no remaining users.

RetpolineDrama · 2024-07-30T13:47:27 1722347247

Is it really "zero cost in maintenance" if that old code presents attack surface?

c0balt · 2024-07-30T13:53:23 1722347603

No, it's zero upfront cost in maintenance. Owning code always generates some maintenance/ tech debt but it often is opaque nor easily quantifiable.

lazide · 2024-07-30T13:57:57 1722347877

IMO the secondary cost is really more ‘friction’ that results when making changes.

If no changes get made, then no friction.

And open source code can and often is just abandoned. Defacto, sometimes even de jur. User beware, use at your own risk, etc.

crngefest · 2024-07-30T18:35:00 1722364500

Proprietary code is just as often if not more abandoned - you just don’t notice it

lazide · 2024-07-30T18:49:21 1722365361

Sure, but proprietary code you can (somewhat) see who is calling it, and the group maintaining it is at least sometimes the group calling it - so has an incentive to not make it a bigger mess. Somewhat. So that part of the equation has less weight.

They have a counter balancing thing which is no one can see it to shame them.

lazide · 2024-07-30T13:56:24 1722347784

Does the open source code maintainer have liability for it, or necessarily need to do work in that case?

At worst they generally just suffer reputational damage, not actual cost or lost revenue, like a business would.

tadbit · 2024-07-30T16:06:58 1722355618

LibreSSL removed NPN support seven years ago.

https://marc.info/?l=openbsd-announce&m=150996307120987&w=2

I wonder how many memory leaks it'll take for OpenSSL to finally get their act together or for major projects to drop it entirely.

doublepg23 · 2024-07-30T16:49:17 1722358157

I feel like a lot of projects did drop OpenSSL post-Heartbleed and then went back to OpenSSL some years later.

I know Gentoo did https://bugs.gentoo.org/762847

Python 3.10 did https://peps.python.org/pep-0644/

Void Linux did https://voidlinux.org/news/2021/02/OpenSSL.html

Etc.

Seems like a lot of that was due to OpenSSL breaking API compat that LibreSSL promised not to break though.

extraduder_ire · 2024-07-31T10:45:15 1722422715

Why does TLS need to know what's being passed through it? I thought the encryption it provides is meant to be transparent.

olliej · 2024-07-30T17:00:44 1722358844

Old protocols truly are the gift that keeps on giving (see the numerous protocol and cipher downgrades over the years as well)

megadal · 2024-07-30T13:28:10 1722346090

The title: "bug exposed up to 255 bytes of server heap and existed since 2011"

The post: "Silently sends up to 255 bytes of the client’s heap to the server."

tux1968 · 2024-07-30T15:22:34 1722352954

You're right. The HN title really needs to be changed, to say it's the client's heap that is exposed.

belter · 2024-07-30T16:34:14 1722357254

Agree. Only the mods can do it.

dang · 2024-07-30T18:39:54 1722364794

OK, we'll s/server/client/ the title above. Thanks!

seba_dos1 · 2024-07-30T13:39:04 1722346744

Not to mention that the HN submission title never appears in the linked article.

colmmacc · 2024-07-30T15:11:10 1722352270

I started writing s2n the day after Heartbleed and the first lines of code were for the stuffer interface. A stuffer is a buffer for stuff, and it's like Java buffered I/O for C. You can get a flavor from reading the header: https://github.com/aws/s2n-tls/blob/main/stuffer/s2n_stuffer...

The implementation is incredibly simple. Treat all blocks of memory as blob with a known size and then read/write into those blobs with a cursor to track progress and bounds checks on every access. Fence all serialization/deserialization through a safe low level interface. Not only do you get memory safety (which we later proved using formal reasoning) ... but when you're parsing message formats it lends itself to a declarative coding style that makes it very clear what the structure is. You can also do lifecycle things, like erasing sensitive memory with zeroes when you're done with it, making sure things don't show up in core dumps, etc. BoringSSL introduced a Crypto_bytes API that also did some of this plus bounds checking, and retrofit it into OpenSSL.

OpenSSL on the other hand is a horrific mash up of raw pointer arithmetic, ad-hoc parsers interleaved with business logic and control flow. I could never keep it straight, and it always scared me to review.

mrngm · 2024-07-30T13:22:49 1722345769

A few excerpts from the blog post (CVE-2024-5535 was assigned):

> Meeting those constraints is quite unlikely nowadays:

> NPN is a precursor to ALPN and was abandoned in 2012. It is very uncommon on internet servers now.

> Node.js 10 and later removed NPN support, and is well past end-of-life.

> Python 3.10 and later removed NPN support.

OpenSSL advisory: https://openssl-library.org/news/secadv/20240627.txt

> Severity: Low

> Issue summary: Calling the OpenSSL API function SSL_select_next_proto with an empty supported client protocols buffer may cause a crash or memory contents to be sent to the peer.

BoringSSL fix: https://boringssl.googlesource.com/boringssl/+/c1d9ac02514a1...

The heap leak was independently observed in 2014 in the Android okhttp library: https://github.com/square/okhttp/issues/437#issuecomment-358...

sebstefan · 2024-07-30T14:25:34 1722349534

Funny that this bug was discovered by rewriting it in Rust!

MaxBarraclough · 2024-07-30T14:34:53 1722350093

Not the first time this kind of thing has happened. A rewrite in SPARK Ada uncovered a flaw in the reference C implementation of the Skein cryptographic algorithm.

• https://www.adacore.com/press/spark-skein (ctrl-f for flaw)

• [PDF] https://www.cl.cam.ac.uk/archive/mjcg/Meeting.SecurityTools.... (page 12)

• [PDF] https://www.adacore.com/uploads/techPapers/SPARKSkein_SBMF.p... (page 5)

Pet_Ant · 2024-07-30T14:34:15 1722350055

I had to check if that was a meme joke or that is actually what happened, and it seems that it actually was found by rewriting it in Rust:

> 2024-04-23 - Discovery of SSL_select_next_proto memory unsafety while rewriting it in rust. [ https://github.com/rustls/rustls-openssl-compat/tree/main/ru... ]

tialaramex · 2024-07-30T14:48:42 1722350922

Specifically this is an attempted ABI compatibility layer. That is, given rustls, which is a perfectly nice Rust TLS implementation, what if we built the OpenSSL API and then shipped the C ABI compatible library ?

In principle the result is definitely safe if your C code (which previously called OpenSSL) is safe, regardless of whether OpenSSL itself is riddled with bugs (which it likely is)

A safety focused person rewriting tricky C pointer banging code in Rust is likely to inadvertently fix bugs - what's interesting is whether you notice that as you're doing it or it's just silently not buggy any more in Rust.

A few hours ago I had some code that was doing bignum arithmetic and one call returned None where I hadn't realised it could and in a haze I thought "Oh, if it's None we can treat it as Zero" so I changed unwrap() to unwrap_or(Zero::zero()) † but nope, it was None because that function assumes we're getting integers and it's just figured out it has a proper fraction instead, so hence there is no such integer, None. Treating None as Zero meant all the related tests blew up and I realised my mistake after a couple of minutes. No, e to the power 0.5 is not 1.

† Yes that's suboptimal because it will make the zero bignum even when it doesn't need it, I should have called unwrap_or_else instead but also no I shouldn't because it's wrong.

anthk · 2024-07-30T15:40:42 1722354042

Does this happen with LibreSSL?

tadbit · 2024-07-30T16:05:39 1722355539

No. They removed NPN support seven years ago.

https://marc.info/?l=openbsd-announce&m=150996307120987&w=2

rurban · 2024-07-30T13:52:53 1722347573

> (though likely with a huge false-positive rate in other code)

That's what I doubt at all. I rather trust the compilers than the openssl devs.

guipsp · 2024-07-30T14:32:13 1722349933

Bounds analysis are notoriously hard to get right on C code.