The real blocker is that the various solutions are almost certainly never ABI-compatible with existing code, and for most people it's unrealistic to recompile the world (and even if you can forbid inline asm elsewhere, libc is nasty). (edit: I suppose WASM is sort of forcing that though, but unfortunately it didn't take the opportunity to allow dynamically fixing this)
A solution is mostly possible with a segmented allocator, which is quite reasonable on 64-bit platforms (32-bit allocation ID, 32-bit index within the allocation).
But keep in mind that "buffer overflow within a struct" is often considered a feature.
I think the best approach is "design a new, 'safe' language that compiles to reasonable C code, and make it easy to port to that new language incrementally".
The last item of a struct could be a variable sized array. That's sound,
and enough for things such as network packets. It's important to avoid bikeshedding.
The "buffer overflow within a struct" is used a lot when you have built a "header" struct that you are just pinning to the start of the some record and the bottom of the struct has a line like:
char data[0];
The purpose is to give you a handle on the remainder of the data even though you don't know the size beforehand.
Technically this is not valid under strict C, but it's also incredibly common.
There must be an expression which can be evaluated to determine the length of the array, and which can thus be used for checking. Without that, the code has little chance of working, since something had better define the size of that array.
> If you can't tell how big something is at all, the program is broken and will probably fail randomly.
That's not what I said at all. It's the exact opposite of what I said. Why this strawman? I already replied above and explained very clearly that I'm talking about when the size is known but not via the struct itself:
>>> the expression doesn't have to come from the same struct, though. It could be provided somewhere else.
>> It might very well be straightforward to obtain, just not located in that struct itself.
The size could be communicated in a different struct, no? Or passed back to the caller via a pointer argument? Or a million other ways beside the same struct itself?
>> And if the code isn't available to you to change?
> Then you don't get the improvements in safety yet.
Huh? This isn't a limitation with current implementations like -fbounds-safety. It's just a limitation with the proposal I was pointing out this issue with [1]. The existing implementations decorate the function/usage sites rather than the struct, which gives you access to information outside the struct. And there's no need to change every single use of that struct, which you obviously don't generally have access to.
I'm saying to deal with it. Change the code to be compatible. It's not that important to keep it the way it is.
Now you're referring to better designs, which is great. Have the best of both worlds if that's possible.
But when you were just pointing out that difficulty, my response is that it's a very small difficulty so that's not a big mark against the idea. If it was that proposal or nothing, that proposal would be much better than nothing, despite the forced code changes to use it.
> I'm saying to deal with it. Change the code to be compatible. It's not that important to keep it the way it is.
> But when you were just pointing out that difficulty, my response is that it's a very small difficulty so that's not a big mark against the idea.
In what alternate timeline do we exist where HNers believe you can just recompile the entire world for the sake of any random program? Say you're a random user calling bind() or getpeername() in your OS's socket library. Or you're Microsoft, trying to secure a function like WSAConnect(). All of which are susceptible to overflows in struct sockaddr. Your proposal is "just move the length from 3rd parameter into the sockaddr struct" because "it's not that important to keep these APIs the way they are"?! How exactly do you propose making this work?
I can't believe you think changing the world isn't a big deal.
So say I'm on board and decide sockaddr Must Be Changed. Roughly how long do you think it will be from today before I can ship to my customers a program using the new, secure definition?
And how does the time and effort required compare against the more powerful implementation that's already out there?
The C standard defines which array accesses are valid or not in the C abstract machine. This definition isn’t simple at all. A C implementation can in principle add runtime machinery to check all accesses during execution. C implementations generally don't, due to performance and ABI compatibility reasons. But C the language doesn’t prohibit it. Most existing programs making use of "buffer overflow within a struct" probaby aren't actually conforming C programs.
A solution is mostly possible with a segmented allocator, which is quite reasonable on 64-bit platforms (32-bit allocation ID, 32-bit index within the allocation).
But keep in mind that "buffer overflow within a struct" is often considered a feature.
I think the best approach is "design a new, 'safe' language that compiles to reasonable C code, and make it easy to port to that new language incrementally".