The real blocker is that the various solutions are almost certainly never ABI-co...

Animats · on Oct 7, 2023

> "buffer overflow within a struct" is often considered a feature.

That's lying to the compiler, because there's no way to say what you meant.

That's why I proposed the syntax

    typedef struct msgitem {
        const size_t len;
        char itemvalue[len];
    };
    struct msgitem firstmsg = { 100, 0 }; // empty msgitem, size 100

The last item of a struct could be a variable sized array. That's sound, and enough for things such as network packets. It's important to avoid bikeshedding.

jandrese · on Oct 7, 2023

The "buffer overflow within a struct" is used a lot when you have built a "header" struct that you are just pinning to the start of the some record and the bottom of the struct has a line like:

      char data[0];

The purpose is to give you a handle on the remainder of the data even though you don't know the size beforehand.

Technically this is not valid under strict C, but it's also incredibly common.

heywhatupboys · on Oct 7, 2023

> Technically this is not valid under strict C, but it's also incredibly common.

char data[] is valid C in recent versions if it is the last entry of a struct.

layer8 · on Oct 7, 2023

To elaborate, this is called "flexible array member" and was introduced with C99.

dataflow · on Oct 7, 2023

What do you do when the length isn't a field in the same struct?

Animats · on Oct 7, 2023

There must be an expression which can be evaluated to determine the length of the array, and which can thus be used for checking. Without that, the code has little chance of working, since something had better define the size of that array.

dataflow · on Oct 7, 2023

I mean the expression doesn't have to come from the same struct, though. It could be provided somewhere else.

Dylan16807 · on Oct 7, 2023

If the somewhere else can't be located in a straightforward way, I'd say just change the code.

dataflow · on Oct 7, 2023

> If the somewhere else can't be located in a straightforward way

It might very well be straightforward to obtain, just not located in that struct itself.

> I'd say just change the code.

And if the code isn't available to you to change?

Animats · on Oct 7, 2023

If you can't tell how big something is at all, the program is broken and will probably fail randomly.

dataflow · on Oct 7, 2023

> If you can't tell how big something is at all, the program is broken and will probably fail randomly.

That's not what I said at all. It's the exact opposite of what I said. Why this strawman? I already replied above and explained very clearly that I'm talking about when the size is known but not via the struct itself:

>>> the expression doesn't have to come from the same struct, though. It could be provided somewhere else.

>> It might very well be straightforward to obtain, just not located in that struct itself.

The size could be communicated in a different struct, no? Or passed back to the caller via a pointer argument? Or a million other ways beside the same struct itself?

Dylan16807 · on Oct 7, 2023

> And if the code isn't available to you to change?

Then you don't get the improvements in safety yet.

dataflow · on Oct 7, 2023

>> And if the code isn't available to you to change?

> Then you don't get the improvements in safety yet.

Huh? This isn't a limitation with current implementations like -fbounds-safety. It's just a limitation with the proposal I was pointing out this issue with [1]. The existing implementations decorate the function/usage sites rather than the struct, which gives you access to information outside the struct. And there's no need to change every single use of that struct, which you obviously don't generally have access to.

[1] https://news.ycombinator.com/item?id=37799444

Dylan16807 · on Oct 7, 2023

It's a limitation off the proposal, sure.

I'm saying to deal with it. Change the code to be compatible. It's not that important to keep it the way it is.

Now you're referring to better designs, which is great. Have the best of both worlds if that's possible.

But when you were just pointing out that difficulty, my response is that it's a very small difficulty so that's not a big mark against the idea. If it was that proposal or nothing, that proposal would be much better than nothing, despite the forced code changes to use it.

dataflow · on Oct 7, 2023

> I'm saying to deal with it. Change the code to be compatible. It's not that important to keep it the way it is.

> But when you were just pointing out that difficulty, my response is that it's a very small difficulty so that's not a big mark against the idea.

In what alternate timeline do we exist where HNers believe you can just recompile the entire world for the sake of any random program? Say you're a random user calling bind() or getpeername() in your OS's socket library. Or you're Microsoft, trying to secure a function like WSAConnect(). All of which are susceptible to overflows in struct sockaddr. Your proposal is "just move the length from 3rd parameter into the sockaddr struct" because "it's not that important to keep these APIs the way they are"?! How exactly do you propose making this work?

Dylan16807 · on Oct 7, 2023

The same way people switch to safe string functions and recompile?

Gradually, for one.

dataflow · on Oct 7, 2023

> Gradually, for one.

I can't believe you think changing the world isn't a big deal.

So say I'm on board and decide sockaddr Must Be Changed. Roughly how long do you think it will be from today before I can ship to my customers a program using the new, secure definition?

And how does the time and effort required compare against the more powerful implementation that's already out there?

Dylan16807 · on Oct 7, 2023

Probably a while. But that doesn't stop the safety from being in the other 95% of the program.

And again, I wasn't comparing to any other implementations, because you hadn't brought them up yet!

raverbashing · on Oct 7, 2023

> But keep in mind that "buffer overflow within a struct" is often considered a feature.

Yes, one of those "features" designed to work around limitations of the language

C is "simple" in the same way a chainsaw without guards or brakes is "simpler" than one with those attachments

layer8 · on Oct 7, 2023

The C standard defines which array accesses are valid or not in the C abstract machine. This definition isn’t simple at all. A C implementation can in principle add runtime machinery to check all accesses during execution. C implementations generally don't, due to performance and ABI compatibility reasons. But C the language doesn’t prohibit it. Most existing programs making use of "buffer overflow within a struct" probaby aren't actually conforming C programs.

mlindner · on Oct 7, 2023

> But keep in mind that "buffer overflow within a struct" is often considered a feature.

Yes this is used rampantly in any kind of code that's running in basically all of the world's network appliances that do packet processing.

I'd even say it's considered "best practice" by most C programmers.

layer8 · on Oct 7, 2023

It's been an explicit language feature (flexible array members) since C99.