Hacker News new | past | comments | ask | show | jobs | submit login

Honest question. What has stopped people from creating a standard with no undefined behavior? Is such a thing impossible? Have people done it?



See for example here https://blog.regehr.org/archives/1287

> After publishing the Friendly C Proposal, I spent some time discussing its design with people, and eventually I came to the depressing conclusion that there’s no way to get a group of C experts — even if they are knowledgable, intelligent, and otherwise reasonable — to agree on the Friendly C dialect. There are just too many variations, each with its own set of performance tradeoffs, for consensus to be possible.


> "What has stopped people from creating a standard with no undefined behavior? Is such a thing impossible?"

I don't understand the "undefined behavior" bandwagon.

1. We have a perfectly defined list of "undefined behaviors".

2. Said list also happens to be relatively small and scoped.

3. "Undefined behaviors" exist because the language can't make certain runtime guarantees which are largely dependent on compiler/os/platform/hardware-specific promises. If you have to, just roll in your own runtime checks. C won't force those on you...


The language could make those guarantees if it wanted to. This might add overhead on some architectures, but would be possible. An example is integer overflow. If we limit to machines using two's complement (thus any machine architecture used during last thirty years) this could be fully defined easily. And if C is ever used on a other architecture they could build a workaround using some form of oberflow trap. (Since CPU design would take C into consideration)

Or evaluation order - `i = ++i;` could easily be defined in some way. But might prevent some niche optimisations by the compiler.

Of course by C's nature there are limits (C won't be able to detect use after free or similar without changing language notably) but there is room where UB could be reduced, if it was seen as neccisary.


> "The language could make those guarantees if it wanted to"

> "Of course by C's nature there are limits"

You seem to acknowledge the fact that most of the undefined behaviors in C are essentially born out of compromise. Those compromises were driven by principles such as "keep the power in the hands of developers", "don't impose unnecessary restrictions", "keep the language clean", "avoid hidden runtime magic". The end results reflect that.

As I've mentioned in my previous comment, there's no "one size fits all", so the language makes it trivial for you to roll out your very own runtime magic (à-la Zig/Nim) which suits you best. Why is that a bad thing?


You don't actually have to make the undefined behaviour go away completely, which is difficult.

An easier approach would be to put a bound on what is permissible undefined behavior.

Sounds a bit like an oxymoron, doesn't it? After all, it is "undefined", right?

However, the C standard does exactly that!

Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

(section 3.4.3)

In fact, in the first version of the ANSI/ISO C standard, this was a binding part of the standard. In later versions, it was made non-binding, though the language is still in the standard.

Yes, the standard has language that says what permissible undefined behavior is, but you are free to ignore it and still call yourself compliant. Which is what just about everybody does nowadays.

Make it binding again and most of the mess disappears.

https://blog.metaobject.com/2018/07/a-one-word-change-to-c-s...


It doesn't fix anything. In most compilers, the behavior of compilers with respect to undefined behavior that everyone complains about boils down to "the compiler creates implicit __builtin_assumes that undefined behavior does not happen." That is "behaving during translation [...] in a documented manner characteristic of the environment," (especially given that the constant discussion of how undefined behavior is treated in the compiler by compiler writer means it's more well-documented than other mandatory-documented things, such as the actual implementation choices for implementation-defined behavior). It's just that the environment isn't the environment people thought it was.


> characteristic of the environment

Setting "the compiler" = "the environment" and therefore anything the compiler does is part of the environment and thus legit seems at best the type of sleight of hand the compiler writers use to justify their actions.

When defining the behaviour of the compiler, "the environment" obviously cannot be "the compiler".


> What has stopped people from creating a standard with no undefined behavior?

It's a lot of work.

> Is such a thing impossible? Have people done it?

It's possible; Ada has a really good take -- in that standard, there's a class of errors called "bounded errors" which on the surface look like "undefined behavior", but are a lot different in that they list out the possible results and thus preclude the "nose demons" problem of C -- see: http://www.ada-auth.org/standards/2xaarm/html/AA-1-1-5.html


> What has stopped people from creating a standard with no undefined behavior?

Because making C safer would introduce a lot more complexity in the runtime environment and instrumentation, and that is even hard to achieve correctly or has varying performance implications across all ISAs that C compilers are currently targeting.

Consider out of bounds indexing. To determine that an instruction is touching a memory region that is not, in abstract terms, a C array or a memory allocation by malloc and friends, now you need to insert long traps in memory, and even then, there is nothing stopping you from accessing array `b` through, for example, `a + 42`.


Removing undefined behavior is not actually the same as making C safer. In fact, in a sense, well defined C exists (many versions in fact), it's whatever your C compiler spits out with whatever combination of flags have been passed in.


If you were to write this standard, how would you define what happens when you write to a random address in memory?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: