Hacker News new | past | comments | ask | show | jobs | submit login

In some sense the language is the compiler and the compiler is the language; the language is much like a human language, used for its utility in expressing things (ideas, programs). You can tell if your human language words work by determining if people understand you. If people start being obtuse and refusing to understand you because of an arbitrary grammar rule that isn't really enforced, you'd be right to be upset with the people just as much as the grammar.

It in fact doesn't matter at all what the standard says if GCC and LLVM say something different, because you can't use the standard to generate assembly code.

The standard doesn't have anything to say about UB, so it's the compiler's responsibility to do the most reasonable, non-shocking thing with it possible: if I'm a GCC developer and you ran GCC on one of these fairly mundane examples and it compiled without error then ran rm -rf / or stole your private RSA keys and posted them on 4chan and I said "well, you can't be mad because it's undefined, it's the standard's fault" you'd probably punch me in the face after some quick damage control.

If it deletes an if loop or terminates a spinlock early that's potentially even worse than those two examples.




>In some sense the language is the compiler and the compiler is the language; the language is much like a human language, used for its utility in expressing things (ideas, programs). You can tell if your human language words work by determining if people understand you. If people start being obtuse and refusing to understand you because of an arbitrary grammar rule that isn't really enforced, you'd be right to be upset with the people just as much as the grammar.

The shortcoming of this interpretation is that programs are not (only) consumed by humans; they're consumed by computers as well. Computers are not at all like humans: there is no such thing as "understanding" or "obtuseness" or even "ideas." You cannot reasonably rely on a computer program, in general, to take arbitrary (Turing-complete!) input and do something reasonable with it, at least not without making compromises on what constitutes "reasonable."

Along this line of thinking, the purpose of the standard is not to generate assembly code; it's to pin down exactly what compromises the compiler is allowed to make with regards to what "reasonable" means. It happens that C allows an implementation to eschew "reasonable" guarantees about behavior for things like "reasonable" guarantees about performance or "reasonable" ease of implementation.

Now, an implementation may choose to provide stronger guarantees for the benefit of its users. It may even be reasonable to expect that in many cases. But at that point you're no longer dealing with C; you're dealing with a derivative language and non-portable programs. I think that for a lot of developers, this is just as bad as a compiler that takes every liberty allowed to it by the standard. The solution, then, is not for GCC and LLVM to make guarantees that the C language standard doesn't strictly require; the solution is for the C language standard to require that GCC and LLVM make those guarantees.

Of course, it doesn't even have to be the C language standard; it could be a "Safe C" standard. The point is that if you want to simultaneously satisfy the constraints that programs be portable and that compilers provide useful guarantees about behavior, then you need to codify those guarantees into some standard. If you just implicitly assume that GCC is going to do something more or less "reasonable" and blame the GCC developers when it doesn't, neither you nor they are going to be happy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: