Hacker News new | past | comments | ask | show | jobs | submit login

If I'm understanding correctly, I think the difference between "undefined behavior" and "returns an unspecified value" is consitency. If you had something like:

    int i = [UNDEFINED BEHAVIOR];
    if (i > 0)
        COND1;
    if (i > 0)
        COND2;
then most people would expect that either COND1 and COND2 both execute, or they both don't execute. But I believe that a compiler is theoretically free to produce code that executes one but not the other since the value of i is undefined. In other words, the compiler doesn't have to act as though i has one specific value after that assignment. It can assume any value it likes independently each time i is used. It can assume that i is positive at the first check and then assume it's negative the next time, even though it might be provably true that no code between those two checks can possibly change the value of i. The change to "unspecified value" would mean that the compiler can give i whatever value it wants to, but it has to be a single defined value, and subsequent code must not behave as though i had multiple changing values.

But I'm not really a C programmer, so feel free to correct me if I'm interpreting this wrong.




The classic example of undefined behavior is "nasal demons". Which is to say, that when you hit undefined behavior, the compiler would be completely within its rights to make demons fly out of your nose.

"Undefined behavior" means the compiler can do anything. In your example, the compiler could execute one statement but not the other. Or it could execute both of them fifteen times. Or it could reformat your hard drive, or start a war. All would be legal according to the spec. (Feasibility of implementing such things is, of course, another question.) The more mundane consequences are more likely to be what you actually see, but the point is that you can't really reason about it in the abstract. You have to know exactly how your particular compiler handles it, and it can go well beyond just executing your code in a funny way.

For one real-world that approaches "nasal demons", early versions of gcc would start a game of nethack if they encountered a #pragma statement in your program.


Thanks for that, I thought I knew a fair amount about C but I'd never run across the nethack easter egg.


Modern compilers have a tendency to just remove code that can be proven to result in undefined behavior. This can make it very difficult to track down a problem because you'll be staring at the source code, assuming the code was run, unable to fathom how it could possibly have ended up in a particular state.

If the compiler were required to return an "unspecified value", you'll at least know that your code DID run (and generate an incorrect result).


This is definitely possible, I'll give an example using LLVM's IR

input:

  declare void @g()

  define void @f() {
    br i1 undef, label %if.end, label %if.else

  if.else:
    call void @g()
    br label %if.end

  if.end:
    ret void
  }

  
after optimizations:

  declare void @g()

  define void @f() {
  if.else:
    tail call void @g()
    ret void
  }


Most responses are hyperbole. Undefined behavior likely doesn't include making up new code; its not certain where you will end up, but someplace in existing code is a certainty.


Even excluding obvious fantasy like nasal demons, and real but unusual cases like starting nethack, there are completely mundane things that are reasonably common in real compilers but that don't go "someplace in existing code", like aborting on integer overflow, or crashing when you access a bad pointer.


Agreed; or like doing nothing, or choosing whatever value the CPU produces when overflowing.


Undefined behaviour means that anything can happen, the program might exit,start displaying random numbers or format your hard drive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: