This one is interesting... it looks like sibling call optimization was a bit too...

caf · on Aug 29, 2019

I think the incorrect propagation is actually from one instance of walk() to the next recursive invocation (which has been replaced by a loop).

tedunangst · on Aug 28, 2019

But the proof that it's non null is the deref in main?

I'm not sure I understand why people say this has nothing to do with the UB already derefed therefore not null optimization. How else is the compiler inferring the pointer could be not null?

klodolph · on Aug 28, 2019

> But the proof that it's non null is the deref in main?

This is an incorrect interpretation, it only proves that 'node' is non-null. Recall that 'walk' is called twice—once for 'node', and once for 'node->child'. The compiler bug (and this IS a compiler bug, make no mistake) is that the the proof that 'node != NULL' is incorrectly treated as a proof that 'module != NULL', which is not sound, and the bug manifests as incorrect program behavior during the recursive call 'walk(module->child, ...)'.

tedunangst · on Aug 28, 2019

Yeah. The bug is that the compiler propagated not null from main.node to walk.module.

But how did the compiler determine main.node is not null?

klodolph · on Aug 28, 2019

> But how did the compiler determine main.node is not null?

The test case in the bug happens to use UB to generate the constraint, but this is just an example of how to trigger the bug. The bug itself is not related to UB, and there are other ways to generate the same constraint.

Here is an example on godbolt showing what I mean: https://godbolt.org/z/pRW97L

You'll notice that I removed the UB, but the bug is still there. You can see that the 'return;' statement is not emitted because it is colored with a white background. So the bug is not related to UB at all.

saagarjha · on Aug 28, 2019

> You can see that the 'return;' statement is not emitted

Uh, what? The code looks correct to me:

  func:
    test rdi, rdi
    je .L7
    mov esi, 1
    jmp walk
  .L7:
    ret

There's a conditional branch on $rdi that jumps to .L7 (which returns) when node is NULL.

klodolph · on Aug 28, 2019

You're looking at 'func', the error is in 'walk'. Here is the top of 'walk':

  walk:
    push rbx
    mov rbx, rdi
    test esi, esi
    je .L3
    mov rdi, QWORD PTR [rdi]
    call walk

Note that esi = cleanup = 1 so the branch to .L3 is not taken, and the recursive call to 'walk' is taken. The crash will occur on the instruction before the call,

    mov rdi, QWORD PTR [rdi]

Because rdi = 0 on the second call to 'walk'.

saagarjha · on Aug 29, 2019

Oh, yes, walk is still broken. I was focusing on func because you had added that and it coincidentally also had a white return statement.

tedunangst · on Aug 28, 2019

Ok, this is the clarification I was seeking.

bialpio · on Aug 28, 2019

There's a deref of main.node and dereferencing null pointer is UB so it couldn't have been null?