This happened over 20 years ago, but I was helping a co-worker debug an issue they were having with a Windows application written in Delphi. This was before Google was a thing, and waaay before Stack Overflow, so getting help to solve these kind of problems was a bit more involved.
As far as the issue, if they ran the offending code in the debugger, it worked flawlessly. But it would fail every time in the production build. Usually, this would point to some kind of race condition, but the code section was innocuous. It was essentially running the Delphi equivalent of strpos on a local variable.
I was comparing the build flags between the debug version and the release version and one thing that caught my eye was the optimization flags for the compiler. Lo and behold if you brought the optimization level down two notches the bug went away.
I don't think I ended up getting into the disassembly to submit a bug report, as the optimizer was almost certainly doing something it shouldn't, but at least we found the source of the issue.
Since we didn't want to actually disable optimizations on our release build, the "permanent fix" was to re-write our own strpos-equivalent in such a way that compiler optimizations didn't break it.
Similarly, found a bug in clang around 2010 that would only happen with max optimization. Actually did manage to track down the root cause; an array access would fail if the index > 255. It went something like this: on ARM (this was building for iPhone) the LDA (Load Accumulator) instruction can store the memory offset (array index) within the instruction itself if it would fit within 1 byte, otherwise the offset would have to get loaded from a memory location pointed to by a given register. One of the two cases was faulty.
Was just about to report this, but my Mac got upgraded to the next version of OS X, which magically solved the problem. What does the OS upgrade have to do with compiling? In the world of Macs, Xcode was also upgraded along with the OS, and in the newer version it was already fixed. Dangit!
As far as the issue, if they ran the offending code in the debugger, it worked flawlessly. But it would fail every time in the production build. Usually, this would point to some kind of race condition, but the code section was innocuous. It was essentially running the Delphi equivalent of strpos on a local variable.
I was comparing the build flags between the debug version and the release version and one thing that caught my eye was the optimization flags for the compiler. Lo and behold if you brought the optimization level down two notches the bug went away.
I don't think I ended up getting into the disassembly to submit a bug report, as the optimizer was almost certainly doing something it shouldn't, but at least we found the source of the issue.
Since we didn't want to actually disable optimizations on our release build, the "permanent fix" was to re-write our own strpos-equivalent in such a way that compiler optimizations didn't break it.