> Just to be clear, using volatile on the magic number will ensure this is avoid...

vlovich123 · on Oct 20, 2021

Best bet is to use the APIs that standards/compilers guarantee things about. The compiler is totally free to optimize volatile variables under very specific situations. Volatile has a very specific definition, but it does not mean “compiler is not allowed to optimize this”.

Trying to structure code to trick the compiler is a bad idea. The compiler authors know the standard better than you and eventually the compiler will exploit your misunderstanding.

DSMan195276 · on Oct 20, 2021

I agree, but FWIW I'm not claiming that volatile means "the compiler can't optimize this". This program would still function anyway even if the volatile is optimized out, so this is much more of a hint rather than a "this has to happen". A simple volatile store like this is also very unlikely to be optimized out, I don't know of any compiler that would attempt to do it and frankly it would break lots of stuff if it did (just because it can't be optimized out doesn't mean it can't cause other problems though). But when you get down to it trying to catch these use-after-free errors is never going to be guaranteed to work since as we've established use-after-free already breaks the standard itself. Still, using one of the various 'secure zero' APIs if you have one is definitely better, though the logic would need to be changed slightly.

vlovich123 · on Oct 21, 2021

> A simple volatile store like this is also very unlikely to be optimized out,

I think there's been godbolt links posted in this thread showing this assumption is wrong. Dead store elimination applies just fine to volatiles.

DSMan195276 · on Oct 21, 2021

I'd really like to see the example if that's the case, that would be very surprising to me and frankly sounds like a bug if it's really just a simple usage. The gcc documentation[0] suggests it will always emit a load and store, even when the result is completely ignored and is effectively dead code. I (and others) have interpreted this to mean they will never optimize out the actual load or store regardless of context (though reordering and such is still on the table in some cases, obviously, but that doesn't matter for this usage).

As context, the Linux Kernel uses volatile to ensure loads and stores happen, that's ultimately how READ_ONCE and WRITE_ONCE work[1]. If that's actually broken in such a simple case I think they'd like to know xD

[0]: https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html

[1]: https://elixir.bootlin.com/linux/latest/source/tools/include...

Edit: To be clear, I looked for the example you mentioned but couldn't find it. I'm somewhat wondering if you were thinking of the example I posted, since I used volatile to get gcc to not optimize the store out :P

vlovich123 · on Oct 21, 2021

It’s literally in the top level link I supplied [1]. You may trick some compilers today but there’s no guarantee that tomorrow’s compilers won’t get smart and leave you scratching your head about what went wrong. Memory allocation and deallocation is special in the standard. I agree it’s a bit weird but there are reasons for it (this is a form of dead store elimination that isn’t the same as normal dead store elimination which the compiler can’t optimize for volatiles because of what volatile means semantically). Your example with the kernel doesn’t apply because there’s no free happening there.

I’m genuinely amazed at the response. There’s literally an API defined that has the contract you want and your response is “yeah, but I want to write it a totally other way the standard doesn’t allow”. Just use memset_s. It’s a compiler builtin so the generated code is as efficient (more so) as compared with a volatile version except actually safe. Volatile has a totally different purpose and isn’t suitable to try to write a value before calling free.

I’ll leave writing a godbolt example of writing to a volatile right before a free in the same compilation unit at O3 for you to try out.

[1] http://www.daemonology.net/blog/2014-09-04-how-to-zero-a-buf...

DSMan195276 · on Oct 22, 2021

First I will point out, as I said memset_s is a good solution, I have no problem with that and would suggest it's use if it's possible. My complaint is simply the suggestion that volatile doesn't work here, it does.

As far as that article goes, the example for `secure_memzero` works and you will not find any compiler that will 'optimize that out', it would be a bug. And as I linked, the gcc documentation says as much. With that, memory allocation is not as special as you're making it out to be, normal memory can be volatile in perfectly valid situations (even ones mentioned in the standard), and just because it's related to a free() does not mean the compiler is now allowed to remove a volatile dead store - and even if you think it does, gcc will not do that.

Here's an example of such a case[0]. A signal handler is able to view the object being set right before the free() call, and a signal could trigger at that point, but the compiler still optimizes it out (which is correct). Using volatile on the variable to ensure all loads and stores actually happen (and are visible to the signal handler) is the suggested way, and if you do that then the code does set the value before the free().

As for your suggestion of writing to a volatile right before a free(), I'm not sure if you tried but it works just fine as expected, look[1]. I am perfectly confident in saying you will never find an example where the volatile store doesn't happen. With that, if it was willing to make such an optimization in the first place, don't you think my original example that used it to avoid dead store elimination and memory allocation elimination wouldn't have worked in the first place? ;)

[0]: https://godbolt.org/z/WWPz5Gqjo [1]: https://godbolt.org/z/anc1cfnPs

MauranKilom · on Oct 21, 2021

I have at times (ab)used volatile to aid in debugging sessions, something like

  volatile bool doCheck = false;

  if (doCheck)
  {
    // code I want to enable at some point during debugging
  }

The idea is that I attach a debugger, and then only at a certain point enable doCheck.

I was baffled to learn that MSVC will happily constant-fold the false into the if, as long as the variable is function-local. The variable still exists and I can change it in the debugger, but it doesn't actually impact control flow as intended. The "solution" is to move it to e.g. global scope (this is a debugging hack, remember).

Not an exact match for what you asked, but I think a good reminder that optimizers work in mysterious ways, and sprinkling in volatile may confuse the programmer more than the optimizer...