It’s been a long time since I worked with C, but my recollection was that a) strict aliasing allows for optimizations that are actually worthwhile, and b) it’s really easy to type pun in a defined way using unions anyway.
Type punning with unions was actually forbidden by C89. You were only ever supposed to read the union member which was last written to. This may have been relaxed in C17; I can only find a draft online, but it allows for type punning in unions as long as the member being read is not longer in size than the member last written to.
What the standard says doesn't really matter. Only what major compilers do matters. GCC has decreed that type punning through unions is supported, therefore it might as well be standard.
That code does not violate the aliasing rules in any case.
The two functions you wrote are not the same; the first re-reads *x which may return a different value than 2 if *x was modified in between the first and third lines of the function by another thread (or hardware). However, since x is not marked volatile, the compiler will usually optimize the first function to behave the same as the second.
typedef struct Msg {
unsigned int a;
unsigned int b;
} Msg;
void SendWord(uint32_t);
int main(void) {
// Get a 32-bit buffer from the system
uint32_t* buff = malloc(sizeof(Msg));
// Alias that buffer through message
Msg* msg = (Msg*)(buff);
// Send a bunch of messages
for (int i = 0; i < 10; ++i) {
msg->a = i;
msg->b = i+1;
SendWord(buff[0]);
SendWord(buff[1]);
}
}
The explanation is: with strict aliasing the compiler doesn't have to think about inserting instructions to reload the contents of buff every iteration of the loop.
The problem I have is that when we re-write the example to use a union, the generated code is the same regardless of whether we pass -fno-strict-aliasing or not. So this isn't a working example of an optimization enabled by strict aliasing. It makes no difference whether I build it with clang or gcc, for x86-64 or arm7. I don't think I did it wrong. We still have a memory load instruction in the loop. See https://godbolt.org/z/9xzq87d1r
Knowing whether a C compiler will make an optimization or not is all but impossible. The simplest and most reliable solution in this case is to do the loop hoisting optimization manually:
uint32_t buff0 = buff[0];
unit32_t buff1 = buff[1];
for (int i = 0; i < 10; ++i) {
msg->a = i;
msg->b = i+1;
SendWord(buff0);
SendWord(buff1);
}
Note 1: The first thing that goes wrong for Stackoverflow example is that the compiler spots that malloc returns uninitialized data, so it can omit the reloading of buff in the loop anyway. In fact it removes the malloc too. Here's clang 18 doing that https://godbolt.org/z/97a8K73ss. I had to replace malloc with an undefined GetBuff() function, so the compiler couldn't assume the returned data was unintialized.
Note 2: Once we're calling GetBuff() instead of malloc(), the compiler has to assume that SendWord(buff[0]) could change buff, and therefore it has to reload it in the loop even with strict-aliasing enabled.
Alias analysis is critical. Knowing what loads and stores can alias one another is a prerequisite for reordering them, hoisting operations out of loops and so forth. Therefore the compiler needs to do that work - but it needs to do it on values that are the same type as each other, not only on types that happen to differ.
Knowing that different types don't alias is a fast path in the analysis or a crutch for a lack of link time optimisation. The price is being unable to write code that does things like initialise an array using normal stores and then operates on it with atomic operations, implement some floating point operations, access network packets as structs, mmap hashtables from disk into C structs and so forth. An especially irritating one is the hostility to arrays that are sometimes a sequence of simd types and sometimes a sequence of uint64_ts.
Though C++ is slowly accumulating enough escape hatches to work around that (launder et al), C is distinctly lacking in the same.
Alias analysis is important. It's the C standard's type-based "strict aliasing" rules which are nonsense and should be disabled by default.
This is C. Here in these lands, we do things like cast float* to int* so that we can do evil bit level manipulation. The compiler is just gonna have to put that in its pipeline and compile it.