I'm curious what exactly makes this undefined behavior. And in particular, what ...

pascal_cuoq · on April 14, 2020

I think this is plenty ok. For one thing, If a struct as a member of type T, it's ok to access it through a pointer to T (and also the address of the struct is guaranteed to be identical to the address of the first member). For another, you are using dynamically allocated memory, so the only thing that matters is the type of the pointer when the access is finally made. It doesn't matter that it was a Foo* before, if what you dereference is an int*.

This is different from pretending that the address of a struct s { int a; double b; } is the address of a struct t { int a; long long c; } and accessing it through a pointer to that. If you do that, C compilers will (given the opportunity) assume that the write-through-a-pointer-to-struct-t does not modify any object of type “struct s”. This is what the example st1 in the article illustrates.

The latter is what I suspect plenty of socket implementations still do (because there are several types of sockets, represented by different struct types with a common prefix). It is possible to revise them carefully so that they do not break the rules, but I doubt this work has been done.

flatfinger · on April 17, 2020

The ability to use pointers to structures with a Common initial Sequence goes back at least to 1974--before unions were invented. When C89 was written, it would have been plausible that an implementation could uphold the Common Initial Sequence guarantees for pointers without upholding them for unions, but rather less plausible that implementations could do the reverse. Thus, the Standard explicitly specified that the guarantee is usable for unions, but saw no need to redundantly specify that it also worked for pointers.

If compilers would recognize that operation involving a pointer/lvalue that is freshly visibly based on another is an action that at least potentially involves the latter, that would be sufficient to make code that relies upon the CIS work. Unfortunately, some compilers are willfully blind to such things.