Hacker News new | past | comments | ask | show | jobs | submit login

> pointers by definition do not alias and it is impossible to make them alias

This is equivalent to saying that pointer arithmetic is disallowed. Pointers are by their nature in the C virtual machine offsets into a linear memory space, so for any two pointers, x and y, there exists a c such that (ptr_t)x+c == (ptr_t)y, and thus there can always be aliasing.




> Pointers are by their nature in the C virtual machine offsets into a linear memory space

Historically, many platforms - such as Multics or Windows 3.x - didn’t have a linear memory space, they had some kind of memory segmentation. The industry has largely moved away from that towards the flat address space model. Go back to the 1980s, it was still a much bigger thing, and people used C on those platforms, and the standard was written to support them. The actual detailed control of memory segmentation is inherently non-portable so cannot be addressed by the standard, but the standard defines pointer arithmetic in such a way to support those platforms - pointer arithmetic on unrelated pointers is undefined, because if the pointers belong to different memory segments the results can be meaningless and useless.


Even in cases of segmented memory architectures there is still a requirement that a void pointer be able to cast (reversably) into an integral type (7.18.1.4 in C99, 3.3.4 in C90, the first ISO standard).


This is not true in either of those C versions. C99§7.18.1.4 describes (u)intptr_t as optional, which (per §7.18) means that <stdint.h> need not provide those typedefs if the (compiler) implementation doesn't provide an integer type that allows reversible casting from a void pointer. Similarly, it's not clear in C90§3.3.4 that the implementation has to implement these casts in a reversible manner, although that is the obvious way of implementing it.

That being said, I can't think of an implementation that didn't support this, even if they did have to pack segment and offset information into a larger integer.


It wouldn’t work on capability-based architectures where pointer validity is enforced by tag bits. Cast the pointer to an integer, lose the tag; cast it back, tag is missing and pointer is invalid (except for privileged system software which has the authority to enable the tag bit on an arbitrary pointer.)

Do such architectures still exist? Do/did they support C? Well, 128-bit MI pointers on IBM i (fka AS/400) are like this - a hardware tag bit protects them against forgery - and ILE C lets you manipulate such pointers (it calls them “system pointers”, _SYSPTR), so that would be a real world example of a pointer in C which can be cast to an integer but cannot be cast back. (IBM i also has 64-bit pointers which aren’t capabilities and hence aren’t tag-protected and can be cast to/from integers - but they don’t point into the main operating system address space, which is a single-level store single address space shared by all non-Unix processes, they only point into per-process private address spaces, so-called “teraspaces”.)

I think some UB in C is motivated by allowing C to be used on these kinds of architectures, even if they are now exceptionally rare. When C was initially being standardised in the 1980s, many people thought these kinds of architectures were the future, I think they were surprised by the fact they’ve never gone mainstream


ARM Morello (on the front page earlier this week: https://news.ycombinator.com/item?id=30007474) is a capability-based architecture, with 129-bit pointers. Compilers for it provide a uintptr_t that is appropriate, but it is far stricter about the kinds of operations that can be done in the reverse direction.


Does that matter? If you change the integer then you are not allowed to cast the altered value to a pointer.

And a compiler could enforce this if it really wanted to.

When you only have to worry about pointer arithmetic on actual pointers, it's straightforward to make sure they never alias.


Actually no, the C abstract machine is not contiguous. If x and y point inside distinct objects (recursively) not part of the same object, there is no valid c that you can add to x to reach y.

edit: allowing that would prevent even basic optimizations like register allocation.

edit: s/virtual/abstract/


The C abstract machine requires reversible transformations from pointer to integral types (7.18.1.4 in C99, 3.3.4 in C90, the first ISO standard).

Practically speaking modern devices are in fact mostly flat, so you can of course do this, although you do brush up against undefined behavior to get there.


Reversible transformations don't imply a flat address space. All it means is that there's an integral type with enough bits to record any address in them.


Pointer provenance is more complex and subtle.


What is the c virtual machine? I thought there wasn't one


They mean the abstract machine, in terms of which the semantics are defined.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: