C++ also pays a price for insisting not only that objects have addresses, but those addresses are distinct.
If you've got a 1.6 billion empty tuples in variable A, 1.4 billion in variable B and 1.8 billion in variable C, C++ can't see a way to do that on a 32-bit operating system. It needs to give each empty tuple an address, so it must think of 4.8 billion integers between 0 and 2^32 and it can't do that, so your program won't work.
Carbon is still far from finished, but if objects needn't have addresses it can do the same as Rust here, and cheerfully handle A, B and C as merely counters, counting 1.6 billion, 1.4 billion and 1.8 billion respectively is fine. Empty tuples are indistinguishable, so I needn't worry about giving you back the "wrong" empty tuple when you remove one, I can just give you a fresh one each time and decrement the counter.
Gamedev, presumably. It's not unusual to have a LOT of triangles and coordinates flying around and GameDev is basically locked to C++ (for many good reasons even if I find C++ distasteful).
Admittedly, those probably aren't running on 32-bit systems, nowadays ...
Carbon is probably Gamedev's best chance of moving beyond c++
But you are going to need some minimal buy-in from Microsoft first, if not Sony and Nintendo too.
The great thing about carbon is that you can incrementally shift a codebase over from c++, and there are no problems interacting with existing c++ APIs, so the required amount of buy-in is pretty much just "yes, you can use a 3rd party compiler" and maybe some improvements to the debugger.
You actually have to wrap all the c++ functions in c functions, and then call those c functions from rust. Which requires either making manual wrappers, or automated wrapping tools that handle the specific idiosyncrasies of the library you are calling.
Which is a massive barrier; Gamedev standardised on c++ and there are so many c++ libraries.
In comparison, Carbon is designed from the ground up to automatically have bi-directional inter-op with c++. It's what the language is designed to do.
I agree with you that Rust isn't a perfect match for video game development. Jonathan Blow's Jai is targeting that sort of program, it's not offering Rust's safety promises (Jon is confident this doesn't matter) but it can hardly be less safe than C++.
However, the purpose of unsafe is to mark code that does something which a human needs to check for correctness. We're not saying "This isn't OK" but "The compiler can't check this is OK, so a person needs to do so". If you write C++ today, all of your code is like that. I'd be astonished if more than a tiny proportion of the actual game code in a modern video game needed that treatment in Rust.
For example, Rust's approach to late initialization is std::mem::MaybeUninit<T>, a wrapper type which says to the compiler hey, I am not initialized yet, it's OK to write a T value into me, but you can't read my value until somebody says they're done initializing me. The "say you're done initializing" part is indeed unsafe, but that's a small part of the program. That intern writing a zone preview gadget? They don't need to be writing unsafe initialization code, when they try to access preview_zone.orb_color the compiler tells them this is MaybeUninit and so they can't read it. "Huh, apparently orb_color is MaybeUninit ?" "Oh, just show all the orbs in preview as orange, it'll be fine". You just avoided Undefined Behaviour and possibly a trip to the land of "But it works in debug builds".
There are a zillion ways to handle sparse arrays or redundant values that don't depend on esoteric language features. You're not even losing any efficiency, you're probably gaining some because you know your domain.
I may be overlooking something, but I don’t see a realistic use case for having multiple empty tuples without an address.
If you have those, I don’t see any way to discriminate between them. If so, why would you ever want to have more than one of a given type? Is there some template code that might accidentally try to create them?
Rust `Set<T>` is implemented as `Map<T, ()>`. Go is similar, but you have to do it manually. Zero-sized types mainly have use in generics (so yes-ish by templates, but not accidental).
If you write the equivalent in C with GCC extensions, or Rust, sizeof(Foo) would be 8, the same as sizeof(int64_t); `marker` doesn't take up any extra space. In C++, however, sizeof(Foo) is 16, because `marker` must take up at least 1 byte to have a unique address, which gets expanded to 8 bytes due to alignment.
Now, as of C++20, you can reduce sizeof(Foo) to 8 by tagging `marker` as [[no_unique_address]]. However, this has drawbacks. First of all, it's easy to get situations like this in highly generic code, so it's hard to predict where [[no_unique_address]] needs to be applied (and applying it everywhere would be verbose).
Second of all, [[no_unique_address]] is dangerous, because it doesn't just allow empty fields to be omitted, it also allows nonempty fields to have trailing padding bytes reused for other fields. Normally that's okay, but if you have any code that performs memcpy or memset or similar based on the size of a type, such as:
I don’t really get the concern here. [[no_unique_address]] seems to be designed for use with empty types. As such, what does it mean to write to such a field? These are really meant to be tags, no?
> C++ also pays a price for insisting not only that objects have addresses, but those addresses are distinct.
With only three exceptions I can't think of a case where by default one would not want that. In the case you describe you would want all those objects to have unique addresses. If you wanted to have them overlap you should go to the effort to happen the way you want it to -- how could the compiler guess on its own?
The exceptions BTW are union elements, the first class/struct element, and base object addresses (in `class foo : bar ...` the when you make a foo, the address of its bar is the same as the address of the foo itself).
> In the case you describe you would want all those objects to have unique addresses.
I certainly don't, if you want unique addresses for indistiguishable objects I guess C++ is the perfect language for you but, what are you expecting to do with these addresses?
I don't want an existing object? You seem to be having a lot of trouble with this very basic idea. Objects do not necessarily need distinct addresses. "But that's how C++ works" is just a fact about how C++ works, it has no larger significance, and where it conflicts with the sensible way to do things it's actually a defect.
How would you define object identity if different objects can have the same value and the same address?
To my mind, an object is a value with a unique identity. How else would you define it?
And if you want a billion empty values, then yes, those could all be implemented by the same object - it's pretty easy to implement, even though it would be ncie for a compiler to do it automatically (like how Java normally gives you the same Integer object if you box the same int value in two different places, even though it will give you a different Integer of the same value if you explicitly ask for it with new).
fn main() {
let vs = [(), (), (), (), ()];
for v in vs {
println!("{:?}", v);
}
}
This loop will five times print "()", because `v` will iterate through all five elements of `vs`. Are this values are identical or they aren't? I don't know, it doesn't matter, isn't it? But I think of them as of different values: they are different members of `vs`.
They are the same value, but with different identities. vs[0], vs[1] etc. are different objects, but they are all initialized with the same value. The difference is kind of irrelevant for a constant object like an empty tuple.
But imagine the following program:
fn main() {
let mut vs = [(1,), (1,), (1,), (1,), (1,)];
vs[0].0=2;
for v in vs {
println!("{:?}", v);
}
}
Here we can see that identity is in fact important: `vs[0].0 = 2` only modifies one of the objects, even if all of them initially had the same value.
By the way, note that your example should be completely equivalent to the following C++ program:
int main() {
using namespace std;
auto vs = vector<tuple<>>{tuple<>(), tuple<>(), tuple<>(), tuple<>()};
for (auto v : vs) {
cout << v; //imagine C++ actually had an implemntation of operator<< for tuples...
}
return 0;
}
> They are the same value, but with different identities. vs[0], vs[1] etc. are different objects
What allows you to say that they have different identities? They are zero-sized types. Literally zero. `()` is a type and `()` its the only possible value. log(1) = 0 bit. If you look into machine code you will not find anything that you can call an object. The very existence of `()` is a shared dream of a programmer and compiler, and `()` ceased to exist after a compilation.
> But imagine the following program
In this program you are using types of size > 0bit, which allows more than 1 value of that type. But even then I wouldn't bet that they are different objects only because you changed one. If you didn't, it would be completely logical to replace them all with just one value in a memory, while pretending that there are many copies of it.
In this particular case I don't believe rustc would manage to do it even if we drop mutation from the example. And I can't think of a case when it will manage. But I wouldn't bet that such case doesn't exist.
> By the way, note that your example should be completely equivalent to the following C++ program
Hmm... And if we write in C++ something like:
fn main() {
let vs = [(), (), (), (), ()];
for i in vs.iter() {
println!("{:?} {:?}", v, v as *const ());
}
}
Will all addresses be equal? If they are, then the proposition "C++ also pays a price for insisting not only that objects have addresses, but those addresses are distinct"* is false.
> What allows you to say that they have different identities?
Syntactic analogy with non-zero sized types. If you want a special case that says "there is a single object of the zero-sized type" that's ok, but it's a special case. All other types have a difference between object identity and value equality.
> Will all addresses be equal?
No, because C++ doesn't optimize for ZSTs, and it doesn't modify semantics for const. I agree that C++ pays a price for these two things, but I don't think it's because it "insists all objects have different addresses", I believe that is just a consequence to not giving special semantics to const beyond disallowing writes.
comex mentions this above. It's a nice hack, but it doesn't actually fix the problem I was talking about, you can give this attribute to a data member, which allows you to do the ZST marker type trick that is available in other languages, but you can't apply the attribute (at least, not as documented) to a type itself, those empty tuples are obliged to take up one byte each.
But why not use an empty base class instead? That seems both more idiomatic and simpler. And since C++ supports multiple inheritance, there's no limit to how many empty base classes you could have.
Using an empty base class is idiomatic C++ for solving the problem comex is talking about (marker types) although it has some pretty annoying consequences. Whether it's "simpler" probably depends on whether to you "simpler" means, "how I'm used to in C++".
But it doesn't have any bearing on the problem I was talking about, if you were to instantiate your empty class it has non-zero size. C++ just doesn't have ZSTs.
Ok, I misunderstood the thread a bit and was only commenting on the marker issue. I do thing that a type-level marker is "simpler" in a more general sense than checking whether a struct has a particular field, even if that field takes up no space in the struct.
Now, related to ZSTs, I think the main reason why C++ doesn't have this is that C++ really doesn't have any good support for constant values. Sure, you can mark something `const` but that rarely means too much - specifically, it can never add any new semantics to a type, it only removes some options.
One consequence of this is your example - a `const vector<T>` can't be a simple counter of how many T are in the vector even if the T type has a single possible value: the language can't really use the fact that the array is `const` to change its layout.
An even worse consequence is that a `const vector<int> * const` (a const pointer to a const vector of T) is not covariant (it can't be initialized with a const pointer to a const vector of a subtype of T), even though it should be: the language just won't use the fact that it is `const` in this way.
If you've got a 1.6 billion empty tuples in variable A, 1.4 billion in variable B and 1.8 billion in variable C, C++ can't see a way to do that on a 32-bit operating system. It needs to give each empty tuple an address, so it must think of 4.8 billion integers between 0 and 2^32 and it can't do that, so your program won't work.
Carbon is still far from finished, but if objects needn't have addresses it can do the same as Rust here, and cheerfully handle A, B and C as merely counters, counting 1.6 billion, 1.4 billion and 1.8 billion respectively is fine. Empty tuples are indistinguishable, so I needn't worry about giving you back the "wrong" empty tuple when you remove one, I can just give you a fresh one each time and decrement the counter.