I wanted to like this article because I'm been thinking about this a lot in the context of game development, noticed a few things. One thing I'll say from briefly playing with this - the code leaves lots out a looks ostensibly simpler than it really is. Would very much appreciate tips / pointers on this or a more fleshed out and working implementation of the code.
For the bulk data with holes code:
First, there's an initialization step that has to happen the first time you allocate your bulk_data_t. Namely, you need to iterate through every item in the list and set its next_free item to the item following it, looping the last item back around to zero. You also need to do this for all the items between the new size and old size every time you resize your item list.
Second, safe iteration over all of the bulk data doesn't seem possible without adding some sort of flag to indicate whether or not an item is free.
No need to preinitialize the list. When allocating an element you have two cases: a) the free list is non empty: you pop an element by making the head point to its successor; b) the free list is empty: you increase the size of the vector by one.
To deallocate an element you simply set the elemnt next to the whatever is the current head, then make head point to this element.
You start with an empty vector.
An element is either allocated or in a free list, that's why the next pointer can be kept in an union.
I would say it depends on how one might want to handle it, like when you create an item_t you set next_free = -1 as a flag to indicate that it is not free. And have bulk_data_t's 0th position's next_free be 0.
I think I've figured out roughly what the author intended, code below [0].
First, it looks like he's relying implicitly on data stored in std::vector. Namely vectors have both a capacity and a size. The capacity is total number of allocated elements. The size is the total number of elements stored actually stored.
Second, vector::resize won't reallocate until it runs out of capacity, but it will give you access to extra elements if you need them. So this is used to lazily re allocate while bumping up the size of the vector.
Both of these effectively make it "do the right thing" by leaning on the vector storing both size and capacity.
If you hand manage those values yourself you can get a pretty compact C implementation without a lot of code.
One last thing: Using a union here for the item_t is pretty much guaranteed to get you a segfault. The whole thing should really be a struct. This also allows for setting next to sentinel value if necessary.
The author's pseudo-code implies that bd->items is full already, since it is probably initialized to the initial number of items that are added. This is probably for memory optimization for games. Also explains why resize increases the size by 1, just enough memory to add the new item.
This way you don't need to worry about keeping track of size and capacity either.
The reason I suggested -1 is because when we iterate through bd->items, we need a way to know if it's a valid value or just "holed".
> This way you don't need to worry about keeping track of size and capacity either.
The only reason you don't have to worry about this is because std::vector handles it for you, at least in the code examples provided by the author. If you choose to go with a pure C implementation (which is what I'm trying out) then you will have to keep track of these.
> The reason I suggested -1 is because when we iterate through bd->items, we need a way to know if it's a valid value or just "holed".
Yep, I was able to get an example using -1 as a sentinel working and passing fuzz testing.
If you want to iterate through your objects without some external mean to track the live objects (like an embedded next pointer in the object itself), then the swap and pop idiom is a better solution (iterating via indexing is going to be significantly faster than following the next pointers).
For the bulk data with holes code:
First, there's an initialization step that has to happen the first time you allocate your bulk_data_t. Namely, you need to iterate through every item in the list and set its next_free item to the item following it, looping the last item back around to zero. You also need to do this for all the items between the new size and old size every time you resize your item list.
Second, safe iteration over all of the bulk data doesn't seem possible without adding some sort of flag to indicate whether or not an item is free.
Am I missing something here?