I found this excellent talk to be complementary to my talk[1] on data-oriented G...

apitman · on Sept 13, 2018

I highly recommend watching Raph's talk for anyone writing Rust apps that manage non-trivial state. Ah heck just watch it no matter what it's great.

raphlinus · on Sept 13, 2018

Aww, thanks :)

MichaelMoser123 · on Sept 13, 2018

# Use a structure-of-arrays rather than an array-of-structures.

Could you please explain this in more detail?

# To build graph-like structures, reach for a Vec of components, and indexes into the Vec for relationships

And that also allows you to reference deleted nodes?

steveklabnik · on Sept 13, 2018

> Use a structure-of-arrays rather than an array-of-structures.

instead of this:

  struct World {
      players: Vec<Player>
  }

  struct Player {
      name: String,
      health: i64,
  }

which is a "array of structures", see Vec<Player>, you do this:

  struct World {
      player_names: Vec<String>,
      player_health: Vec<i64>,
  }

"A structure of arrays".

"Player zero" is no longer an index into a players array, but an index into many arrays, all of which hold certain kinds of data about a player.

> And that also allows you to reference deleted nodes?

... which is why the talk then references generational indices as a way of dealing with it.

philipov · on Sept 13, 2018

This sounds like it is just going to trade one set of problems for another. It makes it impossible to write generic container types. What if elements of the same collection need to have different structure? What benefits justify this extremely tight coupling?

db48x · on Sept 13, 2018

It's frequently done that way for performance. Imagine a game of Starcraft, with 1000 zerglings rushing your base. The game has to repeatedly loop over all the zerglings to move them. Since there are lots of other fields tracking all of the other data about each zergling, the normal AOS approach has poor data locality; you load a cache line and then you only touch a few bytes of it. With the SOA approach you're looping over an array of positions, so every byte that you fetch from memory ends up being used.

LBarret · on Sept 14, 2018

I agree with you but this I am not sure it is the right example. In this case, the position of the zergs would be indirectly a component. You would have arrays of struct Zergs, each would only have a ref (or an index) to an arrays of positions. this array would be updated efficiently.

db48x · on Sept 15, 2018

Yes, you could do that, but then you don't have SOA or AOS, you have SORTA (struct of references to arrays).

Tuna-Fish · on Sept 13, 2018

> What if elements of the same collection need to have different structure?

They don't. If you ever end up in a situation where you feel they do, the correct solution is typically instead to split the component into multiple, different components, only some of which will be used for any given entity. This is basically the same as defining a schema for an SQL table. A component is just a set of state, there is no requirement for it to map 1-to-1 to a specific functionality.

> What benefits justify this extremely tight coupling?

The most commonly stated one is speed. ECS was adopted first in game design because it is just so much faster. On modern OoO cpus it's typically something like 5-10x faster than traversing an object graph. On the previous generation consoles (PS3, XB360) with their in-order CPUs and crappy load/store subsystems, it could easily be 20x-50x faster. I don't know how relevant this is to your typical GUI, though; the speedup in an ECS comes from linear memory access, which means that the prefetchers make sure every memory access is an L1 access, which is great when you have a game that has thousands of entities, which don't fit into any cache. But just how many GUIs have enough state to overflow the L1 anyway?

However, speed is not the only benefit. This is somewhat subjective, but having implemented similar logic for ECS and OO based games, I feel that the logic is almost always much clearer, more understandable and less buggy in the ECS versions. Basically, in OO doing things that have cross-cutting concerns tends to get split into many small parts done in multiple places, and it's hard to understand the whole system at once. In an ECS, the logic for one system is implemented in one place, it is always just a transformation that reads in some data, does some computation on it, and writes out some data, without complex control flow. It's so much easier to understand and test.

> It makes it impossible to write generic container types.

An example of a Generic container type is AnyMap, which holds one value of each type (and each value will typically be either a straight Vec for small/common components, or some kind of more complex set for components that hold a lot of data.)

(edit: looked up old numbers and found that 100x was pushing it, even on Xenon. 50x ought to be realistic.)

philipov · on Sept 13, 2018

Thanks, i'm starting to see how this does naturally encourage organizing type extension around composable traits rather than inheritance

steveklabnik · on Sept 13, 2018

I mean, that's sorta the point of the talk. Did you watch it?

> This sounds like it is just going to trade one set of problems for another.

Sure, that's exactly what a tradeoff is.

> what substantiates the claim that it is generally superior?

I don't think the claim is that it is always superior.

Oh, and I would see this refactor of a decoupling. The issue is that, if you’re trying to process Players in certain ways, the fact that the name and health are coupled together in a single struct is an issue. This pulls them apart.

raphlinus · on Sept 13, 2018

As steveklabnik says, it absolutely is about tradeoffs. The anymap crate may provide enough of "generic container types" to be useful, and can avoid a lot of repetition of per-type code. A graph with heterogeneous node types is definitely possible, Box<Any> is one solution and there are others.

s73v3r_ · on Sept 13, 2018

"What if elements of the same collection need to have different structure?"

The answer is, they usually don't.

meheleventyone · on Sept 13, 2018

And even if you did with Rust you could theoretically use enums to achieve it (or a union in C).

tdsamardzhiev · on Sept 14, 2018

Everything we ever write as programmers is trading one set of problems for another. The key thing is to identify which problems matter the most.

axilmar · on Sept 14, 2018

Doen't using indexes into arrays introduce, more or less, the same problems with C pointers? after all, a C pointer is an index into a huge array, the current process' memory space.

steveklabnik · on Sept 14, 2018

Not quite; you can't cause memory unsafety with the indexing version.