The usual example of Data Oriented Design is with the Ball structure: struct Bal
ID: 658580 • Letter: T
Question
The usual example of Data Oriented Design is with the Ball structure:
struct Ball
{
float Radius;
float XYZ[3];
};
and then they make some algorithm that iterates a std::vector<Ball> vector.
Then they give you the same thing, but implemented in Data Oriented Design:
struct Balls
{
std::vector<float> Radiuses;
std::vector<XYZ[3]> XYZs;
};
Which is good and all if you're going to iterate trough all radiuses first, then all positions and so on. However, how do you move the balls in the vector? In the original version, if you have a std::vector<Ball> BallsAll, you can just move any BallsAll[x] to any BallsAll[y].
However to do that for the Data Oriented version, you must do the same thing for every property (2 times in the case of Ball - radius and position). But it gets worse if you have a lot more properties. You'll have to keep an index for each "ball" and when you try to move it around, you have to do the move in every vector of properties.
Doesn't that kill any performance benefit of Data Oriented Design?
Explanation / Answer
Another answer gave an excellent overview over how you'd nicely encapsulate the row-oriented storage and give a better view. But since you also ask about performance, let me address that: SoA layout is not a silver bullet. It's a pretty good default (for cache usage; not so much for ease of implementation in most languages), but it's not all there is, not even in data oriented design (whatever that exactly means). It's possible that the authors of some introductions you've read missed that point and present only SoA layout because they think that's the entire point of DOD. They'd be wrong, and thankfully not everyone falls into that trap.
As you've probably already realized, not every piece of primitive data benefits from being pulled out into its own array. SoA layout is of advantage when the components that you split into separate arrays are usually accessed separately. But not every tiny piece is accessed in isolation, for example a position vector is almost always read and updated wholesale, so naturally you don't split that one. In fact, your example didn't do that either! Likewise, if you usually access all the properties of a Ball together, because you spend most of your time swapping balls around in your collection of balls, there is no point in separating them.
However, there's a second side to DOD. You don't get all the cache and organization advantages just by turning your memory layout 90
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.