Floating Point Determinism

_3u10 · on July 10, 2011

Just look at the source code for Quake. There is a solution in there, and it's about running the simulation on only one computer that computer makes the final decision on how soft body physics affects gameplay. Use the fast path for all non affective physics and send a sync once in a while for affective physics.

Also, you can run your simulation on integer math that does have deterministic results. Incidentally it's probably how the universe works too. Quantum physics specifies that units (quanta) are not infinitely divisible. Even frequency of light shouldn't be infinitely divisible if you consider the quanta of time to be the speed of light over the planck length.

All you need to do is send definite updates on object locations frequently enough that the error is not significant enough to be noticeable by players. In network game play the overriding error problems stem from network lag rather than FP divergence. For the most part it does not matter that your player is .001 millimeter from where they should be.

Everyone who has sniped in Team Fortress knows that you have to lead your target a little for 'lag'. Ostensibly this is because the bullet takes time to travel, but realistically probably stems from lag.

In all honesty just write the game using whatever running on whatever and if it looks realistic it's probably good enough. If you really can't solve the problem find a guy in creative to augment your story line to match the weirdness in your physics simulation.

psykotic · on July 10, 2011

The issue of floating-point determinism only really comes up in lock-step networking where you are in effect simulating a distributed state machine. Rather than continually transmitting the current state, you synchronize the state at startup and afterwards transmit only actions. In an RTS, where lock-step networking is still most commonplace, an action might be "order unit to move from A to B". In this context determinism is absolutely paramount, because any mismatch in the simulation between clients will compound itself over time, so even if the divergence starts small, it will soon grow to be very large. Whereas with client-server networking, as in Quake, a mismatch in the floating-point specifics between clients will only manifest self as a constant level of error. The error does not contaminate the simulation since it's client-side only, and the server is authoritative.

The issue of latency is totally unrelated and something intrinsic to the workings of distributed systems.

iam · on July 10, 2011

Still, he brings up a good point. If the source of non-determinism is known to be exactly in the physics engine, maybe it's possible to share the global physics state every once in a while.

I don't know much about game network code, but is this feasible at all? Another post brought up the reason for using lock-step networking is because of "thousands" of units, so I assume the biggest data is the physics data (x,y,z velocity, etc of said units).

If some physics errors could be tolerated for a few ticks, maybe a binary search of the gamespace could be used to figure out where the error came from, and sync up the game state in that area. Could be fairly low bandwidth right?

psykotic · on July 10, 2011

Yes, you can mix these approaches. Almost every client-server game does that to some extent, e.g. Unreal Engine 1, 2 and 3 supports both replicated state and replicated function calls, although it's mostly based on state replication.

However, even a small amount of constant error with lock-step networking means that you can't distribute the authority. Imagine a case where a unit is low on health and a tiny error in the unit's position makes the difference between whether it's out of range or in range of an enemy unit's fire. Thus it either lives or dies based on the presence of absence of that error. With fully deterministic lock-step networking, a client can make that determination on its own, without checking in with anyone else for confirmation. Without perfect determinism, the best it can do is make a tentative decision, which it might have to reverse soon thereafter if the server's view of the situation disagrees.

extension · on July 10, 2011

I used to develop for the classic Gameboy which used a serial cable to connect two units for multiplayer. The serial port could exchange exactly one byte per frame, simultaneously in both directions. That was just enough to send the input state for the frame (U,D,L,R,A,B,Select,Start = 8 bits) from each device to the other. So, this sort of blind deterministic simulation was the only option.

Identical hardware and predictable latency made this somewhat easier, but it was still devilishly tricky to get right. The simulation had to not be affected by which player is local and which is remote. That's a tall order for 8-bit assembly language on a 1MHz CPU.

mml · on July 10, 2011

Tl;dr:

FP operations can be non deterministic across different systems, architectures and compilers unless you take great care, and know exactly what you are doing. This is widely applicable in multiplayer game programming.

on July 9, 2011

[deleted]

_3u10 · on July 10, 2011

Run some FP calculations on an AMD CPU and then an Intel CPU you'll get slightly different results. In 3D rendering for movies they use all the same CPU because otherwise you get weird colorspace distortions between frames.

psykotic · on July 10, 2011

> Run some FP calculations on an AMD CPU and then an Intel CPU you'll get slightly different results.

Are you saying that the exact same x87 or SSE instruction sequence will yield different results on AMD vs Intel? As far as I know that isn't true and would be a very serious CPU bug.

vilhelm_s · on July 10, 2011

IEEE specifies exact rounding for +,-,*,/ and sqrt, but it allows the transcendental functions to be inaccurate in the least significant bit (creating a fast exact algorithm for these is apparently quite hard).

At one job, we were running our test suite on both Pentium4 and Athlon machines, and got into trouble because the exp() function returned different values about once every 10,000 numbers. We changed the regression tests to add a fuzz factor to the floating-point comparisons.

psykotic · on July 10, 2011

Your point about IEEE transcendentals is a good one, and it ties into the so-called tablemaker's dilemma. But this isn't just an IEEE issue since we're talking about specific CPU-level instruction sets. I'd be surprised if AMD wasn't precision-compatible with Intel on x87 and SSE at the lowest level. Were you writing x87/SSE instructions using assembly code or intrinsics so you're sure the same instruction sequence was being generated in both cases of the example you mentioned?

vilhelm_s · on July 10, 2011

The same binary gave different outputs depending on what CPU you ran it on.

psykotic · on July 11, 2011

Interesting, thanks for the confirmation! I'll have to try to replicate that.

edge17 · on July 10, 2011

yea, I've encountered this before while diffing outputs of two versions of the same thing (like a testbench to compare behavioral code vs netlist vs c simulation).

In fact, this is one of the challenges in doing relational databases on gpu hardware. With relational databases you want to make sure the result of the same query is the same across every run. Relational databases are otherwise perfect use cases for gpu type of parallelization - i.e. running large numbers of parallel, read-only lockless operations without cross dependencies (for the most part, for most relatively simple queries)

edge17 · on July 10, 2011

IEEE 754 floating point isn't associative

psykotic · on July 10, 2011

Next you are going to tell me the moon isn't made out of cheese.

No, really, that's why I said I was talking about a specific machine code instruction sequence.

jerf · on July 10, 2011

"vocabulary of the title/premise is a little misguided."

What, "Floating Point Determinism"? Seems like what was discussed in great detail to me. How much more explanation of exactly what he meant are you looking for exactly? Your criticism rings very hollow.

Someone · on July 10, 2011

I cannot read the original post, as it is deleted, but I may agree with it. Most random number generator are 100% deterministic. Yet, if I take two on two systems, I cannot guarantee that both produce the same sequence. And no, seeding does not guarantee that, at least GNU does not think so: http://www.gnu.org/s/hello/manual/libc/ISO-Random.html states:

— Macro: int RAND_MAX The value of this macro is an integer constant representing the largest value the rand function can return. In the GNU library, it is 2147483647, which is the largest signed integer representable in 32 bits. In other libraries, it may be as low as 32767.

Barring bugs (a problem that the writer does not touch) floating point arithmetic should be deterministic. Problem is that different machine may determine to produce different results, given the same inputs.

dylanrw · on July 10, 2011

While authored a long time ago, this is a great article for the early game developer. Hopefully Glenn will continue writing awesome intro/theory, and giving talks.

JonnieCache · on July 10, 2011

Is this stuff ever an issue for people doing financial modelling?