High resolution timers and high uptime headaches

vardump · on Aug 9, 2018

The proposed fix is horrible:

  timer_freq.QuadPart /= 1000; // To convert frequency from 'ticks per second' to 'ticks per millisecond'

Timer frequency is usually in the range of millions, but there's _nothing_ to prevent Microsoft from changing it to a value that's in the thousands. Or whatever they please, that's why win32 QueryPerformanceFrequency [1] API call exists in the first place! That would break any code with this "fix".

It also reduces accuracy — a second might no longer be 1000 ms, but maybe 1001 ms.

I think a better fix would be either to simply use doubles instead of floats or to at least offset the value to start of game — as long as a single game can't last months.

[1]: QueryPerformanceFrequency documentation: https://msdn.microsoft.com/en-us/library/windows/desktop/ms6...

ericpauley · on Aug 9, 2018

Agreed. Not sure how they concluded dividing the denominator was a good idea, seems like this exact problem (minus original float bug) shows up all the time.

The solution is clearly to multiply by 1000 before performing floor division:

`t=(now-start)*1000/freq`

CookiePLMonster · on Aug 9, 2018

Values are already offset (in noted snippet, timer_begin is approx. the time process has started, as in - queried during initialization).

While I'll defend the idea of modifying QueryPerformanceFrequency result to achieve desired resolution (in this case, ticks per ms), you are absolutely right about the possibility of the value becoming several orders of magnitude smaller in the future. In this case multiplying the subtraction result by 1000 instead of dividing frequency is indeed far safer.

vardump · on Aug 9, 2018

Why do you use floats in the first place?

> While I'll defend the idea of modifying QueryPerformanceFrequency result to achieve desired resolution (in this case, ticks per ms)

Did a quick back-of-the-envelope calculation, and the suggested solution is off by 42 milliseconds per minute on a hypothetical 3 GHz system. It might not sound like much (~2.5 frames at 60 fps), but I've seen smaller errors cause massive issues when it comes to time.

It's better to do it right rather than to later on debug these often surprising issues we didn't have sufficient imagination for.

Unless one is looking for an amazing bug hunting war story, of course. :)

CookiePLMonster · on Aug 9, 2018

I don't - it's not my code I had to fix.

And as for returning the result - original code does return a float, so while I could also return a double it probably wouldn't make any difference, as I expect the game to store those results in floats too. Moreover, since the result is basically "session time", it is expected to last up to several hours - in which case floats are still fine.

vardump · on Aug 9, 2018

I've often seen code like that written by mathematicians and physicists. Sigh.

Things like using a bit weird data types (floats instead of doubles) and writing functions thousands (!) of lines long.

Then asking me why did the reconstruction have a weird circular error around the origo...

CookiePLMonster · on Aug 9, 2018

Either way, it may indeed be beneficial to return time as a double - so code which stores it in floats will not care about the change, but the code which uses it in intermediate calculations will and it will potentially benefit from it. Notes taken!

EDIT: Will generate more code warnings of course, but I believe it's a good tradeoff, since the game has a ton of them anyway and going through them all is something for another day.

tom_ · on Aug 9, 2018

I don't think it's so bad as to count as horrible. QueryPerformanceCounter promises to count microseconds at worst: https://msdn.microsoft.com/en-us/library/ms644904(v=VS.85).a...

vardump · on Aug 9, 2018

Well, there's always the other side as well: if the value is in the billions, it'll degenerate back to the original bug.

On single CPU socket systems, Windows seems to usually compute QueryPerformanceCounter by simply shifting CPU RDTSC count right by 10 bits, effectively dividing the value by 1024. So exactly 3.0 GHz system would have 2929687 in QueryPerformanceFrequency return value.

On NUMA machines instead of RDTSC QueryPerformanceCounter uses HPET or APIC or whatever is available, because RDTSC is not HW synchronized between CPU sockets. I bet those HPET divisors will be in different divisor range.

No matter what clock source Windows might use, simply using doubles would have fixed the issue. No point to use floats in the first place.

_xzxj · on Aug 9, 2018

rationals might be a better solution since they could use whatever divisor they want without throwing away precision. I'm not very familiar with MS apis but if they provide some facility for telling you how many 'ticks' there are per second you could use that to generate the divisor as well.

edit: I see you did link to the API!

TeMPOraL · on Aug 9, 2018

There is. To save people the click, there's a pair of functions - QueryPerformanceCounter and QueryPerformanceFrequency. QPC gives you ticks, QPF gives you ticks per second, so QPC / QPF is the rational you seek.

anaisbetts · on Aug 9, 2018

You actually don't need to use Detour hacks to fake QueryPerformanceCounter, this is built into Application Verifier as a check called "TimeRollOver". If you're writing Windows apps or games, it's highly recommended to try to run your app through AppVerifier, it'll catch lots of mistakes by turning all of the super permissive Win32 APIs into super vicious strict versions that try to find mistakes in your code.

CookiePLMonster · on Aug 9, 2018

Quoting from the article: "Application Verifier doesn’t have such an option (all it can do is forcibly wrap around GetTickCount, which is not what we need here). "

That said, I failed to observe this issue with TimeRollOVer enabled for the process. The only way I could reliably reproduce it on my machine (with low uptime) was to fake QPC results, like so.

sehugg · on Aug 9, 2018

Windows CE has a 32-bit uptime counter (GetTickCount()) that wraps after ~49 days, but on startup it sets the value so that it wraps around in three minutes, increasing the likelihood that overflow bugs will be sussed out during development. https://news.ycombinator.com/item?id=3231781

zamalek · on Aug 9, 2018

On Windows proper it has a resolution of 15ms, which is nowhere precise enough for games.

Asooka · on Aug 9, 2018

> One unanswered question remains – why didn’t this code break right away, even when compiled with VS2003?

I suspect VS2003 didn't enable SSE by default and was compiling the code using the x87 FPU, in which case all intermediate operations were done using 80-bit floats with only a final cast at the end.

CookiePLMonster · on Aug 9, 2018

While it is not mentioned in the article, I am aware of /arch:IA32 (don't use SSE instructions) and I have tried it - no luck.

On the other hand, using /fp:fast with VS2017 did give me similar results to what VS2003/VS2010 produces - so if I were to guess, VS2003 generates x87 math in a manner similar to fast math, while precise math breaks it?

moogly · on Aug 10, 2018

It appears VS2003 didn't have a defined (default) floating point model. It was introduced in VS2005 (default: precise): https://social.msdn.microsoft.com/Forums/sqlserver/en-US/44e...

CookiePLMonster · on Aug 10, 2018

Then again, it may not be strictly due to /fp:precise - I tried compiling with VS2010 too, and even though it was /fp:precise the bug did not manifest itself.

Perhaps comparing assembly code would have given a definite answer, but since it is so obvious from source I don't think it is worth the effort.

skinner_ · on Aug 9, 2018

The epic Turbo Pascal Runtime Error 200 is similar. Turbo Pascal for DOS runtime sleeps by running a no-op loop some number of times. It figures out this number during startup, and it immediately crashes with division by zero if your computer is faster than a typical PC from the late nineties. http://wiki-errors.com/runtime-error-200-%E2%80%93-the-pasca...

FartyMcFarter · on Aug 9, 2018

If I remember correctly, one of the games bundled with QBASIC on MS-DOS had the same problem. Perhaps it was GORILLA or NIBBLES.

amluto · on Aug 9, 2018

The right way to do this is:

    scaled = raw * mult >> shift;

Calculate mult and shift based on the timer frequency. The multiplication likely needs to be done with a 96 or 128 bit output. If shift is forced to be >= 64 (on a 64-bit machine), then the resulting assembly code is short and fast.

anfractuosity · on Aug 9, 2018

Heh that's a really interesting bug!

Would it be possible to avoid floats completely for the game's calculation?

I made a little uptime faker for Linux, it's not as clever as this one though as it's not altering the kernel timers directly https://www.anfractuosity.com/projects/uptime

mcraiha · on Aug 9, 2018

You can always use fixed points instead of floating points if you have to use decimal numbers. https://en.wikipedia.org/wiki/Fixed-point_arithmetic

There are many libraries for that, e.g. https://github.com/XMunkki/FixPointCS

titzer · on Aug 9, 2018

Or just use a double. It has 53 bits of mantissa.

chowells · on Aug 10, 2018

This appears to be the same issue BlazBlue: Calamity Trigger on steam has. The game's unplayable unless you reboot before playing. Unfortunately, it's never getting patched.

CookiePLMonster · on Aug 10, 2018

I had a quick look on Steam forums and I wouldn't be surprised if that indeed was the case! Interesting.

Does the game have any kind of modding community behind it whatsoever?

chowells · on Aug 10, 2018

No modding community of note. It's the first game in a series that has three direct sequels, two spinoffs, and a crossover title that all run well on Steam. As much as I miss some things about that version of the game, the sequels are better in more than just playability on windows.

I'll just have to remember to reboot before I play my favorite version of Taokaka, when I get the urge.

CookiePLMonster · on Aug 10, 2018

Will keep close tabs on this one, maybe I'll grab it depending on how cheap it goes during Steam sales =)

wyldfire · on Aug 9, 2018

shrug I would just try to represent it w/more precision. Maybe then you wouldn't even need the complexity of a floating point number.

If you're lucky, your compiler can generate 64/128-bit math for your target if there's no native registers/ALU that wide. If not, it's not exactly rocket surgery to do this yourself. Or in dire cases, use an arbitrary precision library.

PaulHoule · on Aug 9, 2018

floats are one of the worst things that ever happened to computing, particularly when they use base 2 representation for the exponent. (eg. there is no "0.03" in the floating point numbers)

krallja · on Aug 9, 2018

There’s no way to represent 1/3 as a decimal float. It’s a problem with any base.

vardump · on Aug 9, 2018

Not a problem in bases divisible by 3.

It's exactly 0.1 in base 3 and 0.4 in base 12, for example.

krallja · on Aug 10, 2018

That’s exactly my point. Every base has numbers it can represent that are not representable in other bases. Every base has numbers it cannot represent that are representable in other bases.

PaulHoule · on Aug 9, 2018

Come on, when I say 0.03 I mean 3/100 as in "3 cents".

recursive · on Aug 9, 2018

Yes, and when I say 1/3, I mean 0.3333333333333333333333333333333333333333333333333333... and so on. Unfortunately, I can't tell you exactly, since we're using decimal.

jschwartzi · on Aug 10, 2018

As a workaround you simply approximate to a significant figure past any other values in your calculation. So in practice it doesn't really matter that there's no exact base-10 representation as a decimal because there's always an approximation that introduces no error.

recursive · on Aug 10, 2018

So what's 1/3 + 1/3 + 1/3? If your answer is anything other than 1, you've introduced an error.