The question, though, is which give you more accurate results? :-)

dkarl · on July 14, 2020

The one that let us build a library of bit-for-bit regression tests ;-) but for that I have to explain a bit more. There were computational "features" that could be turned on and off to tweak the computation for different problems and for different speed/accuracy trade-offs, and sometimes these features had very slight effects, so regressions could result in very small errors.

An experienced eye could easily see the difference between the tiny x87-related indeterminacy and other kinds of changes, but we were uncomfortable automating this comparison, and it took a while for someone without strong domain knowledge (such as myself or any other software engineer being hired) to become comfortable eyeballing it. With deterministic output, we could use automated tests to verify that, for example, the work we did to add a new computational feature did not change the output when that feature was not enabled, or that small changes intended as performance optimizations did not inject tiny numerical errors.

Our customers were also a lot more comfortable when they could use "diff" to validate that our latest X-times-faster release was really doing the same computation as the last one :-)

EDIT: We also got a noticeable speed-up by enabling the SSE2 instructions. The bulk of the numeric work was done in hardware, so it wasn't dramatic, but it was measurable.

WalterBright · on July 15, 2020

Yes, that makes sense.

As for speed, Intel has neglected to keep up the x87.