Some algorithms guarantee that some arithmetic operation(s) applied to 2+ floating point inputs will result in a list of floating point outputs which when summed have exactly the correct result. This gets all screwed up if you mess with the order of operations or the rounding of intermediate results.
Fair, I think it would be very helpful for Rust if some expert actually knows of a specific example for which this is the case. I think there is an RFC about enabling floating-point contraction by default, that would "silently" be able to do some of these transformations depending on the optimization level.
The one very important thing that often get destroyed by compiler using associativity is the TwoSum Error free transform which is a vital composant of several algorithms that deal with numerical error (most notably the Kahan Summation).
I think the key issue with the optimizations that ICC is performing for C++ but Rust is not doing in this case is just FP-contraction, which is related to, but not the same as, assuming associativity.
The RFC about that is https://github.com/rust-lang/rfcs/pull/2686 , where you see users kind of split into the "I want faster binaries" and "I want more deterministic execution" camps. Neither are wrong TBH.
Some people have tried to show there that enabling FP-contraction by default isn't always better / more precise, but I'm not sure if they succeeded.
> I think the key issue with the optimizations that ICC is performing for C++ but Rust is not doing in this case is just FP-contraction, which is related to, but not the same as, assuming associativity.
I think associativity is necessary to vectorize reduction operations like:
r+=(c[i]-a[i]*b[i])*a[i]*(c[i]-a[i]*b[i]);
I haven't looked at the code generated by ICC, but I would expect it to vectorize this by computing tuples of "partial sums", roughly as follows:
For the error free transform, the important part is the associativity and not the fusion (which would not degrade the operation).
As illustrated in the Wikipedia page, if move terms according to associativ rules you can show that one of the term is always zero and conclude that it is useless, dropping it from the computation.
However, in practice, floating-points are not associativ and that term contains the numerical error of the previous operation.
e.g. https://www.cs.cmu.edu/~quake/robust.html
Some keywords to look for: “compensated arithmetic”, “error-free transformations”.
FMAs generally speed up these tools, but you need to be careful and deliberate about how they are used.
(Disclaimer: I am not an expert on this, just some guy on the internet.)