Some algorithms guarantee that some arithmetic operation(s) applied to 2+ floati...

fluffything · on Oct 24, 2019

Fair, I think it would be very helpful for Rust if some expert actually knows of a specific example for which this is the case. I think there is an RFC about enabling floating-point contraction by default, that would "silently" be able to do some of these transformations depending on the optimization level.

nestorD · on Oct 24, 2019

The one very important thing that often get destroyed by compiler using associativity is the TwoSum Error free transform which is a vital composant of several algorithms that deal with numerical error (most notably the Kahan Summation).

The problem is mentionned in the Wikipedia page of the Kahan summation (and I have been able to reproduce it with gcc) : https://en.wikipedia.org/wiki/Kahan_summation_algorithm#Poss...

This is actually my area of research, I could contribute if you point me to an RFC.

fluffything · on Oct 24, 2019

I think the key issue with the optimizations that ICC is performing for C++ but Rust is not doing in this case is just FP-contraction, which is related to, but not the same as, assuming associativity.

The RFC about that is https://github.com/rust-lang/rfcs/pull/2686 , where you see users kind of split into the "I want faster binaries" and "I want more deterministic execution" camps. Neither are wrong TBH.

Some people have tried to show there that enabling FP-contraction by default isn't always better / more precise, but I'm not sure if they succeeded.

tom_mellior · on Oct 24, 2019

> I think the key issue with the optimizations that ICC is performing for C++ but Rust is not doing in this case is just FP-contraction, which is related to, but not the same as, assuming associativity.

I think associativity is necessary to vectorize reduction operations like:

    r+=(c[i]-a[i]*b[i])*a[i]*(c[i]-a[i]*b[i]);

I haven't looked at the code generated by ICC, but I would expect it to vectorize this by computing tuples of "partial sums", roughly as follows:

    r0 += (c[i+0]-a[i+0]*b[i+0])*a[i+0]*(c[i+0]-a[i+0]*b[i+0]);
    r1 += (c[i+1]-a[i+1]*b[i+1])*a[i+1]*(c[i+1]-a[i+1]*b[i+1]);
    r2 += (c[i+2]-a[i+2]*b[i+2])*a[i+2]*(c[i+2]-a[i+2]*b[i+2]);
    ...

and then doing a horizontal sum r = r0 + r1 + r2 + ... in the end. But this requires associativity. (And commutativity, but that's a given.)

fluffything · on Oct 24, 2019

Yes, you are right, for those reduce operations you need associativity.

nestorD · on Oct 24, 2019

For the error free transform, the important part is the associativity and not the fusion (which would not degrade the operation).

As illustrated in the Wikipedia page, if move terms according to associativ rules you can show that one of the term is always zero and conclude that it is useless, dropping it from the computation.

However, in practice, floating-points are not associativ and that term contains the numerical error of the previous operation.

BubRoss · on Oct 24, 2019

It isn't a matter of being a rust expert, fused multiply add is an instruction on CPUs.