it isn't always that simple. FMA instructions are tricky to use in a way that ac...

jcl · on Oct 25, 2019

Something I found surprising: Some AVX2 and AVX-512 instructions consume so much power that Intel chose to have their chips dynamically slow their clock frequency when the instructions are executed. So naively switching to SIMD instructions can not only fail to improve performance, but it can also hurt the performance of unaltered code executed after it -- even unrelated code running on other cores.

https://blog.cloudflare.com/on-the-dangers-of-intels-frequen...

lovasoa · on Oct 24, 2019

What do you mean "manually" ? `mul_add` is a rust function that operates on a single f64, it's still up to LLVM to choose which instructions to use and to do the vectorization.