I wonder if a FP compiler should actuall be able to produce faster code than for imperative style: imho knowing intention and context could lead do far superior optimizations (for loop with var for summing vs. List.sum)
Yes it definitely can! The most important optimisation (imo) is to replace pure code on immutable data-structures with imperative code on mutable data-structures.
Why not just write imperative code? Because it’s really hard to reason about.