Sure, but it's not really a scheduling decision. I think the GP is correct in as much compiler now have to make the hard choice of whether to use any AVX at all, and it's a global trade-off: even though using a few 64-byte moves might be locally optimal, you now need a higher license hence slower CPU and you can only evaluate if that trade-off makes sense in the scope of the larger program: how much such speedups do you get and does it compensate for the lower frequency?