It isn't because those microcontrollers are not that smart about branch prediction and instruction ordering so the penalty is often bigger.
Same with reordering - might not matter that much on a big modern CPU but may matter way more on a device with a micro cache and very slow division for example.