That is a serious mistake here, where the definition of a correct optimization is of significantly more interest than the result that they can be composed. We want to apply optimizations alone even more than we want to apply them together.
Putting things another way, this definition is clearly not given as part of a proof; it is given for its own sake, and the proof uses it.