One of the difficulties is that the broader scope your optimiser has to push tow...

One of the difficulties is that the broader scope your optimiser has to push towards a solution, the better your measurements need to be. And having an accurate measure of which thing is "better" can be prohibitively expensive.

The costs of errors varies drastically in different domains and for different use cases, so something important is understanding how and why different models typically fail and making tradeoffs there.