Specifically for neural networks, is there any alternative for backpropagation a...

b3kart · on Nov 16, 2023

Unlikely given the dimensionality and complexity of the search space. Besides, we probably don’t even care about the global minimum: the loss we’re optimising is a proxy for what we really care about (performance on unseen data). Counter-example: a model that perfectly memorises the training data can be globally optimal (ignoring regularization), but is not very useful.