Implementers of multilayered neural nets often have to deal with the problem of getting stuck in local minima. One commonly used method to try to find the best out of many local minima is to train the neural net several times, each time with randomly initialized weights in hopes of searching across different regions of the parameter space each time.
For example, if we look at the image above, if our weights were initialized such that we started searching somewhere on the far left of the function, we would end up at one of the local minima. But if we happened to initialize our weights such that we starting searching somewhere on the far right of the function, we would end up at the global minimum.
kmcrane
True. There are lots of heuristics people use in practice to try to get good solutions to hard nonconvex optimization problems, even if it's impossible to always guarantee that you'll find a globally optimal solution. These kinds of schemes are important, because in the real world you sometimes have to "solve" impossible problems! (Though perhaps the fact that it's impossible should inform what you expect out of the solution!)
kmcrane
Also, for a 1D problem on a bounded interval there's an obvious way to find a critical point that's guaranteed to work: do binary search on the derivative! (This only works, of course, if you have access to the derivative...)
Implementers of multilayered neural nets often have to deal with the problem of getting stuck in local minima. One commonly used method to try to find the best out of many local minima is to train the neural net several times, each time with randomly initialized weights in hopes of searching across different regions of the parameter space each time.
For example, if we look at the image above, if our weights were initialized such that we started searching somewhere on the far left of the function, we would end up at one of the local minima. But if we happened to initialize our weights such that we starting searching somewhere on the far right of the function, we would end up at the global minimum.
True. There are lots of heuristics people use in practice to try to get good solutions to hard nonconvex optimization problems, even if it's impossible to always guarantee that you'll find a globally optimal solution. These kinds of schemes are important, because in the real world you sometimes have to "solve" impossible problems! (Though perhaps the fact that it's impossible should inform what you expect out of the solution!)
Also, for a 1D problem on a bounded interval there's an obvious way to find a critical point that's guaranteed to work: do binary search on the derivative! (This only works, of course, if you have access to the derivative...)