Initialization – bad correspondence between the local and global structure of the objective

To start a numerical optimization algorithm, such as SGD, we need to initialize the weights. If we have an objective function, as in following figure, by making small local moves suggested by SGD, we will be wasting a lot of time if we start on the side of the mountain across from which the true minima lie. This is a scenario where the local structure of the objective function does not give any hint about where the minima lie. It can be avoided by proper initialization. If we could start the SGD at some point on the other side of the hill, the optimization would be much faster:

Explaining a scenario of bad initialization and the effect on gradient-based optimization
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset