How it works...

The objective function can be minimized using stochastic gradient descent by indirectly modifying (and optimizing) the weight matrix. The entire gradient can be further divided into two forms based on the probability density: positive gradient and negative gradient. The positive gradient primarily depends on the input data and the negative gradient depends only on the generated model.

In the positive gradient, the probability toward the reconstructing training data increases, and in the negative gradient, the probability of randomly generated uniform samples by the model decreases.

The CD technique is used to optimize the negative phase. In the CD technique, the weight matrix is adjusted in each iteration of reconstruction. The new weight matrix is generated using the following formula. The learning rate is defined as alpha, in our case:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset