Setting up a Deep Belief Network

Deep belief networks are a type of Deep Neural Network (DNN), and are composed of multiple hidden layers (or latent variables). Here, the connections are present only between the layers and not within the nodes of each layer. The DBN can be trained both as an unsupervised and supervised model.

The unsupervised model is used to reconstruct the input with noise removal and the supervised model (after pretraining) is used to perform classification. As there are no connections within the nodes in each layer, the DBNs can be considered as a set of unsupervised RBMs or autoencoders, where each hidden layer serves as a visible layer to its subsequent connected hidden layer.

This kind of stacked RBM enhances the performance of input reconstruction where CD is applied across all layers, starting from the actual input training layer and finishing at the last hidden (or latent) layer.

DBNs are a type of graphical model that train the stacked RBMs in a greedy manner. Their networks tend to learn the deep hierarchical representation using joint distributions between the input feature vector i and hidden layers h1,2....m:

Here, i = h0 ; P(hk-1|hk) is a conditional distribution of reconstructed visible units on the hidden layers of the RBM at level k; P(hm-1,hm) is the joint distribution of hidden and visible units (reconstructed) at the final RBM layer of the DBN. The following image illustrates a DBN of four hidden layers, where W represents the weight matrix:

DBNs can also be used to enhance the robustness of DNNs. DNNs face an issue of local optimization while implementing backpropagation. This is possible in scenarios where an error surface features numerous troughs, and the gradient descent, due to backpropagation occurs inside a local deep trough (not a global deep trough). DBNs, on the other hand, perform pretraining of the input features, which helps the optimization direct toward the global deepest trough, and then use backpropagation, to perform a gradient descent to gradually minimize the error rate.

Training a stack of three RBMs: In this recipe, we will train a DBN using three stacked RBMs, where the first hidden layer will have 900 nodes, the second hidden layer will have 500 nodes, and the third hidden layer will have 300 nodes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset