Fine-tuning the parameters of the autoencoder

The autoencoder involves a couple of parameters to tune, depending on the type of autoencoder we are working on. The major parameters in an autoencoder include the following:

  • Number of nodes in any hidden layer
  • Number of hidden layers applicable for deep autoencoders
  • Activation unit such as sigmoid, tanh, softmax, and ReLU activation functions
  • Regularization parameters or weight decay terms on hidden unit weights
  • Fraction of the signal to be corrupted in a denoising autoencoder
  • Sparsity parameters in sparse autoencoders that control the expected activation of neurons in hidden layers
  • Batch size, if using batch gradient descent learning; learning rate and momentum parameter for stochastic gradient descent
  • Maximum iterations to be used for the training
  • Weight initialization
  • Dropout regularization if dropout is used

These hyperparameters can be trained by setting the problem as a grid search problem. However, each hyperparameter combination requires training the neuron weights for the hidden layer(s), which results in increasing computational complexity with an increase in the number of layers and number of nodes within each layer. To deal with these critical parameters and training issues, stacked autoencoder concepts have been proposed that train each layer separately to get pretrained weights, and then the model is fine-tuned using the obtained weights. This approach tremendously improves the training performance over the conventional mode of training.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.