Fine-tuning the parameters of the autoencoder

The autoencoder involves a couple of parameters to tune, depending on the type of autoencoder we are working on. The major parameters in an autoencoder include the following:

Number of nodes in any hidden layer
Number of hidden layers applicable for deep autoencoders
Activation unit such as sigmoid, tanh, softmax, and ReLU activation functions
Regularization parameters or weight decay terms on hidden unit weights
Fraction of the signal to be corrupted in a denoising autoencoder
Sparsity parameters in sparse autoencoders that control the expected activation of neurons in hidden layers
Batch size, if using batch gradient descent learning; learning rate and momentum parameter for stochastic gradient descent
Maximum iterations to be used for the training
Weight initialization
Dropout regularization if dropout is used

These hyperparameters can be trained by setting the problem as a grid search problem. However, each hyperparameter combination requires training the neuron weights for the hidden layer(s), which results in increasing computational complexity with an increase in the number of layers and number of nodes within each layer. To deal with these critical parameters and training issues, stacked autoencoder concepts have been proposed that train each layer separately to get pretrained weights, and then the model is fine-tuned using the obtained weights. This approach tremendously improves the training performance over the conventional mode of training.

Table of Contents for Fine-tuning the parameters of the autoencoder

Create new playlist

Sign In

Sign Up

Table of Contents for
Fine-tuning the parameters of the autoencoder