An autoencoder comprises of the following two networks:
- Encoder: An encoder encodes its input, , into a hidden representation, . The output of an encoder unit is as follows:
h = g(Wxi+b)
where, xi ∈ Rn, W ∈ Rd x n, b ∈ Rd.
- Decoder: A decoder reconstructs the input from the hidden representation, h. The output of a decoder unit is as follows:
where, W* ∈ Rn x d, h ∈ Rd, c ∈ Rn.
An autoencoder neural network tries to reconstruct the original input, , from the encoded representation, , of dimension to produce an output, , such that approximates to . The network is trained to minimize the reconstruction error (loss function). It is a measure of the difference between the original input and the predicted output, and can be denoted as, . An autoencoder with a dimension of encoded representation less than the input dimension is known as an under-complete autoencoder, whereas an autoencoder with a dimension of encoded representation more than the input dimension is known as an overcomplete autoencoder.
The following diagram shows examples of an under-complete and an over-complete autoencoder:
In the following section, we will implement an under-complete autoencoder.