Efficient data representation with autoencoders

A big problem that plagues all supervised learning systems is the so-called curse of dimensionality: a progressive decline in performance while increasing the input space dimension. This occurs because the number of necessary samples to obtain a sufficient sampling of the input space increases exponentially with the number of dimensions. To overcome these problems, some optimizing networks have been developed.

The first are autoencoders networks: these are designed and trained for transforming an input pattern in itself, so that, in the presence of a degraded or incomplete version of an input pattern, it is possible to obtain the original pattern. The network is trained to create output data such as that presented in the entrance, and the hidden layer stores the data compressed, that is, a compact representation that captures the fundamental characteristics of the input data.

The second optimizing networks are Boltzmann machines: these types of networks consist of an input/output visible layer and one hidden layer. The connections between the visible layer and the hidden one are non-directional: data can travel in both directions, visible-hidden and hidden-visible, and the different neuronal units can be fully connected or partially connected.

Let's see an example. Decide which of the following series you think would be easier to memorize:

45, 13, 37, 11, 23, 90, 79, 24, 87, 47
50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20

Seeing the preceding two series, it seems the first series would be easier for a human, because it is shorter, containing only a few numbers compared to the second one. However, if you take a careful look at the second series, you would find that even numbers are exactly two times the following numbers. Whereas the odd numbers are followed by a number times three plus one. This is a famous number sequence called the hailstone sequence.

However, if you can easily memorize long series, you can also recognize patterns in the data easily and quickly. During the 1970s, researchers observed that expert chess players were able to memorize the positions of all the pieces in a game by looking at the board for just five seconds. This might sound controversial, but chess experts don't have a more powerful memory than you and I do. The thing is that they can realize the chess patterns more easily than a non-chess player does. An autoencoder works such that it first observes the inputs, converts them to a better and internal representation, and can swallow similar to what it has already learned:

Figure 5: Autoencoder in chess game perspective

Take a look at a more realistic figure concerning the chess example we just discussed: the hidden layer has two neurons (that is, the encoder itself), whereas the output layer has three neurons (in other words, the decoder). Because the internal representation has a lower dimensionality than the input data (it is 2D instead of 3D), the autoencoder is said to be under complete. An under complete autoencoder cannot trivially copy its inputs to the coding, yet it must find a way to output a copy of its inputs.

It is forced to learn the most important features in the input data and drop the unimportant ones. This way, an autoencoder can be compared with Principal Component Analysis (PCA), which is used to represent a given input using a lower number of dimensions than originally present.

Up to this point, we know how an autoencoder works. Now, it would be worth knowing anomaly detection using outlier identification.

Table of Contents for Efficient data representation with autoencoders

Create new playlist

Sign In

Sign Up

Table of Contents for
Efficient data representation with autoencoders