Multi-Layer Perceptron

We saw that the AND and OR gate outputs are linearly separable and perceptron can be used to model this data. However, not all functions are separable. In fact, there are very few and their proportion to the total of achievable functions tends to zero as the number of bits increases. Indeed, as we anticipated, if we take the XOR gate, the linear separation is not possible. The crosses and the zeros are in different locations and we cannot put a line to separate them, as shown in the following figure:

We could think of parsing more perceptrons. The resulting structure could thus learn a greater number of functions, all of which belong to the subset of linearly separable functions.
In order to achieve a wider range of functions, intermediate transmissions must be introduced into the perceptron between the input layer and the output layer, allowing for some kind of internal representation of the input. The resulting perceptron is called MLP.

We have already seen this as feed forward networks in Chapter 1, Neural Network and Artificial Intelligence Concepts. MLP consists of at least three layers of nodes: input, hidden, and output nodes. Except for the input nodes, each node is a neuron using a non-linear activation function. MLP uses a supervised learning technique and back propagation for training. The multiple layers and non-linear nature distinguishes MLP from simple perceptrons. MLP is specifically used when the data is not linearly separable.

For example, an MLP, such as that shown in the following figure, is able to realize the XOR function, which we have previously seen cannot be achieved through a simple perceptron:

XOR is achieved using a three layer network and is a combination of OR and AND perceptrons. The output layer contains one neuron which gives the XOR output. A configuration of this kind allows the two neurons to specialize each on a particular logic function. For example, in the case of XOR, the two neurons can respectively carry out the AND and OR logic functions.

The term MLP does not refer to a single perceptron that has multiple layers. Rather, it contains many perceptrons that are organized into layers. An alternative is an MLP network.

Applications of MLP are:

MLPs are extremely useful for complex problems in research.
MLPs are universal function approximators and they can be used to create mathematical models by regression analysis. MLPs also make good classifier algorithms.
MLPs are used in diverse fields, such as speech recognition, image recognition, and language translation. They form the basis for deep learning.

We will now implement an MLP using the R package SNNS.

Table of Contents for Multi-Layer Perceptron

Create new playlist

Sign In

Sign Up

Table of Contents for
Multi-Layer Perceptron