Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

A basic neural architecture – perceptrons

Perceptron is the most simple neural network architecture. Projected by Frank Rosenblatt in 1957, it has just one layer of neurons, receiving a set of inputs and producing another set of outputs. This was one of the first representations of neural networks to gain attention, especially because of their simplicity:

In our Java implementation, this is illustrated with one neural layer (the output layer). The following code creates a perceptron with three inputs and two outputs, having the linear function at the output layer:

int numberOfInputs=3;
int numberOfOutputs=2;

Linear outputAcFnc = new Linear(1.0);
NeuralNet perceptron = new NeuralNet(numberOfInputs,numberOfOutputs,
                outputAcFnc);

Applications and limitations

However, scientists did not take long to conclude that a perceptron neural network could only be applied to simple tasks, according to that simplicity. At that time, neural networks were being used for simple classification problems, but perceptrons usually failed when faced with more complex datasets. Let's illustrate this with a very basic example (an AND function) to understand better this issue.

Linear separation

The example consists of an AND function that takes two inputs, x1 and x2. That function can be plotted in a two-dimensional chart as follows:

And now let's examine how the neural network evolves the training using the perceptron rule, considering a pair of two weights, w1 and w2, initially 0.5, and bias valued 0.5 as well. Assume learning rate ? equals 0.2:

Epoch	x1	x2	w1	w2	b	y	t	E	?w1	?w2	?b
1	0	0	0.5	0.5	0.5	0.5	0	-0.5	0	0	-0.1
1	0	1	0.5	0.5	0.4	0.9	0	-0.9	0	-0.18	-0.18
1	1	0	0.5	0.32	0.22	0.72	0	-0.72	-0.144	0	-0.144
1	1	1	0.356	0.32	0.076	0.752	1	0.248	0.0496	0.0496	0.0496
2	0	0	0.406	0.370	0.126	0.126	0	-0.126	0.000	0.000	-0.025
2	0	1	0.406	0.370	0.100	0.470	0	-0.470	0.000	-0.094	-0.094
2	1	0	0.406	0.276	0.006	0.412	0	-0.412	-0.082	0.000	-0.082
2	1	1	0.323	0.276	-0.076	0.523	1	0.477	0.095	0.095	0.095
…	…
89	0	0	0.625	0.562	-0.312	-0.312	0	0.312	0	0	0.062
89	0	1	0.625	0.562	-0.25	0.313	0	-0.313	0	-0.063	-0.063
89	1	0	0.625	0.500	-0.312	0.313	0	-0.313	-0.063	0	-0.063
89	1	1	0.562	0.500	-0.375	0.687	1	0.313	0.063	0.063	0.063

After 89 epochs, we find the network to produce values near to the desired output. Since in this example the outputs are binary (zero or one), we can assume that any value produced by the network that is below 0.5 is considered to be 0 and any value above 0.5 is considered to be 1. So, we can draw a function , with the final weights and bias found by the learning algorithm w1=0.562, w2=0.5 and b=-0.375, defining the linear boundary in the chart:

This boundary is a definition of all classifications given by the network. You can see that the boundary is linear, given that the function is also linear. Thus, the perceptron network is really suitable for problems whose patterns are linearly separable.

The XOR case

Now let's analyze the XOR case:

We see that in two dimensions, it is impossible to draw a line to separate the two patterns. What would happen if we tried to train a single layer perceptron to learn this function? Suppose we tried, let's see what happened in the following table:

Epoch	x1	x2	w1	w2	b	y	t	E	?w1	?w2	?b
1	0	0	0.5	0.5	0.5	0.5	0	-0.5	0	0	-0.1
1	0	1	0.5	0.5	0.4	0.9	1	0.1	0	0.02	0.02
1	1	0	0.5	0.52	0.42	0.92	1	0.08	0.016	0	0.016
1	1	1	0.516	0.52	0.436	1.472	0	-1.472	-0.294	-0.294	-0.294
2	0	0	0.222	0.226	0.142	0.142	0	-0.142	0.000	0.000	-0.028
2	0	1	0.222	0.226	0.113	0.339	1	0.661	0.000	0.132	0.132
2	1	0	0.222	0.358	0.246	0.467	1	0.533	0.107	0.000	0.107
2	1	1	0.328	0.358	0.352	1.038	0	-1.038	-0.208	-0.208	-0.208
…	…
127	0	0	-0.250	-0.125	0.625	0.625	0	-0.625	0.000	0.000	-0.125
127	0	1	-0.250	-0.125	0.500	0.375	1	0.625	0.000	0.125	0.125
127	1	0	-0.250	0.000	0.625	0.375	1	0.625	0.125	0.000	0.125
127	1	1	-0.125	0.000	0.750	0.625	0	-0.625	-0.125	-0.125	-0.125

The perceptron just could not find any pair of weights that would drive the following error 0.625. This can be explained mathematically as we already perceived from the chart that this function cannot be linearly separable in two dimensions. So what if we add another dimension? Let's see the chart in three dimensions:

In three dimensions, it is possible to draw a plane that would separate the patterns, provided that this additional dimension could properly transform the input data. Okay, but now there is an additional problem: how could we derive this additional dimension since we have only two input variables? One obvious, but also workaround, answer would be adding a third variable as a derivation from the two original ones. And being this third variable a (derivation), our neural network would probably get the following shape:

Okay, now the perceptron has three inputs, one of them being a composition of the other. This also leads to a new question: how should that composition be processed? We can see that this component could act as a neuron, so giving the neural network a nested architecture. If so, there would another new question: how would the weights of this new neuron be trained, since the error is on the output neuron?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for A basic neural architecture – perceptrons

Create new playlist

Sign In

Sign Up

A basic neural architecture – perceptrons

Applications and limitations

Linear separation

The XOR case

Table of Contents for
A basic neural architecture – perceptrons