The need for multilayer networks

A multi-layer perceptron (MLP) contains one or more hidden layers (apart from one input and one output layer). While a single layer perceptron can learn only linear functions, a MLP can also learn non-linear functions.

Figure 7 shows MLP with a single hidden layer. Note that all connections have weights associated with them, but only three weights (w0, w1, and w2) are shown in the figure.

Input Layer: The Input layer has three nodes. The bias node has a value of 1. The other two nodes take X1 and X2 as external inputs (which are numerical values depending upon the input dataset). As discussed before, no computation, is performed in the Input Layer, so the outputs from nodes in the Input Layer are 1, X1, and X2 respectively, which are fed into the Hidden Layer.

Hidden Layer: The Hidden Layer also has three nodes, with the bias node having an output of 1. The output of the other two nodes in the Hidden Layer depends on the outputs from the Input Layer (1, X1, and X2) as well as the weights associated with the connections (edges). Remember that f refers to the activation function. These outputs are then fed to the nodes in the Output Layer.

  
Figure 8: A multi-layer perceptron having one hidden layer

Output Layer: The Output Layer has two nodes; they take inputs from the Hidden Layer and perform similar computations as shown for the highlighted hidden node. The values calculated (Y1 and Y2) as a result of these computations act as outputs of the multi-layer perceptron.

Given a set of features X = (x1, x2, …) and a target y, a multi-layer perceptron can learn the relationship between the features and the target for either classification or regression.

Let's take an example to understand multi-layer perceptrons better. Suppose we have the following student marks dataset:

Table 1 – Sample student marks dataset

Hours studied
Mid term marks
Final term results
35
67
Pass
12
75
Fail
16
89
Pass
45
56
Pass
10
90
Fail

The two input columns show the number of hours the student has studied and the mid term marks obtained by the student. The Final Result column can have two values, 1 or 0, indicating whether the student passed in the final term or not. For example, we can see that if the student studied 35 hours and had obtained 67 marks in the mid term, he/she ended up passing the final term.

Now, suppose we want to predict whether a student studying 25 hours and having 70 marks in the mid term will pass the final term:

Table 2 – Sample student with unknown final term result

Hours studied Mid term marks Final term result
26 70 ?

This is a binary classification problem, where a MLP can learn from the given examples (training data) and make an informed prediction given a new data point. We will soon see how a MLP learns such relationships.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset