Now, let's implement all the theory that we've discussed so far. Here, we use the classes that define the ANN structures NeuralNet
, Layer
, Neuron
, and so on. Now, we add HiddenLayer
and OutputLayer
functions, which are inherited from the Layer
class, to implement multilayer neural networks.
We also implement the two learning algorithms that we've presented in this chapter: Backpropagation and Levenberg–Marquardt. In the Training
class, we add two new terms to the enum Training types: BACKPROPAGATION
and LEVENBERG_MARQUARDT
.
In order to make the execution of the Levenberg–Marquardt algorithm possible, we add a new package called edu.packt.neuralnet.util
and two more classes, namely Matrix
and IdentityMatrix
. These classes implement matrix operations, which are applied in the Levenberg–Marquardt algorithm. However, we are not going to detail these classes now; we're just going to use the basic operations of matrix.
The following table shows a list of relevant attributes and methods of the classes used in this chapter:
The class diagram changes are shown in the following figure. Attributes and methods already explained in the previous chapters and their configuration methods (getters and setters) were omitted.
We have seen in the flowchart that the backpropagation algorithm has two phases:
So, the backpropagation class will have two special methods for each of these phases: forward()
and backpropagation()
. The train()
method of the backpropagation class will call these two latter functions.
Let's analyze the methods forward, backpropagation, and train. The train method calls forward and backpropagation.
public NeuralNet train(NeuralNet n) { int epoch = 0; setMse(1.0); while(getMse() > n.getTargetError()) { if ( epoch >= n.getMaxEpochs() ) break; int rows = n.getTrainSet().length; double sumErrors = 0.0; for (int rows_i = 0; rows_i < rows; rows_i++) { n = forward(n, rows_i); n = backpropagation(n, rows_i); sumErrors = sumErrors + n.getErrorMean(); } setMse( sumErrors / rows ); System.out.println( getMse() ); epoch++; } System.out.println("Number of epochs: "+epoch); return n; }
First, this code gets the training parameters and sets the MSE (which stands for mean square error), which will be the stop condition. The first loop handles this stop condition in case the MSE falls below the target. Also, inside this loop, there is a break in case the number of epochs currently executed reaches the maximum.
The second loop will go over every data point in the training dataset, repeating for each data point the training process, first calling the forward function and then the backpropagation function, which will be detailed ahead in this section. The errors are summed up. After going over all the data points in the training set, this method sets the current MSE, prints it on the screen, and increases the number of epochs.
Now, let's analyze the forward and backpropagation functions. Since they are quite long, we are going to explore the most important parts.
The forward function executes the neural computation from the input to the output layers. For simplicity, this implementation will handle only one hidden layer and one output layer, provided that this simple architecture is proved to work quite well when compared to multiple hidden layer networks. The function receives as a parameter the neural network and the row of the dataset to be forwarded.
private NeuralNet forward(NeuralNet n, int row)
It initializes some parameters such as sum error and the estimated and real outputs. There is basically one major loop containing two minor loops, one for the hidden layer and the other for the output layer.
for (HiddenLayer hiddenLayer : listOfHiddenLayer) { int numberOfNeuronsInLayer = hiddenLayer.getNumberOfNeuronsInLayer(); for (Neuron neuron : hiddenLayer.getListOfNeurons()) { for (int layer_j = 0; layer_j < numberOfNeuronsInLayer - 1; layer_j++) { } for (int outLayer_i = 0; outLayer_i < n.getOutputLayer().getNumberOfNeuronsInLayer(); outLayer_i++){ } double errorMean = sumError / n.getOutputLayer() .getNumberOfNeuronsInLayer(); n.setErrorMean(errorMean); n.getListOfHiddenLayer().get(hiddenLayer_i) .setListOfNeurons(hiddenLayer.getListOfNeurons()); } }
After computing the outputs for the hidden and output layers, this function finally calculates the error, which will be used for backpropagation. The computation for the hidden layer and the output layer is detailed in the source codes attached to this chapter.
The backpropagation function also receives as parameters the neural network and the row indicating the data point to be trained.
private NeuralNet backpropagation(NeuralNet n, int row)
For an easier understanding, this function is divided into six parts:
Let's focus on parts 2 to 5. The sensibility for the output layer is quite simple. Looking at the line computing the sensibility parameter shows us the delta rule.
//sensibility output layer for (Neuron neuron : outputLayer) { error = neuron.getError(); netValue = neuron.getOutputValue(); sensibility = derivativeActivationFnc( n.getActivationFncOutputLayer(), netValue ) * error; neuron.setSensibility(sensibility); }
For the hidden layer, there is a need to sum up the weights and the sensibilities of the output layer. The local variable called tempSensibility
handles this sum, after being used in the calculation of the sensibility. It can be seen that this parameter is calculated inside a loop that runs over all neurons contained in that layer.
for (Neuron neuron : hiddenLayer) { sensibility = 0.0; if(neuron.getListOfWeightIn().size() > 0) { //exclude bias ArrayList<Double> listOfWeightsOut = new ArrayList<Double>(); listOfWeightsOut = neuron.getListOfWeightOut(); double tempSensibility = 0.0; int weight_i = 0; for (Double weight : listOfWeightsOut) { tempSensibility = tempSensibility + (weight * outputLayer.get(weight_i) .getSensibility()); weight_i++; } sensibility = derivativeActivationFnc ( n.getActivationFnc(), neuron.getOutputValue() ) * tempSensibility; neuron.setSensibility(sensibility); } }
The weight updating in the output layer is as simple as its respective sensibility. There is a loop inside this part to walk over all the hidden layer neurons connected to each output neuron. The local variable called newWeight
is in charge of receiving the new value for the respective weight.
//fix weights (teach) [output layer to hidden layer] for (int outLayer_i = 0; outLayer_i < n.getOutputLayer().getNumberOfNeuronsInLayer(); outLayer_i++) { for (Neuron neuron : hiddenLayer) { double newWeight = neuron.getListOfWeightOut() .get( outLayer_i ) + ( n.getLearningRate() * outputLayer.get( outLayer_i ) .getSensibility() * neuron.getOutputValue() ); neuron.getListOfWeightOut().set(outLayer_i, newWeight); } }
For the hidden layer, it is the sensibility parameters that are used, according to the equations shown in the backpropagation section. There is also an inside loop to walk over all the neural inputs.
//fix weights (teach) [hidden layer to input layer] for (Neuron neuron : hiddenLayer) { ArrayList<Double> hiddenLayerInputWeights = new ArrayList<Double>(); hiddenLayerInputWeights = neuron.getListOfWeightIn(); if(hiddenLayerInputWeights.size() > 0) { //exclude bias int hidden_i = 0; double newWeight = 0.0; for (int i = 0; i < n.getInputLayer().getNumberOfNeuronsInLayer(); i++) { newWeight = hiddenLayerInputWeights.get(hidden_i) + ( n.getLearningRate() * neuron.getSensibility() * n.getTrainSet()[row][i] ); neuron.getListOfWeightIn().set(hidden_i, newWeight); hidden_i++; } } }