Amazing, it learned! Or, did it really? A further step – testing

Well, we might ask now: so the neural network has already learned from the data; how can we attest it has effectively learned? Just like in exams students are subjected to, we need to check the network response after training. But wait! Do you think it is likely a teacher would put in an exam the same questions he/she has presented in class? There is no sense in evaluating somebody's learning with examples that are already known, or a suspecting teacher would conclude the student might have memorized the content, instead of having learned it.

Okay, let's now explain this part. What we are talking about here is testing. The learning process we have covered is called training. After training a neural network, we should test whether it has really learnt. For testing, we must present to the neural network another fraction of data from the same environment it has learnt from. This is necessary because, just like the student, the neural network could respond properly to only the data points it has been exposed to; this is called overtraining. To check whether the neural network has not passed on overtraining, we must check its response to other data points.

The following figure illustrates the overtraining problem. Imagine that our network is designed to approximate some function f(x) whose definition is unknown. The neural network was fed with some data from that function and produced the result shown on the left in the following figure. But when expanding to a wider domain, for example, adding a testing dataset, we note the neural response does not follow the data (on the right in the figure):

Amazing, it learned! Or, did it really? A further step – testing

In this case, we see that the neural network failed to learn the whole environment (the function f(x)). This happens because of a number of reasons:

  • The neural network didn't receive enough information from the environment
  • The data from the environment is nondeterministic
  • The training and testing datasets are poorly defined
  • The neural network has learnt so much on the training data, it has forgotten about the testing data

Throughout this book, we are going to cover the process to prevent this and other issues that may arise during training.

Overfitting and overtraining

In our previous example, the neural network seemed to have learned amazingly well. However, there is a risk of overfitting and overtraining. The difference between these two concepts is very subtle. An overfitting occurs when the neural network memorizes the problem's behavior, so that it can provide good values only on training points, therefore losing a generalization capacity. Overtraining, which can be a cause for overfitting, occurs when the training error becomes much smaller than the testing error, or actually, the testing error starts to increase as the neural network continues (over)training:

Overfitting and overtraining

One of ways to prevent overtraining and overfitting is checking the testing error when the training goes on. When the testing error starts to increase, it is time to stop. This will be covered more in detail in the next chapters.

Now, let's see if there is the case in our example. Let's now add some more data and test it:

Double[][] _testDataSet ={
  {-1.7 , fncTest(-1.7) }
, {-1.0 , fncTest(-1.0) }
, {0.0 , fncTest(0.0) }
, {0.8 , fncTest(0.8) }
, {2.0 , fncTest(2.0) }
};
NeuralDataSet testDataSet = new NeuralDataSet(_testDataSet, ....inputColumns, outputColumns);
deltaRule.setTestingDataSet(testDataSet);
deltaRule.test();
testDataSet.printNeuralOutput();
Overfitting and overtraining
Overfitting and overtraining

As can be seen, the neural network presents a generalization capacity in this case. In spite of the simplicity of this example, we can still see the learning skill of the neural network.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset