Early stopping of network training

When training a network, we specify the number of epochs we need in advance, without knowing how many epochs will actually be needed. If we specify the number of epochs to be too few compared to what is actually required, we may have to train the network again by specifying more epochs. On the other hand, if we specify too many more epochs than what are actually needed, then this may lead to an overfitting situation and we may have to retrain the network by reducing the number of epochs. This trial and error approach can be very time-consuming for applications where each epoch takes a long time to complete. In such situations, we can make use of callbacks that can help stop the network training at a suitable time.

To illustrate this problem, let's develop a classification model with the CTG data from Chapter 2, Deep Neural Networks for Multi-Class Classification, using the following code:

# Training network for classification with CTG data (chapter-2)
model <- keras_model_sequential()
model %>% 
  layer_dense(units = 25, activation = 'relu', input_shape = c(21)) %>%
  layer_dense(units = 3, activation = 'softmax') 
model %>% compile(loss = 'categorical_crossentropy', 
                  optimizer = 'adam',
                  metrics = 'accuracy')
history <- model %>% fit(training, 
                         trainLabels, 
                         epochs = 50,  
                         batch_size = 32,
                         validation_split = 0.2)
plot(history)

In the preceding code, we have specified the number of epochs as 50. Once the training process is completed, we can plot the loss and accuracy values for the training and validation data, as follows:

From the preceding plot, we can observe the following:

We can observe that the loss values for the validation data decrease initially for the first few epochs and then start to increase.
The plot also shows that, after the first few epochs, the loss values for the training and validation data show divergence and tend to go in the opposite direction.
If we would like to stop the training process much earlier instead of waiting for all 50 epochs to be completed, then we can make use of the callback feature that's available in Keras.

The following code includes the callback feature within the fit function at the time of training the network:

# Training network with callback
model <- keras_model_sequential()
model %>% 
  layer_dense(units = 25, activation = 'relu', input_shape = c(21)) %>%
  layer_dense(units = 3, activation = 'softmax') 
model %>% compile(loss = 'categorical_crossentropy', 
                  optimizer = 'adam',
                  metrics = 'accuracy')
history <- model %>% fit(training, 
                         trainLabels, 
                         epochs = 50,  
                         batch_size = 32,
                         validation_split = 0.2,
                         callbacks = callback_early_stopping(monitor = "val_loss", 
                                                   patience = 10))
plot(history)

In the preceding code, early stopping is included for callbacks:

The metric that we used for monitoring was validation loss values. Another metric that can be tried in this situation is validation accuracy since we are developing a classification model.
We have specified patience to be 10, which means that when there are no improvements for 10 epochs, the training process will be stopped automatically.

The plot for the loss and accuracy are also useful in helping us decide on the appropriate values for patience. The following plot is for the loss and accuracy:

As we can see, this time, the training process didn't run for all 50 epochs and stopped as soon as there were no improvements in the loss values for 10 epochs.

Table of Contents for Early stopping of network training

Create new playlist

Sign In

Sign Up

Table of Contents for
Early stopping of network training