Experimenting with VGG16 as a pretrained network

In this experiment, we will use a pretrained network called VGG16. VGG16 is a convolutional neural network that is 16 layers deep and can classify images into thousands of categories. This network is also trained using over 1 million images from the ImageNet database. The code for the model's architecture and compiling and then fitting the model is as follows:

# Pretrained model
pretrained <- application_vgg16(weights = 'imagenet', 
                           include_top = FALSE,
                           input_shape = c(224, 224, 3))

# Model architecture
model <- keras_model_sequential() %>% 
  pretrained %>% 
  layer_flatten() %>% 
  layer_dense(units = 256, activation = "relu") %>% 
  layer_dense(units = 10, activation = "softmax")
summary(model)

freeze_weights(pretrained)
summary(model)
_________________________________________________________________________
Layer (type)                    Output Shape            Param #      
=========================================================================
vgg16 (Model)                  (None, 7, 7, 512)        14714688     
__________________________________________________________________________
flatten (Flatten)              (None, 25088)              0            
__________________________________________________________________________
dense (Dense)                  (None, 256)              6422784      
__________________________________________________________________________
dense_1 (Dense)                (None, 10)                2570         
==========================================================================
Total params: 21,140,042
Trainable params: 6,425,354
Non-trainable params: 14,714,688
___________________________________________________________________________

# Compile model
model %>% compile(loss = 'categorical_crossentropy',
                  optimizer = 'adam',    
                  metrics = 'accuracy')

# Fit model
model_four <- model %>% fit(trainx, 
                         trainy, 
                         epochs = 10, 
                         batch_size = 10, 
                         validation_split = 0.2)

From the preceding summary, we can observe the following:

This model has 21,140,042 parameters, which, after freezing the weights of VGG16, goes down to a total of 6,425,354 trainable parameters.
When compiling the model, we retain the use of the adam optimizer.
In addition, we run 10 epochs to train the model. All the other settings are the same ones that we used for the previous models.

A plot of the accuracy and loss values after training the model is as follows:

The preceding loss and accuracy plot for the training and validation data indicates that, after about four epochs, the model performance remains flat. This is in contrast to the previous model, where the loss values for the validation data showed a gradual increase.

The code for obtaining the loss, accuracy, and confusion matrix for the test data is as follows:

# Loss and accuracy
model %>% evaluate(testx, testy)
$loss
 [1] 1.673867
$acc
 [1] 0.7565

# Confusion matrix
pred <- model %>% predict_classes(testx)
(tab <- table(Predicted = pred, Actual = data$test$y[1:2000,]))
         Actual
Predicted   0   1   2   3   4   5   6   7   8   9
        0 137   2  12   0   6   0   0   1  11   6
        1   9 172   1   0   0   0   0   1   9  21
        2   7   0 123  11  11   3   3   5   3   0
        3   3   0  11 130  10  35   7   7   0   0
        4   7   0  13   5 118   7  10   5   1   0
        5   1   0  11  27   3 125   2   7   0   0
        6   2   5  20  18  21   8 192   3   4   1
        7   6   0   4   6  25   7   2 163   2   1
        8  18   6   0   2   4   0   0   1 182   3
        9   6  13   0   0   0   0   0   0   5 171

# Accuracy for each category
100*diag(tab)/colSums(tab)
       0        1        2        3        4 
69.89796 86.86869 63.07692 65.32663 59.59596 
       5        6        7        8        9 
67.56757 88.88889 84.45596 83.87097 84.23645

From the preceding output, we can make the following observations:

The loss and accuracy of the test data are 1.674 and 0.757, respectively.
The confusion matrix provides further insights. This model has the best classification accuracy of 88.9% when classifying category 6 (frog).
On the other hand, the accuracy when classifying category 4 (deer) images is only about 59.6%.

In this section, we experimented with three situations:

The use of the adam optimizer improved the results a little bit and provided test data accuracy of about 77.2%.
In the second experiment, hyperparameter tuning provided the best results for the number of dense units at 512, a dropout rate at 0.1, and a batch size at 30. This combination of parameters helped us obtain a test data accuracy of about 79.8%.
The third experiment, where we used the VGG16 pretrained network, also provided decent results. However, it provided test data accuracy of slightly lower than 75.7%.

Another approach when working with smaller datasets is to use data augmentation. In this approach, the existing images are modified (by flipping, rotation, shifting, and so on) to create new samples. Since images in image datasets aren't always centered, such artificially created new samples help us to learn about useful features that, in turn, improve image classification performance.

Table of Contents for Experimenting with VGG16 as a pretrained network

Create new playlist

Sign In

Sign Up

Table of Contents for
Experimenting with VGG16 as a pretrained network