Changes to the architecture

We modify the architecture of the CNN by adding more convolutional layers to illustrate how such layers can be added. Take a look at the following code:

# Model architecture
model <- keras_model_sequential() 
model %>% 
         layer_conv_2d(filters = 32, kernel_size = c(3,3), 
                        activation = 'relu', input_shape = c(28,28,1)) %>%   
         layer_conv_2d(filters = 32, kernel_size = c(3,3), 
                        activation = 'relu') %>%  
         layer_max_pooling_2d(pool_size = c(2,2)) %>% 
         layer_dropout(rate = 0.25) %>%   
         layer_conv_2d(filters = 64, kernel_size = c(3,3), 
                        activation = 'relu') %>% 
         layer_conv_2d(filters = 64, kernel_size = c(3,3), 
                        activation = 'relu') %>%  
         layer_max_pooling_2d(pool_size = c(2,2)) %>% 
         layer_dropout(rate = 0.25) %>%   
         layer_flatten() %>% 
         layer_dense(units = 512, activation = 'relu') %>%  
         layer_dropout(rate = 0.5) %>% 
         layer_dense(units = 10, activation = 'softmax')

# Compile model
model %>% compile(loss = 'categorical_crossentropy',
                  optimizer = optimizer_adadelta(),
                  metrics = 'accuracy')

# Fit model
model_two <- model %>% fit(trainx, 
                         trainy, 
                         epochs = 15, 
                         batch_size = 128, 
                         validation_split = 0.2)
plot(model_two)

In the preceding code, for the first two convolutional layers, we use 32 filters each, and for the next set of convolutional layers, we use 64 filters each. After each pair of convolutional layers, as done earlier, we add pooling and dropout layers. Another change carried out here is the use of 512 units in the dense layer. Other settings are similar to the earlier network.

The following screenshot shows accuracy and loss for training and validation data (model_two):

The plot based on model_two shows closer performance between training and validation data for loss and accuracy compared to model_one. In addition, a flattening of the lines toward the fifteenth epoch also suggests that increasing the number of epochs is not likely to help much in improving the classification performance further.

Loss and accuracy values for the training data are obtained as follows:

# Loss and accuracy
model %>% evaluate(trainx, trainy)

$loss 0.1587473
$acc 0.94285

The loss and accuracy values based on this model do not show a major improvement, with the loss value being slightly higher and accuracy values being slightly lower.

The following confusion matrix summarizes the predicted and actual classes:

# Confusion matrix for training data
pred <- model %>%   predict_classes(trainx)
table(Predicted=pred, Actual=mnist$train$y)

OUTPUT
         Actual
Predicted    0    1    2    3    4    5    6    7    8    9
        0 5499    0   58   63    3    0  456    0    4    0
        1    2 5936    1    5    3    0    4    0    1    0
        2   83    0 5669   13  258    0  438    0    7    0
        3   69   52   48 5798  197    0  103    0    6    0
        4    3    3  136   49 5348    0  265    0    5    0
        5    0    0    0    0    0 5879    0    3    0    4
        6  309    6   73   67  181    0 4700    0    2    0
        7    0    0    0    0    0   75    0 5943    1  169
        8   35    3   15    5   10    3   34    0 5974    2
        9    0    0    0    0    0   43    0   54    0 5825

From the confusion matrix, we can make the following observations:

It shows that the model has maximum confusion (456 misclassifications) between item 6 (shirt) and item 0 (t-shirt/top). And this confusion is observed in both directions, where item 6 is confused for item 0, and item-0 being confused for item 6.
Item 8 (bag) has been classified most accurately, with 5,974 instances out of a total of 6,000 (about 99.6% accuracy).
Item-6 (shirt) has been classified with the lowest accuracy out of 10 categories, with 4,700 instances out of 6,000 (about 78.3% accuracy).

For the test data loss, the accuracy and confusion matrices are provided as follows:

# Loss and accuracy for the test data
model %>% evaluate(testx, testy)

$loss 0.2233179
$acc 0.9211

# Confusion matrix for test data
pred <- model %>% predict_classes(testx)
table(Predicted=pred, Actual=mnist$test$y)

OUTPUT
         Actual
Predicted   0   1   2   3   4   5   6   7   8   9
        0 875   1  18   8   0   0 104   0   3   0
        1   0 979   0   2   0   0   0   0   0   0
        2  19   0 926   9  50   0  78   0   1   0
        3  10  14   9 936  35   0  19   0   3   0
        4   2   0  30  12 869   0  66   0   0   0
        5   0   0   0   0   0 971   0   2   1   2
        6  78   3  16  29  45   0 720   0   1   0
        7   0   0   0   0   0  18   0 988   1  39
        8  16   3   1   4   1   0  13   0 989   1
        9   0   0   0   0   0  11   0  10   1 958

From the preceding output, we observe that loss is lower than we obtained with the earlier model, while accuracy is slightly lower than the earlier performance. From the confusion matrix, we can make the following observations:

It shows that the model has the maximum confusion (104 misclassifications) between item 6 (shirt) and item 0 (t-shirt/top).
Item 8 (bag) has been classified most accurately, with 989 instances out of a total of 1,000 (about 98.9% accuracy).
Item 6 (shirt) has been classified with the lowest accuracy out of 10 categories, with 720 instances out of 1,000 (about 72.0% accuracy).

Thus, overall, we observe a similar performance to the one that we observed with the training data.

For the 20 images of fashion items downloaded from the internet, the following screenshot summarizes the performance of the model:

As seen from the preceding plot, this time, we have 17 out 20 images correctly classified. Although this is a slightly better performance, it is still a little lower than the figure in the region of 92% accuracy for the test data. In addition, note that due to a much smaller sample, the accuracy values can fluctuate significantly.

In this section, we made modifications to the 20 new images and made some changes to the CNN model architecture to obtain a better classification performance.

Table of Contents for Changes to the architecture

Create new playlist

Sign In

Sign Up

Table of Contents for
Changes to the architecture