Experimenting with an additional hidden layer

In this experiment, we will add an additional hidden layer to the previous model. The code and output of the summary of the model is given as follows:

# Model architecture
model <- keras_model_sequential()
model %>%
layer_dense(units = 8, activation = 'relu', input_shape = c(21)) %>%
layer_dense(units = 5, activation = 'relu') %>%
layer_dense(units = 3, activation = 'softmax')

summary(model)

OUTPUT
___________________________________________________________________________
Layer (type) Output Shape Param #
===========================================================================
dense_1 (Dense) (None, 8) 176
___________________________________________________________________________
dense_2 (Dense) (None, 5) 45
___________________________________________________________________________
dense_3 (Dense) (None, 3) 18
===========================================================================
Total params: 239
Trainable params: 239
Non-trainable params: 0
___________________________________________________________________________

As shown in the preceding code and output, we have added a second hidden layer with 5 units. In this hidden layer too, we use relu as the activation function. Note that as a result of this change, we have increased the total number of parameters from 203 in the previous model to 239 in this model.

Next, we compile and then fit the model using the following code:

# Compile and fit model
model %>%
compile(loss = 'categorical_crossentropy',
optimizer = 'adam',
metrics = 'accuracy')
model_two <- model %>%
fit(training,
trainLabels,
epochs = 200,
batch_size = 32,
validation_split = 0.2)
plot(model_two)

As shown in the preceding code, we have compiled the model with same settings that we used earlier. We have also kept the setting for the fit function the same as earlier. The model-output-related information is stored in model_two. The following diagram provides the plot of accuracy and loss for model_two:

Accuracy and loss for training and validation data

From the preceding diagram, we can make the following observations:

  • The accuracy values based on training and validation data remain relatively constant for the first few epochs.
  • After about 20 epochs, the accuracy for the training data starts to increase and then continues to increase for the remaining epochs. The rate of increase, however, slows down after about 100 epochs.
  • On the other hand, the accuracy based on validation data drops for approximately 50 epochs, then starts to increase, and then becomes more or less constant after about 125 epochs.
  • Similarly, loss values initially drop significantly for training data, but after about 50 epochs, the rate of decrease drops.
  • The loss values for the validation data drop during the initial few epochs and then increase and stabilize after about 25 epochs. 

Using class predictions based on the test data, we can also obtain a confusion matrix to assess the performance of this classification model. The following code is used to obtain a confusion matrix:

# Prediction and confusion matrix
pred <- model %>%
predict_classes(test)
table(Predicted=pred, Actual=testtarget)

OUTPUT
Actual
## Predicted 0 1 2
## 0 429 38 4
## 1 29 54 33
## 2 2 2 12

From the preceding confusion matrix, we can make the following observations:

  • By comparing correct classifications for 0, 1, and 2 classes with the previous model, we notice that improvement is only seen for class 1, whereas the correct classifications for classes 0 and 2 have, in fact, reduced.
  • The overall accuracy for this model is 82.1%, which is below the accuracy value of 84.2% that we obtained earlier. So, our attempt to make our model slightly deeper did not improve accuracy, in this case.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset