Experimenting with a higher number of units in the hidden layer

Now, let's fine-tune the first model by changing the number of units in the first and only hidden layer using the following code:

# Model architecture
 model <- keras_model_sequential()
 model %>% 
   layer_dense(units = 30, activation = 'relu', input_shape = c(21)) %>% 
   layer_dense(units = 3, activation = 'softmax') 

summary(model)
OUTPUT
__________________________________________________________________________
Layer (type)                   Output Shape               Param #      
==========================================================================
dense_1 (Dense)                (None, 30)                  660          
__________________________________________________________________________
dense_2 (Dense)                (None, 3)                   93           
==========================================================================
Total params: 753
Trainable params: 753
Non-trainable params: 0
__________________________________________________________________________

# Compile model
 model %>% 
   compile(loss = 'categorical_crossentropy', 
           optimizer = 'adam',
           metrics = 'accuracy')

# Fit model
model_three <- model %>%
   fit(training, 
       trainLabels, 
       epochs = 200,
       batch_size = 32,
       validation_split = 0.2)
 plot(model_three )

As shown in the preceding code and output, we have increased the number of units in the first and only hidden layer from 8 to 30. The total number of parameters for this model is 753. We compile and fit the model with the same setting that we used earlier. We store the accuracy and loss values while fitting the model in model_three.

The following screenshot provides the plot for accuracy and loss for training and validation data based on the new classification model, as shown in the following graph:

Accuracy and loss for training and validation data

We can make the following observations from the preceding plot:

There is no evidence of overfitting.
After about 75 epochs, we do not see any major improvement in the model performance.

The prediction of classes using the test data and confusion matrix is obtained using the following code:

# Prediction and confusion matrix
 pred <- model %>%
    predict_classes(test)
 table(Predicted=pred, Actual=testtarget)

OUTPUT
          Actual
 ## Predicted   0   1   2
 ##         0 424  35   5
 ##         1  28  55   5
 ##         2   8   4  39

From the preceding confusion matrix, we can make the following observations:

We see improvements in the classification of 1 suspect and 2 pathological categories compared to the first model.
The correct classifications for the 0, 1, and 2 categories are 424, 55, and 39 respectively.
The overall accuracy using the test data comes to 85.9%, which is better than the first two models.

We can also obtain percentages that show how often this model correctly classifies each class by dividing the number of correct classifications in each column by the total of that column. We find that this classification model correctly classifies normal, suspect, and pathological cases with percentages of about 92.2%, 58.5%, and 79.6% respectively. So the model performance is at its highest when correctly classifying normal patients; however, the model accuracy drops to just 58.5% when correctly classifying patients in the suspect category. From the confusion matrix, we can see that the highest number of samples associated with misclassification is 35. Thus, there are 35 patients who actually belong to the suspect category, but the classification model incorrectly puts these patients in the normal category.

Table of Contents for Experimenting with a higher number of units in the hidden layer

Create new playlist

Sign In

Sign Up

Table of Contents for
Experimenting with a higher number of units in the hidden layer