Confusion matrix

To obtain a confusion matrix, let's start by making a prediction for the test data and save it in pred. We use predict_classes to make this prediction and then use the table function to create a summary of predicted versus actual values for the test data to create a confusion matrix, as shown in the following code:

# Prediction and confusion matrix
pred <- model %>%
predict_classes(test)
table(Predicted=pred, Actual=testtarget)

OUTPUT
Actual
## Predicted 0 1 2
## 0 435 41 11
## 1 24 51 16
## 2 1 2 22

In the preceding confusion matrix, shown as output", values 0, 1, and 2 represent normal, suspect, and pathological categories respectively. From the confusion matrix, we can make the following observations:

  • There were 435 patients in the test data who were actually normal and the model also predicted them as being normal.
  • Similarly, there were 51 correct predictions for the suspect group and 22 correct predictions for the pathological group.
  • If we add all the numbers on the diagonal of the confusion matrix, which are the correct classifications, we obtain 508 (435 + 51 + 22), or an accuracy level of 84.2% ((508 รท 603) x 100).
  • In the confusion matrix, the off diagonal numbers indicate the number of patients who are misclassified. The highest number of misclassifications is 41, where the patients actually belong to the suspect group but the model incorrectly classified them in the normal category of patients.
  • The instance of misclassification with the lowest number involved one patient who actually belonged to the normal category, but the model incorrectly classified this patient in the pathological category.

Let's also look at the predictions in terms of probabilities instead of only classes, which was the approach that we used previously. To predict probabilities, we can use the predict_prob function. We can then look at the first seven rows from the test data using the cbind function for comparison, as shown in the following code:

# Prediction probabilities
prob <- model %>%
predict_proba(test)
cbind(prob, pred, testtarget)[1:7,]

OUTPUT
pred testtarget
[1,] 0.993281245 0.006415705 0.000302993 0 0
[2,] 0.979825318 0.018759586 0.001415106 0 0
[3,] 0.982333243 0.014519051 0.003147765 0 0
[4,] 0.009040437 0.271216542 0.719743013 2 2
[5,] 0.008850170 0.267527819 0.723622024 2 2
[6,] 0.946622312 0.030137880 0.0232398603 0 1
[7,] 0.986279726 0.012411724 0.0013086179 0 0

In the preceding output, we have probability values for three categories based on the model and we also have the predicted category represented by pred and the actual category represented by testtarget in the test data. From the preceding output, we can make the following observations:

  • For the first sample, the highest probability of 0.993 is for the normal category of patients, and that is the reason the predicted class is identified as 0. Since this prediction matches the actual result in the test data, we treat this as the correct classification.
  • Similarly, since the fourth sample shows the highest probability of 0.7197 for the third category, the predicted class is labeled as 2, which turns out to be a correct prediction.
  • However, the sixth sample has the highest probability of 0.9466 for the first category represented by 0, whereas the actual class is 1. In this case, our model misclassifies the sample.

Next, we will explore the option of improving the classification performance of the model to obtain better accuracy. Two key strategies that we can follow are to increase the number of hidden layers for building a deeper neural network and to change the number of units in the hidden layer. We will explore these options in the next section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset