Accuracy

To understand accuracy, let's consider an example of spam filtering with the classifier algorithm. One possibility is that the true label was classified correctly but there was a mistake in classification, and the other possibility is that the wrong label was classified as correct (which is wrong), and the wrong label was properly classified. The following are the definitions in terms of metrics:

  • True Positives (TP): This is the number of positive examples labeled as positive
  • False Positives (FP): This is the number of negative examples labeled as positive
  • True Negatives (TN): This is the number of negative examples labeled as negative
  • False Negatives (FN): This is the number of positive examples labeled as negative

Let's further understand this with the help of a table:

Dataset number Correct label Correct classifier
1 T T
2 T N
3 N T
4 N N
5 T T
6 T

N

From the preceding table, the following can be inferred:

  • True Positive use cases: Dataset number 1 and 5 are true positive scenarios, therefore TP = 2
  • True Negative use cases: Dataset number 4 is a true negative use cases, therefore TN = 1
  • False Positive use cases: Dataset number 3 is an example of a false positive, therefore FP = 1
  • False Negative use cases: Dataset number 2 and 6 are examples of false negative, therefore FN = 2

The accuracy is calculated using the following formula:

accuracy = (TP + TN)/(TP + TN + FP + FN)

Therefore, in our case, accuracy = (2+1)/6 = 0.5 (50%).

If we have a dataset sequence that, for some reason, has 0 True Positive and 0 False Positive, we will see the increase in accuracy metrics, which is incorrect. This phenomenon is called the accuracy paradox. When True Positive is less than False Positive, the accuracy increases when the classification always outputs the negative category. On the other hand, the same thing happens when True Negative is less than False Negative, and our classification always outputs positively.

In order to overcome this effect, we will define precision and recall.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset