Metrics for error identification

Error identification is a very important aspect that affects the performance of an NLP system. Searching tasks may involve the following terminologies:

  • True Positive (TP): This may be defined as the set of relevant documents that is correctly identified as the relevant document.
  • True Negative (TN): This may be defined as the set of irrelevant documents that is correctly identified as the irrelevant document.
  • False Positive (FP): This is also referred to as Type I error and is the set of irrelevant documents that is incorrectly identified as the relevant document.
  • False Negative (FN): This is also referred to as Type II error and is the set of relevant documents that is incorrectly identified as the irrelevant document.

On the basis of the previously mentioned terminologies, we have the following metrics:

  • Precision (P) - TP/(TP+FP)
  • Recall (R) - TP/(TP+FN)
  • F-Measure – 2*P*R/(P+R)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset