IR is also one of the applications of Natural Language Processing.
Following are the aspects that can be considered while performing the evaluation of the IR system:
Evaluation is usually done by comparing one system with another.
IR systems can be compared on the basis of a set of documents, set of queries, techniques used, and so on. Metrics used for performance evaluation are Precision, Recall, and F-Measure. Let's learn a bit more about them:
Precision = |relevant ∩ retrieved| ÷ |retrieved| = P( relevant | retrieved )
Recall = |relevant ∩ retrieved| ÷ |relevant| = P( retrieved | relevant )
F-Measure = (2*Precision*Recall) / (Precision + Recall)