More often than not, your model in practice will return multiple detection windows for the same object. To handle this, we use an algorithm called Non-Maximum Suppression. This algorithm filters these multiple boxes using the "IoU and presence of object" as heuristics. Here's how it works:
- Discard all boxes with a low probability of containing an object (pc < 0.6)
- Select the box with the biggest probability of having an object (pc on our label)
- Discard all boxes with a high overlap with the selected box (IoU > 0.5)
- Repeat steps 2 and 3 until all detections are either discarded or selected
We will use the Non-Maximum suppression on the prediction time on our detector:
Tensorflow already has a function that implements the non-maxima suppression algorithm, called tf.image.non_max_suppression.