Now that our algorithm works for single frames, how can we make sure that the image found in one frame will also be found in the very next frame?
In FeatureMatching.__init__
, we created some bookkeeping variables that we said we would use for feature tracking. The main idea is to enforce some coherence while going from one frame to the next. Since we are capturing roughly 10 frames per second, it is reasonable to assume that the changes from one frame to the next will not be too radical. Therefore, we can be sure that the result we get in any given frame has to be similar to the result we got in the previous frame. Otherwise, we discard the result and move on to the next frame.
However, we have to be careful not to get stuck with a result that we think is reasonable but is actually an outlier. To solve this problem, we keep track of the number of frames we have spent without finding a suitable result. We use self.num_frames_no_success
; if this number is smaller than a certain threshold, say self.max_frames_no_success
, we do the comparison between the frames. If it is greater than the threshold, we assume that too much time has passed since the last result was obtained, in which case it would be unreasonable to compare the results between the frames.
We can extend the idea of outlier rejection to every step in the computation. The goal then becomes minimizing the workload while maximizing the likelihood that the result we obtain is a good one.
The resulting procedure for early outlier detection and rejection is embedded in FeatureMatching.match
and looks as follows:
def match(self, frame): # create a working copy (grayscale) of the frame # and store its shape for convenience img_query = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) sh_query = img_query.shape[:2] # rows,cols
key_query, desc_query = self._extract_features(img_query) good_matches = self._match_features(descQuery)
In order for RANSAC to work in the very next step, we need at least four matches. If fewer matches are found, we admit defeat and return False
right away:
if len(good_matches) < 4: self.num_frames_no_success=self.num_frames_no_success + 1 return False, frame
dst_corners
):dst_corners = self._detect_corner_points(key_query, good_matches)
If any of these points lies significantly outside the image (by 20 pixels in our case), it means that either we are not looking at our object of interest, or the object of interest is not entirely in the image. In both cases, we have no interest in proceeding, and we return False
:
if np.any(filter(lambda x: x[0] < -20 or x[1] < -20 or x[0] > sh_query[1] + 20 or x[1] > sh_query[0] + 20, dst_corners)): self.num_frames_no_success = self.num_frames_no_success + 1 return False, frame
area = 0 for i in xrange(0, 4): next_i = (i + 1) % 4 area = area + (dst_corners[i][0]*dst_corners[next_i][1]- dst_corners[i][1]*dst_corners[next_i][0])/2.
If the area is either unreasonably small or unreasonably large, we discard the frame and return False
:
if area < np.prod(sh_query)/16. or area > np.prod(sh_query)/2.: self.num_frames_no_success=self.num_frames_no_success + 1 return False, frame
self.last_hinv
), it means that we are probably looking at a different object, in which case we discard the frame and return False
. We compare the current homography matrix to the last one by calculating the distance between the two matrices:np.linalg.norm(Hinv – self.last_hinv)
However, we only want to consider self.last_hinv
if it is fairly recent, say, from within the last self.max_frames_no_success
. This is why we keep track of self.num_frames_no_success
:
recent = self.num_frames_no_success < self.max_frames_no_success similar = np.linalg.norm(Hinv - self.last_hinv) < self.max_error_hinv if recent and not similar: self.num_frames_no_success = self.num_frames_no_success + 1 return False, frame
This will help us keep track of the one and the same object of interest over time. If, for any reason, we lose track of the pattern image for more than self.max_frames_no_success
frames, we skip this condition and accept whatever homography matrix was recovered up to that point. This makes sure that we do not get stuck with some self.last_hinv
matrix that is actually an outlier.
Otherwise, we can be fairly certain that we have successfully located the object of interest in the current frame. In such a case, we store the homography matrix and reset the counter:
self.num_frames_no_success = 0 self.last_hinv = Hinv
All that is left to do is warping the image and (for the first time) returning True
along with the warped image so that the image can be plotted:
img_out = cv2.warpPerspective(img_query, Hinv, dst_size) img_out = cv2.cvtColor(img_out, cv2.COLOR_GRAY2RGB) return True, imgOut