Before discussing epipolar geometry, let's discuss what happens when we capture two images of the same scene from two different viewpoints. Consider the following figure:
Let's see how it happens in real life. Consider the following image:
Now, let's capture the same scene from a different viewpoint:
Our goal is to match the keypoints in these two images to extract the scene information. The way we do this is by extracting a matrix that can associate the corresponding points between two stereo images. This is called the fundamental matrix.
As we saw in the camera figure earlier, we can draw lines to see where they meet. These lines are called epipolar lines. The point at which the epipolar lines converge is called epipole. If you match the keypoints using SIFT, and draw the lines towards the meeting point on the left image, it will look like this:
Following are the matching feature points in the right image:
The lines are epipolar lines. If you take the second image as the reference, they will appear as shown in the next image:
Following are the matching feature points in the first image:
It's important to understand epipolar geometry and how we draw these lines. If two frames are positioned in 3D, then each epipolar line between the two frames must intersect the corresponding feature in each frame and each of the camera origins. This can be used to estimate the pose of the cameras with respect to the 3D environment. We will use this information later on, to extract 3D information from the scene. Let's take a look at the code:
import argparse import cv2 import numpy as np def build_arg_parser(): parser = argparse.ArgumentParser(description='Find fundamental matrix using the two input stereo images and draw epipolar lines') parser.add_argument("--img-left", dest="img_left", required=True, help="Image captured from the left view") parser.add_argument("--img-right", dest="img_right", required=True, help="Image captured from the right view") parser.add_argument("--feature-type", dest="feature_type", required=True, help="Feature extractor that will be used; can be either 'sift' or 'surf'") return parser def draw_lines(img_left, img_right, lines, pts_left, pts_right): h,w = img_left.shape img_left = cv2.cvtColor(img_left, cv2.COLOR_GRAY2BGR) img_right = cv2.cvtColor(img_right, cv2.COLOR_GRAY2BGR) for line, pt_left, pt_right in zip(lines, pts_left, pts_right): x_start,y_start = map(int, [0, -line[2]/line[1] ]) x_end,y_end = map(int, [w, -(line[2]+line[0]*w)/line[1] ]) color = tuple(np.random.randint(0,255,2).tolist()) cv2.line(img_left, (x_start,y_start), (x_end,y_end), color,1) cv2.circle(img_left, tuple(pt_left), 5, color, -1) cv2.circle(img_right, tuple(pt_right), 5, color, -1) return img_left, img_right def get_descriptors(gray_image, feature_type): if feature_type == 'surf': feature_extractor = cv2.SURF() elif feature_type == 'sift': feature_extractor = cv2.SIFT() else: raise TypeError("Invalid feature type; should be either 'surf' or 'sift'") keypoints, descriptors = feature_extractor.detectAndCompute(gray_image, None) return keypoints, descriptors if __name__=='__main__': args = build_arg_parser().parse_args() img_left = cv2.imread(args.img_left,0) # left image img_right = cv2.imread(args.img_right,0) # right image feature_type = args.feature_type if feature_type not in ['sift', 'surf']: raise TypeError("Invalid feature type; has to be either 'sift' or 'surf'") scaling_factor = 1.0 img_left = cv2.resize(img_left, None, fx=scaling_factor, fy=scaling_factor, interpolation=cv2.INTER_AREA) img_right = cv2.resize(img_right, None, fx=scaling_factor, fy=scaling_factor, interpolation=cv2.INTER_AREA) kps_left, des_left = get_descriptors(img_left, feature_type) kps_right, des_right = get_descriptors(img_right, feature_type) # FLANN parameters FLANN_INDEX_KDTREE = 0 index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5) search_params = dict(checks=50) # Get the matches based on the descriptors flann = cv2.FlannBasedMatcher(index_params, search_params) matches = flann.knnMatch(des_left, des_right, k=2) pts_left_image = [] pts_right_image = [] # ratio test to retain only the good matches for i,(m,n) in enumerate(matches): if m.distance < 0.7*n.distance: pts_left_image.append(kps_left[m.queryIdx].pt) pts_right_image.append(kps_right[m.trainIdx].pt) pts_left_image = np.float32(pts_left_image) pts_right_image = np.float32(pts_right_image) F, mask = cv2.findFundamentalMat(pts_left_image, pts_right_image, cv2.FM_LMEDS) # Selecting only the inliers pts_left_image = pts_left_image[mask.ravel()==1] pts_right_image = pts_right_image[mask.ravel()==1] # Drawing the lines on left image and the corresponding feature points on the right image lines1 = cv2.computeCorrespondEpilines (pts_right_image.reshape(-1,1,2), 2, F) lines1 = lines1.reshape(-1,3) img_left_lines, img_right_pts = draw_lines(img_left, img_right, lines1, pts_left_image, pts_right_image) # Drawing the lines on right image and the corresponding feature points on the left image lines2 = cv2.computeCorrespondEpilines (pts_left_image.reshape(-1,1,2), 1,F) lines2 = lines2.reshape(-1,3) img_right_lines, img_left_pts = draw_lines(img_right, img_left, lines2, pts_right_image, pts_left_image) cv2.imshow('Epi lines on left image', img_left_lines) cv2.imshow('Feature points on right image', img_right_pts) cv2.imshow('Epi lines on right image', img_right_lines) cv2.imshow('Feature points on left image', img_left_pts) cv2.waitKey() cv2.destroyAllWindows()
Let's see what happens if we use the SURF feature extractor. The lines in the left image will look like this:
Following are the matching feature points in the right image:
If you take the second image as the reference, you will see something like the following image: