Going forward, we will be using a famous open source dataset called fountain-P11
. It depicts a Swiss fountain viewed from various angles. An example of this is shown in the following image:
The dataset consists of 11 high-resolution images and can be downloaded from http://cvlabwww.epfl.ch/data/multiview/denseMVS.html. Had we taken the pictures ourselves, we would have had to go through the entire camera calibration procedure to recover the intrinsic camera matrix and the distortion coefficients. Luckily, these parameters are known for the camera that took the fountain dataset, so we can go ahead and hardcode these values in our code.
Our main function routine will consist of creating and interacting with an instance of the SceneReconstruction3D
class. This code can be found in the chapter4.py
file, which imports all the necessary modules and instantiates the class:
import numpy as np from scene3D import SceneReconstruction3D def main(): # camera matrix and distortion coefficients # can be recovered with calibrate.py # but the examples used here are already undistorted, taken # with a camera of known K K = np.array([[2759.48/4, 0, 1520.69/4, 0, 2764.16/4,1006.81/4, 0, 0, 1]]).reshape(3, 3) d = np.array([0.0, 0.0, 0.0, 0.0, 0.0]).reshape(1, 5)
Here, the K
matrix is the intrinsic camera matrix for the camera that took the fountain dataset. According to the photographer, these images are already distortion free, so we set all the distortion coefficients (d
) to zero.
Next, we load a pair of images to which we would like to apply our structure-from-motion techniques. I downloaded the dataset into a subdirectory called fountain_dense
:
# load a pair of images for which to perform SfM scene = SceneReconstruction3D(K, d) scene.load_image_pair("fountain_dense/0004.png", "fountain_dense/0005.png")
Now we are ready to perform various computations, such as the following:
scene.plot_optic_flow() scene.draw_epipolar_lines() scene.plot_rectified_images() # draw 3D point cloud of fountain # use "pan axes" button in pyplot to inspect the cloud (rotate # and zoom to convince you of the result) scene.plot_point_cloud()
All of the relevant 3D scene reconstruction code for this chapter can be found as part of the SceneReconstruction3D
class in the scene3D
module. Upon instantiation, the class stores the intrinsic camera parameters to be used in all subsequent calculations:
import cv2 import numpy as np import sys from mpl_toolkits.mplot3d import Axes3D import matplotlib.pyplot as plt class SceneReconstruction3D: def __init__(self, K, dist): self.K = K self.K_inv = np.linalg.inv(K) self.d = dist
Then, the first step is to load a pair of images on which to operate:
def load_image_pair(self, img_path1, img_path2,downscale=True): self.img1 = cv2.imread(img_path1, cv2.CV_8UC3) self.img2 = cv2.imread(img_path2, cv2.CV_8UC3) # make sure images are valid if self.img1 is None: sys.exit("Image " + img_path1 + " could not be loaded.") if self.img2 is None: sys.exit("Image " + img_path2 + " could not be loaded.")
If the loaded images are grayscale, the method will convert to them to BGR format, because the other methods expect a three-channel image:
if len(self.img1.shape)==2: self.img1 = cv2.cvtColor(self.img1, cv2.COLOR_GRAY2BGR) self.img2 = cv2.cvtColor(self.img2, cv2.COLOR_GRAY2BGR)
In the case of the fountain sequence, all images are of a relatively high resolution. If an optional downscale
flag is set, the method will downscale the images to a width of roughly 600 pixels:
# scale down image if necessary # to something close to 600px wide target_width = 600 if downscale and self.img1.shape[1]>target_width: while self.img1.shape[1] > 2*target_width: self.img1 = cv2.pyrDown(self.img1) self.img2 = cv2.pyrDown(self.img2)
Also, we need to compensate for the radial and tangential lens distortions using the distortion coefficients specified earlier (if there are any):
self.img1 = cv2.undistort(self.img1, self.K, self.d) self.img2 = cv2.undistort(self.img2, self.K, self.d)
Finally, we are ready to move on to the meat of the project—estimating the camera motion and reconstructing the scene!