Chapter 10. Stereo Vision and 3D Reconstruction

In this chapter, we are going to learn about stereo vision and how we can reconstruct the 3D map of a scene. We will discuss epipolar geometry, depth maps, and 3D reconstruction. We will learn how to extract 3D information from stereo images and build a point cloud.

By the end of this chapter, you will know:

  • What is stereo correspondence
  • What is epipolar geometry
  • What is a depth map
  • How to extract 3D information
  • How to build and visualize the 3D map of a given scene

What is stereo correspondence?

When we capture images, we project the 3D world around us on a 2D image plane. So technically, we only have 2D information when we capture those photos. Since all the objects in that scene are projected onto a flat 2D plane, the depth information is lost. We have no way of knowing how far an object is from the camera or how the objects are positioned with respect to each other in the 3D space. This is where stereo vision comes into the picture.

Humans are very good at inferring depth information from the real world. The reason is that we have two eyes positioned a couple of inches from each other. Each eye acts as a camera and we capture two images of the same scene from two different viewpoints, that is, one image each using the left and right eyes. So, our brain takes these two images and builds a 3D map using stereo vision. This is what we want to achieve using stereo vision algorithms. We can capture two photos of the same scene using different viewpoints, and then match the corresponding points to obtain the depth map of the scene.

Let's consider the following image:

What is stereo correspondence?

Now, if we capture the same scene from a different angle, it will look like this:

What is stereo correspondence?

As you can see, there is a large amount of movement in the positions of the objects in the image. If you consider the pixel coordinates, the values of the initial position and final position will differ by a large amount in these two images. Consider the following image:

What is stereo correspondence?

If we consider the same line of distance in the second image, it will look like this:

What is stereo correspondence?

The difference between d1 and d2 is large. Now, let's bring the box closer to the camera:

What is stereo correspondence?

Now, let's move the camera by the same amount as we did earlier, and capture the same scene from this angle:

What is stereo correspondence?

As you can see, the movement between the positions of the objects is not much. If you consider the pixel coordinates, you will see that the values are close to each other. The distance in the first image would be:

What is stereo correspondence?

If we consider the same line of distance in the second image, it will be as shown in the following image:

What is stereo correspondence?

The difference between d3 and d4 is small. We can say that the absolute difference between d1 and d2 is greater than the absolute difference between d3 and d4. Even though the camera moved by the same amount, there is a big difference between the apparent distances between the initial and final positions. This happens because we can bring the object closer to the camera; the apparent movement decreases when you capture two images from different angles. This is the concept behind stereo correspondence: we capture two images and use this knowledge to extract the depth information from a given scene.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset