Rolling shutter compensation

At this point, our video is stable, however, when objects in the scene are moving quickly, the rolling shutter effects become more pronounced.

To fix this, we'll need to do a few things. First, incorporate the rolling shutter speed into our calibration code. Second, when warping images, we need to unwarp the rolling shutter as well.

Calibrating the rolling shutter

To start calibrating the rolling shutter duration, we need to tweak the error function to incorporate another term. Let's start by looking at the calcErrorAcrossVideo method. The part we're interested in is:

def calcErrorAcrossVideo(videoObj, theta, timestamp, focal_length, gyro_delay=None, gyro_drift=None, rolling_shutter=None):
    total_error = 0
    ...
        transform = getAccumulatedRotation(...)
        transformed_corners = cv2.perspectiveTransform(old_corners, transform)
    ...

Also, we'll need to add logic to transform a corner based on its location—a corner in the upper part of the image is transformed differently from a corner in the lower half.

So far, we have had a single transformation matrix and that was usually sufficient. However, now, we need to have multiple transformation matrices, one for each row. We could choose to do this for every row of pixels, however that is a bit excessive. We only need transforms for rows that contain a corner we're tracking.

We'll start by replacing the two lines mentioned above. We need to loop over each corner individually and warp it. Let's do this with a simple for loop:

for pt in old_corners:
    x = pt[0][0]
    y = pt[0][1]

    pt_timestamp = int(current_timestamp) + rolling_shutter * (y-frame_height/2) / frame_height

Here, we extract the x and y coordinates of the old corner and try to estimate the timestamp when this particular pixel was captured. Here, I'm assuming the rolling shutter is in the vertical direction, from the top of the frame to the bottom.

We use the current estimate of the rolling shutter duration and estimate subtract and add time based on the row the corner belongs to. It should be simple to adapt this for a horizontal rolling shutter as well. Instead of using y and frameHeight, you would have to use x and frameWidth—the calculation would stay the same. For now, we'll just assume this is going to be a vertical rolling shutter.

Now that we have the estimated timestamp of capture, we can get the rotation matrix for that instant (remember, the gyroscope produces a higher resolution data than the camera sensor).

    transform = getAccumulatedRotation(videoObj.frameWidth, videoObj.frameHeight, theta[0], theta[1], theta[2], timestamps, int(previous_timestamp), int(pt_timestamp), focal_length, gyro_delay, gyro_drift, doSub=True)

This line is almost the same as the original we had; the only difference is that we've replaced current_timestamp with pt_timestamp.

Next, we need to transform this point based on the rolling shutter duration.

    output = transform * np.matrix("%f;%f;1.0" % (x, y)).tolist()
    tx = (output[0][0] / output[2][0]).tolist()[0][0]
    ty = (output[1][0] / output[2][0]).tolist()[0][0]
    transformed_corners.append( np.array([tx, ty]) )

After transforming, we simply append it to the transformed_corners list (just like we did earlier).

With this, we're done with the calibration part. Now, we move onto warping images.

Warping with grid points

Let's start by writing a function that will do the warping for us. This function takes in these inputs:

  • The original image
  • A bunch of points that should ideally line up in a perfect grid

The size of the point list gives us the number of rows and columns to expect and the function returns a perfectly aligned image.

Warping with grid points

Let's define the function:

def meshwarp(src, distorted_grid):
    """
    src: The original image
    distorted_grid: The list of points that have been distorted
    """
    size = src.shape

As mentioned earlier, this takes in an image and the list of control points. We store the size of the image for future reference.

    mapsize = (size[0], size[1], 1)
    dst = np.zeros(size, dtype=np.uint8)

The size we stored earlier will most likely have three channels in it. So we create a new variable called mapsize; this stores the size of the image but only one channel. We'll use this later for creating matrices for use by the remap function in OpenCV.

We also create a blank image of the same size as the original. Next, we look at calculating the number of rows and columns in the grid.

    quads_per_row = len(distorted_grid[0]) – 1
    quads_per_col = len(distorted_grid) – 1
    pixels_per_row = size[1] / quads_per_row
    pixels_per_col = size[0] / quads_per_col

We'll use the variables in some loops soon.

    pt_src_all = []
    pt_dst_all = []

These lists store all the source (distorted) points and the destination (perfectly aligned) points. We'll have to use distorted_grid to populate pt_src_all. We'll procedurally generate the destination based on the number of rows and columns in the input data.

    for ptlist in distorted_grid:
        pt_src_all.extend(ptlist)

The distorted grid should be a list of lists. Each row is a list that contains its points.

Now, we generate the procedural destination points using the quads_per_* variables we calculated earlier.

    for x in range(quads_per_row+1):
        for y in range(quads_per_col+1):
            pt_dst_all.append( [x*pixels_per_col,
                                y*pixels_per_row])

This generates the ideal grid based on the number of points we passed to the method.

We then have all the required information to calculate the interpolation between the source and destination grids. We'll be using scipy to calculate the interpolation for us. We then pass this to OpenCV's remap method and that applies it to an image.

To begin with, scipy needs a representation of the expected output grid so we need to specify a dense grid that contains all the pixels of the image. This is done with:

gx, gt = np.mgrid[0:size[1], 0:size[0]]

Once we have the base grid defined, we can use Scipy's interpolate module to calculate the mapping for us.

g_out = scipy.interpolate.griddata(np.array(pt_dst_all),
                                   np.array(pt_src_all),
                                   (gx, gy), method='linear')

g_out contains both the x and y coordinates of the remapping; we need to split this into individual components for OpenCV's remap method to work.

mapx = np.append([], [ar[:,0] for ar in g_out]).reshape(mapsize).astype('float32')
mapy = np.append([], [ar[:,1] for ar in g_out]).reshape(mapsize).astype('float32')

These matrices are exactly what remap expects and we can now simply run it with the appropriate parameters.

    dst = cv2.remap(src, mapx, mapy, cv2.INTER_LINEAR)
    return dst

And that completes our method. We can use this in our stabilization code and fix the rolling shutter as well.

Unwarping with calibration

Here, we discuss how to warp images given a mesh for stabilizing the video. We split each frame into a 10x10 mesh. We warp the mesh and that results in warping the image (like control points). Using this approach, we should get good results and decent performance as well.

The actual unwarp happens in the accumulateRotation method:

def accumulateRotation(src, theta_x, theta_y, theta_z, timestamps, prev, current, f, gyro_delay=None, gyro_drift=None, shutter_duration=None):
    ...
    transform = getAccumulatedRotation(src.shape[1], src.shape[0], theta_x, theta_y, theta_z, timestamps, prev, current, f, gyro_delay, gyro_drift)
    o = cv2.warpPerspective(src, transform (src.shape[1], src.shape[0]))
    return o

Here, there's a single perspective transform happening. Now, instead, we have to do a different transform for each of the 10x10 control points and use the meshwarp method to fix the rolling shutter. So replace the transform = line with the contents below:

    ...
    pts = []
    transformed_pts = []
    for x in range(10):
        current_row = []
        current_row_transformed = []
        pixel_x = x * (src.shape[1] / 10)
        for y in range(10):
            pixel_y = y * (src.shape[0] / 10)
            current_row.append( [pixel_x, pixel_y] )
        pts.append(current_row)

We have now generated the original grid in the pts list. Now, we need to generate the transformed coordinates:

        ...
        for y in range(10):
            pixel_y = y * (src.shape[0] / 10
            if shutter_duration:
                y_timestamp = current + shutter_duration*(pixel_y - src.shape[0]/2)
            else:
                y_timestamp = current
        ...

If a shutter duration is passed, we generate the timestamp at which this specific pixel was recorded. Now we can transform (pixel_x, pixel_y) based on the shutter rotation and append that to current_row_transformed:

        ...
        transform = getAccumulatedRotation(src.shape[1], src.shape[0], theta_x, theta_y, theta_z, timestamps, prev, y_timestamp, f, gyro_delay, gyro_drift)
        output = cv2.perspectiveTransform(np.array([[pixel_x, pixel_y]], transform)
        current_row_transformed.append(output)
    pts.append(current_row)
    pts_transformed.append(current_row_transformed)
        ...

This completes the grid for meshwarp. Now all we need to do is generate the warped image. This is simple since we already have the required method:

    o = meshwarp(src, pts_transformed)
    return o

And this completes our transformation. We now have rolling shutter incorporated into our undistortion as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset