Going real time

One of the main advantages of using the GPU to perform computations in images is that they are much faster. This increase in speed allows you to run heavy computational algorithms in real-time applications, such as stereo vision, pedestrian detection, or dense optical flow. In the next matchTemplateGPU example, we show an application that matches a template in a video sequence:

#include <iostream>
#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/features2d/features2d.hpp"
#include "opencv2/gpu/gpu.hpp"
#include "opencv2/nonfree/gpu.hpp"

using namespace std;
using namespace cv;

int main( int argc, char** argv )
{
    Mat img_template_cpu = imread( argv[1],IMREAD_GRAYSCALE);
    gpu::GpuMat img_template;
    img_template.upload(img_template_cpu);

    //Detect keypoints and compute descriptors of the template
    gpu::SURF_GPU surf;
    gpu::GpuMat keypoints_template, descriptors_template;

    surf(img_template,gpu::GpuMat(),keypoints_template,       descriptors_template);

    //Matcher variables
    gpu::BFMatcher_GPU matcher(NORM_L2);   

    //VideoCapture from the webcam
    gpu::GpuMat img_frame;
    gpu::GpuMat img_frame_gray;
    Mat img_frame_aux;
    VideoCapture cap;
    cap.open(0);
    if (!cap.isOpened()){
        cerr << "cannot open camera" << endl;
        return -1;
    }
    int nFrames = 0;
    uint64 totalTime = 0;
    //main loop
    for(;;){
        int64 start = getTickCount();
        cap >> img_frame_aux;
        if (img_frame_aux.empty())
            break;
        img_frame.upload(img_frame_aux);
        cvtColor(img_frame,img_frame_gray, CV_BGR2GRAY);

        //Step 1: Detect keypoints and compute descriptors
        gpu::GpuMat keypoints_frame, descriptors_frame;
        surf(img_frame_gray,gpu::GpuMat(),keypoints_frame, descriptors_frame);

        //Step 2: Match descriptors
        vector<vector<DMatch>>matches;        matcher.knnMatch(descriptors_template,descriptors_frame,matches,2);

        //Step 3: Filter results
        vector<DMatch> good_matches;
        float ratioT = 0.7;
        for(int i = 0; i < (int) matches.size(); i++)
        {
            if((matches[i][0].distance < ratioT*(matches[i][1].distance)) && ((int) matches[i].size()<=2 && (int) matches[i].size()>0))
            {
                good_matches.push_back(matches[i][0]);
            }
        }
        // Step 4: Download results
        vector<KeyPoint> keypoints1, keypoints2;
        vector<float> descriptors1, descriptors2;
        surf.downloadKeypoints(keypoints_template, keypoints1);
        surf.downloadKeypoints(keypoints_frame, keypoints2);
        surf.downloadDescriptors(descriptors_template, descriptors1);
        surf.downloadDescriptors(descriptors_frame, descriptors2);

        //Draw the results
        Mat img_result_matches;
        drawMatches(img_template_cpu, keypoints1, img_frame_aux, keypoints2, good_matches, img_result_matches);
        imshow("Matching a template", img_result_matches);

        int64 time_elapsed = getTickCount() - start;
        double fps = getTickFrequency() / time_elapsed;
        totalTime += time_elapsed;
        nFrames++;
        cout << "FPS : " << fps <<endl;

        int key = waitKey(30);
        if (key == 27)
            break;;
    }
    double meanFps = getTickFrequency() / (totalTime / nFrames);
    cout << "Mean FPS: " << meanFps << endl;

    return 0;
}

The explanation of the code is given as follows. As detailed in Chapter 5, Focusing on the Interesting 2D Features, features can be used to find the correspondence between two images. The template image, which is searched afterwards within every frame, is processed in the first place using the GPU version of SURF (gpu::SURF_GPU surf;) to detect interest points and extract descriptors. This is accomplished by running surf(img_template,gpu::GpuMat(),keypoints_template, descriptors_template);. The same process is performed for every frame taken from the video sequence. In order to match the descriptors of both images, a GPU version of the BruteForce matcher is also created with gpu::BFMatcher_GPU matcher(NORM_L2);. An extra step is needed due to the fact that interest points and descriptors are stored in the GPU memory, and they need to be downloaded before we can show them. That's why surf.downloadKeypoints(keypoints, keypoints); and surf.downloadDescriptors(descriptors, descriptors); are executed. The following screenshot shows the example running:

Going real time

Template matching using a webcam

Performance

The principal motivation for choosing GPU programming is performance. Therefore, this example includes time measurements to compare the speedups obtained with respect to the CPU version. Specifically, time is saved at the beginning of the main loop of the program by means of the getTickCount() method. At the end of this loop, the same method is used as well as getTickFrequency, which helps to calculate the FPS of the current frame. The time elapsed in each frame is accumulated, and at the end of the program, the mean is computed. The previous example has an average latency of 15 FPS, whereas the same example using CPU data types and algorithms achieves a mere 0.5 FPS. Both examples have been tested on the same hardware: a PC equipped with an i5-4570 processor and an NVIDIA GeForce GTX 750 graphics card. Obviously, a speed increment of 30x is significant, especially when we just need to change a few lines of code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset