Latent SVM

Latent SVM is a detector that uses HOG features and a star-structured, part-based model consisting of a root filter and a set of part filters to represent an object category. HOGs are feature descriptors that are obtained by counting the occurrences of gradient orientations in localized portions of an image. On the other hand, a variant of support vector machines (SVM) classifiers are used in this detector to train models using partially labeled data. The basic idea of an SVM is constructing a hyperplane or set of hyperplanes in high-dimensional space. These hyperplanes are obtained to have the largest distance to the nearest training data point (functional margin in order to achieve low generalization errors). Like cascade detectors, Latent SVM uses a sliding window with different initial positions and scales where the algorithm is applied in order to detect if there is an object inside.

One of the advantages of the OpenCV Latent SVM implementation is that it allows the detection of multiple object categories by combining several simple pretrained detectors within the same multiobject detector instance.

The following latentDetection example illustrates how to use a Latent SVM detector for localizing objects from a category in an image:

#include "opencv2/core/core.hpp"
#include "opencv2/objdetect/objdetect.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <iostream>

using namespace std;
using namespace cv;

int main(int argc, char* argv[]){
    String model = argv[1];
    vector<String> models;
    models.push_back( model );
    vector<String> names;
    names.push_back( "category" );
    LatentSvmDetector detector( models , names);
    if( detector.empty() ) {
        cout << "Model cannot be loaded" << endl;
        return -1;
    }

    String img = argv[2];
    Mat image = imread( img );
    if( image.empty() ){
        cout << "Image cannot be loaded" << endl;
        return -1;
    }

    vector<LatentSvmDetector::ObjectDetection> detections;
    detector.detect( image, detections, 0.1, 1);
    for( size_t i = 0; i < detections.size(); i++ ) {
        Point center( detections[i].rect.x + 
                      detections[i].rect.width*0.5, 
                      detections[i].rect.y + 
                      detections[i].rect.height*0.5 );
        ellipse( image, center, Size( detections[i].rect.width*0.5, 
                 detections[i].rect.height*0.5), 0, 0, 360, 
                 Scalar( 255, 0, 255 ), 4, 8, 0 );
    }
    imshow( "result", image );
    waitKey(0);
    return 0;
}

The code explanation is as follows:

  • LatentSvmDetector: This class has an object that represents a Latent SVM detector composed of one or more pretrained detectors.
  • constructor LatentSvmDetector::LatentSvmDetector(const vector<String>& filenames, const vector<string>& classNames=vector<String>()): This class initializes the object instance and loads the information of the detectors stored in the system paths indicated by the vector filenames. The second parameter, the vector classNames, contains the category names. The method bool LatentSvmDetector::load(const vector<string>& filenames, const vector<string>& classNames=vector<string>()) is called implicitly after the constructor.
  • void LatentSvmDetector::detect(const Mat& image, vector<ObjectDetection>& objectDetections, float overlapThreshold = 0.5f, int numThreads = -1): This method examines the image in the variable image by applying the simple or combined detector on it and puts all detected objects in objectDetections. All detections are stored in a vector of the ObjectDetection struct. This structure has the following three variables:
    • The bounding box of the detection (rect)
    • The confidence level (score)
    • The category ID (classID)

    The parameter overlapThreshold is the threshold for the non-maximum suppression algorithm for eliminating overlapped detections. Finally, numThreads is the number of threads used in the parallel version of the algorithm.

The following screenshot shows a cat detected using the previous code and the files cat.xml and cat.png, and cars detected using car.xml and cars.png. These files are included in the OpenCV extra data that can be found in the official repository. Thus, it is possible to run the program using the following command:

>latentDetection.exe xmlfile imagefile

In the previous command, xmlfile is the Latent SVM detector and imagefile is the image that has to be examined.

Note

OpenCV extra data provides more samples and test files that can be used by users to create and test their own projects while saving time. It can be found at https://github.com/Itseez/opencv_extra.

In addition to the car and cat detectors, OpenCV provides pretrained detectors for the rest of the classes defined in The PASCAL Visual Object Classes Challenge 2007 (http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007). These detectors are as follows:

  • aeroplane.xml
  • bicycle.xml
  • bird.xml
  • boat.xml
  • bottle.xml
  • bus.xml
  • car.xml
  • cat.xml
  • chair.xml
  • cow.xml
  • diningtable.xml
  • dog.xml
  • horse.xml
  • motorbike.xml
  • person.xml
  • pottedplant.xml
  • sheep.xml
  • sofa.xml
  • train.xml
  • tvmonitor.xml
Latent SVM

The detection of a cat and some cars using Latent SVM

Tip

The false positive rate can be adjusted by changing the value of the overlapThreshold parameter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset