Latent SVM is a detector that uses HOG features and a star-structured, part-based model consisting of a root filter and a set of part filters to represent an object category. HOGs are feature descriptors that are obtained by counting the occurrences of gradient orientations in localized portions of an image. On the other hand, a variant of support vector machines (SVM) classifiers are used in this detector to train models using partially labeled data. The basic idea of an SVM is constructing a hyperplane or set of hyperplanes in high-dimensional space. These hyperplanes are obtained to have the largest distance to the nearest training data point (functional margin in order to achieve low generalization errors). Like cascade detectors, Latent SVM uses a sliding window with different initial positions and scales where the algorithm is applied in order to detect if there is an object inside.
One of the advantages of the OpenCV Latent SVM implementation is that it allows the detection of multiple object categories by combining several simple pretrained detectors within the same multiobject detector instance.
The following latentDetection
example illustrates how to use a Latent SVM detector for localizing objects from a category in an image:
#include "opencv2/core/core.hpp" #include "opencv2/objdetect/objdetect.hpp" #include "opencv2/highgui/highgui.hpp" #include <iostream> using namespace std; using namespace cv; int main(int argc, char* argv[]){ String model = argv[1]; vector<String> models; models.push_back( model ); vector<String> names; names.push_back( "category" ); LatentSvmDetector detector( models , names); if( detector.empty() ) { cout << "Model cannot be loaded" << endl; return -1; } String img = argv[2]; Mat image = imread( img ); if( image.empty() ){ cout << "Image cannot be loaded" << endl; return -1; } vector<LatentSvmDetector::ObjectDetection> detections; detector.detect( image, detections, 0.1, 1); for( size_t i = 0; i < detections.size(); i++ ) { Point center( detections[i].rect.x + detections[i].rect.width*0.5, detections[i].rect.y + detections[i].rect.height*0.5 ); ellipse( image, center, Size( detections[i].rect.width*0.5, detections[i].rect.height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 ); } imshow( "result", image ); waitKey(0); return 0; }
The code explanation is as follows:
LatentSvmDetector
: This class has an object that represents a Latent SVM detector composed of one or more pretrained detectors.constructor LatentSvmDetector::LatentSvmDetector(const vector<String>& filenames, const vector<string>& classNames=vector<String>())
: This class initializes the object instance and loads the information of the detectors stored in the system paths indicated by the vector filenames
. The second parameter, the vector classNames
, contains the category names. The method bool LatentSvmDetector::load(const vector<string>& filenames, const vector<string>& classNames=vector<string>())
is called implicitly after the constructor.void LatentSvmDetector::detect(const Mat& image, vector<ObjectDetection>& objectDetections, float overlapThreshold = 0.5f, int numThreads = -1)
: This method examines the image in the variable image
by applying the simple or combined detector on it and puts all detected objects in objectDetections
. All detections are stored in a vector of the ObjectDetection
struct. This structure has the following three variables:rect
)score
)classID
)The parameter overlapThreshold
is the threshold for the non-maximum suppression algorithm for eliminating overlapped detections. Finally, numThreads
is the number of threads used in the parallel version of the algorithm.
The following screenshot shows a cat detected using the previous code and the files cat.xml
and cat.png
, and cars detected using car.xml
and cars.png
. These files are included in the OpenCV extra data that can be found in the official repository. Thus, it is possible to run the program using the following command:
>latentDetection.exe xmlfile imagefile
In the previous command, xmlfile
is the Latent SVM detector and imagefile
is the image that has to be examined.
OpenCV extra data provides more samples and test files that can be used by users to create and test their own projects while saving time. It can be found at https://github.com/Itseez/opencv_extra.
In addition to the car and cat detectors, OpenCV provides pretrained detectors for the rest of the classes defined in The PASCAL Visual Object Classes Challenge 2007 (http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007). These detectors are as follows:
aeroplane.xml
bicycle.xml
bird.xml
boat.xml
bottle.xml
bus.xml
car.xml
cat.xml
chair.xml
cow.xml
diningtable.xml
dog.xml
horse.xml
motorbike.xml
person.xml
pottedplant.xml
sheep.xml
sofa.xml
train.xml
tvmonitor.xml