OpenCV handles several local feature detector implementations through the FeatureDetector
abstract class and its Ptr<FeatureDetector> FeatureDetector::create(const string& detectorType)
method or through the algorithm class directly. In the first case, the type of detector is specified (see the following diagram where the detectors used in this chapter are indicated in red color). Detectors and the types of local features that they detect are as follows:
FAST
(FastFeatureDetector
): This feature detects corners and blobsSTAR
(StarFeatureDetector
): This feature detects edges, corners, and blobsSIFT
(SiftFeatureDetector
): This feature detects corners and blobs (part of the nonfree
module)SURF
(SurfFeatureDetector
): This feature detects corners and blobs (part of the nonfree
module)ORB
(OrbFeatureDetector
): This feature detects corners and blobsBRISK
(BRISK
): This feature detects corners and blobsMSER
(MserFeatureDetector
): This feature detects blobsGFTT
(GoodFeaturesToTrackDetector
): This feature detects edges and cornersHARRIS
(GoodFeaturesToTrackDetector
): This feature detects edges and corners (with the Harris detector enabled)Dense
(DenseFeatureDetector
): This feature detects the features that are distributed densely and regularly on the imageSimpleBlob
(SimpleBlobDetector
): This feature detects blobsWe should note that some of these detectors, such as SIFT
, SURF
, ORB
, and BRISK
, are also descriptors.
Keypoint detection is performed by the void FeatureDetector::detect(const Mat& image, vector<KeyPoint>& keypoints, const Mat& mask)
function, which is another method of the FeatureDetector
class. The first parameter is the input image where the keypoints will be detected. The second parameter corresponds to the vector where the keypoints will be stored. The last parameter is optional and represents an input mask image in which we can specify where to look for keypoints.
Matthieu Labbé has implemented a Qt-based open source application where you can test OpenCV's corner detectors, feature extractors, and matching algorithms in a nice GUI. It is available at https://code.google.com/p/find-object/.
The first interest points were historically corners. In 1977, Moravec defined corners as interest points where there is a large intensity variation in several directions (45 degrees). These interest points were used by Moravec to find matching regions in consecutive image frames. Later, in 1988, Harris improved Moravec's algorithm using the Taylor expansion to approximate the shifted intensity variation. Afterwards, other detectors appeared, such as the detector based on difference of Gaussians (DoG) and
determinant of the Hessian (DoH) (for example, SIFT
or SURF
, respectively) or the detector based on Moravec's algorithm, but considering continuous intensity values in a pixel neighborhood such as FAST
or BRISK
(scale-space FAST).
Lu, in her personal blog, LittleCheeseCake, explains some of the most popular detectors and descriptors in detail. The blog is available at http://littlecheesecake.me/blog/13804625/feature-detectors-and-descriptors.
The corner detector is based on the Features from Accelerated Segment Test (FAST) algorithm. It was designed to be very efficient, targeting real-time applications. The method is based on considering a circle of 16 pixels (neighborhood) around a candidate corner p. The FAST detector will consider p as a corner if there is a set of contiguous pixels in the neighborhood that all are brighter than p+T or darker than p-T, T being a threshold value. This threshold must be properly selected.
OpenCV implements the FAST detector in the FastFeatureDetector()
class, which is a wrapper class for the FAST()
method. To use this class, we must include the features2d.hpp
header file in our code.
Next, we show a code example where the corners are detected using the FAST
method with different threshold values. The FASTDetector
code example is shown as follows:
#include "opencv2/core/core.hpp" #include "opencv2/highgui/highgui.hpp" #include "opencv2/imgproc/imgproc.hpp" #include "opencv2/features2d/features2d.hpp" #include <iostream> using namespace std; using namespace cv; int main(int argc, char *argv[]) { //Load original image and convert to gray scale Mat in_img = imread("book.png"); cvtColor( in_img, in_img, COLOR_BGR2GRAY ); //Create a keypoint vectors vector<KeyPoint> keypoints1,keypoints2; //FAST detector with threshold value of 80 and 100 FastFeatureDetector detector1(80); FastFeatureDetector detector2(100); //Compute keypoints in in_img with detector1 and detector2 detector1.detect(in_img, keypoints1); detector2.detect(in_img, keypoints2); Mat out_img1, out_img2; //Draw keypoints1 and keypoints2 drawKeypoints(in_img,keypoints1,out_img1,Scalar::all(-1),0); drawKeypoints(in_img,keypoints2,out_img2,Scalar::all(-1),0); //Show keypoints detected by detector1 and detector2 imshow( "out_img1", out_img1 ); imshow( "out_img2", out_img2 ); waitKey(0); return 0; }
The explanation of the code is given as follows. In this and the following examples, we usually perform the following three steps:
In our sample, FastFeatureDetector(int threshold=1, bool nonmaxSuppression= true, type=FastFeatureDetector::TYPE_9_16)
is the function where the detector parameters, such as threshold value, non-maximum suppression, and neighborhoods, are defined.
The following three types of neighborhoods can be selected:
FastFeatureDetector::TYPE_9_16
FastFeatureDetector::TYPE_7_12
FastFeatureDetector::TYPE_5_8
These neighborhoods define the number of neighbors (16, 12, or 8) and the total number of contiguous pixels (9, 7, or 5) needed to consider the corner (keypoint) valid. An example of TYPE_9_16
is shown in the next screenshot.
In our code, the threshold values 80
and 100
have been selected, while the rest of the parameters have their default values, nonmaxSuppression=true
and type=FastFeatureDetector::TYPE_9_16
, as shown:
FastFeatureDetector detector1(80); FastFeatureDetector detector2(100);
Keypoints are detected and saved using the void detect(const Mat& image, vector<KeyPoint>& keypoints, const Mat& mask=Mat())
function. In our case, we create the following two FAST feature detectors:
detector1
saves its keypoints in the keypoints1
vectordetector2
saves its keypoints in the keypoints2
The void drawKeypoints(const Mat& image, const vector<KeyPoint>& keypoints, Mat& outImage, const Scalar& color=Scalar::all(-1), int flags=DrawMatchesFlags::DEFAULT)
function draws the keypoints in the image. The color
parameter allows us to define a color of keypoints, and with the Scalar:: all(-1)
option, each keypoint will be drawn with a different color.
The keypoints are drawn using the two threshold values on the image. We will notice a small difference in the number of keypoints detected. This is due to the threshold value in each case. The following screenshot shows a corner detected in the sample with a threshold value of 80, which is not detected with a threshold value of 100:
The difference is due to the fact that the FAST feature detectors are created with the default type, that is, TYPE_9_16
. In the example, the p pixel takes a value of 228, so at least nine contiguous pixels must be brighter than p+T or darker than p-T. The following screenshot shows the neighborhood pixel values in this specific keypoint. The condition of nine contiguous pixels is met if we use a threshold value of 80. However, the condition is not met with a threshold value of 100:
The Speeded Up Robust Features (SURF) detector is based on a Hessian matrix to find the interest points. For this purpose, SURF divides the image in different scales (levels and octaves) using second-order Gaussian kernels and approximates these kernels with a simple box filter. This filter box is mostly interpolated in scale and space in order to provide the detector with the scale-invariance properties. SURF is a faster approximation of the classic Scale Invariant Feature Transform (SIFT) detector. Both the SURF and SIFT detectors are patented, so OpenCV includes them separately in their nonfree/nonfree.hpp
header file.
The following SURFDetector
code shows an example where the keypoints are detected using the SURF detector with a different number of Gaussian pyramid octaves:
//… (omitted for simplicity) #include "opencv2/nonfree/nonfree.hpp" int main(int argc, char *argv[]) { //Load image and convert to gray scale (omitted for //simplicity) //Create a keypoint vectors vector<KeyPoint> keypoints1,keypoints2; //SURF detector1 and detector2 with 2 and 5 Gaussian pyramid //octaves respectively SurfFeatureDetector detector1(3500, 2, 2, false, false); SurfFeatureDetector detector2(3500, 5, 2, false, false); //Compute keypoints in in_img with detector1 and detector2 detector1.detect(in_img, keypoints1); detector2.detect(in_img, keypoints2); Mat out_img1, out_img2; //Draw keypoints1 and keypoints2 drawKeypoints(in_img,keypoints1,out_img1,Scalar::all(-1), DrawMatchesFlags::DRAW_RICH_KEYPOINTS); drawKeypoints(in_img,keypoints2,out_img2,Scalar::all(-1), DrawMatchesFlags::DRAW_RICH_KEYPOINTS); //Show the 2 final images (omitted for simplicity) return 0; }
The explanation of the code is given as follows. SURFFeatureDetector(double hessianThreshold, int nOctaves, int nOctaveLayers, bool extended, bool upright)
is the main function used to create a SURF detector where we can define the parameter values of the detector, such as the Hessian threshold, the number of Gaussian pyramid octaves, number of images within each octave of a Gaussian pyramid, number of elements in the descriptor, and the orientation of each feature.
A high threshold value extracts less keypoints but with more accuracy. A low threshold value extracts more keypoints but with less accuracy. In this case, we have used a large Hessian threshold (3500
) to show a reduced number of keypoints in the image. Also, the number of octaves changes for each image (2 and 5, respectively). A larger number of octaves also select keypoints with a larger size. The following screenshot shows the result:
Again, we use the drawKeypoints
function to draw the keypoints detected, but in this case, as the SURF detector has orientation properties, the DrawMatchesFlags
parameter is defined as DRAW_RICH_KEYPOINTS
. Then, the drawKeypoints
function draws each keypoint with its size and orientation.
Binary Robust Independent Elementary Features (BRIEF) is a descriptor based on binary strings; it does not find interest points. The Oriented FAST and Rotated BRIEF (ORB) detector is a union of the FAST detector and BRIEF descriptor and is considered an alternative to the patented SIFT and SURF detectors. The ORB detector uses the FAST detector with pyramids to detect interest points and then uses the HARRIS algorithm to rank the features and retain the best ones. OpenCV also allows us to use the FAST algorithm to rank the features, but normally, this produces less stable keypoints. The following ORBDetector
code shows a simple and clear example of this difference:
int main(int argc, char *argv[]) { //Load image and convert to gray scale (omitted for //simplicity) //Create a keypoint vectors vector<KeyPoint> keypoints1,keypoints2; //ORB detector with FAST (detector1) and HARRIS (detector2) //score to rank the features OrbFeatureDetector detector1(300, 1.1f, 2, 31,0, 2, ORB::FAST_SCORE, 31); OrbFeatureDetector detector2(300, 1.1f, 2, 31,0, 2, ORB::HARRIS_SCORE, 31); //Compute keypoints in in_img with detector1 and detector2 detector1.detect(in_img, keypoints1); detector2.detect(in_img, keypoints2); Mat out_img1, out_img2; //Draw keypoints1 and keypoints2 drawKeypoints(in_img,keypoints1,out_img1,Scalar::all(-1), DrawMatchesFlags::DEFAULT); drawKeypoints(in_img,keypoints2,out_img2,Scalar::all(-1), DrawMatchesFlags::DEFAULT); //Show the 2 final images (omitted for simplicity) return 0; }
The explanation of the code is given as follows. The OrbFeatureDetector(int nfeatures=500, float scaleFactor=1.2f, int nlevels=8, int edgeThreshold=31, int firstLevel=0, int WTA_K=2, int scoreType=ORB:: HARRIS_SCORE, int patchSize=31)
function is the class constructor where we can specify the maximum number of features to retain the scale, number of levels, and type of detector (HARRIS_SCORE
or FAST_SCORE
) used to rank the features.
The following proposed code example shows the difference between the HARRIS and FAST algorithms to rank features; the result is shown in the preceding screenshot:
OrbFeatureDetector detector1(300, 1.1f, 2, 31,0, 2, ORB::FAST_SCORE, 31); OrbFeatureDetector detector2(300, 1.1f, 2, 31,0, 2, ORB::HARRIS_SCORE, 31);
The HARRIS corner detector is used more than FAST to rank features, because it rejects edges and provides a reasonable score. The rest of the functions are the same as in the previous detector examples, keypoint detection and drawing.
The KAZE and AKAZE detectors will be included in the upcoming OpenCV 3.0.
OpenCV 3.0 is not yet available. Again, if you want to test this code and use the KAZE and AKAZE features, you can work with the latest version already available in the OpenCV git repository at http://code.opencv.org/projects/opencv/repository.
The KAZE detector is a method that can detect 2D features in a nonlinear scale space. This method allows us to keep important image details and remove noise. Additive Operator Splitting (AOS) schemes are used for nonlinear scale space. AOS schemes are efficient, stable, and parallelizable. The algorithm computes the response of a Hessian matrix at multiple scale levels to detect keypoints. On the other hand, the Accelerated-KAZE (AKAZE) feature detector uses fast explicit diffusion to build a nonlinear scale space.
Next, in the KAZEDetector
code, we see an example of the new KAZE and AKAZE feature detectors:
int main(int argc, char *argv[]) { //Load image and convert to gray scale (omitted for //simplicity) //Create a keypoint vectors vector<KeyPoint> keypoints1,keypoints2; //Create KAZE and AKAZE detectors KAZE detector1(true,true); AKAZE detector2(cv::AKAZE::DESCRIPTOR_KAZE_UPRIGHT,0,3); //Compute keypoints in in_img with detector1 and detector2 detector1.detect(in_img, keypoints1); detector2.detect(in_img, keypoints2,cv::Mat()); Mat out_img1, out_img2; //Draw keypoints1 and keypoints2 drawKeypoints(in_img,keypoints1,out_img1,Scalar::all(-1), DrawMatchesFlags::DRAW_RICH_KEYPOINTS); drawKeypoints(in_img,keypoints2,out_img2,Scalar::all(-1), DrawMatchesFlags::DRAW_RICH_KEYPOINTS); //Show the 2 final images (omitted for simplicity) return 0; }
The KAZE::KAZE(bool extended, bool upright)
function is the KAZE class constructor in which two parameters can be selected: extended
and upright
. The extended
parameter adds the option to select between 64 or 128 descriptors, while the upright
parameter allows us to select rotation or no invariant. In this case, we use both parameters with a true
value.
On the other hand, the AKAZE::AKAZE(DESCRIPTOR_TYPE descriptor_type, int descriptor_size=0, int descriptor_channels=3)
function is the AKAZE class constructor. This function gets the descriptor type, descriptor size, and the channels as input arguments. For the descriptor type, the following enumeration is applied:
enum DESCRIPTOR_TYPE {DESCRIPTOR_KAZE_UPRIGHT = 2, DESCRIPTOR_KAZE = 3, DESCRIPTOR_MLDB_UPRIGHT = 4, DESCRIPTOR_MLDB = 5 };
The following screenshot shows the results obtained with this example:
Eugene Khvedchenya's Computer Vision Talks blog contains useful reports that compare different keypoints in terms of robustness and efficiency. See the posts at http://computer-vision-talks.com/articles/2012-08-18-a-battle-of-three-descriptors-surf-freak-and-brisk/ and http://computer-vision-talks.com/articles/2011-07-13-comparison-of-the-opencv-feature-detection-algorithms/.