Cascades are beautiful

Most objects' detection problems, such as face/person detection or lesion detection in medicine, require searching for the object in many image patches. However, examining all image zones and computing the feature set for each zone are time-consuming tasks. Cascade detectors are widely used because of their high efficiency in doing this.

Cascade detectors consist of various boosting stages. The boosting algorithm selects the best feature set to create and combine a number of weak tree classifiers. Thus, boosting is not only a detector but also a feature selection method. Each stage is usually trained to detect nearly 100 percent of objects correctly and discard at least 50 percent of the background images. Therefore, background images, which represent a larger number of images, need less processing time as they are discarded at the early stages of the cascade. Moreover, the concluding cascade stages use more features than earlier stages, and even then only objects and difficult background images require more time to be evaluated.

Discrete AdaBoost (Adaptive Boosting), Real AdaBoost, Gentle AdaBoost, and LogitBoost are all implemented in OpenCV as boosting stages. On the other hand, it is possible to use Haar-like, Local Binary Patterns (LBP) and Histograms of Oriented Gradients (HOG) features together with the different boosting algorithms.

All these advantages and available techniques make cascades very useful for building practical detection applications.

Object detection using cascades

OpenCV comes with several pretrained cascade detectors for the most common detection problems. They are located under the OPENCV_SOURCEdata directory. The following is a list of some of them and their corresponding subdirectories:

  • Subdirectory haarcascades:
    • haarcascade_frontalface_default.xml
    • haarcascade_eye.xml
    • haarcascade_mcs_nose.xml
    • haarcascade_mcs_mouth.xml
    • haarcascade_upperbody.xml
    • haarcascade_lowerbody.xml
    • haarcascade_fullbody.xml
  • Subdirectory lbpcascades:
    • lbpcascade_frontalface.xml
    • lbpcascade_profileface.xml
    • lbpcascade_silverware.xml
  • Subdirectory hogcascades:
    • hogcascade_pedestrians.xml

The following pedestrianDetection example serves to illustrate how to use a cascade detector and localize pedestrians in a video file with OpenCV:

#include "opencv2/core/core.hpp"
#include "opencv2/objdetect/objdetect.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>

using namespace std;
using namespace cv;

int main(int argc, char *argv[]){
    CascadeClassifier cascade(argv[1]);
    if (cascade.empty())
        return -1;

    VideoCapture vid(argv[2]);
    if (!vid.isOpened()){
        cout<<"Error. The video cannot be opened."<<endl;
        return -1;
    }

    namedWindow("Pedestrian Detection");
    Mat frame;
    while(1) {
        if (!vid.read(frame))
            break;

        Mat frame_gray;
        if(frame.channels()>1){
            cvtColor( frame, frame_gray, CV_BGR2GRAY );
            equalizeHist( frame_gray, frame_gray );
        }else{
            frame_gray = frame;
        }

        vector<Rect> pedestrians;
        cascade.detectMultiScale( frame_gray, pedestrians, 1.1, 2, 0, Size(30, 30), Size(150, 150) );

        for( size_t i = 0; i < pedestrians.size(); i++ ) {
            Point center( pedestrians[i].x + 
                          pedestrians[i].width*0.5, 
                          pedestrians[i].y + 
                          pedestrians[i].height*0.5 );
            ellipse( frame, center, Size( pedestrians[i].width*0.5,
                     pedestrians[i].height*0.5), 0, 0, 360, 
                     Scalar( 255, 0, 255 ), 4, 8, 0 );
        }

        imshow("Pedestrian Detection", frame);
        if(waitKey(100) >= 0)
            break;
    }
    return 0;
}

The code explanation is as follows:

  • CascadeClassifier: This class provides all the methods needed when working with cascades. An object from this class represents a trained cascade detector.
  • constructor CascadeClassifier:: CascadeClassifier(const string& filename): This class initializes the object instance and loads the information of the cascade detector stored in the system file indicated by the variable filename.

    Note

    Note that the method bool CascadeClassifier::load(const string& filename) is actually called implicitly after the constructor.

  • bool CascadeClassifier:: empty(): This method checks if a cascade detector has been loaded.
  • cvtColor and equalizeHist: These methods are required for image grayscale conversion and equalization. Since the cascade detector is trained with grayscale images and input images can be in different formats, it is necessary to convert them to the correct color space and equalize their histograms in order to obtain better results. This is done by the following code that uses the cvtColor and equalizeHist functions:
    Mat frame_gray;
    if(frame.channels()>1){
        cvtColor( frame, frame_gray, CV_BGR2GRAY );
        equalizeHist( frame_gray, frame_gray );
    }else{
        frame_gray = frame;
    }
  • void CascadeClassifier::detectMultiScale(const Mat& image, vector<Rect>& objects, double scaleFactor=1.1, int minNeighbors=3, int flags=0, Size minSize=Size(), Size maxSize=Size()): This method examines the image in the image variable applying the loaded cascade and insert all detected objects in objects. Detections are stored in a vector of rectangles of type Rect. The parameters scaleFactor and minNeighbors indicates how much the image size is reduced at each image scale considered and the minimum number of neighbors that indicate a positive detection. Detections are bound by the minimum and maximum sizes, indicated by minSize and maxSize. Finally, the parameter flags is not used when using cascades created with opencv_traincascade.

    Tip

    After obtaining the vector that stores the detected objects, it is easy to show them over the original images by reading the coordinates of each rectangle, represented by objects of the class Rect, and drawing a polygon in the indicated zones.

The following screenshot shows the result of applying the hogcascade_pedestrians.xml pretrained HOG-based pedestrian detector over the frames of the 768x576.avi video, which is stored in the OPENCV_SCR/samples folder.

Object detection using cascades

Pedestrian detection using the OpenCV-trained HOG cascade detector

There are several projects and contributions to the OpenCV community that solve other detection-related problems that involve not only detecting the object but also distinguishing its state. One example of this type of detectors is the smile detector included in OpenCV since Version 2.4.4. The code can be found in the file OPENCV_SCR/samples/c/smiledetect.cpp, and the XML that stores the cascade detector, haarcascade_smile.xml, can be found in OPENCV_SCR/data/haarcascades. This code first detects the frontal face using the pretrained cascade stored in haarcascade_frontalface_alt.xml and then detects the smiling mouth pattern at the bottom part of the image. Finally, the intensity of the smile is calculated based on the number of neighbors detected.

Training your own cascade

Although OpenCV provides pretrained cascades, in some cases it is necessary to train a cascade detector to look for a specific object. For these cases, OpenCV comes with tools to help train a cascade, generating all the data needed during the training process and the final files with the detector information. These are usually stored in the OPENCV_BUILDinstallx64mingwin directory. Some of the applications are listed as follows:

  • opencv_haartraining: This application is historically the first version of the application for creating cascades.
  • opencv_traincascade: This application is the latest version of the application for creating cascades.
  • opencv_createsamples: This application is used to create the .vec file with the images that contain instances of the object. The file generated is accepted by both the preceding training executables.
  • opencv_performance: This application may be used to evaluate a cascade trained with the opencv_haartraining tool. It uses a set of marked images to obtain information about the evaluation, for example, the false alarm or the detection rates.

Since opencv_haartraining is the older version of the program and it comes with fewer features than opencv_traincascade, only the latter will be described here.

Here, the cascade training process is explained using the MIT CBCL face database. This database contains face and background images of 19 x 19 pixels arranged as shown in the following screenshot:

Training your own cascade

Image file organization

Note

This section explains the training process on Windows. For Linux and Mac OS X, the process is similar but takes into account the specific aspects of the operating system. More information on training cascade detectors in Linux and Mac OS X can be found at http://opencvuser.blogspot.co.uk/2011/08/creating-haar-cascade-classifier-aka.html and http://kaflurbaleen.blogspot.co.uk/2012/11/how-to-train-your-classifier-on-mac.html respectively.

The training process involves the following steps:

  1. Setting the current directory: In the Command Prompt window, set the current directory to the directory in which training images are stored. For example, if the directory is C:chapter6images, use the following command:
    >cd C:chapter6images
    
  2. Creating the background images information text file: If background images are stored in C:chapter6images rain on-face and their format is .pgm, it is possible to create the text file required by OpenCV using the following command:
    >for %i in (C:chapter6images	rain
    on-face*.pgm) do @echo %i >> train_non-face.txt
    

    The following screenshot shows the contents of the background image information file. This file contains the path of the background images:

    Training your own cascade

    Background images information file

  3. Creating the object images file: This involves the following two steps:
    1. Creating the .dat file with the object coordinates. In this particular database, object images only contain one instance of the object and it is located in the center of the image and scaled to occupy the entire image. Therefore, the number of objects per image is 1 and the object coordinates are 0 0 19 19, which are the initial point and the width and height of the rectangle that contains the object.

      If object images are stored in C:chapter6images rainface, it is possible to use the following command to generate the file:

      >for %i in (C:chapter6images	rainface*.pgm) do @echo %i 1 0 0 19 19 >> train_face.dat
      

      The content of the .dat file can be seen in the following screenshot:

      Training your own cascade

      Object images file

    2. After creating the .dat file with the object coordinates, it is necessary to create the .vec file that is needed by OpenCV. This step can be performed using the opencv_createsamples program with the arguments –info (.dat file); -vec (.vec output file name); -num (number of images); -w and –h (output image width and height); and –maxxangle, -maxyangle, and -maxzangle (image rotation angles). To see more options, execute opencv_createsamples without arguments. In this case, the command used is:
      >opencv_createsamples -info train_face.dat -vec train_face.vec -num 2429 -w 19 -h 19 -maxxangle 0 -maxyangle 0 -maxzangle 0
      

      Tip

      OpenCV includes a sample .vec file with facial images of size 24 x 24 pixels.

  4. Training the cascade: Finally, use the opencv_traincascade executable and train the cascade detector. The command used in this case is:
    >opencv_traincascade -data C:chapter6	rainedCascade -vec train_face.vec -bg train_non-face.txt -numPos 242 -numNeg 454 -numStages 10 -w 19 -h 19
    

    The arguments indicate the output directory (-data), the .vec file (-vec), the background information file (-bg), the number of positive and negative images to train each stage (-numPos and –numNeg), the maximum number of stages (-numStages), and the width and height of the images (-w and –h).

    The output of the training process is:

    PARAMETERS:
    cascadeDirName: C:chapter6	rainedCascade
    vecFileName: train_face.vec
    bgFileName: train_non-face.txt
    numPos: 242
    numNeg: 454
    numStages: 10
    precalcValBufSize[Mb] : 256
    precalcIdxBufSize[Mb] : 256
    stageType: BOOST
    featureType: HAAR
    sampleWidth: 19
    sampleHeight: 19
    boostType: GAB
    minHitRate: 0.995
    maxFalseAlarmRate: 0.5
    weightTrimRate: 0.95
    maxDepth: 1
    maxWeakCount: 100
    mode: BASIC
    ===== TRAINING 0-stage =====
    <BEGIN
    POS count : consumed   242 : 242
    NEG count : acceptanceRatio    454 : 1
    Precalculation time: 4.524
    +----+---------+---------+
    |  N |    HR   |    FA   |
    +----+---------+---------+
    |   1|        1|        1|
    +----+---------+---------+
    |   2|        1|        1|
    +----+---------+---------+
    |   3| 0.995868| 0.314978|
    +----+---------+---------+
    END>
    Training until now has taken 0 days 0 hours 0 minutes 9 seconds.
    . . . Stages 1, 2, 3, and 4 . . .
    ===== TRAINING 5-stage =====
    <BEGIN
    POS count : consumed   242 : 247
    NEG count : acceptanceRatio    454 : 0.000220059
    Required leaf false alarm rate achieved. Branch training terminated.
    

Finally, the XML files of the cascade are stored in the output directory. These files are cascade.xml, params.xml, and a set of stageX.xml files where X is the stage number.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset