In this recipe, we'll learn how to detect faces using the cv::CascadeClassifier
class from OpenCV. In order to do that, we will load an XML file with a trained classifier, use it to detect faces, and then draw a rectangle over the detected face.
Source code for this recipe can be found in the Recipe04_DetectingFaces
folder in the code bundle that accompanies this book. For this recipe, you will need to download the XMLfile from the OpenCV sources at http://bit.ly/3848_FaceCascade. Alternatively, you can find the file in the resources for this book. You can use the iOS Simulator to work on this recipe.
The following are the basic steps needed to accomplish the task:
cv::CascadeClassifier
class using the cascade file from resources.Let's implement the described steps:
ViewController.h
and add a field of the cv::CascadeClassifier
type; this will be our object detector:@interface ViewController : UIViewController { cv::CascadeClassifier faceDetector; }
viewDidLoad
method. Please add it to your application, then run and check if Lena's face is detected successfully:- (void)viewDidLoad { [super viewDidLoad]; // Load cascade classifier from the XML file NSString* cascadePath = [[NSBundle mainBundle] pathForResource:@"haarcascade_frontalface_alt2" ofType:@"xml"]; faceDetector.load([cascadePath UTF8String]); // Load image with face UIImage* image = [UIImage imageNamed:@"lena.png"]; cv::Mat faceImage; UIImageToMat(image, faceImage); // Convert to grayscale cv::Mat gray; cvtColor(faceImage, gray, CV_BGR2GRAY); // Detect faces std::vector<cv::Rect> faces; faceDetector.detectMultiScale(gray, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, cv::Size(30, 30)); // Draw all detected faces for(unsigned int i = 0; i < faces.size(); i++) { const cv::Rect& face = faces[i]; // Get top-left and bottom-right corner points cv::Point tl(face.x, face.y); cv::Point br = tl + cv::Point(face.width, face.height); // Draw rectangle around the face cv::Scalar magenta = cv::Scalar(255, 0, 255); cv::rectangle(faceImage, tl, br, magenta, 4, 8, 0); } // Show resulting image imageView.image = MatToUIImage(faceImage); }
The first steps of this example are similar to ones from previous recipes. You should create an Xcode project, add the OpenCV framework, add a UIImageView
component to the storyboard, and load an input image from the project resources. We just add some more complex OpenCV functionality to detect faces.
In the next recipes, we will discuss how to detect faces in a live video stream, but right now, let's try to do it for a static image. For this task, we use the cv::CascadeClassifier
class. The Haar-based OpenCV face detector was initially proposed by Paul Viola and later extended by Rainer Lienhart. It is based on Haar features and allows finding some specific objects. This method is the de facto standard for face detection tasks. The input XML file contains parameters of such classifiers trained to detect frontal faces.
To load parameters, we need to convert the NSString
object to std::string
. In order to do it, we use the UTF8String
method that returns a null-terminated UTF-8 representation of the NSString
object.
After that, we can find faces on our image with the help of the detectMultiScale
method of the cv::CascadeClassifier
class.
OpenCV function. This function receives the following parameters to configure the detection stage:
scaleFactor
: This specifies how much the image size is decreased at each iteration.minNeighbors
: This specifies how many neighbors each candidate rectangle should have to retain it. Increasing the value of this parameter helps to reduce the number of false positives.CV_HAAR_SCALE_IMAGE
: This is a flag that specifies the algorithm to scale the image rather than the detector. It helps to achieve the best possible performance.minSize
: This parameter specifies the minimum possible face size.Detailed description of the function arguments you can be found in the OpenCV documentation at http://bit.ly/3848_DetectMultiScale.
This function is parallelized with Grand Central Dispatch, so it will work faster on multi-core devices.
Each detected rectangle is added to the resulting image with the cv::rectangle
function.
Now you can try to replace lena.png
with your family photo or some other image with faces.
Object detection is a wide and deep subject, and we only scratched its surface in this recipe. The following will give you some pointers if you want to know more.
The iOS Core Image framework already contains a class for face detection called CIDetector
. So if you only need to detect faces, it can be appropriate. But the cv::CascadeClassifier
class has more options; it can be used to detect any textured objects (with some assumptions) after training.
OpenCV has several trained classifiers, including frontal and profile human faces, individual facial features, silverware, and some others (more details can be found at http://bit.ly/3848_Cascades). You should check the available classifiers, as they might be useful in your future applications.
If there is no classifier for a particular type of object, you can always train your own, following the instructions found at http://bit.ly/3848_TrainCascade. But please note that training a good detector could be a challenging research task.
Cascade Classifier may be too slow for real-time processing, especially on a mobile device. But there are several ways to improve the situation. First of all, please note that downscaling an image may not help, as the detectMultiScale
method builds a pyramid of scales depending on minSize
and maxSize
parameters. But you can tweak these parameters to achieve better performance. Start from increasing the value of the first parameter. Next, you can try to increase the scaleFactor
parameter. Try to use values such as 1.2
or 1.3
, but please note that it may negatively affect the quality of detection!
Apart from parameter tuning, you can try more radical methods. First of all, check if LBP-based cascade is available for your objects (http://bit.ly/3848_LBPCascades). Local Binary Patterns (LBP) features use integer arithmetic; thus they are more efficient and the detector usually works 2-3 times faster than using classic Haar-features (they use floating-point calculations). Finally, you can try to skip the frames in the video stream and track objects with Optical Flow between detections.