Tracking faces

The challenge in using OpenCV's Haar cascade classifiers is not just getting a tracking result; it is getting a series of sensible tracking results at a high frame rate. One kind of common sense that we can enforce is that certain tracked objects should have a hierarchical relationship, one being located relative to the other. For example, a nose should be in the middle of a face. By attempting to track both a whole face and parts of a face, we can enable application code to do more detailed manipulations and to check how good a given tracking result is. A face with a nose is a better result than one without. At the same time, we can support some optimizations, such as only looking for faces of a certain size and noses in certain places.

We are going to implement an optimized, hierarchical tracker in a class called FaceTracker, which offers a simple interface. A FaceTracker may be initialized with certain optional configuration arguments that are relevant to the tradeoff between tracking accuracy and performance. At any given time, the latest tracking results of FaceTracker are stored in a property called faces, which is a list of Face instances. Initially, this list is empty. It is refreshed via an update() method that accepts an image for the tracker to analyze. Finally, for debugging purposes, the rectangles of faces may be drawn via a drawDebugRects() method, which accepts an image as a drawing surface. Every frame, a real-time face-tracking application would call update(), read faces, and perhaps call drawDebugRects().

Internally, FaceTracker uses an OpenCV class called CascadeClassifier. A CascadeClassifier is initialized with a cascade data file, such as the ones that we found and copied earlier. For our purposes, the important method of CascadeClassifier is detectMultiScale(), which performs tracking that may be robust to variations in scale. The possible arguments to detectMultiScale() are:

  • image: This is an image to be analyzed. It must have 8 bits per channel.
  • scaleFactor: This scaling factor separates the window sizes in two successive passes. A higher value improves performance but diminishes robustness with respect to variations in scale.
  • minNeighbors: This value is one less than the minimum number of regions that are required in a match. (A match may merge multiple neighboring regions.)
  • flags: There are several flags but not all combinations are valid. The valid standalone flags and valid combinations include:
    • cv2.cv.CV_HAAR_SCALE_IMAGE: Scales each windowed image region to match the feature data. (The default approach is the opposite: scale the feature data to match the window.) Scaling the image allows for certain optimizations on modern hardware. This flag must not be combined with others.
    • cv2.cv.CV_HAAR_DO_CANNY_PRUNING: Eagerly rejects regions that contain too many or too few edges to match the object type. This flag should not be combined with cv2.cv.CV_HAAR_FIND_BIGGEST_OBJECT.
    • cv2.cv.CV_HAAR_FIND_BIGGEST_OBJECT: Accepts, at most, one match (the biggest).
    • cv2.cv.CV_HAAR_FIND_BIGGEST_OBJECT | cv2.cv.HAAR_DO_ROUGH SEARCH: Accepts, at most, one match (the biggest) and skips some steps that would refine (shrink) the region of this match. The minNeighbors argument should be greater than 0.
  • minSize: A pair of pixel dimensions representing the minimum object size being sought. A higher value improves performance.
  • maxSize: A pair of pixel dimensions representing the maximum object size being sought. A lower value improves performance.

The return value of detectMultiScale() is a list of matches, each expressed as a rectangle in the format [x, y, w, h].

Similarly, the initializer of FaceTracker accepts scaleFactor, minNeighbors, and flags as arguments. The given values are passed to all detectMultiScale() calls that a FaceTracker makes internally. Also during initialization, a FaceTracker creates CascadeClassifiers using face, eye, nose, and mouth data. Let's add the following implementation of the initializer and the faces property to trackers.py:

class FaceTracker(object):
    """A tracker for facial features: face, eyes, nose, mouth."""
    
    def __init__(self, scaleFactor = 1.2, minNeighbors = 2,
                 flags = cv2.cv.CV_HAAR_SCALE_IMAGE):
        
        self.scaleFactor = scaleFactor
        self.minNeighbors = minNeighbors
        self.flags = flags
        
        self._faces = []
        
        self._faceClassifier = cv2.CascadeClassifier(
            'cascades/haarcascade_frontalface_alt.xml')
        self._eyeClassifier = cv2.CascadeClassifier(
            'cascades/haarcascade_eye.xml')
        self._noseClassifier = cv2.CascadeClassifier(
            'cascades/haarcascade_mcs_nose.xml')
        self._mouthClassifier = cv2.CascadeClassifier(
            'cascades/haarcascade_mcs_mouth.xml')
    
    @property
    def faces(self):
        """The tracked facial features."""
        return self._faces

The update() method of FaceTracker first creates an equalized, grayscale variant of the given image. Equalization, as implemented in OpenCV's equalizeHist() function, normalizes an image's brightness and increases its contrast. Equalization as a preprocessing step makes our tracker more robust to variations in lighting, while conversion to grayscale improves performance. Next, we feed the preprocessed image to our face classifier. For each matching rectangle, we search certain subregions for a left and right eye, nose, and mouth. Ultimately, the matching rectangles and subrectangles are stored in Face instances in faces. For each type of tracking, we specify a minimum object size that is proportional to the image size. Our implementation of FaceTracker should continue with the following code for update():

    def update(self, image):
        """Update the tracked facial features."""
        
        self._faces = []
        
        if utils.isGray(image):
            image = cv2.equalizeHist(image)
        else:
            image = cv2.cvtColor(image, cv2.cv.CV_BGR2GRAY)
            cv2.equalizeHist(image, image)
        
        minSize = utils.widthHeightDividedBy(image, 8)
        
        faceRects = self._faceClassifier.detectMultiScale(
            image, self.scaleFactor, self.minNeighbors, self.flags,
            minSize)
        
        if faceRects is not None:
            for faceRect in faceRects:
                
                face = Face()
                face.faceRect = faceRect
                
                x, y, w, h = faceRect
                
                # Seek an eye in the upper-left part of the face.
                searchRect = (x+w/7, y, w*2/7, h/2)
                face.leftEyeRect = self._detectOneObject(
                    self._eyeClassifier, image, searchRect, 64)
                
                # Seek an eye in the upper-right part of the face.
                searchRect = (x+w*4/7, y, w*2/7, h/2)
                face.rightEyeRect = self._detectOneObject(
                    self._eyeClassifier, image, searchRect, 64)
                
                # Seek a nose in the middle part of the face.
                searchRect = (x+w/4, y+h/4, w/2, h/2)
                face.noseRect = self._detectOneObject(
                    self._noseClassifier, image, searchRect, 32)
                
                # Seek a mouth in the lower-middle part of the face.
                searchRect = (x+w/6, y+h*2/3, w*2/3, h/3)
                face.mouthRect = self._detectOneObject(
                    self._mouthClassifier, image, searchRect, 16)
                
                self._faces.append(face)

Note that update() relies on utils.isGray() and utils.widthHeightDividedBy(), both implemented earlier in this chapter. Also, it relies on a private helper method, _detectOneObject(), which is called several times in order to handle the repetitious work of tracking several subparts of the face. As arguments, _detectOneObject() requires a classifier, image, rectangle, and minimum object size. The rectangle is the image subregion that the given classifier should search. For example, the nose classifier should search the middle of the face. Limiting the search area improves performance and helps eliminate false positives. Internally, _detectOneObject() works by running the classifier on a slice of the image and returning the first match (or None if there are no matches). This approach works whether or not we are using the cv2.cv.CV_HAAR_FIND_BIGGEST_OBJECT flag. Our implementation of FaceTracker should continue with the following code for _detectOneObject():

    def _detectOneObject(self, classifier, image, rect,
                          imageSizeToMinSizeRatio):
        
        x, y, w, h = rect
        
        minSize = utils.widthHeightDividedBy(
            image, imageSizeToMinSizeRatio)
        
        subImage = image[y:y+h, x:x+w]
        
        subRects = classifier.detectMultiScale(
            subImage, self.scaleFactor, self.minNeighbors,
            self.flags, minSize)
        
        if len(subRects) == 0:
            return None
        
        subX, subY, subW, subH = subRects[0]
        return (x+subX, y+subY, subW, subH)

Lastly, FaceTracker should offer basic drawing functionality so that its tracking results can be displayed for debugging purposes. The following method implementation simply defines colors, iterates over Face instances, and draws rectangles of each Face to a given image using our rects.outlineRect() function:

def drawDebugRects(self, image):
        """Draw rectangles around the tracked facial features."""
        
        if utils.isGray(image):
            faceColor = 255
            leftEyeColor = 255
            rightEyeColor = 255
            noseColor = 255
            mouthColor = 255
        else:
            faceColor = (255, 255, 255) # white
            leftEyeColor = (0, 0, 255) # red
            rightEyeColor = (0, 255, 255) # yellow
            noseColor = (0, 255, 0) # green
            mouthColor = (255, 0, 0) # blue
        
        for face in self.faces:
            rects.outlineRect(image, face.faceRect, faceColor)
            rects.outlineRect(image, face.leftEyeRect, leftEyeColor)
            rects.outlineRect(image, face.rightEyeRect,
                              rightEyeColor)
            rects.outlineRect(image, face.noseRect, noseColor)
            rects.outlineRect(image, face.mouthRect, mouthColor)

Now, we have a high-level tracker that hides the details of Haar cascade classifiers while allowing application code to supply new images, fetch data about tracking results, and ask for debug drawing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset