Searching for objects in an image

In order to make your project or installation respond to user actions by using the data from camera, often you need to search, recognize, and track objects on the live video from camera. In this section, we will only cover searching objects.

There are several methods to search a specific object in an image:

  • Template matching is used for searching objects of a fixed shape. The image is scanned using a referenced "ideal" template of the interested object for finding the best match.
  • Contour analysis (other name is geometrical matching) is used for searching objects captured by the camera on a simple background (such as details on the conveyor line). It's based on computing and analyzing the edges of the objects on the image for searching the edge's configurations and matching the interested objects.

    This method can work with complex and overlapped objects but is sensitive to a number of conditions; for example, object should have distinctive edges and there should not be too many edges in the whole image.

  • Algorithms based on machine learning are used for searching complex objects in complex and outdoor scenes. The algorithms use a number of image features (the sum of pixel values over some rectangles or the local statistics of edges orientations) together with machine learning algorithms (Boosting or SVM).

    This is the most advanced and robust technique today. Most famous algorithms are the Viola-Jones algorithm for searching frontal human faces and the HOG-method (Histogram of Orientated Gradients) for searching pedestrians, cars, bicycles, animals, and other outdoor objects.

We will consider a simple example of object searching with contour analysis. Here we will not search specific objects but will find all the bright objects in the image and find the centers of these objects. To achieve this, we will find contours and call the pixels inside each distinct contour, the object. Finding contours is provided by a functionality of the ofxCvContourFinder class, which we will discuss now.

Using the ofxCvContourFinder class for finding contours

The ofxCvContourFinder class is used for searching contours and bounding connected white regions in a binary image. These regions here are called blobs. The typical usage of ofxCvContourFinder is as follows:

  1. Declare the ofxCvContourFinder contourFinder; object.
  2. Find the contours using the following line of code:
    contourFinder.findContours( mask, minArea, maxArea, maxNumber, findHoles );

    Here mask is a binary image. The parameters minArea and maxArea set the range of the pixels' number in the blobs to reject too small or too large blobs.

    Also, maxNumber is an upper limit of the number of resulted blobs and the findHoles value specifies searching the holes mode. If it is false, black holes in blobs are simply ignored, else the holes are regarded as blobs too.

    For example, contourFinder.findContours( mask, 10, 10000, 20, false ); searches for not more than 20 white blobs containing 10 to 10000 pixels and the holes are ignored.

  3. Use the found contours via the contourFinder.blobs array. The number of blobs is given by contourFinder.blobs.size(). Each contourFinder.blobs[i] blob has the following members:
    • area – The number of pixels in the blob
    • length – The perimeter of the blob's contour
    • boundingRect – The bounding rectangle of the blob
    • centroid – The point of the blob's center of mass
    • hole – A boolean value that is equal to true if the blob is not a blob but the hole of some other blob
    • pts – The point array of the blob's contour
    • nPts – The number of points in the contour

      Tip

      Note that the list of blobs is calculated independently for each frame, so you should not assume that blobs with same the index i mean the same blob on successive video frames. If you need to retain blobs identifiers, you should implement it in your own algorithm; for example, you can assign the IDs of the blobs in the current frame by getting the ID of the nearest blob in the previous frame.

  4. Draw contours using the contourFinder.draw( x, y, w, h ) function. Note that the function uses its internal colors for drawing and draws the blob's contour line and bounding box.

An example for searching bright objects in video

Let's consider an example realizing the full processing of the input image image to the list of objects' center's obj.

Note

This is example 09-OpenCV/04-SearchingObjects.

Use the project Generator wizard for creating an empty project with the linked ofxOpenCv addon (see the Using ofxOpenCv section). Then, copy the fruits.mov movie into bin/data of the project, and copy sources of the example to the src folder.

Here, we will consider just the part of the code related to searching for objects.

Assume that the scene has a dark background and the objects are brighter than the background. Then, the processing steps inside the update() function will be the following:

  1. Get the frame from a camera or movie video and convert it into the ofxCvColorImage image as follows:
    image.setFromPixels( video.getPixelsRef() );
  2. Decimate the image size for speeding up the process. We use a special image imageDecimated for storing the decimated image, and allocate it at the first iteration as follows:
    if ( !imageDecimated.bAllocated ) {
      imageDecimated.allocate( image.width * 0.5,
                               image.height * 0.5 );
    }
    imageDecimated.scaleIntoMe( image, CV_INTER_NN );
  3. Convert the image to a grayscale image ofxCvGrayscaleImage grayImage:
    grayImage = imageDecimated;
  4. Perform smoothing for noise suppressing as follows:
    blurred = grayImage;
    blurred.blurGaussian( 9 );
  5. Store the first frame of the movie to the background image. We will assume that this frame is the "true background image":
    if ( !background.bAllocated ) {
      background = blurred;
    }
  6. Find the difference between the current blurred image and the stored background image. Note, we use the difference but not the absolute difference (absDiff()) because we assume that objects are brighter than the background.
    diff = blurred;
    diff -= background;
  7. Perform thresholding for obtaining a binary image where the white regions correspond to the bright objects on the original image.
    mask = diff;
    mask.threshold( 40 );

    Here the value 40 is the threshold parameter and should be adjusted for good results while using videos other than the one in example.

  8. Find the contours of the objects using the contourFinder object of type ofxCvContourFinder.
    contourFinder.findContours( mask, 10, 10000, 20, false );

    This function searches no more than 20 white blobs containing 10 to 10000 pixels, and the holes are ignored.

  9. Collect all the blob's centers to an array of the points obj, namely, vector<ofPoint> obj. For shortening the code, we use the reference to the blob list blobs:
    vector<ofxCvBlob>  &blobs = contourFinder.blobs;
    int n = blobs.size();     //Get number of blobs
    obj.resize( n );          //Resize obj array
    for (int i=0; i<n; i++) {
      obj[i] = blobs[i].centroid;  //Fill obj array
    }
  10. Draw the original image on the screen and mark the objects found.

Compile and run the project. You will see an animation with four images showing the processing video with rolling fruits. The first image is a frame from the video, which is decimated by 50 percent. The second image is the difference diff between the smoothed and background images. The third image is a thresholded image mask with contours drawn over it. Finally, the last image is the original (decimated) frame with crosses marking the found objects.

An example for searching bright objects in video

Note that the difference image does not contain the bright spot that existed at the bottom-left corner of the original image, because this spot is included in the background image and hence was subtracted.

Press 2 and you will see an example of using the centers of the objects obj for generating images.

An example for searching bright objects in video

This is the original image with some white lines drawn over it. These lines depict the isolines of some function f(x,y), which depend on the obj array. Each object's center obj[i] adds to the function cone with the center obj[i]. See a detailed description of this function in generateImg() function in the full example's code. For returning to the processing screen, press 1.

In this example, we were interested in searching all the objects irrespective of their shape. Nevertheless, such a simple example can be used in a wide range of projects for detecting objects such as bright spots. In particular, it is useful in the following situations:

  • Interactive installation with sticks: You are preparing some lighting objects such as sticks with LEDs on their ends and giving them to users. The users flutter the sticks in front of the installation's camera and the installation responds accordingly.
  • Interactive pool installation: It's based on using an infrared light source (such as an IR projector) and a camera that senses IR light of the corresponding wavelength (it can be a camera such as Sony PS3-Eye, equipped with a special IR filter). So, the infrared light illuminates the pool table with balls. The infrared camera detects the pool balls' coordinates, and your installation draws some visuals using this data back to the table using a projector. This method works because the IR light and the visible light of the projector do not interfere.
  • Interactive floor installation: Here the infrared light illuminates the floor, the infrared camera detects objects as humans walking on the floor, and the projector draws the corresponding game's events on the floor.

Note that for accurately detecting the fact that a foot is standing on the floor, it is often better to use depth cameras; see how to do this in the Creating interactive surface section in Chapter 10, Using Depth Cameras. Though depth cameras are much simpler to adjust and use, they have limitations in their tracking range. So three or more depth cameras are needed for tracking areas of size 10×10 meters. And when using ordinary cameras, just one or two cameras can be enough.

Although the ofxOpenCv addon provides a handy interface for basic filtering and geometrical transformations of images, it is a very small part of OpenCV. So now we will learn how to use other OpenCV functions in your projects.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset