Segmentation is any process that partitions an image into multiple regions or segments. These will typically correspond to meaningful regions or objects, such as face, car, road, sky, grass, and so on. Segmentation is one of the most important stages in a computer vision system. In OpenCV, there is no specific module for segmentation, though a number of ready-to-use methods are available in other modules (most of them in imgproc
). In this chapter, we will cover the most important and frequently used methods available in the library. In some cases, additional processing will have to be added to improve the results or obtain seeds (this refers to rough segments that allow an algorithm to perform a complete segmentation). In this chapter we will look at the following major segmentation methods: thresholding, contours and connected components, flood filling, watershed segmentation, and the GrabCut algorithm.
Thresholding is one of the simplest yet most useful segmentation operations. We can safely say that you will end up using some sort of thresholding in almost any image-processing application. We consider it a segmentation operation since it partitions an image into two regions, typically, an object and its background. In OpenCV, thresholding is performed with the function double threshold(InputArray src, OutputArray dst, double thresh, double maxval, int type)
.
The first two parameters are the input and output images, respectively. The third input parameter is the threshold chosen. The meaning of maxval
is controlled by the type of thresholding we want to perform. The following table shows the operation performed for each type:
Type |
dst(x,y) |
---|---|
|
|
|
|
|
|
|
|
|
|
While in previous OpenCV books (and the available reference manual) each type of thresholding is illustrated with the help of 1D signal plots, our experience shows that numbers and gray levels allow you to grasp the concept faster. The following table shows the effect of the different threshold types using a single-line image as an example input:
The special value THRESH_OTSU
may be combined with the previous values (with the OR operator). In such cases, the threshold value is automatically estimated by the function (using Otsu's algorithm). This function returns the estimated threshold value.
Otsu's method obtains a threshold that best separates the background from the foreground's pixels (in an interclass/intraclass variance ratio sense). See the full explanation and demos at http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html.
While the function described uses a single threshold for the whole image, adaptive thresholding estimates a different threshold for each pixel. This produces a better result when the input image is less homogeneous (with unevenly illuminated regions, for example). The function to perform adaptive thresholding is as follows:
adaptiveThreshold(InputArray src, OutputArray dst, double maxValue, int adaptiveMethod, int thresholdType, int blockSize, double C)
This function is similar to the previous one. The parameter thresholdType
must be either THRESH_BINARY
or THRESH_BINARY_INV
. This function computes a threshold for each pixel by computing a weighted average of pixels in a neighborhood minus a constant (C
). When thresholdType
is ADAPTIVE_THRESH_MEAN_C, the threshold computed is the mean of the neighborhood (that is, all the elements are weighted equally).When thresholdType
is ADAPTIVE_THRESH_GAUSSIAN_C, the pixels in the neighborhood are weighted according to a Gaussian function.
The following thresholding
example shows how to perform thresholding operations on an image:
#include "opencv2/opencv.hpp" #include <iostream> using namespace std; using namespace cv; Mat src, dst, adaptDst; int threshold_value, block_size, C; void thresholding( int, void* ) { threshold( src, dst, threshold_value, 255, THRESH_BINARY ); imshow( "Thresholding", dst ); } void adaptThreshAndShow() { adaptiveThreshold( src, adaptDst, 255, CV_ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, block_size, C); imshow( "Adaptive Thresholding", adaptDst ); } void adaptiveThresholding1( int, void* ) { static int prev_block_size=block_size; if ((block_size%2)==0) // make sure that block_size is odd { if (block_size>prev_block_size) block_size++; if (block_size<prev_block_size) block_size--; } if (block_size<=1) block_size=3; // check block_size min value adaptThreshAndShow(); } void adaptiveThresholding2( int, void* ) { adaptThreshAndShow(); } int main(int argc, char *argv[]) { //Read original image and clone it to contain results src = imread("left12.jpg", CV_LOAD_IMAGE_GRAYSCALE ); dst=src.clone(); adaptDst=src.clone(); //Create 3 windows namedWindow("Source", WINDOW_AUTOSIZE); namedWindow("Thresholding", WINDOW_AUTOSIZE); namedWindow("Adaptive Thresholding", WINDOW_AUTOSIZE); imshow("Source", src); //Create trackbars threshold_value=127; block_size=7; C=10; createTrackbar( "threshold", "Thresholding", &threshold_value, 255, thresholding ); createTrackbar( "block_size", "Adaptive Thresholding", &block_size, 25, adaptiveThresholding1 ); createTrackbar( "C", "Adaptive Thresholding", &C, 255, adaptiveThresholding2 ); //Perform operations a first time thresholding(threshold_value,0); adaptiveThresholding1(block_size, 0); adaptiveThresholding2(C, 0); // Position windows on screen moveWindow("Source", 0,0); moveWindow("Thresholding", src.cols,0); moveWindow("Adaptive Thresholding", 2*src.cols,0); cout << "Press any key to exit... "; waitKey(); // Wait for key press return 0; }
The example in the preceding code creates three windows with the source image, which is loaded in grayscale, and the result of thresholding and adaptive thresholding. Then, it creates three trackbars: one associated to the thresholding result window (to handle the threshold value) and two associated to the adaptive thresholding result window (to handle the block's size and the value of the constant C
). Note that since two callback functions are necessary in this case, and we do not want to repeat code, the call to adaptiveThreshold
is embedded in the function, adaptThreshAndShow
.
Next, a call is made to the functions that perform the operations using default parameter values. Finally, the moveWindow
function from highgui
is used to reposition the windows on the screen (otherwise they will be displayed on top of each other, and only the third one will be visible). Also, note that the first six lines in the function adaptiveThresholding1
are needed to keep an odd value in the parameter block_size
. The following screenshot shows the output of the example:
The function inRange(InputArray src, InputArray lowerb, InputArray upperb, OutputArray dst)
is also useful for thresholding as it checks whether the pixels lie between lower and upper thresholds. Both lowerb
and upperb
must be provided using Scalar, as in inRange(src, Scalar(bl,gl,rl), Scalar(bh,gh,rh), tgt);
.