Chapter 7. Functors in OpenCV

Objects That “Do Stuff”

As the OpenCV library has evolved, it has become increasingly common to introduce new objects that encapsulate functionality that is too complicated to be associated with a single function and which, if implemented as a set of functions, would cause the overall function space of the library to become too cluttered.1

As a result, new functionality is often represented by an associated new object type, which can be thought of as a “machine” that does whatever this functionality is. Most of these machines have an overloaded operator(), which officially makes them function objects or functors. If you are not familiar with this programming idiom, the important idea is that unlike “normal” functions, function objects are created and can maintain state information inside them. As a result, they can be set up with whatever data or configuration they need, and they are “asked” to perform services through either common member functions, or by being called as functions themselves (usually via the overloaded operator()2).

Principal Component Analysis (cv::PCA)

Principal component analysis, illustrated in Figure 7-1, is the process of analyzing  a distribution in many dimensions and extracting from that distribution the particular subset of dimensions that carry the most information. The dimensions computed by PCA are not necessarily the basis dimensions in which the distribution was originally specified. Indeed, one of the most important aspects of PCA is the ability to generate a new basis whose axes can be ordered by their importance.3 These basis vectors will turn out to be the eigenvectors of the covariance matrix for the distribution as a whole, and the corresponding eigenvalues will tell us about the extent of the distribution in that dimension.

(a) Input data is characterized by a Gaussian approximation; (b) the data is projected into the space implied by the eigenvectors of the covariance of that approximation; (c) the data is projected by the KLT projection to a space defined only by the most “useful” of the eigenvectors; superimposed: a new data point (the white diamond) is projected to the reduced dimension space by cv::PCA::project(); that same point is brought back to the original space (the black diamond) by cv::PCA::backProject()
Figure 7-1. (a) Input data is characterized by a Gaussian approximation; (b) the data is projected into the space implied by the eigenvectors of the covariance of that approximation; (c) the data is projected by the KLT projection to a space defined only by the most “useful” of the eigenvectors; superimposed: a new data point (the white diamond) is projected to the reduced dimension space by cv::PCA::project(); that same point is brought back to the original space (the black diamond) by cv::PCA::backProject()

We are now in a position to explain why PCA is handled by one of these function objects. Given a distribution once, the PCA object can compute and retain this new basis. The big advantage of the new basis is that the basis vectors that correspond to the large eigenvalues carry most of the information about the objects in the distribution. Thus, without losing much accuracy, we can throw away the less informative dimensions. This dimension reduction is called a KLT transform.4 Once you have loaded a sample distribution and the principal components are computed, you might want to use that information to do various things, such as apply the KLT transform to new vectors. When you make the PCA functionality a function object, it can “remember” what it needs to know about the distribution you gave it, and thereafter use that information to provide the “service” of transforming new vectors on demand.

cv::PCA::PCA()

PCA::PCA();
PCA::PCA(
  cv::InputArray data,                    // Data, as rows or cols in 2d array
  cv::InputArray mean,                    // average, if known, 1-by-n or n-by-1
  int            flags,                   // Are vectors rows or cols of 'data'
  int            maxComponents = 0        // Max dimensions to retain
);

The PCA object has a default constructor, cv::PCA(), which simply builds the PCA object and initializes the empty structure. The second form executes the default construction, then immediately proceeds to pass its arguments to PCA::operator()() (discussed next). 

cv::PCA::operator()()

PCA::operator()(
  cv::InputArray data,                    // Data, as rows or cols in 2d array
  cv::InputArray mean,                    // average, if known, 1-by-n or n-by-1
  int            flags,                   // Are vectors rows or cols of 'data'
  int            maxComponents = 0        // Max dimensions to retain
);

The overloaded operator()() for PCA builds the model of the distribution inside of the PCA object. The data argument is an array containing all of the samples that constitute the distribution. Optionally, mean, a second array that contains the mean value in each dimension, can be supplied (mean can either be n × 1 or 1 × n). The data can be arranged as an n × D (n rows of samples, each of D dimensions) or D × n array (n columns of samples, each of D dimensions). The flags argument is currently used only to specify the arrangement of the data in data and mean. In particular, flags can be set to either cv::PCA_DATA_AS_ROW or cv::PCA_DATA_AS_COL, to indicate that either data is n × D and mean is n × 1 or data is D × n and mean is 1 × n, respectively. The final argument, maxComponents, specifies the maximum number of components (dimensions) that PCA should retain. By default, all of the components are retained.

Note

Any subsequent call to cv::PCA::operator()() will overwrite the internal representations of the eigenvectors and eigenvalues, so you can recycle a PCA object whenever you need to (i.e., you don’t have to reallocate a new one for each new distribution you want to handle if you no longer need the information about the previous distribution).

cv::PCA::project()

cv::Mat PCA::project(                     // Return results, as a 2d matrix
  cv::InputArray  vec                     // points to project, rows or cols, 2d
) const;

void PCA::project(
  cv::InputArray  vec                     // points to project, rows or cols, 2d
  cv::OutputArray result                  // Result of projection, reduced space
) const;

Once you have loaded your reference distribution with cv::PCA::operator()(), you can start asking the PCA object to do useful things for you like compute the KLT projection of some set of vectors onto the basis vectors computed by the principal component analysis. The cv::PCA::project() function has two forms; the first returns a matrix containing the results of the projections, while the second writes the results to a matrix you provide. The advantage of the first form is that you can use it in matrix expressions.

The vec argument contains the input vectors. vec is required to have the same number of dimensions and the same “orientation” as the data array that was passed to PCA when the distribution was first analyzed (i.e., if your data was columns when you called cv::PCA::operator()(), vec should also have the data arranged into columns).

The returned array will have the same number of objects as vec with the same orientation, but the dimensionality of each object will be whatever was passed to maxComponents when the PCA object was first configured with cv::PCA::operator()().

cv::PCA::backProject()

cv::Mat PCA::backProject(             // Return results, as a 2d matrix
  cv::InputArray  vec                 // Result of projection, reduced space
} const;

void PCA::backProject(
  cv::InputArray  vec                 // Result of projection, reduced space
  cv::OutputArray result              // "reconstructed" vectors, full dimension
) const;

The cv::PCA::backProject() function performs the reverse operation of cv::PCA::project(), with the analogous restrictions on the input and output arrays. The vec argument contains the input vectors, which this time are from the projected space. They will have the same number of dimensions as you specified with maxComponents when you configured the PCA object and the same “orientation” as the data array that was passed to PCA when the distribution was first analyzed (i.e., if your data was columns when you called cv::PCA::operator(), vec should also have the data arranged into columns).

The returned array will have the same number of objects as vec with the same orientation, but the dimensionality of each object will be the dimensionality of the original data you gave to the PCA object when the PCA object was first configured with cv::PCA::operator()().

Note

If you did not retain all of the dimensions when you configured the PCA object in the beginning, the result of back-projecting vectors—which are themselves projections of some vector from the original data space—will not be equal to . Of course, the difference should be small, even if the number of components retained was much smaller than the original dimension of , as this is the point of using PCA in the first place.

Singular Value Decomposition (cv::SVD)

The cv::SVD class is similar to cv::PCA in that it is the same kind of function object. Its purpose, however, is quite different. The singular value decomposition is essentially a tool for working with nonsquare, ill-conditioned, or otherwise poorly behaved matrices such those you encounter when solving underdetermined linear systems.

Mathematically, the singular value decomposition (SVD) is the decomposition of an m × n matrix A into the form:

where W is a diagonal matrix and U and V are m × m and n × n (unitary) matrices, respectively. Of course, the matrix W is also an m × n matrix, so here “diagonal” means that any element whose row and column numbers are not equal is necessarily 0.

cv::SVD()

SVD::SVD();
SVD::SVD(
  cv::InputArray A,                     // Linear system, array to be decomposed
  int            flags = 0              // what to construct, can A can scratch
);

The SVD object has a default constructor, cv::SVD(), which simply builds the SVD object and initializes the empty structure. The second form basically executes the default construction, then immediately proceeds to pass its arguments to cv::SVD::operator()() (discussed next). 

cv::SVD::operator()()

SVD::& SVD::operator() (
  cv::InputArray A,                     // Linear system, array to be decomposed
  int            flags = 0              // what to construct, can A be scratch
);

The operator cv::SVD::operator()() passes to the cv::SVD object the matrix to be decomposed. The matrix A, as described earlier, is decomposed into a matrix U, a matrix V (actually the transpose of V, which we will call Vt), and a set of singular values (which are the diagonal elements of the matrix W).

The flags can be any one of cv::SVD::MODIFY_A, cv::SVD::NO_UV, or cv::SVD::FULL_UV. The latter two are mutually exclusive, but either can be combined with the first. The flag cv::SVD::MODIFY_A indicates that it is OK to modify the matrix A when computing. This speeds up computation a bit and saves some memory. It is more important when the input matrix is already very large. The flag cv::SVD::NO_UV tells cv::SVD to not explicitly compute the matrices U and Vt, while the flag cv::SVD::FULL_UV indicates that not only would you like U and Vt computed, but that you would also like them to be represented as full-size square orthogonal matrices.

cv::SVD::compute()

void SVD::compute(
  cv::InputArray  A,                // Linear system, array to be decomposed
  cv::OutputArray W,                // Output array 'W', singular values
  cv::OutputArray U,                // Output array 'U', left singular vectors
  cv::OutputArray Vt,               // Output array 'Vt', right singular vectors
  int             flags = 0         // what to construct, and if A can be scratch
);

This function is an alternative to using cv::SVD::operator()() to decompose the matrix A. The primary difference is that the matrices W, U, and Vt are stored in the user-supplied arrays, rather than being kept internally. The flags supported are exactly those supported by cv::SVD::operator()().

cv::SVD::solveZ()

void SVD::solveZ(
  cv::InputArray  A,                // Linear system, array to be decomposed
  cv::OutputArray z                 // One possible solution (unit length)
);

Given an underdetermined (singular) linear system, cv::SVD::solveZ() will (attempt to) find a unit length solution of and place the solution in the array z. Because the linear system is singular, however, it may have no solution, or it may have an infinite family of solutions. cv::SVD::solveZ() will find a solution, if one exists. If no solution exists, then the return value will be a vector that minimizes , even if this is not, in fact, zero.

cv::SVD::backSubst()

void SVD::backSubst(
  cv::InputArray  b,                // Righthand side of linear system
  cv::OutputArray x                 // Found solution to linear system
);

void SVD::backSubst(
  cv::InputArray  W,                // Output array 'W', singular values
  cv::InputArray  U,                // Output array 'U', left singular vectors
  cv::InputArray  Vt,               // Output array 'Vt', right singular vectors
  cv::InputArray  b,                // Righthand side of linear system
  cv::OutputArray x                 // Found solution to linear system
);

Assuming that the matrix A has been previously passed to the cv::SVD object (and thus decomposed into U, W, and Vt), the first form of cv::SVD::backSubst() attempts to solve the system:

The second form does the same thing, but expects the matrices W, U, and Vt to be passed to it as arguments. The actual method of computing dst is to evaluate the following expression:

This method produces a pseudosolution for an overdetermined system, which is the best solution in the sense of minimizing the least-squares error.5 Of course, it will also exactly solve a correctly determined linear system.

Note

In practice, it is relatively rare that you would want to use cv::SVD::backSubst() directly. This is because you can do precisely the same thing by calling cv::solve() and passing the cv::DECOMP_SVD method flag—which is a lot easier. Only in the less common case in which you need to solve many different systems with the same lefthand side (x) would you be better off calling cv::SVD::backSubst() directly—as opposed to solving the same system many times with different righthand sides (b), which you might as well do with cv::solve().

Random Number Generator (cv::RNG)

A random number tor (RNG) object holds the state of a pseudorandom sequence that generates random numbers. The benefit of using it is that you can conveniently maintain multiple streams of pseudorandom numbers.

Note

When programming large systems, it is a good practice to use separate random number streams in different modules of the code. This way, removing one module does not change the behavior of the streams in the other modules.

Once created, the random number generator provides the “service” of generating random numbers on demand, drawn from either a uniform or a Gaussian distribution. The generator uses the Multiply with Carry (MWC) algorithm [Goresky03] for uniform distributions and the Ziggurat algorithm [Marsaglia00] for the generation of numbers from a Gaussian distribution.

cv::theRNG()

cv::RNG& theRNG( void );                  // Return a random number generator

The cv::theRNG() function returns the default random number generator for the thread from which it was called. OpenCV automatically creates one instance of cv::RNG for each thread in execution. This is the same random number generator that is implicitly accessed by functions like cv::randu() or cv::randn(). Those functions are convenient if you want a single number or to initialize a single array. However, if you have a loop of your own that needs to generate a lot of random numbers, you are better off grabbing a reference to a random number generator—in this case, the default generator, but you could use your own instead—and using RNG::operator T() to get your random numbers (more on that operator shortly).

cv::RNG()

cv::RNG::RNG( void );
cv::RNG::RNG( uint64 state );             // create using the seed 'state'

You can create an RNG object with either the default constructor, or by passing it a 64-bit unsigned integer that it will use as the seed of the random number sequence. If you call the default constructor (or pass 0 to the second variation) the generator will initialize with a standardized value.6

cv::RNG::operator T(), where T is your favorite type

cv::RNG::operator uchar();
cv::RNG::operator schar();
cv::RNG::operator ushort();
cv::RNG::operator short int();
cv::RNG::operator int();
cv::RNG::operator unsigned();
cv::RNG::operator float();
cv::RNG::operator double();

cv::RNG::operator T() is really a set of different methods that return a new random number from cv::RNG of some specific type. Each of these is an overloaded cast operator, so in effect you cast the RNG object to whatever type you want, as shown in Example 7-1. (The style of the cast operation is up to you; this example shows both the int(x) and the (int)x forms.)

Example 7-1. Using the default random number generator to generate a pair of integers and a pair of floating-point numbers
cv::RNG rng = cv::theRNG();
cout << "An integer:      " << (int)rng   << endl;
cout << "Another integer: " << int(rng)   << endl;
cout << "A float:         " << (float)rng << endl;
cout << "Another float:   " << float(rng) << endl;

When integer types are generated, they will be generated (using the MWC algorithm described earlier and thus uniformly) across the entire range of available values. When floating-point types are generated, they will always be in the range from the interval [0.0, 1.0).7

cv::RNG::operator()

unsigned int cv::RNG::operator()();       // Return random value from 0-UINT_MAX
unsigned int cv::RNG::operator()( unsigned int N );  // Return value from 0-(N-1)

When you are generating integer random numbers, the overloaded operator()() allows a convenient way to just grab another one. In essence, calling my_rng() is equivalent to calling (unsigned int)my_rng. The somewhat-more-interesting form of cv::RNG::operator()() takes an integer argument N. This form returns (using the MWC algorithm described earlier and thus uniformly) a random unsigned integer modulo N. Thus, the range of integers returned by my_rng( N ) is then the range of integers from 0 to N-1.

cv::RNG::uniform()

int    cv::RNG::uniform( int a,    int b    );     // Return value from a-(b-1)
float  cv::RNG::uniform( float a,  float b  );     // Return value in range [a,b)
double cv::RNG::uniform( double a, double b );     // Return value in range [a,b)

This function allows you to generate a random number uniformly (using the MWC algorithm) in the interval [a, b).

Note

The C++ compiler does not consider the return value of a function when determining which of multiple similar forms to use, only the arguments. As a result, if you call float x = my_rng.uniform(0,1) you will get 0.f, because 0 and 1 are integers and the only integer in the interval [0, 1) is 0. If you want a floating-point number, you should use something like my_rng.uniform(0.f,1.f), and for a double, use my_rng.uniform(0.,1.). Of course, explicitly casting the arguments also works.

cv::RNG::gaussian()

double  cv::RNG::gaussian( double sigma ); // Gaussian number, zero mean,
                                           // std-dev='sigma'

This function allows you to generate a random number from a zero-mean Gaussian distribution (using the Ziggurat algorithm) with standard deviation sigma.

cv::RNG::fill()

void  cv::RNG::fill(
  InputOutputArray mat,             // Input array, values will be overwritten
  int              distType,        // Type of distribution (Gaussian or uniform)
  InputArray       a,               // min (uniform) or mean (Gaussian)
  InputArray       b                // max (uniform) or std-deviation (Gaussian)
 );

The cv::RNG::fill() algorithm fills a matrix mat of up to four channels with random numbers drawn from a specific distribution. That distribution is selected by the distType argument, which can be either cv::RNG::UNIFORM or cv::RNG::NORMAL. In the case of the uniform distribution, each element of mat will be filled with a random value generated from the interval . In the case of the Gaussian (cv::RNG::NORMAL) distribution, each element is generated from a distribution with the mean taken from a and the standard deviation taken from b:. It is important to note that the arrays a and b are not of the dimension of mat; instead, they are nc × 1 or 1 × nc where nc is the number of channels in mat (i.e., there is not a separate distribution for each element of mat; a and b specify one distribution, not one distribution for every element of mat.)

Note

If you have a multichannel array, then you can generate individual entries in “channel space” from a multivariate distribution simply by giving the appropriate mean and standard deviation for each channel in the input arrays a and b. This distribution, however, will be drawn from a distribution with only zero entries in the off-diagonal elements of its covariance matrix. (This is because each element is generated completely independently of the others.) If you need to draw from a more general distribution, the easiest method is to generate the values from an identity covariation matrix with zero mean using cv::RNG::fill(), and then rotate back to your original  basis using cv::transform()

Summary

In this chapter we introduced the concept of functors, as well as the manner in which they are used in the OpenCV library. We looked at a few such objects that are of general utility and saw how they worked. These included the PCA and SVD objects, as well as the very useful random number generator RNG. Later on, as we delve into the more advanced algorithms the library provides, we will see that this same concept is used in many of the more modern additions to the library.

Exercises

  1. Using the cv::RNG random number generator:

    1. Generate and print three floating-point numbers, each drawn from a uniform distribution from 0.0 to 1.0.

    2. Generate and print three double-precision numbers, each drawn from a Gaussian distribution centered at 0.0 and with a standard deviation of 1.0.

    3. Generate and print three unsigned bytes, each drawn from a uniform distribution from 0 to 255.

  2. Using the fill() method of the cv::RNG random number generator, create an array of:

    1. 20 floating-point numbers with a uniform distribution from 0.0 to 1.0.

    2. 20 floating-point numbers with a Gaussian distribution centered at 0.0 and with a standard deviation of 1.0.

    3. 20 unsigned bytes with a uniform distribution from 0 to 255.

    4. 20 color triplets, each of three bytes with a uniform distribution from 0 to 255.

  3. Using the cv::RNG random number generator, create an array of 100 three-byte objects such that:

    1. The first and second dimensions have a Gaussian distribution, centered at 64 and 192, respectively, each with a variance of 10.

    2. The third dimension has a Gaussian distribution, centered at 128 and with a variance of 2.

    3. Using the cv::PCA object, compute a projection for which maxComponents=2.

    4. Compute the mean in both dimensions of the projection; explain the result.

  4. Beginning with the following matrix:

    1. First compute, by hand, the matrix AT A. Find the eigenvalues (e1, e2), and eigenvectors , of AT A. From the eigenvalues, compute the singular values .

    2. Compute the matrices , and . Recall that , , and is a vector orthogonal to both and . Hint: recall that the cross product of two vectors is always orthogonal to both terms in the cross product.

    3. The matrix Σ is defined (given this particular value of A) to be:

      Using this definition of Σ, and the preceding results for V and U, verify A = U Σ VT by direct multiplication.

    4. Using the cv::SVD object, compute the preceding matrices Σ, V, and U and verify that the results you computed by hand are correct. Do you get exactly what you expected? If not, explain.

1 We encountered one of these objects briefly in the previous chapter with the cv::LineIterator object.

2 Here the word usually means “usually when people program function objects,” but does not turn out to mean “usually for the OpenCV library.” There is a competing convention in the OpenCV library that uses the overloaded operator() to load the configuration, and a named member to provide the fundamental service of the object. This convention is substantially less canonical in general, but quite common in the OpenCV library.

3 You might be thinking to yourself, “Hey, this sounds like machine learning—what is it doing in this chapter?” This is not a bad question. In modern computer vision, machine learning is becoming intrinsic to an ever-growing list of algorithms. For this reason, component capabilities, such as PCA and SVD, are increasingly considered “building blocks.”

4 KLT stands for “Karhunen-Loeve Transform,” so the phrase KLT transform is a bit of a malapropism. It is, however, at least as often said one way as the other.

5 The object diag(W)–1 is a matrix whose diagonal elements are defined in terms of the diagonal elements λ1 of W by for λiε. This value ε is the singularity threshold, a very small number that is typically proportional to the sum of the diagonal elements of W (i.e., ε0i λi).

6 This “standard value” is not zero because, for that value, many random number generators (including the ones used by RNG) will return nothing but zeros thereafter. Currently, this standard value is 232 – 1.

7 In case this notation is unfamiliar, in the designation of an interval using square brackets, [ indicates that this limit is inclusive, and in the designation using parentheses, ( indicates that this limit is noninclusive. Thus the notation [0.0,1.0) means an interval from 0.0 to 1.0 inclusive of 0.0 but not inclusive of 1.0.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset