5.5. Tensor Voting in ND

The framework can easily be generalized to higher dimensions to tackle perceptual organization problems in ND spaces. At first glance, it might not be apparent how computer vision problems can be formulated in domains with more than three dimensions, but some computer vision problems are very naturally represented in such domains. For instance, in Section 5.6, we show how optical flow can be formulated as the organization of tokens in smooth layers in a 4D space, where the four dimensions are the image coordinates and the horizontal and vertical velocity.

Both the representation and voting can be generalized to N dimensions along the lines of the previous section, where the generalization from 2D to 3D was shown. The symmetric second order tensor in ND is equivalent to an N x N matrix and an ND hyperellipsoid. Each axis of the hyperellipsoid can be viewed as a vector that is locally normal to the underlying structure. An elementary hyperplane is represented by a stick tensor, which has only one nonzero eigenvalue and, therefore, one normal. An elementary curve has one tangent and N — 1 normals and is represented by a second-order tensor with N — 1 nonzero equal eigenvalues, whose corresponding eigenvectors are the normals of the elementary curve. The polarity vector is an ND vector, and its interpretation is the same as in the previous sections. Its magnitude is proportional to the token's likelihood for being a boundary of a perceptual structure, it is locally orthogonal to the boundary and is directed towards the interior of the structure.

As mentioned in Section 5.4.2, first- and second-order voting by a purely stick voter takes place on a plane defined by the voter, the receiver, and the stick tensor of the voter, regardless of the dimensionality of the space. Any cut of a second-order stick voting field in any dimension that contains the the origin and the stick is identical to the fundamental second-order 2D stick voting field. The same holds for the first-order stick field as well. The ball voting fields are radially symmetric, since the voter has no preference of orientation. What changes is the number of degrees of freedom of the elementary stick voter at the origin that spans the unit sphere to simulate the effect of the ball voter. The total number of first- and second-order voting fields is 2N.

Analysis of the results of voting can be performed under the principles summarized in Tables 5.4 and 5.6. An aspect of the framework that is harder to generalize is the extraction of dense structures in ND. The exponential increase in the number of vertices per hypercube requires a large number of votes to be cast and collected, and generates a prohibitively large number of possible configurations for an ND variant of the marching cubes algorithm.

Caution must be applied to certain issues in ND, which are not as prominent in two or three dimensions. Since voting is a function of distance and curvature, these measures have to be meaningful. The space has to be Euclidean, or at least the axes have to be scaled in a way that distances are meaningful, thus making the effect of more relevant neighbors larger than the effect of irrelevant ones. The data are stored in an approximate nearest neighbor (ANN) k-d tree [1], and thus the computational complexity of searching for a token's neighbors is linear with the dimensionality of the space. Finally, depending on the number of input tokens and the dimensionality of the space, the precomputation of the voting fields may become impractical. If the data is relatively sparse in a high-dimensional space, it is more efficient to compute the required votes when they are cast, instead of generating complete voting fields and using them as look-up tables. For sparse data in high dimensions, repetition of votes is less probable, and many of the precomputed votes may never be used.

5.5.1. Computational complexity

In this section, the computational complexity of tensor voting is analyzed. We show that it is a function of the number of tokens and, in the case of dense structure extraction, the size of the extracted structures.

The initialization step consists of sorting the tokens according to their spacial coordinates and storing them in a data structure from which they can be retrieved efficiently. We use the ANN k-d tree as the data structure that holds the tokens. Its construction is of O(dn logn) complexity, where d is the dimension of the space and n the number of tokens. This data structure meets the requirements of tensor voting operations, since it is optimal when the locations of the data remain constant and multiple queries that seek the data points within a certain distance from a given location are generated. During voting, the tokens that are within the neighborhood of the voter, as defined by the scale (see equation 5.7 in Section 5.3.4), can be retrieved with O(logn) operations.

The number of votes cast by a voter is not fixed and depends on the distribution of data in space and the scale of voting. For most practical purposes, however, we can assume that each voter casts votes on a fraction of the data set and that votes being cast to all tokens is the undesirable consequence of an extremely large scale. On average, the number of votes cast by each voter can be approximated as the product of data density times the size of the voting field. For each voter and receiver pair, the N components of the voter cast first- and second-order votes, which are computed by look-up operations from the voting fields and linear interpolation. The complexity of voting, after the receivers have been localized, is linear in the number of tokens and the dimension of the space.

The complexity of dense structure extraction, which is implemented in 2D and 3D only, is a function of grid resolution, since it defines the number of curvels or surfels extracted. The surface or curve saliency of a fixed number of grid points has to be computed for each surfel or curvel. For instance, surface saliency values at the eight vertices of a grid cube have to be computed for a surfel to be extracted at subvoxel accuracy inside the cube. The search for tokens within the voting neighborhood of each vertex from which votes are collected is O(m), where m is the number of vertices. The marching cubes algorithm [37] is also linear, since it essentially consists of look-up operations. The number of voxels that are visited, but where no surface is extracted, is typically small (zero for a closed surface). Therefore, the complexity of dense surface extraction in 3D is O(mlogn), where n is the number of tokens and m the number of surfels. The same holds for curve extraction.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset