12 2. BASICS OF FEATURE DESIGN
clustered. Histogram- and dictionary-based descriptors often suffer from the lack of spatial in-
formation, which is most commonly addressed by spatial pyramids, e.g., for BOV [Philbin et al.,
2007]. However, according to the same authors, dictionary-based descriptors also suffer from
quantization effects that have a major effect on their performance.
is problem has been addressed in Fisher vectors (FV) that move from dictionaries
to Gaussian mixture models (GMM), which also estimate sub-bin displacements and vari-
ances [Sánchez et al., 2013]. Similarly to FVs adding displacement estimates for irregularly
placed models, channel representations add displacement estimates to regularly placed his-
tograms, most apparent in the formulation as P-channels [Felsberg and Granlund, 2006]. e
regular spacing of channels makes a separate variance estimation unnecessary. See Chapter 3 for
the technical definition of channel coding.
2.4 GRID-BASED FEATURE REPRESENTATIONS
Based on the observations made in the previous sections of this chapter, channel representa-
tions belong to the class of grid-based feature representations. e idea behind this approach is
to compute a density estimate of the feature distribution by histogram-like methods. In ma-
chine learning, density estimation by histograms is usually referred to as a nonparametric ap-
proach, in contrast to parameter fitting for a known family of distribution, e.g., normal dis-
tribution [Bishop, 1995]. A hybrid between those parametric approaches and histograms are
mixtures of parametric models, e.g., Gaussian mixture models (GMMs), and kernel density es-
timators (KDEs). e drawback of GMMs is the relatively demanding parameter estimation
by expectation-maximization; the drawback of KDEs is the relatively slow readout of density
values, as the kernel needs to be evaluated for all training samples.
us, despite their success in machine learning, GMMs and KDEs are too slow to be
used as feature descriptors. Instead, grid-based methods such as histogram of oriented gra-
dients (HOG, Dalal and Triggs [2005]), the scale invariant feature transform (SIFT, Lowe
[2004]), and distribution fields (DFs, Sevilla-Lara and Learned-Miller [2012]) are successfully
used, e.g., in multi-view geometry (point matching) and visual tracking. ey are of central
importance to visual computing and have in common that they combined histograms and sig-
natures. is is achieved by computing local histograms over the spatio-featural domain, i.e., a
3D domain consisting of 2D spatial coordinates and one orientation (SIFT/HOG) or intensity
(DF) coordinate. Consider the case of DFs: the image is exploded into several layers representing
different ranges of intensity; see Figure 2.3.
Whereas DFs make an ordinary bin assignment and apply post-smoothing, channel rep-
resentations apply a soft-assignment, i.e., pre-smoothing, which has shown to be more effi-
cient [Felsberg, 2013]. e channel representation was proposed by Nordberg et al. [1994].
It shares similarities to population codes [Pouget et al., 2000, Snippe and Koenderink, 1992]
and similar to their probabilistic interpretation [Zemel et al., 1998] they approximate a kernel
2.4. GRID-BASED FEATURE REPRESENTATIONS 13
Original
Image
Layer 7
Layer 6
Layer 5
Layer 4
Layer 3
Layer 2
Layer 1
Figure 2.3: Illustration of DFs: the image (top) is exploded into several layers (here: 7). In each
of the seven layers, intensity represents activation, where dark is no activation and white is full
activation. Each layer represents a range of intensity values of the original image. e bottom
layer represents dark intensities, i.e., the high activations in the bottom layer are at pixels with low
intensity in the original image. Each new layer above the bottom one represents, respectively,
higher intensities. In the seventh layer, the high intensity pixels of the original image appear
active. Figure from
Öäll and Felsberg [2017].
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset