Links to Biologically Inspired Models

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

12 2. BASICS OF FEATURE DESIGN

clustered. Histogram- and dictionary-based descriptors often suﬀer from the lack of spatial in-

formation, which is most commonly addressed by spatial pyramids, e.g., for BOV [Philbin et al.,

2007]. However, according to the same authors, dictionary-based descriptors also suﬀer from

quantization eﬀects that have a major eﬀect on their performance.

is problem has been addressed in Fisher vectors (FV) that move from dictionaries

to Gaussian mixture models (GMM), which also estimate sub-bin displacements and vari-

ances [Sánchez et al., 2013]. Similarly to FVs adding displacement estimates for irregularly

placed models, channel representations add displacement estimates to regularly placed his-

tograms, most apparent in the formulation as P-channels [Felsberg and Granlund, 2006]. e

regular spacing of channels makes a separate variance estimation unnecessary. See Chapter 3 for

the technical deﬁnition of channel coding.

2.4 GRID-BASED FEATURE REPRESENTATIONS

Based on the observations made in the previous sections of this chapter, channel representa-

tions belong to the class of grid-based feature representations. e idea behind this approach is

to compute a density estimate of the feature distribution by histogram-like methods. In ma-

chine learning, density estimation by histograms is usually referred to as a nonparametric ap-

proach, in contrast to parameter ﬁtting for a known family of distribution, e.g., normal dis-

tribution [Bishop, 1995]. A hybrid between those parametric approaches and histograms are

mixtures of parametric models, e.g., Gaussian mixture models (GMMs), and kernel density es-

timators (KDEs). e drawback of GMMs is the relatively demanding parameter estimation

by expectation-maximization; the drawback of KDEs is the relatively slow readout of density

values, as the kernel needs to be evaluated for all training samples.

us, despite their success in machine learning, GMMs and KDEs are too slow to be

used as feature descriptors. Instead, grid-based methods such as histogram of oriented gra-

dients (HOG, Dalal and Triggs [2005]), the scale invariant feature transform (SIFT, Lowe

[2004]), and distribution ﬁelds (DFs, Sevilla-Lara and Learned-Miller [2012]) are successfully

used, e.g., in multi-view geometry (point matching) and visual tracking. ey are of central

importance to visual computing and have in common that they combined histograms and sig-

natures. is is achieved by computing local histograms over the spatio-featural domain, i.e., a

3D domain consisting of 2D spatial coordinates and one orientation (SIFT/HOG) or intensity

(DF) coordinate. Consider the case of DFs: the image is exploded into several layers representing

diﬀerent ranges of intensity; see Figure 2.3.

Whereas DFs make an ordinary bin assignment and apply post-smoothing, channel rep-

resentations apply a soft-assignment, i.e., pre-smoothing, which has shown to be more eﬃ-

cient [Felsberg, 2013]. e channel representation was proposed by Nordberg et al. [1994].

It shares similarities to population codes [Pouget et al., 2000, Snippe and Koenderink, 1992]

and similar to their probabilistic interpretation [Zemel et al., 1998] they approximate a kernel

2.4. GRID-BASED FEATURE REPRESENTATIONS 13

Original

Image

Layer 7

Layer 6

Layer 5

Layer 4

Layer 3

Layer 2

Layer 1

Figure 2.3: Illustration of DFs: the image (top) is exploded into several layers (here: 7). In each

of the seven layers, intensity represents activation, where dark is no activation and white is full

activation. Each layer represents a range of intensity values of the original image. e bottom

layer represents dark intensities, i.e., the high activations in the bottom layer are at pixels with low

intensity in the original image. Each new layer above the bottom one represents, respectively,

higher intensities. In the seventh layer, the high intensity pixels of the original image appear

active. Figure from

Öäll and Felsberg [2017].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Links to Biologically Inspired Models

Create new playlist

Sign In

Sign Up

Table of Contents for
Links to Biologically Inspired Models