8 2. BASICS OF FEATURE DESIGN
erator itself or some subsequent step might detect the outlier and thus avoid further use of the
spurious result.
erefore, it makes sense to evaluate precision and robustness separately, e.g., as done in
the visual object tracking (VOT) benchmark [Kristan, Matas, Leonardis, Vojir, Pflugfelder, Fer-
nandez, Nebehay, Porikli and Čehovin, 2016]. A typical measure for outlier removal in feature
extraction is the coherence of the structure tensor. e first eigenvector of the tensor represents
the local signal orientation if the difference of first and second eigenvalue is large. e double-
angle vector is an integrated representation of this principle [Granlund and Knutsson, 1995].
e tensor representation is still limited as it cannot represent two independent orienta-
tions, even if they co-exist. In general, the same local data can be subject to multiple mid-level
interpretations, which leads to the concept of metamery, as introduced by Koenderink [1993],
where metamery describes the ambivalence of the stimulus (icon) for a given feature descriptor
(N-jet).
In order to represent multiple orientations, a mid-level representation with more degrees
of freedom than the structure tensor is required. As explained by Granlund and Knutsson [1995],
the structure tensor can be computed from a minimal set of three filters with a cos
2
-shaped
angular transfer function and increasing the number of angular transfer functions directly leads
to the concept of channel representations; see Chapter 3. Before we take this step, we will look
into other terms, in particular invariance and equivariance, that are relevant to the development
of structure tensors.
2.2 INVARIANCE AND EQUIVARIANCE
e example of the structure tensor for signal orientation estimation establishes also a proto-
typical example of the invariance-equivariance principle. If the image is rotated, the eigenvalues
of the structure tensor remain the same (invariant) and the eigenvectors are rotated by the same
angle as the image (equivariant).
Unfortunately, in literature the term invariant is often used without proper definition.
Obviously, it is not enough to require that the output is not affected by a certain operation in
the input space because this has a trivial solution, an operator with constant output [Burkhardt,
1989] that lacks separability [Mallat, 2016].
Also, the often used example of shift-invariant operators by means of the magnitude
Fourier spectrum is problematic as it maps many completely different inputs to the same equiv-
alence class; see Figure 2.1. is figure illustrates different contours that are all described by
the same magnitude Fourier descriptors [Granlund, 1972], but they are obviously not related in
shape.
Actually, the proper definition of invariance is more complicated than simply requiring
constant output or global symmetry [Mallat, 2016]. Mathematically, it is more straightforward
2.2. INVARIANCE AND EQUIVARIANCE 9
Figure 2.1: Illustration of faulty extension of equivalence classes under invariance formulation:
contour of a pedestrian (left) and two contours with the same magnitudes of Fourier descriptors
(center and right). Illustration from Larsson et al. [2011] used with permission.
to define equivariance [Nordberg and Granlund, 1996], [Granlund and Knutsson, 1995, p. 298]:
f .Ay/ D A
0
f .y/; y 2 Y: (2.1)
Here, f W Y ! X is an operator (feature extractor), A W Y ! Y a transformation on the
input space, and A
0
W X ! X the corresponding transformation on the output space. Techni-
cally, the two transformations are different representations of the same transformation group
and there exists a homomorphism between them.
e equivariance property is also called strong invariance [Ferraro and Caelli, 1994], left
invariance [Duits and Franken, 2010], or covariance [Mallat, 2016]. In contrast, f is called
invariant under B if
f .By/ D f .y/; y 2 Y; (2.2)
which might, however, result in too large equivalence classes for elements in Y or lack of sepa-
rability, as pointed out above. In the literature, this is also referred to weak invariance [Ferraro
and Caelli, 1994] or right invariance [Duits and Franken, 2010].
In deep learning literature, the two terms are often not kept distinct. Shift-invariance
is used for both, the convolutional layers (refers to equivariance above) and max-pooling lay-
ers (refers to invariance above, Goodfellow et al. [2016]). In practice, the most useful feature
representations establish a split of identity, i.e., a factorization of the output into invariant and
equivariant parts [Felsberg and Sommer, 2001].
In what follows, invariance and equivariance will be kept as separate concepts, but note
that equivariant features often behave invariant under scalar products, i.e., they form isometries.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset