Conclusions

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

64 6. PROBABILISTIC INTERPRETATION OF CHANNEL REPRESENTATIONS

Similarly, we can compute a closed form solution for the Hellinger divergence of Dirichlet

distributions

.p.Pjc/kp.Pjc

// D

˛  1



B.c

˛1

B.c

B.c/

 1



; (6.26)

but for numerical reasons, the Rényi divergence is preferable as it can be computed using log-

Beta functions betaln./

.p.Pjc/kp.Pjc

// D betaln.c

/ C

˛  1

betaln.c

/ 

˛  1

betaln.c/ : (6.27)

A further reason to prefer Rényi divergence over Hellinger divergence is the respective sensitivity

to small perturbations. e derivative of the Hellinger divergence with respect to single channel

coeﬃcients scales with the divergence itself, which leads to a lack of robustness. In contrast,

.p.Pjc/kp.Pjc

D .c

/  .jc

j/ C .jc

j/  .c

˛;n

/ ; (6.28)

where .c

/ D



.c

is the digamma function, see, e.g., Van Trees et al. [2013, p. 104].

us, the Rényi divergences of the posteriors estimated from c and c

are candidates for

suitable distance measures. Unfortunately, robustness against outliers is still limited and the

introduction of an outlier process as in (6.25) is analytically cumbersome.

Also, the fully symmetric setting is less common in practice, but occurs, e.g., in the com-

putation of aﬃnity matrices for spectral clustering. Most cases aim at the comparison of a new

measurement with previously acquired ones and the posterior predictive (6.25) is more suitable.

e proposed distances have been discussed assuming one-dimensional distributions, but

the results generalize to higher dimensions, both for independent and dependent stochastic

variables. Obviously, it is an advantage to have uniform marginals and for that case, dependent

joint distributions correspond to a non-constant Copula distribution; see Section 6.4.

6.4 UNIFORMIZATION AND COPULA ESTIMATION

As mentioned in Section 6.1, we often assume a uniform prior for the channel vector. However,

if we compute the marginal distribution from the posterior distribution for a large dataset, the

components of a channel vector might be highly unbalanced. is issue can be addressed by

placing channels in a non-regular way, according to the marginal distribution, i.e., with high

channel density where samples are likely.

is placement is obtained by mapping samples using the cumulative density function of

the distribution from which the samples are drawn. e cumulative density can be computed

from the estimated distribution as obtained from maximum entropy decoding; see Section 5.3.

e subsequent procedure has been proposed by Öäll and Felsberg [2017].

6.4. UNIFORMIZATION AND COPULA ESTIMATION 65

We start with the density function (5.20) (note 

D 

) and obtain the cumulative den-

sity function

P .x/ D

1

nD1



K.y  

/ dy D

nD1



1

K.y  

/ dy D

nD1



K.x  

(6.29)

with the (cumulative) basis functions

K.x/ D

1

K.y/ dy : (6.30)

Only three (for three overlapping channels) cumulative basis functions are in the transition

region for any given x, (6.29) can thus be calculated in constant time (independent of channel

count N ) as

P .x/ D

j C1

nDj 1



K.x  

/ C

N  .j C 1/

; (6.31)

where j is the central channel activated by x.

e function P ./ maps values x to the range Œ0; 1. If the cumulative density function is

accurately estimated, the mapped values will appear to be drawn from a uniform distribution.

us, placing regularly spaced channels in this transformed space corresponds to a sample den-

sity dependent spacing in the original domain and leads to a uniform marginal distribution of

the channel coeﬃcients.

For multi-dimensional distributions, the mapped values can be used to estimate the Cop-

ula distribution. e Copula distribution estimates dependencies between dimensions by re-

moving the eﬀect of the marginal distributions. If the dimensions of a joint distribution are

stochastically independent, the Copula is one. If the dimensions have a positive correlation co-

eﬃcient, the Copula is larger than one, smaller otherwise.

Algorithmically, the estimate for the Copula density is obtained by encoding the mapped

points using an outer product channel representation on the space Œ0I1  Œ0I1. Figure 6.4 illus-

trates the process, with ﬁrst estimating the marginal distributions and the respective cumulative

functions that are then used to form the Copula estimation basis functions.

As an example, Copulas estimated from samples drawn from two diﬀerent multivariate

Gaussian distributions are shown in Figure 6.3. e covariance matrices of these distributions

66 6. PROBABILISTIC INTERPRETATION OF CHANNEL REPRESENTATIONS

are, respectively,

†



0:3 0:3

0:3 1:2



and †



0:3 0

0 1:2



: (6.32)

0.5

0.6

1.2

0.8

0.4

0.2

Figure 6.3: Copulas estimated from multivariate Gaussian distributions. Left: covariance †

(dependent); right: covariance †

(independent); see (6.32). Figure based on Öäll and Felsberg

[2017].

In these estimated Copulas, the ﬁrst 100 samples were only used for estimating the

marginals. e subsequent samples were used both for updating the estimate of the marginals

and for generating the Copula estimate. As apparent in the ﬁgures, the estimated Copula cap-

tures the dependency structure in the ﬁrst case and the independence in the second case.

SUMMARY

In this chapter we have discussed the comparison of channel representations based on their

probabilistic interpretation. In asymmetric cases, the comparison can be done using the distance

obtained from the posterior predictive, in symmetric cases by estimating the divergence of the

two posterior distributions. Finally, the extension to dependent multi-dimensional distributions

in terms Copula distributions has been discussed. e decomposition into marginals and Copula

enable eﬃcient schemes where the high-dimensional Copulas are only compared if the marginals

already indicate a small distance.

6.4. UNIFORMIZATION AND COPULA ESTIMATION 67

-4 -3 -2 -1 0 1 2 3 4

0.2

0.4

0.6

0.8

True CDF

Estimated CDF

-4 -3 -2 -1 0 1 2 3 4

0.1

0.2

0.3

0.4

0.5

True Marginal Density

Estimated Marginal Density

Figure 6.4: Top and middle: marginal density functions estimated using channel representations

and maximum entropy reconstruction, compared with the true marginal densities. Bottom: basis

functions for Copula estimation. e basis functions are regularly spaced on Œ0I1 and mapped

back to the original domain using the inverse estimated cumulative function. When estimating

the Copula, samples are instead mapped by the estimated cumulative function. Figure based

on Öäll and Felsberg [2017].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Conclusions

Create new playlist

Sign In

Sign Up

Table of Contents for
Conclusions