6.3. COMPARING USING DIVERGENCES 63
6.3 COMPARING USING DIVERGENCES
In the previous section, we considered the posterior predictive to match a new channel vector c
0
to a previously observed vector c. In symmetric settings, both channel vectors c and c
0
are drawn
from the respective generating data distributions p and p
0
. From these measurements, we now
estimate the divergences of p and p
0
. e Hellinger distance is a special case of the Hellinger
divergence
H
˛
.pkp
0
/ D
1
˛ 1
Z
p.x/
˛
p
0
.x/
1˛
dx 1
(6.22)
for ˛ D 1=2. is and other special cases are listed in Table 6.1, in accordance with the classifi-
cation of ˛-divergences, as listed by Felsberg et al. [2013].
Table 6.1: Special cases of Hellinger divergences
Case Distance
α = 1/2
Hellinger
distance
α ↑ 1
Kullback-Leibler divergence
α ↓ 0
Log-likelihood ratio
α = 2
Neyman x
2
divergence
α = –1
Pearson x
2
divergence
Closely related to the Hellinger divergences are the Rényi divergences
R
˛
.pkp
0
/ D
1
˛ 1
log
Z
p.x/
˛
p
0
.x/
1˛
dx (6.23)
by the equality
H
˛
.pkp
0
/ D
1
˛ 1
.exp..˛ 1/R
˛
.pkp
0
// 1/ : (6.24)
Surprisingly, ˛ " 1 leads to the Kullback-Leibler divergence also in this case.
Moreover, for exponential families, closed form solutions of the Rényi divergence exist,
in particular for the Dirichlet distribution (6.6). We obtain for a uniform prior ˛ D 1 (note that
classical notation for both the Dirichlet prior and Rényi divergences makes use of ˛; in what
follows, ˛ denotes the divergence parameter and the parameter of the Dirichlet distribution is
set to 1):
R
˛
.p.Pjc/kp.Pjc
0
// D log
B.c
0
/
B.c/
C
1
˛ 1
log
B.c
˛
/
B.c/
; (6.25)
where c
˛
D ˛c C .1 ˛/c
0
and B./ is the N -dimensional Beta function as defined in the pre-
vious section.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset