6
Manifold means

6.1 Intrinsic and extrinsic means

The structure of shape and size variability is the primary concern in most applications (see Bookstein, 2016). However, the concept of a mean shape or mean size-and-shape also has an underpinning role to such analysis, for example when choosing a tangent space projection. We consider a situation where a population of shapes or size-and-shapes can be modelled by a probability distribution. Early definitions of a mean in a general manifold were given by Fréchet (1948) and Grove and Karcher (1973).

Definition 6.1 If d is a choice of distance in a manifold M then a population mean is given by:

(6.1) numbered Display Equation

where EX[ · ] is the expectation operator with respect to the random quantity X inline M.

If μM is a global minimum then it has been called a ‘Fréchet mean’ in the literature, and the term ‘Karcher mean’ has been used when it is a local minimum, after the work of Grove and Karcher (1973) and Karcher (1977) on Riemannian centres of mass. Although the terminology is not universally applied, we shall use these terms in order to help distinguish global and local minimizers. A detailed history is given by Afsari (2011).

Definition 6.2 A population variance is:

(6.2) numbered Display Equation

and if it is a global minimum then σ2M is called the Fréchet variance.

In the case of a Euclidean space the population mean is simply

numbered Display Equation

and the population variance is:

numbered Display Equation

However, for more general manifolds we have many choices. The first distinction to make is that of an intrinsic mean versus an extrinsic mean, which are defined using either an intrinsic or extrinsic distance, respectively. An intrinsic distance is the length of the shortest geodesic path in the manifold; an extrinsic distance is induced by a Euclidean distance in an embedding of the manifold; and another type of distance is a residual distance in a tangent space (Huckemann 2012).

6.2 Population mean shapes

Definition 6.3 If d = ρ is an intrinsic distance in the shape space (e.g. the Riemannian metric) then the population mean is called the intrinsic mean shape and is denoted by:

(6.3) numbered Display Equation

Intrinsic means are unique provided the distribution is sufficiently concentrated, (Karcher 1977; Kendall 1990b; Le 2001; Afsari 2011). Further discussion is given in Section 13.2.

Definition 6.4 The extrinsic mean shape is given by P(μE) where

(6.4) numbered Display Equation

and j(X) is an embedding into a Euclidean space, P(μ) is a unique projection from the embedding space to M, and ρJ is an extrinsic distance. By ‘unique’ we mean that the projection maps to a unique closest point in M.

There are many choices of Euclidean embedding for an extrinsic mean and Bhattacharya and Patrangenaru (2003, 2005) focus particularly on the Veronese–Whitney embedding for shape in two dimensions and reflection shape in three dimensions, although any embedding can be used to find an extrinsic distance. For a complex pre-shape (k − 1)-vector z, the Veronese–Whitney embedding is j(z) = zz*, and the Euclidean distance using this embedding is the full Procrustes distance for 2D shapes. For a real pre-shape (k − 1) × m matrix Z the Veronese–Whitney embedding is j(Z) = ZZT, which gives rise to an embedding used for computing a MDS mean for reflection shape (see Section 15.3). Kent (1992) provides a general family of equivariant embeddings for 2D shapes which give rise to the extrinsic distances:

where h is a positive integer.

Definition 6.5 The population partial Procrustes mean shape is given by:

(6.6) numbered Display Equation

Huckemann (2012) uses the term ‘Ziezold mean’ for the mean which involves minimizing the expected square Euclidean distance on a pre-shape sphere, after Ziezold (1977). The partial Procrustes mean is a Ziezold mean.

Definition 6.6 The population full Procrustes mean shape is given by:

(6.7) numbered Display Equation

The full Procrustes mean is an extrinsic mean for planar landmark data using the complex Veronese–Whitney embedding on pre-shape space. Huckemann (2012) uses the term ‘residual mean’ for the mean which minimizes the expected square Euclidean distance in the tangent space, and the full Procrustes mean is also a residual mean.

We can also define the population mean in the size-and-shape space, as seen in Section 6.7.

6.3 Sample mean shape

Consider a random sample of data available X1, …, Xn inline M, where M is a manifold.

Definition 6.7 A sample mean shape is:

If the the minimizer is a global minimum then the term sample Fréchet mean shape is used, and if a local minimizer is available then we use the term sample Karcher mean shape.

The minimum of the objective function is a sample variance, as we include the divisor n in (6.8).

Also, we can have intrinsic sample means or extrinsic sample means depending on the choice of distance, in a similar manner as for population mean shapes.

Definition 6.8 The sample intrinsic mean shape is given by:

(6.9) numbered Display Equation

Definition 6.9 The sample partial Procrustes mean shape is given by:

(6.10) numbered Display Equation

Definition 6.10 The sample full Procrustes mean shape is given by:

(6.11) numbered Display Equation

Different types of distance give rise to different means, as in Table 6.1. In some situations it is numerically much more difficult to work with the intrinsic means compared with the extrinsic means, but there are also many situations where the computational differences are not so large. We will demonstrate in some examples that the means are often very similar, although they can differ for example when outliers are present.

Example 6.1 We begin with an illustrative example which highlights the intrinsic and extrinsic sample means in the very simple case of a circle S1 (the shape space of k = 3 points in m = 1 dimension). Consider angles θ1, …, θn on a circle. The intrinsic distance between θ1, θ2 is:

numbered Display Equation

The intrinsic sample mean is:

numbered Display Equation

which must be found numerically. For an extrinsic mean we embed the angles into the complex plane . The extrinsic mean is given by minimizing

numbered Display Equation

with respect to μ and hence . Finally we project back onto the circle to give the extrinsic mean as the argument of the complex mean :

numbered Display Equation

Table 6.1 Some shape space distances. Different types of means are obtained with different choices of distance.

Notation Type
Name Distance for mean of mean
Riemannian ρ Intrinsic
Partial Procrustes 2sin (ρ/2) Extrinsic/Ziezold
Full Procrustes sin ρ Extrinsic/Residual
Kent’s family Extrinsic

Hotz and Huckemann (2015) discuss the intrinsic sample mean on the circle in particular detail, including discussion of uniqueness and the development of asymptotic results.

6.4 Comparing mean shapes

Dryden et al. (2014) consider a generalization of family of distances given by Kent (1992) in Equation (6.5) to the functions

on the shape space Σkm of k points in m dimensions, where and ρ is the Riemannian shape distance. The resulting mean shape obtained by minimizing EX[dh(X, μ)2] has different robustness properties depending on the value of h. For larger h the mean is more resistant to outlier shapes and for smaller h it is less resistant to outlier shapes. The partial Procrustes mean shape is equivalent to using and the full Procrustes mean shape is equivalent to using h = 1. The intrinsic mean is very similar to using , as we shall see in Result 6.1. Note that the intrinsic mean shape is less resistant than the partial Procrustes mean shape. If outliers are present then larger h values are preferred, e.g. h = 1 rather than h = 1/2 or h = 1/3.

Example 6.2 We consider an example which helps to illustrate how similar the types of mean shapes can be in practice, but it also highlights some differences between them. We calculate a variety of sample means for the male gorilla data of Section 1.4.8 which has small shape variability. There are k = 8 landmarks in m = 2 dimensions on n = 29 male gorillas.

We calculate 13 means with different choices of distances, including nine means indicated by values of h using the distances in Equation (6.12). In particular, we label the means as follows: 1, h = 0.001; 2, h = 0.1; 3, ; 4, intrinsic; 5, (partial Procrustes); 6, h = 0.75; 7, h = 1 (full Procrustes); 8, h = 1.5; 9, h = 2; 10, h = 5; 11, MDS; 12, MDS(α = 1); and 13, isotropic maximum likelihood estimate (MLE) (shape of the means – see Section 6.6). Here MDS refers to the mean shape from Equation (14.3) and MDS(α = 1) refers to the mean shape from Equation (14.4). Both MDS means have been reflected if necessary to minimize the Riemannian distance to the intrinsic mean shape.

The Riemannian shape distance matrix is computed between all pairs of 29 skulls and the 13 mean shapes, and classical MDS is carried out (Mardia et al. 1979). The first two principal coordinates are displayed in Figure 6.1a. All the mean shapes are actually very close, as seen in Figure 6.1a. In Figure 6.1a–c the observations (circles) and the means (using their number label) are plotted and the means 1–10 are connected with piecewise straight lines. In Figure 6.1b we zoom in to see that means 10, 11 and 12 are further away from the remainder. Zooming in further in Figure 6.1c we see that means 1–10 are all close to being on a common straight line. Also, mean 13 is very close to mean 7 (but not on the line). Also, means 3 and 4 are indistinguishable even at this fine scale. Huckemann (2012) derived an intriguing result for concentrated distributions that the intrinsic, partial Procrustes (Ziezold mean) and full Procrustes estimators approximately lie on a geodesic, with the intrinsic and partial Procrustes mean closer together than the partial and full Procrustes (ratio of 1:3 approximately). In this example the ratio is 1:3.00171.

Image described by surrounding text and caption.

Figure 6.1 Multidimensional scaling plot of the male gorilla data and 13 means using Riemannian distance to form the dissimilarity matrix. (a) The principal coordinates of the 29 skulls (circles) and the 13 means (numbers). (b) Zoomed in version of (a) near the origin. (c) Another zoomed in version of (a) near the origin. In each of (a)–(c) the mean shapes are indicated by their number label, and the first ten means are joined by straight lines. (d) A plot of the observed ratios and expected approximate ratios of Riemannian distances for different h.

Image described by surrounding text and caption.

Figure 6.2 (a) Multidimensional scaling plot of the digit 3 data and ten means using Riemannian distance to form the dissimilarity matrix. The ten sample mean shapes are: 1, h = 0.001; 2, intrinsic; 3, partial Procrustes; 4, isotropic MLE; 5, full Procrustes; 6, h = 2; 7, h = 3; 8, h = 5; 9, h = 10; and 10, h = 20. (b) A zoomed in view showing the ten means and lines joining them in order of resistance to outlier shapes, with 10 being the most resistant.

In the following result we assume that the shape distribution is concentrated on Σkm and sufficiently regular such that the objective functions for obtaining the means are approximately quadratic in the vicinity of the unique minimum.

Result 6.1 The family of mean shapes using dh in Equation (6.12) indexed by are approximately located on a minimal geodesic passing through the intrinsic mean μI. In addition we have

(6.13) numbered Display Equation

A proof of this result can be obtained from Taylor series expansions of the objective function, and details are given in Dryden et al. (2014).

The above result will hold for either the population or sample mean shapes provided the shape distribution or sample data are concentrated and sufficiently regular. For h = 1 it follows that R = ρ(μI, μ1)/ρ(μI, μ1/2) ≈ 4, which corresponds to Huckemann's (2012) ratio of 1:3. Also, for we have R ≈ 0, and hence the intrinsic mean shape and the extrinsic mean are expected to be extremely close, as seen in our example. As h → 0 the furthest that μh can be from the intrinsic mean is approximately twice the distance as that of the intrinsic mean to the partial Procrustes mean in the opposite direction along the minimal geodesic. Finally note that if a single outlier is present in the data then the less resistant means are pulled more towards the outlier, and hence in this case the outlier determines the particular geodesic on which the means approximately lie.

In Figure 6.1d the value of the empirical ratio obtained from the sample means in the male gorilla data is plotted versus the theoretical approximate ratio R = 6h − 2 for each of the 9 values of h. Note that if we have also negated the empirical ratio. The agreement is extremely close in this example, and indeed it is very close in many other examples, although for less concentrated data there are more differences for larger values of h.

6.5 Calculation of mean shapes in R

To illustrate some commands in R for calculating mean shapes, we calculate a variety of sample means for the digit 3 data of Section 1.4.2 which has moderate variability, and the first observation might be considered an outlier as it is over 0.7 in terms of Riemannian distance from the mean. We calculate 10 means (1, h = 0.001; 2, intrinsic; 3, partial Procrustes; 4, isotropic MLE (see Section 11.3); 5, full Procrustes; 6, h = 2; 7, h = 3; 8, h = 5; 9, h = 10; 10, h = 20), which are successively more resistant to outliers. The full Procrustes mean shape is given in Figure 7.15.

A command used in R to calculate each mean is frechet which involves numerical optimization with the non-linear minimization function nlm.

frechet(digit3.dat,mean="intrinsic")
frechet(digit3.dat,mean="partial.procrustes")
frechet(digit3.dat,mean="mle")
frechet(digit3.dat,mean="full.procrustes")
frechet(digit3.dat,mean=2)

This calculates the mean shapes labelled 2, 3, 4, 5, 6 respectively. Some of the options only work for m = 2 dimensions, for example mle. A more reliable way of computing the Procrustes mean shapes is with the generalized Procrustes algorithm of Chapter 7. So, for example, the full Procrustes mean shape of the digit 3 data is given by:

procGPA(digit3.dat)$mshape

For the digit 3 example a distance matrix is computed between all 30 observations and the 10 mean shapes, and classical MDS is carried out (using cmdscale in R) and displayed in Figure 6.2. All the mean shapes are actually very close. However, on a fine scale we see that the less resistant estimators are pulled towards the outlier, and again the intrinsic, partial Procrustes and full Procrustes estimators lie approximately on a geodesic, with the intrinsic and partial Procrustes mean closer together than the partial and full Procrustes (ratio of 1:3 approximately). Again this provides another demonstration of Result 6.1.

As another example of how similar the mean shapes can be we estimate the intrinsic, partial and full Procrustes mean shapes for the Small group of mouse vertebrae in Section 1.4.1. Given that the estimates are invariant to rotation and scale we standardize them to Bookstein shape variables:

> f1<-frechet(qset2.dat,mean="intrinsic")
> f2<-frechet(qset2.dat,mean="partial.procrustes")
> f3<-frechet(qset2.dat,mean="full.procrustes")
> bookstein.shpv(f1$mshape)
         [,1] [,2]
[1,] -0.50000000 0.0000000
[2,] 0.50000000 0.0000000
[3,] 0.08490820 0.2924684
[4,] 0.01245608 0.5589496
[5,] -0.06869796 0.2982314
[6,] -0.02512807 -0.3044915
> bookstein.shpv(f2$mshape)
         [,1] [,2]
[1,] -0.50000000 0.0000000
[2,] 0.50000000 0.0000000
[3,] 0.08490589 0.2924798
[4,] 0.01245726 0.5589728
[5,] -0.06869559 0.2982405
[6,] -0.02513298 -0.3044881
> bookstein.shpv(f3$mshape)
         [,1] [,2]
[1,] -0.50000000 0.0000000
[2,] 0.50000000 0.0000000
[3,] 0.08489897 0.2925142
[4,] 0.01246093 0.5590424
[5,] -0.06868845 0.2982678
[6,] -0.02514771 -0.3044776

These mean shape estimates are all extremely similar; the same to four decimal places using Bookstein shape variables. The Fréchet variances are all very similar:

> f1$var
[1] 0.09448654
> f2$var
[1] 0.09442313
> f3$var
[1] 0.09423311

Also, the Riemannian distances between the mean shape estimates are very close:

> riemdist(f1$mshape,f2$mshape)
[1] 1.737509e-05
> riemdist(f1$mshape,f3$mshape)
[1] 6.962873e-05
> riemdist(f2$mshape,f3$mshape)
[1] 5.225367e-05

and the ratio of Riemannian distances from the intrinsic to partial Procrustes to full Procrustes means is again approximately 1:3 here (Huckemann 2012).

Although for many practical examples there is very little difference between the estimates, in certain pathological examples they can be very different indeed (Huckemann 2012), and in particular the intrinsic and partial Procrustes means are manifold stable, whereas the full Procrustes mean may not be. Manifold stable means the following: given a distribution on the manifold part of the space, the mean also lies in the manifold part of the space with probability 1, that is there is zero probability of the mean being in the non-manifold strata (Huckemann 2012).

There are many other mathematical statistical aspects of mean shapes and estimation on manifolds that have been explored in the last couple of decades. Two papers by Bhattacharya and Patrangenaru (2003, 2005) lay out the framework for consistency for intrinsic and extrinsic mean estimators, and provide central limit theorems for extrinsic means in particular detail. Huckemann and Hotz (2014) discuss means and asymptotic results for circles and shape spaces. In particular they consider strata of shape spaces and demonstrate that means at singularities must be controlled. More discussion of these theoretical issues will be given in Chapter 13.

6.6 Shape of the means

A different notion of mean shape is where a probability distribution for the original configuration is given with, say, mean E[X] = μ. The population mean shape is then given by the shape of μ, written as [μ]. This type of shape has been called the shape of the means (Kendall et al. 1999, Chapter 9; Le and Kume 2000b). This type of mean shape was considered by Mardia and Dryden (1989a,b), where independent isotropic Gaussian perturbations of planar landmarks about a mean configuration were considered. Further discussion is given in Section 11.1.2. Also, the extension to non-isotropic errors was considered by Dryden (1989) and Dryden and Mardia (1991b) (see Section 11.2), and to higher dimensions by Goodall and Mardia (1992, 1993) (see Section 11.6).

In general the population Fréchet mean and the shape of the means are different, although there are some special cases where they coincide. For example, for independent isotropic Gaussian errors for planar landmark data Kent and Mardia (1997) show that the shape of the means is the same as the population full Procrustes mean.

6.7 Means in size-and-shape space

6.7.1 Fréchet and Karcher means

Definitions of Fréchet and Karcher means for size-and-shape follow in exactly the same manner as in the shape space, except that the size-and-shape distance dS is used in place of a shape distance. From Chapter 5 for the size-and-shape space the Riemannian size-and-shape distance coincides with the Procrustes distance and the residual distance, where optimization is carried out over location and rotation only in calculating the distance. So, for size-and-shape the population intrinsic Fréchet mean is identical to the population extrinsic Procrustes/Ziezold mean which is the same as the extrinsic residual mean.

Definition 6.11 The population Fréchet mean size-and-shape is:

(6.14) numbered Display Equation

which is also known as the population Procrustes mean size-and-shape. Here the optimum is global.

Again, the term Karcher mean size-and-shape is used for a local minimum.

Definition 6.12 The sample Fréchet mean size-and-shape from a random sample X1, …, Xn is:

(6.15) numbered Display Equation

which is also known as the sample Procrustes mean size-and-shape if it is a global optimum.

The term Karcher mean size-and-shape is used for a local minimum. The sample Fréchet or Karcher mean size-and-shape is an intrinsic, extrinsic and residual mean size-and-shape.

6.7.2 Size-and-shape of the means

The size-and-shape of the means is another type of mean and this is different from the Procrustes mean in general. In this situation a model for the configuration X is proposed with mean E[X] = μ, and then the size-and-shape of the means is the size-and-shape of μ written as [μ]S, using the notation of Section 3.3. See Section 11.5 for further details.

6.8 Principal geodesic mean

An alternative measure of mean shape is where a geodesic or curve is first fitted to a dataset, and then a mean is estimated to minimize the sum of square distances constrained to lie within the geodesic path. This approach was considered by Huckemann and Ziezold (2006), Huckemann et al. (2010) and Kenobi et al. (2010). The type of mean can be very different from the Fréchet and other means, for example for data widely dispersed in a ‘girdle’ distribution, where the data are distributed around an approximate geodesic.

A further generalization was given by Jung et al. (2012) who considered a ‘backwards’ approach in fitting successively lower dimensional shape spaces, called principal nested shape spaces. The method starts with fitting the highest dimensional subspace, and then fits a sequence of lower dimensional subspaces, with the final iteration involving the fitting of a principal nested shape space mean. This method contrasts with the more usual ‘forwards’ methods of starting with a mean and then fitting successively more complicated, higher dimensional structures. The principal nested shape space mean can again be very different from the usual means, including for girdle distributed datasets. See Section 13.4.5 for more details, and also see Marron et al. (2010) for further discussion about the advantages of the backwards approach to PCA.

6.9 Riemannian barycentres

Kendall (2015) described the use of Riemannian barycentres, which provide a straightforward calculation for computing a mean shape on Riemannian manifolds. For a sphere we use:

numbered Display Equation

where the expectation is taken in the real embedding space ( for a sphere in three dimensions). Hence the use of the Riemannian barycentre is useful for analysing triangle shape, where the shape space is a sphere (see Section 4.3.4). Sample estimates are extremely straightforward to compute as one simply uses arithmetic averages of the coordinates in the real embedding space. The method enables sophisticated inference to be implemented, for example Kendall (2015) investigated clustering of hurricane tracks and whether tracks were correlated in time. For the complex pre-shape sphere we could use:

numbered Display Equation

where , although it is important to remove orientation (e.g. by Procrustes registration).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset