Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Qian Xie and Anuj Srivastava

3Image registration using phase–amplitude separation

Qian Xie, 117 N. Woodward Ave., United States, 12

Anuj Srivastava, 117 N. Woodward Ave., United States, [email protected]

Abstract: The problem of registration of images is a classical one, with a variety of solutions proposed in the past. Motivated by recent progress in functional data analysis, we approach this problem from a novel viewpoint and develop a notion of phase and amplitude for image data. We define amplitude of an image as a property that is invariant to diffeomorphic deformations of the background space; these deformations are said to change only the relative phases. Consequently, image registration is viewed as separation of phase and amplitude components in given images, and we derive a formal, metric-based framework for this separation. The main advantage of this approach is that the objective function used here for registration satisfies some important properties – (1) it is invariant under identical deformations of images being compared, and (2) it induces a proper metric on the amplitude space of images. The key is to use a mathematical representation, termed q-map and used previously in shape analysis, such that the L2-norm in the range space of this map forms the objective function. This is also called the amplitude distance, and is used to align images. This alignment effectively results in a separation of phase and amplitude in image data. We demonstrate this framework using examples from a variety of image types and compare performances with some recent methods.

Keywords: metric-based registration, elastic image deformation, post-registration analysis, mean image, multiple registration

Mathematics Subject Classification 2010: 62H35, 68U10, 65D18

3.1Introduction

The problem of image registration is one of the most widely studied problems in all of image processing. The basic problem is to find correspondences between pixels or voxels across images such that the registered pixel patterns match as well as possible. If one views pixel values as functions on a certain domain or a background space, then matching can loosely be described as alignment of peaks and valleys of these functions (and/or their derivatives) using deformations of that background space. Since images contain information about imaged objects, a registration of images is essentially about reconciling related information across observations. The larger goal is to compare images in a meaningful way and to use images to answer questions about object identities despite different imaging conditions, viewing angles, and even different sensors/modalities. In medical image analysis, as an example, the focus has been on the longitudinal sequence of images to track, detect, and model structural changes that relate to tumor growth, treatment verification, and inter-patient comparisons. Therefore, registration of images is a vital component in statistical analysis and modeling of objects using image data.

The problem of image registration takes different forms according to the purposes of various applications. For instance, to detect aging effects in a time sequence of brain MR images, rigid alignment is often crucial to perform meaningful comparisons and statistical analysis. In contrast, in building template images using image ensembles, nonrigid alignments are known to improve resulting template quality by averaging pixels that are similar in a geometric sense. In object recognition tasks, such as recognition of handwritten numbers and letters, it is often necessary to perform nonrigid registration, that is to deform one image so as to match its pixel patterns with the other image as much as possible. The literature on image registration mostly focuses on designing an energy functional involving deformations, that can be used to define and compute optimal deformations. The basic problem is to estimate a smooth and realistic deformation field which matches the corresponding points in a template and a source image. When given a set of (more than two) observed images, the goal is to register points across the domains of these images. The registration problem can be subdivided into categories in several ways:

–On the basis of the number of images involved: pairwise registration, where two images are matched, or groupwise registration or multiple registration, where more than two images are involved.

–On the basis of the modality – unimodal registration which is performed within a single modality and multimodal registration which is performed for images across different imaging modalities.

–On the basis of the transformation space considered to deform images, such as projective registration, affine registration, and rigid/nonrigid registration.

–On the basis of the optimization procedure – variational, Bayesian, and so on.

In addition to being called registration, this problem appears in the literature under many names: matching, correspondence, reparameterization, domain warping, deformation, etc., but the fundamental problem is essentially the same – which pixel/ voxel in one image corresponds to which pixel/voxel in the other image. We will use these names interchangeably hereafter.

3.1.1Current literature

Although the registration problem has been studied for almost two decades, there continue to be some fundamental limitations in the popular solutions that make them suboptimal, difficult to evaluate, and limited in scope. To explain these limitations let ℱ be a certain set of ℝn-valued functions on a domain D, made precise later. A pairwise registration between any two images f1, f2 ∈ ℱ is defined as finding a mapping γ, typically a diffeomorphism from D to itself, such that pixels f1(s) and f2(γ(s)) are optimallymatched to each other for all s ∈ D. To develop an algorithm for registration one needs: (1) An objective function for formalizing the notion of optimality, and (2) a numerical procedure for finding the optimal γ. Although the numerical techniques for optimization, that is item (2), have become quite mature over the last ten years, the commonly used objective functions themselves have several fundamental shortcomings. It is the choice of objective function that is the focus of this paper. The registration problems are commonly posed as variational problems, with the most common form of an objective function being

where |⋅| is the Euclidean norm, R is a regularization penalty on γ, typically involving its first and/ or second derivatives, λ is a positive constant, and Γ denotes the space of relevant deformations. Several variations of this functional are also used. We highlight the shortcomings of these methods using a broader discussion about desired properties of an objective function in the next section.

3.1.2Our approach

We propose a novel way to approach the image registration problem in this paper. Focusing on the deformations of image domain using diffeomorphisms of unimodal images, we introduce the concept of phase and amplitude of an image. This is motivated by progress in shape analysis of objects [1, 9, 11, 12, 14, 18, 22, 24] and phase-amplitude modes in functional data [16, 23, 25, 31]. We start by defining the amplitude of an image as the property that is unchanged under arbitrary diffeomorphic deformations. A deformation is said to change only the phase of an image. This description forms the basis for a formal development of a notion of phase and amplitude components of an image, and a proper computational framework for separating these components in the given image data. This separation effectively yields registration of the images involved. These developments rely on novel mathematical representations of images, termed q-maps, and the L2-norms between them. Alternatively, one can view this construction as based on a novel metric imposed on the image space, this metric is defined by pulling back the L2-metric under the q-map.

The main advantage of this viewpoint is a logical and interpretable separation between sources of variability in images. In cases where the images under study are primarily considered as deformations of each other, the main variability lies in their phase components and one can exploit this restriction to improve performance. In another situation where images are well registered already, the primary source of variability is amplitude and phase representation can be left out of the analysis. Most practical situations lie between these two extreme cases and, similar to analysis of variance in statistics, one is interested in decomposing the observed variance in these two components. Furthermore, one can impose separate metric structures on these components and study/ model them in an individual manner. In fact, we can interpret many of the past methods as imposing such metric structures without making them explicit.

The rest of this paper is organized as follows. In Section 3.2, we introduce the concepts of phase and amplitude of images, and describe the accompanying mathematical framework. In Section 3.3, we present a discussion on the theoretical properties of this framework, and its advantages relative to the current literature. We use Section 3.4 to outline a computational approach for separating phase-amplitude in a given dataset, and we present some experimental results in Section 3.5. We close the paper with a short summary and conclusion in Section 3.6.

3.2Definition of phase–amplitude components

We start by defining the concept of phase and amplitude components of an image and then formulate the problem of separating them in a given dataset. This method applies to mathematical objects (or functions) whose range space has a dimension at least as high as that of their domain, as f : D → ℝn, where n ≥ m = dim(D). D is called the domain or the background space of an image. In case of two-dimensional (or three-dimensional) images, this means that pixels have at least two (or three) coordinates, as is the case for colored images or multimodal images (with different imaging modalities acting as the pixel coordinates). The scalar (gray scaled) images can be transformed to satisfy this condition as described later.

Let the image space be ℱ = {f : D → ℝn | f ∈ C∞(D)} and Γ = Diff+(D) is the set of orientation-preserving diffeomorphisms that preserves the boundary of D. It can be shown that Γ is a group with composition as the binary operation: For any γ, γ′ and γ″ ∈ Γ, we have (1) γ ∘ γ′ ∈ Γ, (2) (γ ∘ γ′) ∘ γ″ = γ ∘ (γ′ ∘ γ″), (3) there exists an γid ∈ Γ such that γid ∘ γ = γ ∘ γid = γ, and (4) there exists a h ∈ Γ such that γ ∘ h = h ∘ γ = γid. Hereafter, we will use ‖f‖ for the L2-norm of any f ,

The elements of Γ play the role of deformations. Formally, a deformation is defined as the right action of Γ on ℱ, given by the mapping ℱ × Γ → ℱ, (f, γ) = f ∘ γ. Since Γ is a group and a deformation is a group action, it is possible to concatenate deformations or to undo deformations by applying their inverses. The action by γid implies no deformation. This group action sets up the definition of the amplitude of an image.

Definition 2.1 (Amplitude of an Image). For any image f ∈ ℱ, the amplitude of f is defined to be the set of all possible deformations of f, or the orbit of f, under Γ. That is, the amplitude of f is given by:

According to this definition, any two images f1, f2 ∈ ℱ are said to have the same amplitude if there exists a γ ∈ Γ such that f1 = f2 ∘ γ. This forms an equivalence relation and this relation partitions the image space into disjoint sets of amplitudes. Since deformations do not change amplitudes of images, they change the remaining information in the image! We will define the remainder of the image contents as the phase. We will make this precise in the ensuing sections. Figure 1 shows a pictorial illustration of this idea where the curves denote amplitudes of two images f1 and f2.

Fig. 1: Illustration of amplitudes [f1] and [f2], amplitude distance B, and relative phase A.

For defining phase, it is easier to start with images that have the same amplitude. Later on we will generalize the concept to arbitrary images. Another remark is that, as opposed to the amplitude of an image, the phase is defined only in a relative way. That is, one defines the relative phase of an image with respect to another. It is not an absolute property.

Definition 2.2 (Relative Phases of Images With Same Amplitude). The relative phase of an image f1, with respect to another image f2 ∈ [f1] is defined to be the element γ12 ∈ Γ such that f1 = f2 ∘ γ12.

Note that if γ12 is the relative phase of f1 with respect to an f2 ∈ [f1], then its inverse- is the relative phase of f2 with respect to f1. This is termed inverse consistency. In terms of image registration, inverse consistency implies that the registration of f1 to f2 is the same as the registration of f2 to f1. If two images are assumed to have the same amplitude, then the relative phase between them can be obtained by minimizing any reasonable norm between f1 and f2 ∘ γ, with the minimization performed over γ ∈ Γ. The most common choice is, of course, the L2-norm, but that has problems such as the pinching effect and lack of inverse consistency (see [25]).

Before we proceed further, it is useful to attach some intuitive interpretation to these terms. During a deformation, the locations of the pixel values are altered but the values themselves are not changed. No new pixel values are added or removed. Thus, the notion of amplitude relates to the collection of pixel values present in an image. In contrast, the notion of phase closely relates to the deformation itself. This terminology is motivated from the corresponding notions used in studies of waves in physics. The change in height of a wave function over time is referred to as its amplitude while a shifting (linear or nonlinear) of the time coordinate associated with that function is said to change its phase. Note that so far we have not made any choice of metric or norm on the space of images. The definitions are independent of any metric structure we impose on ℱ. However, for any further development, including the definition of relative phase between arbitrary images, one needs to make some additional choices. The main choice comes in the form of a metric for comparing amplitudes of images. This metric cannot be an arbitrary metric. It should satisfy some properties so that the resulting framework is consistent and meaningful. We introduce our choice of metric next and then return to the general notion of relative phase.

3.2.1q-Map and amplitude distance

We start with this question: Suppose we want to compare the amplitudes of any two images, f1 and f2, using a proper metric, whatmetric can we use? If we try to use the L2-norm ‖f1 − f2‖ to quantify the difference in their amplitudes, we face a problem because ‖f1 − f2‖ ≠ ‖(f1, γ1) − (f2, γ2)‖ in general. In fact, it is not true even if γ1 = γ2. This is problematic because the action of Γ does not change the amplitudes of f1 and f2 and yet the norm is changing. Another way to express this behavior is that the action of Γ is not by isometries under the L2-metric. That is, for any two f1, f2 ∈ ℱ, and a general γ ∈ Γ, we have ‖f1 − f2‖ ≠ ‖f1 ∘ γ − f2 ∘ γ‖.

What are our other options? One potential metric can be derived as follows. This derivation requires a mathematical representation of images defined by a mapping called the q-map, that has been prompted by recent work in shape analysis of surfaces [13]. Here,we adopt it for analyzing images as follows. Let (x1, . . . , xm) : D → ℝm be coordinates on (a chart of) D and Jf(s) be the Jacobian matrix of f at s with the (j, i)-th element as ∂f j/∂xi(s). Define the generalized area multiplication factor of f at s for arbitrary n ≥ m as a(s) = ‖Jf(s)‖V where ‖Jf(s)‖V = |∂f/∂x1(s) ∧ ∂f/∂x2(s) ∧⋅ ⋅⋅ ∧ ∂f/ ∂xm(s)|. Here ∧ denotes the wedge product. The two special cases are: if m = n = 2, then a(s) = det Jf(s); if m = 2 and n = 3, then a(s) = ∂f/∂x1(s) × ∂f/∂x2(s). Given the area multiplication factor, we can define the q-map.

Definition 2.3. For an f ∈ ℱ, define a mapping Q: ℱ → L2 such that for any s ∈ D,

For any f ∈ ℱ, we will refer to q = Q(f) as its q-map; note that q : D → ℝm. Also, we remark that this is a general version of q-map for arbitrary ℝn that extends the work of [13]. If the original image is smooth, then the resulting q-map is square integrable. Intuitively, the q-map leaves homogeneous (constant) regions in an image as zeros while preserving edge information in such a way that it is compatible with change of variables, that is stronger edges get higher values. If we deform an image f to f ∘ γ, how does its q-map change? This is given by the corresponding action of Γ on L2 as follows.

Lemma 2.4. The right action of Γ on the space of q-maps, corresponding to the deformation of the original image, is given by the mapping L2 ×Γ → L2 as where Jγ(s) denotes the Jacobian matrix of γ at s.

Note that the mapping Q is equivariant, that is it can be shown that for an image f and any γ, Q((f, γ)) = (Q(f), γ), where the action on the right side is given by Lemma 2.4 and the action on the left is just the composition. Why is it useful to introduce a new mathematical representation of images. This question is answered by the following important property.

Proposition 2.5. The deformation group Γ acts on the L2-space by isometries under the L2-norm, that is ∀q1, q2 ∈ L2, ∀γ ∈ Γ, ‖(q1, γ) − (q2, γ)‖ = ‖q1 − q2‖.

Proof. Using a simple change of variables:

Setting q2 ≡ 0, ∀q1 ∈ L2 and γ ∈ Γ, we have ‖q1‖ = ‖(q1, γ)‖. Thus, warping of images is actually a unitary operator under this representation.

We return to the problem of comparing amplitudes of any two given images and use q-maps to define a new metric.

Definition 2.6 (Amplitude Distance). Define the distance between the amplitude components in any two images f1 and f2 as the minimum L2-norm between the two orbits in terms the q-maps:

In other words, find the two elements in their respective orbits such that their q-maps are as close to each other as possible with respect to the L2-distance. This distance is denoted by the letter B in Figure 1. Some properties of this distance are as follows:

The da is a proper distance on the space of amplitudes, or orbits of Γ. That is, it satisfies: da([f1], [f2]) = da([f2], [f1]), da([f1], [f2]) ≥ 0, and the triangle inequality.
For any two f1, f2 ∈ [f], we have da([f1], [f2]) = 0.
For any γ1, γ2 ∈ Γ and f1, f2 ∈ ℱ, we have da([f1], [f2]) = da([f1 ∘ γ1], [f2 ∘ γ2]).

Not only is the amplitude distance well defined, it agrees with the desired goals of this definition. It compares amplitude components of images and is invariant to arbitrary deformations of those images. Beyond comparing the amplitude components, the quantity da is of central importance in registration of images and in generalizing the notion of relative phase to arbitrary pairs of images. This is described in the next section.

3.2.2Relative phase and image registration

The introduction of q-map sets up the general concept of relative phases between images. As mentioned earlier, the notion of phases relates to the amount of deformation in an image. However, the specification of deformation requires two images – one before and one after deformation. In fact, as shown below, one can talk about the amount of deformation between any two images. We start with the registration problem.

Definition 2.7 (Registration of Images). The registration between any two images f1 and f2, represented by their q-maps, q1 and q2, is defined as the solution to the optimization problem:

This makes γ12 the optimal mapping from the domain of f1 to the domain of f2. With this definition, the pixel f1(s) is said to be registered to the pixel f2(γ12(s)) for any s ∈ D. This also leads to the definition of relative phase.

Definition 2.8 (Relative Phase between Images). The relative phase of an image f1, with respect to another image f2, is given by γ12 defined using the equation 2.4.

It is interesting to note that while the notion of amplitude is independent of any metric structure on the image space, the notion of relative phase is closely tied to the choice of the amplitude distance. That is da([f1], [f2]) = ‖q1 − (q2, γ12)‖. The actual value of relative phase may change substantially if another metric structure is used instead.

There are some interesting properties associated with this definition of relative phase:

If γ12 is the relative phase of f1 with respect to f2, then its inverse is the relative phase of f2 with respect to f1.
Two images are said to have zero phase between them if their relative phase is given by the identity function γid.
The relative phase of f1 with respect to f2 ∘ γ is given by (γ−1 ∘ γ12). Similarly, the relative phase of f1 ∘ γ with respect to f2 is γ12 ∘ γ. As a corollary to this result, the relative phase of f1 ∘ γ with respect to f2 ∘ γ is given by γ−1 ∘ γ12 ∘ γ.
The relative phase between any two images remains unchanged by arbitrary scalar multiplication and constant additions to the images. As a special case, we have:

for any c > 0.

3.3Properties of registration framework

So far we have established a framework, or a set of concepts, for defining and separating the phase-amplitude components of images, which also results in pairwise registration of images. However, since the registration of images is an important goal, we elaborate on the properties of our framework from the perspective of the image registration performance.

We start with the question: What should the properties of an objective function for registering images be? The answer to this question is difficult since we may desire different results in different contexts. In fact, one can argue that we may never have a “perfect” objective function that matches an expert’s intuition and solution. Still, there is a set of basic properties that is desirable, even essential, in registration; some of these have been discussed previously in [4, 27]. Also, some of them have been achieved in the previous papers while others have not. We list these properties next, starting with some notation. Note that some of them are overlapping, in the sense that they, individually or jointly with others, imply some others. Let ℒ(f1, (f2, γ)) denote the objective function for matching f1 and f2 by optimizing over γ (here γ is assumed to be applied to f2 resulting in (f2, γ) ∈ ℱ). The bracket (f, γ) denotes the deformation of f by γ (f ∘ γ), as earlier. In our approach, function ℒ (f1, (f2, γ)) takes the form ‖q1 − (q2, γ)‖. The desired properties of ℒ are:

Symmetry: For any f1, f2 ∈ ℱ, we want ℒ(f1, f2) = ℒ(f2, f1).
Positive Definiteness: For any f1, f2 ∈ ℱ we want ℒ(f1, f2) ≥ 0 and ℒ(f1, f2) = 0 ⇔ f1 = f2 a. e..
Lack of Bias.: If f1, f2 are constant functions then for any γ ∈ Γ, ℒ(f1, f2) = ℒ (f1, (f2, γ)).
Invariance to Identical Warping: For any f1, f2 ∈ ℱ and γ ∈ Γ, we have ℒ(f1, f2) = ℒ ((f1, γ), (f2, γ)).
Triangle Inequality: For any f1, f2, f3 ∈ ℱ, ℒ(f1, f3) ≤ ℒ(f1, f2) + ℒ(f2, f3).

Despite seemingly different appearances, Properties 1 to 4 are the same or closely related to those introduced previously in [27]. Specifically, Properties 1, 4 and the group structure of Γ together imply what was termed “Symmetry”, and Property 4 and group structure of Γ imply “Invariance under SDiff+” but are actually stronger. Property 5 is introduced to the list so that Properties 1, 2, and 5 altogether imply that ℒ is a proper metric on ℱ.

Property 4 is listed as a standalone property not only because it is fundamental but also because it is one of the most important. We have already discussed its importance in defining da and in definition of the relative phase (it was termed isometry earlier). However, it has a tremendous importance from the registration perspective also. Consider the two images f1 and f2 shown in the left panel of Figure 2. Even though the two images are different, their corresponding pixels are nicely aligned. The middle panel shows an example of a warping function γ to be applied to both images and the right panel shows the warped images f1 ∘γ and f2∘γ. It is easy to see that the correspondence between pixels across two images remains unchanged. Thus, since ℒ is a measure of registration or correspondence between images, we need ℒ(f1, f2) = ℒ ((f1, γ), (f2, γ)). However, if we take the commonly used L2-norm as an objective function (the first term in equation 1.1), then function values are not the same, as shown below the images. In summary, an identical warping of any two images keeps their registration unchanged and, hence, in order to properly measure the level of registration, an objective function must have this property of invariance to identical warping. We seek a framework that achieves all of the properties listed above.

Fig. 2: Illustration of invariance to identical warping. The values of objective functions are not the same: ‖f1 − f2‖ = 0.1085 and ‖f1 ∘ γ − f2 ∘ γ‖ = 0.1305.

This setup allows us to study another important property – “inverse consistency” – introduced in [4]. It states that for all f1, f2 ∈ ℱ, if γ ∈ arginfγ∈Γ ℒ(f1, (f2, γ)) then γ−1 ∈ arginfγ∈Γ ℒ (f2, (f1, γ)). It is natural to have this property since it implies that the optimal registration between two images remains the same even if they are treated in the reverse order. Note that a combination of Property 1, Property 4 and the group structure of Γ, along with equation 2.3 in Definition 2.6, implies inverse consistency. The requirement for Γ to be a group is important to derive other properties. While most papers use the full diffeomorphism group for Γ, some papers work with a subset of deformations, for example spline-based deformations, and that can be problematic as discussed later.

While the objective function given in equation 1.1 is one used most often, several variations have also been applied. For instance, sometimes the first term is replaced by mutual information [5, 36], cross-correlation [33, 34], minimum description length [6, 32], etc., or the second term is replaced by the length of a geodesic in the warping space (as in the LDDMM approach [2, 7, 17, 30]). Some methods conceptualize the average image under the large deformation diffeomorphism setting as an unbiased atlas ([10, 15]). However, these methods do not use a formal metric for registration. Another idea is to impose regularization differently, for example using Gaussian smoothing of images (diffeomorphic demons [29, 35]). Some methods optimize the objective function over a proper subgroup Γ0 ⊂ Γ (e.g., the set of volumepreserving diffeos [27]), some on Γ, some on a larger group Γb that contains Γ (e.g., the one including nondiffeomorphic mappings also) and some on even larger deformation spaces that are not groups (e.g., thin plate splines [3, 8, 26]). Rather than going through individual methods and their properties in detail, we summarize their satisfaction of desired properties in Table 1. It is interesting to note that not a single past method satisfies Property 4 without drastically restricting the deformation group Γ. The inverse consistency is, similarly, seldom satisfied.

Table 1: Properties of Objective Functions for Registration.

1 is invariant to the general linear group.

2 is invariant to the special group of volume- and orientation-preserving diffeomorphisms.

–indicates where it is not proper to evaluate the property under the context.

We reiterate that Proposition 2.5 is exactly the same as Property 4 – invariance to identical warping – in Section 3.1. In view of this, the L2-norm between q-maps of images becomes a proper measure of registration between images since it remains the same if the registration is unchanged. This leads to a quantity that will serve as both the registration objective function and an extrinsic distance between registered images (previously termed da or the amplitude distance).

The objective function ℒ given in Definition 2.7 satisfies all the properties listed earlier as those desired for registration. Specifically, it satisfies invariance to identicalwarping (Property 4). Since our transformation space Γ is a group, this framework satisfies the inverse consistency as stated in Section 3.1. Additionally, the optimal registration is not affected by scaling and translations of image pixels: let g1 = c1f1 + d1 and g2 = c2f2 + d2 with c1, c2 ≥ 0 and d1, d2 ∈ ℝn, if γ∗ = arg infγ ℒ(f1, (f2, γ)) then γ∗ = arg infγ ℒ(g1, (g2, γ)) as well. We point out that there are some unresolved mathematical issues concerning the existence of a unique global solution for γ∗, especially its existence inside Γ rather than being on its boundary. We leave this for a future discussion and focus on a numerical approach that estimates γ∗.

The proposed objective function in equation 2.4 has only one term (similarity term) and the regularity term appears to be missing. However, the similarity term also has a built-in regularity, since it includes the determinant of the Jacobian, det Jγ, in (q, γ). One can view this as a first-order penalty involving the first derivative of γ. Additional regularity can also be introduced to the framework, as is done in the LDDMM framework, by adding external penalty terms.

3.4Gradient method for optimization over Γ

The optimization problem over Γ stated in equation 2.4 forms the crux of our registration framework and we will use a gradient descent method to solve it. Since Γ is a group, we can perform optimization iteratively; in each iteration, we use the gradient to solve for the incremental warping γ, on top of the previous cumulative warping γo, as follows. We define a cost function with respect to γ as the functional

where ϕq : Γ ↦ [q] is defined to be ϕq(γ) = (q, γ) and q̃2 = (q2, γo), and γo is the current deformation. Since E is a real-valued function on Γ, its gradient is a smooth vector field on the domain [0, 1]2. Technically, this vector field is the element of the tangent space Tγid (Γ), the tangent space of Γ at identity γid. To compute and express this vector field, we will use an orthogonal basis B, on Tγid (Γ).

Given an orthonormal basis, the gradient at γid takes the form

where ∇bE(γid) stands for the directional derivative of E at γid. This directional derivative is conveniently available in an analytical form given as follows. Let ϕq,∗ be the differential of ϕq at γid in the direction of b. Then . The inner product here is the standard L2-inner product. There exists an explicit form of ϕq,∗ such that for the coordinate functions of ϕq,∗(b) are given by:

where ∇⋅ denotes the divergence operator. The form of ϕq,∗ is the same as that derived for parameterized surfaces in [13].

3.4.1Basis on

In order to compute the gradient ∇E and to update γo we need to specify an orthonormal basis for . Since Γ = Diff+([0, 1]2), the set of all boundary preserving diffeomorphisms on [0, 1]2, the tangent space is defined as the set of smooth vector fields on [0, 1]2:

We start from constructing an orthonormal basis on L2([0, 1], ℝ) and then extend it to the 2D case. It is known that

forms an orthonormal basis for L2([0, 1], ℝ) under the L2-metric. We seek an orthonormal basis for L2([0, 1], ℝ) under the Palais metric due to some nice properties of this Riemannian metric [21], to get a basis set that vanishes at the boundaries. The 1D Palais metric is defined as for f, g ∈ L2([0, 1], ℝ). Under this metric, the set

provides an orthonormal basis of functions under L2 that vanish at t ∈ {0, 1}.

We will use the Cartesian product of b (with elements b) and b ̃ (with elements b̃) to construct an orthonormal basis for functions on [0, 1]2 under the 2D Palais metric given by the inner product:

By enumerating the 1D basis as and the 2D basis elements are constructed as The basis elements can be orthonormalized using the Gram–Schmidt process under the metric defined in equation 4.3.

3.4.2Mean image and group-wise registration

An important problem in image analysis, especially medical image analysis, is to compute a “typical” or an “average” of several images from the same class and use it as a template. Then, the individual images can be registered to the sample mean in a pairwise manner, resulting in a group registration. By registering member images to the group mean, one can analyze their variations from the typical template image. Suppose there is a set of n images, Their Karcher mean is defined as the image that minimizes the sum of squares of the distances to the given images, that is Since this quantity is an amplitude, it results in a set of images that have the same amplitude. In order to pick a central amplitude from this set, we use the notion of center of an orbit, as defined in [25] for real-valued functions on [0, 1]. It lays out a procedure to find an instance of mean image fμ, such that fμ ∈ [μ]. With this image, we can register groups of images to the mean image rather than to an arbitrarily chosen template, as is often done in current methods.

3.5Experiments

In this section we present various image registration results in order to validate our method. We provide examples of pairwise registration on both synthetic images and brain MRIs. In order to improve the registration of images with larger deformations, we also show results in landmark-aided registration for a better solution. We demonstrate the utility of our method to compute mean images as templates for registering multiple images. The problem of image classification is also considered using the proposed metric and the results are compared to those from other methods.

Recall that in the case of grayscale images, with n = 1, our method does not apply directly since n < dim(D). Instead, we make use of image gradients ∇f = (fu, fυ) for (u, υ) ∈ D and register objects in the form of g = (f, fu , fυ) ∈ ℝ3. In other words, the vector-valued image g : D → ℝ3 forms the input data for registration. Such image gradients are a type of edge measure and are often used in their own right as robust spatial features for image registration.

3.5.1Pairwise image registration

We first present some results on synthetic images to demonstrate the use of the registration framework introduced in equation 2.4. When given a pair of synthetic images, f1 and f2, we register them twice. First take f1 as the template image and estimate γ12 that optimally deforms f2, that is γ12 = arg minγ∈Γ ℒ(f1, (f2, γ)). Then, the roles are reversed and f2 is used as the template to obtain γ21, that is γ21 = arg minγ∈Γ ℒ (f2, (f1, γ)).

Figure 3 shows a synthetic example to illustrate the property of deformations from our algorithm. From left to right, we show images f1 and f2, followed by optimal deformations for forward and backward registration.

In Figure 4, we use a different set of images and show the original images f1 and f2 before and after warping. The diffeomorphisms γ12 and γ21, used to register the images, are also presented. By composing them in different orders we expect the resulting diffeomorphisms to be the identity map. We show the two converged objective functions, ‖(q1, γ21) − q2‖ and ‖q1 − (q2, γ12)‖, associated with the optimal γ12 and γ21 to verify symmetry. The cumulative diffeomorphisms γ21 ∘γ12 and γ12 ∘γ21 are also used to demonstrate the inverse consistency of the proposed metric. As mentioned above, the theory shows that γ12 and γ21 are expected to be inverses of each other. In order to better visualize that the composed diffeomorphisms are close to identity, we apply them to checkerboard images and we observe that the composed diffeomorphisms γ21 ∘ γ12 and γ12 ∘ γ21 are close to the identity map. Figure 5 shows similar results on registering a pair of binary images.

Next, we add simulated Gaussian noise to images. First, the original images are registered, as described previously, and then the corresponding noisy image pairs are registered again, as shown in Figure 6. When the registration results and the diffeomorphisms are compared, we find that the results are quite similar to each other. This result underlines the fact that although the image representation using q-maps involves taking a gradient, the resulting registration is relatively immune to the presence of high-frequency noise.

In Figure 7, we present registration results on 2D brain MR images. In order to illustrate our method, in each of the two experiments, we show (1) images, f1 and f2, (2) f1 and f2 ∘ γ21, and (3) f2 and f1 ∘ γ12. In each panel, the two concerned images are shown overlapped on a common canvas such that red and green denote positive and negative image differences, respectively. We also take brain MRIs from two modalities collected simultaneously (in this case n = 2 and our method directly applies) and register both modalities jointly using a common diffeomorphism. The results are shown in Figure 8. In both cases, the deformations are relatively small and the identity map provides a good initial condition for the gradient method.

Landmark-aided registration

Our framework can be extended to incorporate landmark information during registration and all of the nice mathematical properties of the objective function are preserved. Assume that there are a fixed number, say K, of distinct landmark points, ? = {p1, p2, . . . , p K}, in the image domain D. They are typically chosen according to the application but fixed within the analysis. The landmark-guided registration is

Fig. 3: Illustration of deformations to register two simple shapes.

Fig. 4: Registering Synthetic Smooth Grayscale Images. ‖q1 − q2‖ = 0.2312,q1 − (q2, γ12)= 0.0728 and (q1, γ21) − q2= 0.0859.

Fig. 5: Registering Synthetic Binary Grayscale Images. ‖q1 − q2‖ = 0.1521,q1 − (q2, γ21)= 0.0343 and (q1, γ12) − q2= 0.0380.

Fig. 6: Registering Synthetic Noisy Images. (a) registration results of original pair of images. (b) results for noisy version.

Fig. 7: Two examples of brain MR image registration. First column shows overlapped original images f1 and f2; second column shows overlapped images f1 and deformed f2; third column shows f2 and deformed f1.

Fig. 8: Results for registering brain images from two modalities. First two columns contain given images, with the first row from T1 MRI and the second from T2 MRI. The registered images are shown in the third column. The last two columns give the image differences before and after registration.

achieved by defining a subgroup of Γ, denoted by ΓP, which preserves the landmark points:

Given two images f1, f2 with landmark information ?, the images can be registered in two steps in an iterative way:

Register the landmark points ? and apply an initial deformation to f1 to form such that the landmarks are at the same locations in and f2.
Register to f2 using equation 2.4 restricting to the subgroup ΓP.

A similar technique of forming a landmark-constrained basis on ?2 has been used on closed surfaces as described in [14]. In the second step, searching over ΓP ensures correspondences of landmarks established in step 1 are preserved. The registration is refined without moving the landmarks. This search is based on a basis BP in Tγid (ΓP) constructed such that its elements, the vector fields on D, vanish at the landmark locations {pi}. We remark that ΓP forms a subgroup of Γ and, as a result, the desired properties discussed earlier are still satisfied. This approach may be termed landmarkguided registration where the landmarks are treated as hard constraints.

We show results on MRIs with landmarks in Figure 9. In the first example, presented in the first row of Figure 9, the optimally deformed f1 searched over the full group Γ is displayed at the end as (f1, γ). The deformation in the skull is so large that our original method fails to reach a global minimizer of the objective function. By adopting the landmark-guided registration, we at first get a deformed image with nicely matched landmarks and the skull deformed correspondingly. Then, is further deformed to register the intensity details without moving the landmarks achieved by optimizing over the restricted subgroup ΓP. The final result matches f2 with no artifacts near the skull. Two more examples are shown in the bottom two rows of Figure 9. Generally, the registration with landmarks outperforms the identity map as the initial condition of our procedure, especially when the deformations are large.

Fig. 9: Three examples of brain image registration with landmarks. First two columns show original images f1 and f2.The third column shows the deformed images fP1 using only landmarks; fourth column shows final deformed images (fP1 , γ) with fP1 as the initial condition; the last column shows registered images (f1, γ) without using landmarks.

3.5.2Registering multiple images

We use part of the MNIST database of handwritten digits to illustrate our method to compute the mean image and multiple image registration. Images containing a certain digit all have the same pattern, or the same amplitude. Therefore, within the same group, the phase variation can be left out of analysis by groupwise registration.

In Figure 10 we present the mean image of each digit computed without and with registration, respectively. Two examples of registering images within digit groups are shown in Figures 11 and 12.

Fig. 10: Computing mean images on MNIST data. (a) mean image without registration for each digit group (0–9); (b) corresponding mean images with registration.

Fig. 11: Groupwise registration on images containing digit 3. (a) observed data images in group; (b) corresponding images registered to group mean.

Fig. 12: Groupwise registration on images containing digit 6. (a) observed data images in group; (b) corresponding images registered to group mean.

Fig. 13: Computing mean image of brain MR images. (a) observed images and the cross-sectional mean; (b) mean image with registration and observed images registered to the mean.

We also show an example using brain MRIs in Figure 13. Four brain images without alignment are shown in Figure 13a with the corresponding mean image. This mean without registration appears blurred due to misalignment. In Figure 13b, the images are aligned to the Karcher mean as described in Section 3.4.2. We can see that with multiple registration the mean image improves the blurriness issue.

3.5.3Image classification

The amplitude distance introduced in equation 2.3 is useful in many applications relating to pattern analysis of images such as clustering or classification. For the task of handwritten digit recognition, amplitude variation is of main interest: images containing the same digit are from the same orbit thus are supposed to have zero amplitude distance (or very small amplitude distance in practice); conversely, images containing distinct digits have different amplitudes so that their amplitude distance should be large. Distance-based classifiers can then be applied for recognition after the pairwise distances between images are computed. To illustrate this idea, we use a subset of the MNIST database of images of handwritten digits from 0 to 9. It contains ten images of handwriting for each digit. In addition to the baseline L2-distance, or sum of squared distances (SSD) (without any warping), we compare our method to three other methods – diffeomorphic demons [35], FAIR [20], and NiftyReg [19].

For computing distance matrices between all pairs of images, the digits are registered in a pairwise manner using each of the three methods and then the SSD is computed for the registered images as a measure of distance. In the case of our method the amplitude distance defined in equation 2.3 is used. Using the leave-one-out nearestneighbor (LOO-NN) classifier, the rates of correct classification are listed in Table 2. Provided as a baseline, the L2-distance without any registration provides a rate of classification of 76% with our method performing the best with 94%. Our method has misclassified three pairs of images. One misclassified pair in all methods is presented in Figure 14.

Table 2: Classification of MNIST Digits.

Fig. 14: A misclassification between a 7 (I1) and a 9 (I2). Top row shows deformed I1 to match I2 and bottom row as deformed I2.

3.6Conclusion

We have introduced a novel concept involving phase and amplitude components of images, such that the separation of these components corresponds to the classical image registration problem. It is based on a new mathematical representation, termed a q-map, which is based on scaling the image at each pixel by the square-root of its area multiplication factor. An important property being exploited here is that identical deformations of images do not change the L2-norm between their q-maps. The use of shape-type metrics not only helps register images, with many more desired properties than the current methods, but also enables further statistical analysis such as PCA or regression modeling. The biggest advantage comes from the fact that registration and comparisons of images are all performed in a unified way, under the same metric.

Acknowledgment: This material is based upon work supported in part by the NSF grants DMS 1208959, IIS 1217515, and CCF 1319658.

We thank Sebastian Kurtek and Eric Klassen for helpful discussions, and the producers of the datasets used here for making them available to the public.

Bibliography

[1]N. Balov, A. Srivastava, C. Li, and Z. Ding. Shape analysis of open curves in R3 with applications to study of fiber tracts in dt-mri data. In Energy Minimization Methods in Computer Vision and Pattern Recognition, volume 4679 of Lecture Notes in Computer Science, pages 399–413. Springer Berlin Heidelberg, 2007.

[2]M. Beg, M. Miller, A. Trouvé, and L. Younes. Computing large deformation metric mappings via geodesic flows of diffeomorphisms. International Journal of Computer Vision, 61:139–157, February 2005.

[3]F. L. Bookstein. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(6):567–585, June 1989.

[4]G. Christensen and H. Johnson. Consistent image registration. Medical Imaging, IEEE Transactions on, 20(7):568–582, July 2001.

[5]A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens. Multimodality image registration by maximization of mutual information. Medical Imaging, IEEE Transactions on, 16(2):187–198, April 1997.

[6]R. Davies, C. Twining, T. Cootes, J. Waterton, and C. Taylor. A minimum description length approach to statistical shape modeling. Medical Imaging, IEEE Transactions on, 21(5):525–537, may 2002.

[7]P. Dupuis and U. Grenander. Variational problems on flows of diffeomorphisms for image matching. Journal Quarterly of Applied Mathematics, LVI(3):587–600, 1998.

[8]A.P. Eriksson and K. Astrom. Bijective image registration using thin-plate splines. In Pattern Recognition, International Conference on, volume 3, pages 798–801, 2006.

[9]I.H. Jermyn, S. Kurtek, E. Klassen, and A. Srivastava. Elastic shape matching of parameterized surfaces using square root normal fields. In European Conference on Computer Vision, volume 7576 of Lecture Notes in Computer Science, pages 804–817. Springer Berlin Heidelberg, 2012.

[10]S. Joshi, Brad Davis, B Matthieu Jomier, and Guido Gerig B. Unbiased diffeomorphic atlas construction for computational anatomy. Neuroimage, 23:151–160, 2004.

[11]S. Joshi, E. Klassen, A. Srivastava, and I. Jermyn. Removing shape-preserving transformations in square-root elastic (sre) framework for shape analysis of curves. In Energy Minimization Methods in Computer Vision and Pattern Recognition, volume 4679 of Lecture Notes in Computer Science, pages 387–398. Springer Berlin Heidelberg, 2007.

[12]S. Kurtek, E. Klassen, Z. Ding, S. Jacobson, J. Jacobson, M. Avison, and A. Srivastava. Parameterization-invariant shape comparisons of anatomical surfaces. Medical Imaging, IEEE Transactions on, 30:849–858, March 2011.

[13]S. Kurtek, E. Klassen, Zhaohua Ding, and A. Srivastava. A novel Riemannian framework for shape analysis of 3d objects. In Computer Vision and Pattern Recognition, 2010 IEEE Conference on, pages 1625–1632, 2010.

[14]S. Kurtek, A. Srivastava, E. Klassen, and H. Laga. Landmark-guided elastic shape analysis of spherically-parameterized surfaces. Computer Graphics Forum (Proceedings of Eurographics 2013), 32(2):429–438, 2013.

[15]P. Lorenzen, B. Davis, and S. Joshi. Unbiased atlas formation via large deformations metric mapping. In Medical Image Computing and Computer-Assisted Intervention-MICCAI 2005, volume 3750 of Lecture Notes in Computer Science, pages 411–418. Springer Berlin Heidelberg, 2005.

[16]J. S. Marron, J. O. Ramsay, L. M. Sangalli, and A. Srivastava. Statistics of time warpings and phase variations. Electron. J. Statist., 8(2):1697–1702, 2014.

[17]M. Miller, A. Trouve, and L. Younes. On the metrics and Euler-Lagrange equations of computational anatomy. Annual Review of Biomedical Engineering, 4:375–405, March 2002.

[18]W. Mio, A. Srivastava, and S. Joshi. On shape of plane elastic curves. International Journal of Computer Vision, 73(3):307–324, 2007.

[19]M. Modat, M.J. Cardoso, P. Daga, D.M. Cash, N.C. Fox, and S. Ourselin. Inverse-consistent symmetric free form deformation. In Biomedical Image Registration: 5th International Workshop, volume 7359 of Lecture Notes in Computer Science, pages 79–88. Springer, 2012.

[20]J. Modersitzki. FAIR: Flexible Algorithms for Image Registration. Society for Industrial and Applied Mathematics, 2009.

[21]R. Palais. Morse theory on Hilbert manifolds. Topology, 2:299–340, 1963.

[22]C. Samir, S. Kurtek, A. Srivastava, and M. Canis. Elastic shape analysis of cylindrical surfaces for 3d/2d registration in endometrial tissue characterization. Medical Imaging, IEEE Transactions on, 33(5):1035–1043, May 2014.

[23]A. Srivastava, I. Jermyn, and S. Joshi. Riemannian analysis of probability density functions with applications in vision. In Computer Vision and Pattern Recognition, 2007. CVPR ’07. IEEE Conference on, pages 1–8, June 2007.

[24]A. Srivastava, E. Klassen, S.H. Joshi, and I.H. Jermyn. Shape analysis of elastic curves in euclidean spaces. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(7):1415–1428, July 2011.

[25]A. Srivastava, W. Wu, E. Klassen, and J. S. Marron. Registration of functional data using Fisher-Rao. ARXIV:1103.3817V2, 2011.

[26]R. Szeliski and J. Coughlan. Spline-based image registration. International Journal of Computer Vision, 22(3):199–218, 1997.

[27]H. Tagare, D. Groisser, and O. Skrinjar. Symmetric non-rigid registration: A geometric theory and some numerical techniques. Journal of Mathematical Imaging and Vision, 34(1):61–88, May 2009.

[28]M. Taquet, B. Macq, and S. Warfield. A generalized correlation coefficient: application to DTI and multi-fiber DTI. In Mathematical Methods in Biomedical Image Analysis (MMBIA), 2012 IEEE Workshop on, pages 9–14, Jan 2012.

[29]J. Thirion. Image matching as a diffusion process: an analogy with Maxwell’s demons. Medical Image Analysis, 2(3):243–260, 1998.

[30]A. Trouve. Diffeomorphisms groups and pattern matching in image analysis. International Journal of Computer Vision, 28(3):213–221, July 1998.

[31]J. Derek Tucker, Wei Wu, and Anuj Srivastava. Generative models for functional data using phase and amplitude separation. Comput. Stat. Data Anal., 61:50–66, May 2013.

[32]C.J. Twining, S. Marsland, and C.J. Taylor. Groupwise non-rigid registration: The minimum description length approach. Proceedings of the British Machine Vision Converence (BMVC), 1:417–426, 2004.

[33]P.A. van den Elsen, J.B.A. Maintz, E.-J.D. Pol, and M.A. Viergever. Automatic registration of ct and mr brain images using correlation of geometrical features. Medical Imaging, IEEE Transactions on, 14(2):384–396, Jun 1995.

[34]A Venot, J F. Lebruchec, and J C. Roucayrol. A new class of similarity measures for robust image registration. Comput. Vision Graph. Image Process., 28(3):176–184, November 1984.

[35]T. Vercauteren, X. Pennec, A. Perchant, and N. Ayache. Diffeomorphic demons: Efficient nonparametric image registration. NeuroImage, 45(1, Supplement 1):S61–S72, 2009.

[36]P. Viola and W.M. Wells III. Alignment by maximization of mutual information. In Computer Vision, Fifth International Conference on, pages 16–23, June 1995.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Image registration using phase–amplitude separation

Create new playlist

Sign In

Sign Up