5
MULTIRESOLUTION ANALYSIS
In the previous chapter, we described a procedure for decomposing a signal into its Haas wavelet components of varying frequency (see Theorem 4.12). The Haar wavelet scheme relied on two functions: the Haar scaling function ϕ and the Haar wavelet ψ. Both are simple to describe and lead to an easy decomposition algorithm. The drawback with the Haar decomposition algorithm is that both of these functions are discontinuous (ϕ at x = 0, 1 and ψ at x = 0, 1/2, 1). As a result, the Haar decomposition algorithm provides only crude approximations to a continuously varying signal (as already mentioned, Figure 4.4 does not approximate Figure 4.3 very well). What is needed is a theory similar to what has been described in the past sections but with continuous versions of our building blocks, ϕ and ψ. In this chapter, we present a framework for creating more general ϕ and ψ. The resulting theory, due to Stéphane Mallat (Mallat, 1989, 1998), is called a multiresolution analysis. In the sections that follow, this theory will be used together with a continuous ϕ and ψ (to be constructed later) that will improve the performance of the signal decomposition algorithm with the Haar wavelets described in a previous section.
5.1 THE MULTIRESOLUTION FRAMEWORK
5.1.1 Definition
Before we define the notion of a multiresolution analysis, we need to provide some background. Recall that the Sampling Theorem (see Theorem 2.23) approximately reconstructs a signal f from samples taken uniformly at intervals of length T. If the signal is band-limited and its Nyquist frequency is less than 1/T, then the reconstruction is perfect; otherwise it’s an approximation. The smaller the value of T, the better we can approximate or resolve the signal; the size of T measures our resolution of the signal f. A typical FFT analysis of the samples taken from f works at one resolution, T.
If the signal has bursts where it varies rapidly, interspersed with periods where it is slowly varying, this “single” resolution analysis does not work well—for all the reasons that we outlined earlier in Section 4.1. To treat such signals, Mallat had the following two ideas. First, replace the space of band-limited functions from the Sampling Theorem with one tailored to the signal. Second, analyze the signal using the scaled versions of the same space, but geared to resolutions T/2, T/22, and so on (hence, the term multiresolution analysis). This framework is ideal not only for analyzing certain signals, but also for actually creating wavelets.
We start with the general definition of a multiresolution analysis.
Definition 5.1 Let Vj) , j = … –2, –1, 0, 1, 2, …, be a sequence of subspaces of functions in L2(R). The collection of spaces {Vj, j ∈ Z} is called a multiresolution analysis with scaling function ϕ if the following conditions hold.
1. (Nested) Vj ⊂ Vj+1.
2. (Density) = L2(R)
3. (Separation) ∩Vj = {0}.
4. (Scaling) The function f(x) belongs to Vj if and only if the function f(2–jx) belongs to V0.
5. (Orthonormal basis) The function ϕ belongs to V0 and the set {ϕ (x – k), k ∈ Z} is an orthonormal basis (using the L2 inner product) for V0.
The Vj’s are called approximation spaces. Every f ∈ L2 can be approximated as closely as one likes by a function in a Vj provided that j is large enough. This is precisely what Property 2 above means. The symbol is called the closure of ∪Vj, and it is defined this way: f ∈ if and only if for every ∈ > 0 one can find j such that there is an fj ∈ Vj for which || f – fj || < ∈.
There may be several choices of ϕ corresponding to a system of approximation spaces. Different choices for ϕ may yield different multiresolution analyses. Although we required the translates of ϕ(x) to be orthonormal, we don’t have to. All that is needed is a function ϕ for which the set {ϕ(x – k), k ∈ Z} is a basis. We can then use ϕ to obtain a new scaling function ϕ for which {ϕ(x – k),k ∈ Z} is orthonormal. We discuss this in Section 5.3.
Probably the most useful class of scaling functions are those that have compact or finite support. As defined in Chapter 4, a function has compact support if it is identically 0 outside of a finite interval. The Haar scaling function is a good example of a compactly supported function. The scaling functions associated with Daubechies’s wavelets are not only compactly supported, but also continuous. Having both properties in a scaling function is especially desirable, because the associated decomposition and reconstruction algorithms are computationally faster and do a better job of analyzing and reconstructing signals.
Example 5.2 The Haar scaling function (Definition 4.1) generates the sub-spaces, Vj consisting of the space of piecewise constant functions of finite support (i.e., step functions) whose discontinuities are contained in the set of integer multiples of 2–j (see Definition 4.4). We will now verify that this collection {Vj, j ≥ 0} together with the Haar scaling function, ϕ, satisfies Definition 5.1 of a multiresolution analysis.
As mentioned just after Definition 4.4, the nested property holds since the set of multiples of 2–j is contained in the set of multiples of 2–j+1 (the former set consists of every other member of the latter set). Intuitively, an approximation of a signal by a function in Vj is capable of capturing details of the signal down to a resolution of 2–j The nested property indicates that as j gets larger, more information is revealed by an approximation of a signal by a function in Vj.
The density condition for the Haar system is the content of Theorem 4.9. This condition means an approximation of a signal by a function in Vj eventually captures all details of the signal as j gets larger (i.e., as j → ∞). This approximation is illustrated in Figure 4.11.
To discuss the separation condition, first note that j can be negative as well as positive in the definition of Vj. If f belongs to V–j, for j > 0, then f must be a finite linear combination of {ϕ(x/2j – k),k ∈ Z} whose elements are constant on the intervals … [–2j, 0), [0, 2j), …. As j gets larger, these intervals get larger. On the other hand, the support of f (i.e., the set where f is nonzero) must stay finite. So if f belongs to all the V–j as j → ∞, then these constant values of f must be zero.
Finally, the scaling condition for the Haar system follows from Theorem 4.5 and the orthonormal condition follows from Theorem 4.6. Therefore, the Haar system of {Vj} satisfies all the properties of a multiresolution analysis.
Example 5.3 Linear Spline Multiresolution Analysis. Linear splines are continuous, piecewise linear functions; the jagged function in Figure 5.1 is an example of one.
We can construct a multiresolution analysis for linear splines. Let Vj be the space of all finite energy signals f that are continuous and piecewise linear, with possible corners occurring only at the dyadic points k/2j, k ∈ Z. These approximation spaces satisfy the conditions 1–4 in the definition of a multiresolution analysis (see exercise 9). The scaling function φ is the “tent function,”
and the set {φ(x – k)}k∈z is a nonorthogonal one. Using the methods in Section 5.3, we can construct a new scaling function ϕ that does generate an orthonormal basis for V0.
Example 5.4 Shannon Multiresolution Analysis. For j ∈ Z, let Vj be the space of all finite energy signals f for which the Fourier transform = 0 outside of the interval [–2jπ, 2jπ]—that is, all f ∈ L2(R) that are band-limited and have supp () ⊆ [–2jπ, 2jπ]. Then the collection {Vj} is a multiresolution analysis with scaling function φ(x) = sinc(x), where
See exercise 8 for further details.
We turn to a discussion of properties common to every multiresolution analysis. For a given function g : R → R, let
The function gjk(x) = 2j/2g(2j(x – k/2j) is a translation (by k/2j) and a rescaling (by a factor of 2j/2) of the original function g. (See exercise 1.) The factor of 2j/2 is present to preserve its L2 norm, that is,
We use this notation both for the scaling function ϕ and, later on, for the wavelet. Our first result is that {ϕjk}k ∈ Z is an orthonormal basis for Vj.
Theorem 5.5 Suppose {Vj, j ∈ Z) is a multiresolution analysis with scaling function ϕ. Then for any j ∈ Z, the set of functions
is an orthonormal basis for Vj.
Proof. To prove that {ϕ,jk, k ∈ Z} spans Vj, we must show that any f(x) ∈ Vj) can be written as a linear combination {ϕ{2jx – k); k ∈ Z}. Using the scaling property (Definition 5.1, condition 4), the function f(2–jx) belongs to V0 and therefore f(2–jx) is a linear combination of {ϕ(x – k), k ∈ Z}. By replacing x by 2jx, we see that f(x) is a linear combination of {ϕ(2jx – k),k ∈ Z}. To show that {ϕjk ,k ∈ Z} is orthonormal, we must show
or
To establish this equation, make the change of variables y = 2jx (and so dy = 2j dx). We obtain
which equals δkl in view of the orthonormal basis property given in Definition 5.1, condition 5.
5.1.2 The Scaling Relation
We are now ready to state and prove the central equation in multiresolution analysis, the scaling relation.
Theorem 5.6 Suppose {Vj}; j ∈ Z} is a multiresolution analysis with scaling function ϕ. Then the following scaling relation holds:
Moreover, we also have
(5.2)
or, equivalently,
Remark. The equation above, which relates ϕ(x) and the translates of ϕ(2x), is called the scaling relation or two-scale relation. When the support of ϕ is compact, only a finite number of the pk are nonzero, because when |k| is large enough, the support of ϕ(2x – k) will be outside of the support for ϕ(x). Therefore, the sum occurring in the above theorem is finite. Usually, ϕ is real-valued, in which case the pk are real.
Proof. To prove this theorem, note that must hold for some choice of k because ϕ(x) belongs to V0 ⊂ V1 which is the linear span of {ϕ1k, k ∈ Z}. Since {ϕ1k, k ∈ Z} is an orthonormal basis of V1, the k can be determined by using Theorem 0.21:
Therefore
Let dx and we have
as claimed. To get the second equation, replace x by 2j–1x – l in ϕ, and then adjust the index of summation in the resulting series. The third follows from the second by multiplying by 2(j-1)/2 and then simplifying.
Example 5.7 The values of the pk for the Haar system are
[see Eq.(4.12)] and all other pk are zero.
Example 5.8 As mentioned earlier, the Haar scaling function and Haar wavelet are discontinuous. There is a more intricate construction, due to Daubechies, that we will present in Chapter 6, which leads to a continuous scaling function and wavelet (see Figures 6.2 and 6.3 for their graphs). The associated values of the Pk will be computed in Section 6.1 to be
The following result contains identities for the pk which will be important later.
Theorem 5.9 Suppose {Vj; j ∈ Z} is a multiresolution analysis with scaling function ϕ. Then, provided the scaling relation can be integrated termwise, the following equalities hold:
Proof. The first equation follows from the two-scale relation, Theorem 5.6, and the fact that {ϕ (x – k), k ∈ Z} is orthonormal. We leave the details to exercise 4a. By setting l = 0 in the first equation, we get the second equation. The proof of the third uses Theorem 5.6 as follows:
By letting t = 2x - k and dx = dt/2, we obtain
Now ϕ(t)dt cannot equal zero (see exercise 6) because otherwise we could never approximate functions ∈ L2(R) with f(t) dt =0 by functions in Vj. Therefore, the factor of ϕ(t)dt on the right can be canceled with ϕ(x)dx on the left. The third equation now follows.
To prove the fourth equation, we take the first equation, and replace l by –l and then sum over l. We obtain
Now divide the sum over k into even and odd terms:
For the two inner l sums on the right, replace l by l – k. These inner l sums become ∑l p2l and ∑l p2l+1, respectively. Therefore
Let . This last equation can be restated as |E|2 + |0|2 = 2. From the third property, we have ∑k pk = 2. Dividing this sum into even and odd indices gives E = O = 2. These two equations for E and O have only one solution: E = O = 1, as illustrated in Figure 5.2 for the case when all the pk are real numbers. For the general case, see exercise 7.
5.1.3 The Associated Wavelet and Wavelet Spaces
Recall that Vj is a subset of Vj+1. In order to carry out the decomposition algorithm in the general case, we need to decompose Vj+1 into an orthogonal direct sum of Vj and its orthogonal complement, which we will denote by Wj (as we did in the case of the Haar system). In addition, we need to construct a function ψ whose translates generate the space Wj (also as we did in the Haar system). Once ϕ is specified, the scaling relation can be used to construct the associated function ψ that generates Wj. We do this now.
Theorem 5.10 Suppose {Vj ; j ∈ Z} is a multiresolution analysis with scaling function
(pk are the coefficients in Theorem 5.6). Let Wj be the span of {ψ(2jx – k); k ∈ Z) where
Then, Wj ⊂ Vj+1 is the orthogonal complement of Vj in Vj+1. Furthermore, [ψjk(x) := 2j/2ψ(2jx – k), k ∈Z} is an orthonormal basis for the Wj.
For the Haar scaling function, the coefficients p0 and p1 are both 1. The above theorem then states that the Haar wavelet is ψ(x) = ϕ(2x) – ϕ(2x – 1), which agrees with the definition of ψ in Definition 4.7.
Proof of Theorem 5.10. Once we establish this theorem for the case j = 0, the case for j > 0 will follow by using the scaling condition (Definition 5.1, condition 4).
To establish the result for the case j = 0, we must verify the following three statements
1. The set {ψ0k(x) = ψ(x – k), k ∈ Z} is orthonormal.
2. ψ0k(x) = ψ(x – k) is orthogonal to V0 for all k ∈ Z. Equivalently, W0 is contained in the orthogonal complement of V0 in V1.
3. Any function in V1 that is orthogonal to V0 can be written as a linear combination of ψ (x – k). Equivalently, W0 contains the orthogonal complement of V0 in V1.
We establish the first two statements, and we postpone the proof of third (which is more technical) until the Appendix A.2.2. The second and third together imply that W0 is the orthogonal complement of V0 in V1.
To establish the first statement, we have from exercise 4b that
Making the change of index k′ = 1 – k + 2m in this series and using the first part of Theorem 5.9, we have
so the set is orthonormal.
To establish the second statement, it is sufficient to show that ψ(x – m) is orthogonal to ϕ(x – l) for each l ∈ Z because V0 is spanned by {ϕ(x – l), l ∈ Z}. From exercise 4c, we have that
The sum on the right is zero. To see this, let us first examine the case when l = m = 0, in order to see the pattern of cancellation. In this case, this sum is
The inner pair of terms cancel and then the outer pair of terms cancel and so on. To see the general case, first note that the term with k = l + m – j for j ≥ 0 is
which cancels with the term with k = l + m + j + 1 (again with j ≥ 0), which is
This completes the proof of the second statement. As noted earlier, the third statement will be proved in the Appendix A.2.2.
For future reference, we note that ψjl(x) = 2j/2ψ(2j x ∈ l) has the expansion
This follows from the definition of ψ given in Theorem 5.10: Substitute 2jx – l for x, multiply both sides by 2j/2, and adjust the summation index.
From Theorem 5.10, the set {ψj–1, k}k∈Z is an orthonormal basis for the space Wj–1, which is the orthogonal complement of Vj–1 in Vj (so Vj = Wj–1)⊕ Vj–1). By successive orthogonal decompositions, we obtain
Since we have defined Vj for j < 0, we can keep going:
The Vj are nested and the union of all the Vj is the space L2(R). Therefore, we can let j → ∞ and obtain the following theorem.
Theorem 5.11 Let {Vj, j ∈ Z} be a multiresolution analysis with scaling function ϕ. Let Wj) be the orthogonal complement of Vj in Vj+1. Then
In particular, each f ∈ L2(R) can be uniquely expressed as a sum with wk ∈ Wk and where the wk’s are mutually orthogonal. Equivalently, the set of all wavelets, {ψjk}j, k∈Z is an orthonormal basis for L2(R).
The infinite sum appearing in this theorem should be thought of as an approximation by finite sums. In other words, each f ∈ L2(R) can be approximated arbitrarily closely in the L2 norm by finite sums of the form w–j + w1–j + ... + Wj–1 + Wj for suitably large j.
For large j, the Wj components of a signal represent its high-frequency components because Wj is generated by translates of the function ψ(2jx) which vibrate at high frequency. For example, compare the picture in Figure 5.3, which is the graph of a function, ψ(x) to that in Figure 5.4, which is the graph of ψ(22x) (the same function ψ is used to generate both graphs).
5.1.4 Decomposition and Reconstruction Formulas: A Tale of Two Bases
Suppose that we are dealing with a signal f that is already in one of the approximation spaces, such as Vj. There are two primary orthonormal bases that can be used for representing f. The first is the natural scaling function basis for Vj, {ϕjk}k∈z, as defined in Theorem 5.5. In terms of this basis, we have
Since we have the orthogonal direct sum decomposition Vj = Vj–1 ⊕ Wj–1, we can also use the concatenation of the bases for Vj–1 and Wj–1 that is, we use {ϕj−1,k}k∈Z ∪ {ψj−1,k}k∈Z where ψjk is defined in Theorem 5.10. Relative to this orthonormal basis (see Theorem 0.21), f has the form
The decomposition formula starts with the coefficients relative to the first basis [in Eq. (5.7)] and uses them to calculate those relative to the second [in Eq. (5.8)]. The reconstruction formula does the reverse. Here is the decomposition formula.
To obtain the first part, we use the expansion (5.7) for f and the scaling relation in Eq. (5.3). The second part also requires (5.7), but uses Eq. (5.6) for the wavelets ψjk with j replaced by j – 1.
One important application of the decomposition formula is to express ϕjk in terms of the second basis. From the decomposition formula (5.9) together with the orthonormality of the ϕjk’s, we have that and . Being careful about which index we are summing over (l in this case), we obtain the expansion
which is literally the “inverse” of the scaling relation. By taking the L2 inner product of the preceding equation with f, we obtain the reconstruction formula:
The preceding formulas all involve orthonormal bases. For various reasons it might be more convenient to use the orthogonal versions—{ϕ(2j x – k)}k∈Z rather than rather than {ψjk(x) = 2j/2ψ(2j x – k)}k∈Z. For instance, the expansion for f ∈ Vj in Eq. (5.7) can be rewritten this way:
Similarly, Eq. (5.8) can be rewritten as
where . The are called the approximation coefficients and the are called the detail coefficients. The decomposition and reconstruction formulas can be restated in terms of them.
Thus ends our “‘Tale of Two Bases.” We use the basis {ϕjk}k∈Z to view the signal as a whole, and we use the basis {ϕj–1,k}k∈Z ∪ {ψj–1,k}k∈Z to view the smoothed and oscillatory parts of the signal.
5.1.5 Summary
Let us summarize the main results concerning a multiresolution analysis {Vj}j∈Z with scaling function ϕ.
Scaling Function
Wavelet Spaces
Wavelet
Orthonormal Bases
Decomposition Formulas
Reconstruction Formulas
5.2 IMPLEMENTING DECOMPOSITION AND RECONSTRUCTION
In this section, we describe decomposition and reconstruction algorithms associated with a multiresolution analysis. These algorithms are analogous to those presented in the section on Haar wavelets. In later sections, the algorithms developed here will be used in conjunction with a multiresolution analysis involving a continuous scaling function and a continuous wavelet to provide accurate decomposition and reconstruction of signals.
5.2.1 The Decomposition Algorithm
In order to do signal processing, such as filtering or data compression, an efficient algorithm is needed to decompose a signal into parts that contain information about the signal’s oscillatory behavior. If we use a multiresolution analysis for this purpose, we need to develop an algorithm for breaking up a signal into its wavelet-space (Wj) components, because these components have the information that we want.
There are three major steps in decomposing a signal f : initialization, iteration, and termination.
Initialization. This step involves two parts. First, we have to decide on the approximation space Vj that best fits the information available on f. The sampling rate and our choice of multiresolution analysis determine which Vj to use. Second, we need to choose fj, ∈ Vj to best fit f itself.
The best approximation to f from Vj, in the sense of energy, is Pjf, the orthogonal projection of f onto Vj (see Definition 0.19); since 2j/2ϕ(2j x – k) is orthonormal, Theorem 0.21 implies
The information from the sampled signal is usually not enough to determine the coefficients exactly, and so we have to approximate them using the quadrature rule given in the next theorem.
Theorem 5.12 Let {Vj, j ∈ Z} be a given multiresolution analysis with a scaling function ϕ that is compactly supported. If f ∈ L2(R) is continuous, then, for j sufficiently large, we have
where
Proof. Since ϕ has compact support, it is nonzero on an interval of the form {|t ≤ M}. (In many cases, M is not large; in fact in the case of the Haar function, M = 1, and for the first few Daubechies wavelets, M is on the order of 5.) Thus, the integral of integration for in Eq. (5.14) is {x; |2jx – k| ≤ M}. By changing variables in the integral for from x to t = 2jx – k, we obtain
When j is large, 2–jt + 2–j k ≈ 2–jk for t ∈ [–M, M]. Therefore, f(2–jt + 2–jk) ≈ f(2–jk) because f is uniformly continuous on any finite interval (this is illustrated in Figure 5.5). In particular, we can approximate the integral on the right by replacing f(2–j + 2–jk) by f(2–jk) to obtain
Recalling that ϕ is 0 outside of the interval [–M, M], we have that
so ≈ mf (k/2j).
The accuracy of this approximation increases with increasing j. In exercise 12, an estimate is given on how large j must be in order to get an approximation to within a given tolerance. The accuracy may depend on the index k as well as j. Since in practice only finitely many coefficients are dealt with, care should be taken to pick j large enough so that the all of the coefficients are accurately calculated. In addition, the proof of the theorem can be modified to apply to the case of a piecewise continuous signal (see exercise 15) and to the case where ϕ is well-localized, but not compactly supported. As mentioned in the proof of Theorem 5.9, ϕ ≠ 0 (exercise 6). Often, ϕ is constructed so that ϕ = 1 (e.g., Haar and Daubechies), in which case = f (k/2j ). Using the quadrature formula = mf (k/2j), we are also approximating the projection Pj f by the expansion,
Note the similarity to the expansions associated with the Sampling Theorem.
The initialization step discussed here is often simply assumed and used without comment. Strang and Nguyen (1996) point this out and call it the “wavelet crime!”
Iteration. Wavelets shine here! After the initialization step, we have f ≈ fj ∈ Vj. From Theorem 5.11, we can start with fj, and decompose it into a sum of a lower-level approximation part, fj–1 ∈ Vj–1 and a wavelet part, Wj–1 ∈ Wj–1; that is, fj = fj–1 + Wj–1. We then repeat the process with fj–1, then with fj–2, and so on. This is illustrated in the diagram below, where we have stopped the decomposition at level 0.
To carry out the decomposition, we work with the approximation and wavelet coefficients, the a’s and b’s that we discussed at the end of Section 5.1.4. As we did for the Haar wavelets, we want to translate the decomposition equation (5.12) into the language of discrete filters and simple operators acting on coefficients.
Recall that the convolution of two sequences x = (... x–1, x0, x1,... ) and y = (... y–1, y0, y1, ... ) is defined (see Definition 3.9) by
Let h and I be the sequences
Define the two discrete filters (convolution operators) H and L via H(x) = h * x and L(x) = * x. Take x = aj and note that L(aj)t = Comparing this with Eq. (5.12), we see that . Similarly, = H(aj)2l. In discussing the Haar decomposition algorithm, we encountered this operation of discarding the odd coefficients in a sequence and called it down sampling. A down-sampled signal is a sampling of the signal at every other node and thus corresponds to sampling a signal on a grid that is twice as coarse as the original one. We define the down-sampling operator D below.
Definition 5.13 If x = (...x–2, x–1, x0, x1, x2,...) then its down sample is the sequence
or (Dx)l = x2l for all l ∈ Z.
We can now formulate the iterative step in the algorithm using discrete filters (convolution operators). The two filters that we use, h and are called the decomposition high-pass and decomposition low-pass filters, respectively.
In Figure 5.6, we illustrate the iterative step in the decomposition algorithm. We represent the down-sampling operator pictorially by 2↓. It is important to note that the discrete filters and down-sampling operator do not depend on the level j. Thus storage is minimal and, because convolution is inexpensive computationally, the whole iterative process is fast and efficient.
Termination. There are several criteria for finishing the decomposition. The simplest is that we decompose until we exhaust the finite number of samples we have taken. On the other hand, this may not be necessary. Singularity detection may only require one or two levels. In general, the choice of stopping point greatly depends on what we wish to accomplish.
The end result of the entire decomposition procedure—stopped at j = 0, say—is a set of coefficients that includes the approximation coefficients for level 0, , and the detail (wavelet) coefficients for j′ = 0,..., j – 1.
Example 5.14 We consider the signal, f, defined on the unit interval 0 ≤ x ≤ 1 given in Figure 5.7 (the same one used in Example 4.13). As before, we discretize this signal over 28 nodes (so = f (l/28)). We now use the decomposition algorithm given in Eq. (5.17) with the Daubechies wavelets whose p values are given in Eq. (5.4). Plots of the components in V8, V7, V6, and V4 are given in Figures 5.8, 5.9, 5.10, and 5.11. Compare these with the corresponding figures for the Haar wavelets (Figures 4.14, 4.15, 4.16, and 4.17). Since the Daubechies wavelets are continuous, they offer better approximations to the original signal than do the Haar wavelets.
5.2.2 The Reconstruction Algorithm
As mentioned in the section on Haar wavelets, once the signal has been decomposed, some of its Wj′ components may be modified. If the goal is to filter out noise, then the Wj′ components of f corresponding to the unwanted frequencies can be thrown out and the resulting signal will have significantly less noise. If the goal is data compression, the Wj′ components that are small can be thrown out, without appreciably changing the signal. Only the significant Wj′ components (the larger ) need to be stored or transmitted and significant data compression can be achieved. In either case, since the Wj′ components have been modified, we need a reconstruction algorithm to reassemble the compressed or filtered signal in terms of the basis elements, ϕ(2jx – l), of Vj. The idea is to reconstruct fj ≈ f by using fj′ = fj′–1 + wj′1, starting at j′ = 1. The scheme is illustrated in the diagram below.
We again break up the algorithm into three major steps: initialization, iteration, and termination.
Initialization. What we have available is a set of possibly modified coefficients—starting at level 0, say—that includes the approximation coefficients for level 0, and the detail (wavelet) coefficients for j′ = 0,..., j – 1. These coefficients appear in the expansions
(5.18)
(5.19)
Iteration. We will again formulate this step in terms of discrete filters. Let and be the sequences
(5.20)
(5.21)
Define the two discrete filters (convolution operators) and via (x) = * x and (x) = * x. The reconstruction formula Eq. (5.13) gives as a sum of two terms, . Using the filter sequences we have just introduced, we can rewrite Eq. (5.13) as
This is almost a sum of two convolutions; the only difference is that the index for a convolution is k – l instead of k – 2l. In other words, Eq. (5.22) is a convolution with the odd terms missing (i.e., k–(2l+1) They can be inserted back by simply multiplying them by zero. We illustrate this procedure by explicitly writing out the first few terms:
In order to put this sum into convolution form, we alter the input sequence by interspersing zeros between its components, making a new sequence that has zeros at all odd entries. Each of the original nonzero terms is given a new, even index by doubling its old index. For example, now has index –2. This procedure is called up sampling and is precisely formulated as follows:
Definition 5.15 Let x = (…x–2, x–1, x0, x1, x2,...) be a sequence. We define the upsampling operator U via
or
The iterative step (5.22) in the reconstruction algorithm can now be simply formulated in terms of discrete filters (convolution operators).
(5.23)
We remark that and are called the reconstruction high-pass and low-pass filters, respectively. As was the case in the decomposition algorithm, neither of the filters depends on the level. This makes the iteration in the reconstruction step quick and efficient. In Figure 5.12, we illustrate the iterative step in the reconstruction algorithm. We represent the up-sampling operator pictorially by 2↑.
In the case of the Haar wavelets, p0 = p1 = 1. This algorithm reduces to the Haar reconstruction algorithm given in Theorem 4.14.
Termination. The decomposition and reconstruction algorithms use the scaling coefficients, pk, but not the actual formulas for ϕ(x) or ψ(x). To plot the reconstructed signal f(x) = , we can approximate the value of f at in view of Theorem 5.12 (provided that we have ϕ = 1). So the formulas for ϕ and ψ do not even enter into the plot of the reconstructed signal. This is fortunate since the computation of ϕ and ψ for Daubechies example is rather complicated (see Sections 5.3.4 and 6.4). However, ϕ and ψ play important background roles since the orthogonality properties of ϕ and ψ are crucial to the success of the decomposition and reconstruction algorithms.
5.2.3 Processing a Signal
The steps in processing a signal are essentially the same as the ones discussed in connection with the Haar system in Chapter 4.
1. Sample. This is really a preprocessing step. If the signal is continuous,1then it must be sampled at a rate sufficient to capture its essential details. The sampling rate varies, depending on a variety of factors. For instance, to sample with full fidelity a performance of Beethoven’s Ninth Symphony, one has to capture frequencies up to the audible limit of human hearing, roughly 20 kHz. This is the Nyquist frequency. A good rule of thumb frequently used in sampling signals is to use a rate that is double the Nyquist frequency, or 40 kHz in this case.
2. Decompose. Once the signal has been sampled, then we use Theorem 5.12 to approximate the highest level approximation coefficients. We then iterate Eq. (5.12) or (5.17) until an appropriate level is reached. The output of this step are all levels of wavelet coefficients (details) and the lowest-level approximation coefficients. This set of coefficients will be used in the next step.
3. Process. At this stage, the signal can be compressed by discarding insignificant coefficients, or it can be filtered or denoised in other ways. The output is a modified set of coefficients, , which may be stored or immediately reconstructed to recover the processed version of the signal. In some cases, such as singularity detection, the signal is of no further interest and is discarded.
4. Reconstruct. The reconstruction algorithm (5.22) or (5.23) is used here with the modified from the previous step. The output of that algorithm is really the set of top-level approximation coefficients. Theorem 5.12 is then invoked to conclude that the processed signal is approximately equal to the top-level reconstructed coefficients.
Example 5.16 We will apply the decomposition and reconstruction algorithms with Daubechies wavelets to the signal, f, over the unit interval given in Figure 5.13 (this is the same signal used in Example 4.15),
As before, we discretize this signal over 28 nodes for 0 ≤ k ≤ 28 – 1) and then decompose this signal with Eq. (5.17) to obtain
with
where ϕ and ψ are the Daubechies scaling function and wavelet (which we will construct shortly). We then compress and reconstruct the signal using Eq. (5.22), and we re-plot the new as the approximate value of the reconstructed signal at x = k/28. Figures 5.14 and 5.15 illustrate 80% and 90% compressions using Daubechies wavelets (by setting the smaller 80% and 90% of the coefficients equal to zero, respectively). Compare these compressed signals with Figures 4.20 and 4.21, which illustrate 80% and 90% compression with the Haar wavelets. The relative errors for 80% and 90% compression are 0.0553 and 0.1794, respectively. For comparison, the relative errors for the same compression rates with Haar wavelets are 0.0895 and 0.1838, respectively (see Example 4.15). Using the FFT, for the same compression rates, the relative errors are 0.1228 and 0.1885, respectively.
Do not get the impression from this example that wavelets are better than the FFT for data compression. The signal in this example has two isolated spikes, which wavelets are designed to handle better than the FFT. Signals without spikes or rapid changes or that have an almost periodic nature can be handled quite well with the FFT (perhaps even better than wavelets).
5.3 FOURIER TRANSFORM CRITERIA
The examples of multiresolution analyses discussed in the previous sections— Haar, linear splines, and Shannon—all are constructed by starting with a given set of sampling spaces. All of these have limitations. For example, as mentioned earlier, the decomposition algorithm based on the Haar scaling function does not provide an accurate approximation to most smooth signals, because the Haar scaling function and associated wavelet are discontinuous. The Shannon mul-tiresolution analysis is smooth enough, but the high- and low-pass filters used in decomposition and reconstruction have infinite-length impulse responses. Moreover, these responses are sequences that have slow decay for large index n. The linear splines are better in this regard. Even so, the impulse responses involved are still infinite. In Chapter 6, we construct a multiresolution analysis that has a continuous scaling function and finite impulse responses for the high- and low-pass filters used in its decomposition and reconstruction algorithms. Instead of starting with the sampling spaces, we construct the scaling function directly using Fourier transform methods. In this section, we state and derive the properties that must be satisfied by a scaling function, and then we translate these properties into language involving the Fourier transform.
5.3.1 The Scaling Function
Recall that a multiresolution analysis is, by definition, a collection of subspaces {Vj, j ∈ Z] of L2(R), where each Vj is the span of {ϕ(2jx – k), k ∈ Z] where ϕ must be constructed so that the collection {Vj, j ∈ Z] satisfies the properties listed in Definition 5.1. The following theorem provides an equivalent formulation of these properties in terms of the function ϕ.
Theorem 5.17 Suppose ϕ is a continuous function with compact support satisfying the orthonormality condition; that is, ϕ (x – k)ϕ (x – l) dx = δkl. Let Vj be the span of {ϕ (2 jx – k); k ∈ Z}. Then the following hold.
- The spaces Vj satisfy the separation condition (i.e., ∩Vj = {0}).
- Suppose that the following additional conditions are satisfied by ϕ:
1. Normalization: ϕ (x)dx = 1;
2. Scaling: ϕ (x) = ∑k pkϕ (2x – k) for some finite number of constants pk.
Then the associated Vj satisfy the density condition: = L2(R), or in other words, any function in L2(R) can be approximated by functions in Vj for large enough j.
In particular, if the function ϕ is continuous with compact support and satisfies the normalization, scaling and orthonormality conditions listed above, then the collection of spaces (Vj, j ∈ Z} forms a multiresolution analysis.
Proof. Here, we give a nontechnical explanation of the central ideas of the first part of the theorem. Rigorous proofs of the first and second parts are contained in the Appendix (see A.2.1).
By definition, a signal, f, belonging to V–j is a linear combination of
As j > 0 gets large, the support of ϕ (x/2j – k) spreads out. For example, if Figure 5.16 represents the graph of ϕ, then the graph of ϕ (x/22) is given in Figure 5.17.
On the other hand, as a member of L2(R), f has finite energy; that is, |f(t)|2 dt is finite. As the graphs in Figures 5.16 and 5.17 show, the energy of a signal increases (|f(t)|2dt → ∞), as the support of the signal grows. Therefore, a nonzero signal that belongs to all the V–j, as j → ∞, must have infinite energy. We conclude that the only signal belonging to all the Vj must be identically zero.
5.3.2 Orthogonality via the Fourier Transform
How is a scaling function ϕ constructed? To construct a scaling function, we must first find coefficients pk that satisfy Theorem 5.9. A direct construction of the pk is difficult. Instead, we use the Fourier transform, which will more clearly motivate some of the steps in the construction of the pk and hence of ϕ. At first, we assume a continuous scaling function exists and see what properties must be satisfied by the associated scaling coefficients pk. Then, we reverse the process and show that if pk satisfies these properties, then an associated scaling function can be constructed. In the next section, an example of a set of coefficients, pk, will be constructed that satisfy these properties, and this will lead to the construction of a particular continuous scaling function (due to Daubechies).
Recall that the definition of the Fourier transform of a function f is given by
Since , the normalization condition ( ϕ = 1) becomes
The translations of the orthonormality conditions on ϕ and ψ into the language of the Fourier transform are given in the next theorem.
Theorem 5.18 A function ϕ satisfies the orthonormality condition if and only if
In addition, a function ψ(x) is orthogonal to ϕ (x – l) for all l ∈ Z if and only if
Proof. We will prove the first part. The proof of the second part is similar and left to the exercises (see exercise 16). The orthonormality condition can be stated as
By replacing x – k by x in the above integral and relabeling n = l – k, the orthonormality condition can be restated as
(5.24)
Recall that Plancherel’s identity for the Fourier transform (Theorem 2.12) states
We apply this identity with f(x) = ϕ (x) and g(x) = ϕ (x – n). By the sixth part of Theorem 2.6, with a = n and λ = ξ, we obtain . So the orthonormality condition (5.24) can be restated as
or
By dividing up the real line into the intervals Ij = [2π j, 2π(j + 1)] for j ∈ Z, this equation can be written as
Now replace ξ by ξ + 2π j. The limits of integration change to 0 and 2π :
Since e2πij = 1, for j ∈ Z, this equation becomes
Let
The orthonormality condition (5.25) becomes
(5.26)
The function F is 2π -periodic because
Since F is periodic, it has a Fourier series, ∑αneinx where the Fourier coefficients are given by . Thus, the orthonormality condition, (5.26), is equivalent to α–n = δn0, which in turn is equivalent to the statement F(ξ) = 1. This completes the proof.
An Interesting Identity. There is an interesting and useful formula that we can derive using this theorem. Recall that the Haar scaling function ϕ, which generates the orthonormal set {ϕ(x – k}k∈z, is identically 1 on the unit interval and 0 everywhere else, and so its Fourier transform is
With a little algebra, we obtain
Since {ϕ (x – k), k ∈ Z} is an orthonormal set of functions, Theorem 5.18 implies
Since sin2 x is π-periodic, all the numerators are equal to sin2 . Dividing by this and simplifying results in the formula, we obtain
This formula is well known, and it is usually derived via techniques from complex analysis. We will apply it in exercise 10 to get a second scaling function for the linear spline multiresolution analysis, whose translates are orthonormal (but not compactly supported).
5.3.3 The Scaling Equation via the Fourier Transform
The scaling condition, ϕ (x) = ∑k pkϕ (2x – k) can also be recast in terms of the Fourier transform.
Theorem 5.19 The scaling condition ϕ (x) = ∑k pkϕ (2x – k) is equivalent to
where the polynomial P is given by
Proof. We take the Fourier transform of both sides of the equation ϕ (x) = ∑k pkϕ (2x – k) and then use Theorem 2.6 to obtain
where P(z) = (1/2) ∑k pkzk as claimed.
Suppose ϕ satisfies the scaling condition. Using Theorem 5.19 in an iterative fashion, we obtain
and so
Continuing in this manner, we obtain
For a given scaling function ϕ, the preceding equation holds for each value of n. In the limit, as n → ∞, this equation becomes
If ϕ satisfies the normalization condition ϕ (x)dx = 1, then and so
This provides a formula for the Fourier transform of the scaling function ϕ in terms of the scaling polynomial P. This formula is of limited practical use for the construction of ϕ since infinite products are difficult to compute. In addition, the inverse Fourier transform would have to be used to recover ϕ from its Fourier transform. Nevertheless, it is useful theoretically, as we will see later.
Given a multiresolution analysis generated by a function ϕ satisfying the scaling condition ϕ(x) = ∑k pkϕ (2x – k) the associated wavelet, ψ, satisfies the following equation (see Theorem 5.10):
Let
For |z| = 1, Q(z) = (1/2) (see exercise 13). The same arguments given in the proof of Theorem 5.19 show that
The previous two theorems can be combined to give the following necessary condition on the polynomial P(z) for the existence of a multiresolution analysis.
Theorem 5.20 Suppose the function ϕ satisfies the orthonormality condition, ϕ (x – k)ϕ (x – l)dx = δkl and the scaling condition, ϕ (x) = ∑k pkϕ (2x – k). Then the polynomial P(z) = (1/2) ∑k pkzk satisfies the equation
or equivalently
Proof. In view of Theorem 5.18, if ϕ satisfies the orthonormality condition, then
If ϕ satisfies the scaling condition, then (Theorem 5.19)
Dividing the sum in (5.31) into even and odd indices and using (5.32), we have
where the last equation uses e–2lπi = 1 and e–πi = −1. The two sums appearing on the right are both 1/2π (using Theorem 5.18 with ξ replaced by ξ/2 and ξ/2 + π). Therefore, the previous equations reduce to
Since this equation holds for all ξ ∈ R, we conclude that |P(z)|2 + |P(–z)|2 = 1 for all complex numbers z with |z| = 1, as desired.
The following theorem is the analog of the above theorem for the function ψ and its associated scaling polynomial Q. Its proof is similar to the proof of the previous theorem and will be left to the reader (see exercise 14).
Theorem 5.21 Suppose the function ϕ satisfies the orthonormality condition, ϕ (x – k) ϕ (x – l)dx = δkl, and the scaling condition, ϕ (x) = ∑k pkϕ (2x – k). Suppose ψ(x) = ∑k qk ϕ (2x – k). Let Q(z) = ∑k qkZk. Then the following two statements are equivalent.
Another way to state Theorems 5.20 and 5.21 is that the matrix associated to an orthonormal scaling function and orthonormal wavelet must be unitary (that is, MM* = I).
Example 5.22 Here, we examine the special case of the Haar scaling function which is one on the unit interval and zero everywhere else. For this example, P0 = P1 = 1 (all other pk = 0) and so
We will check the statements of Theorems 5.19 and 5.20 for this example. Using the formula for the Fourier transform of the Haar scaling function given in Eq. (5.27), we obtain
So as stated in Theorem 5.19.
Also note
in agreement with Theorem 5.20.
In a similar manner, the Fourier transform of the Haar wavelet is
Its scaling polynomial is Q(z) = (1 – z)/2 and so
in agreement with Eq. (5.30).
5.3.4 Iterative Procedure for Constructing the Scaling Function
The previous theorem states that if ϕ exists, its scaling polynomial P must satisfy the equation |P(z)|2 + |P(– z)|2 = 1 for |z| = 1. So one strategy for constructing a scaling function ϕ is to construct a polynomial P that satisfies this equation and then construct a function ϕ so that it satisfies the scaling equation ϕ (x) = ∑k pkϕ (2x – k).
Let us assume that P has been constructed to satisfy Theorem 5.20 and that P(1) = 1. An example of such a P will be given in the next section. The strategy for constructing the scaling function ϕ associated with P is given by the following iterative process. Let the Haar scaling function be denoted by
The Haar scaling function already satisfies the orthonormality property. Then define
In general, define ϕn in terms of ϕn–1 by
(5.34)
In the next theorem, we will show that the ϕn converge, as n → ∞ , to a function denoted by ϕ. By taking limits of the above equation as n → ∞, we obtain
and so ϕ satisfies the desired scaling condition. Since ϕ0 satisfies the orthonormality condition, there is some hope that ϕ1, ϕ2, ϕ3, ... and eventually ϕ will also satisfy the orthonormality condition. This procedure turns out to work, under certain additional assumptions on P.
Theorem 5.23 Suppose P(z) = (1/2)∑k pkzk is a polynomial that satisfies the following conditions:
Let ϕ0(x) be the Haar scaling function and let ϕn(x) = ∑k pk ϕn–1(2x – k) for n ≥ 1. Then, the sequence ϕn converges pointwise and in L2 to a function ϕ, which satisfies the normalization condition, , the orthonormality condition
and the scaling equation, ϕ (x) = ∑k pkϕ (2x – k).
Proof. A formal proof of the convergence part of the theorem is technical and is contained in Appendix A.2.3. Once convergence has been established, ϕ automatically satisfies the scaling condition, as already mentioned. Here, we give an explanation of why ϕ satisfies the normalization and orthonormality conditions. We first start with ϕ1, which by Eq. (5.33) is
This is equivalent to the following Fourier transform equation
by the same argument as in the proof of Theorem 5.19. Since and P(1) = 1, clearly and so ϕ1 satisfies the normalization condition.
By (5.35), we have
Dividing the sum on the right into even and odd indices and using the same argument as in the proof of Theorem 5.20, we obtain
Since ϕ0 satisfies the orthonormality condition, both sums on the right side equal 1/2π (by Theorem 5.18). Therefore
By Theorem 5.18, ϕ1 satisfies the orthonormality condition. The same argument shows that if ϕn–1 satisfies the normalization and orthonormality conditions, then so does ϕn. By induction, all the ϕn satisfy these conditions. After taking limits, we conclude that ϕ satisfies these conditions, as well.
Example 5.24 In the case of Haar, p0 = p1 = 1 and all other pk = 0. In this case, P(z) = (1 + z)/2. Clearly, P(1) = 1 and in Example 5.22 we showed that |P(z)|2 + |P(–z)|2 = 1 for |z| = 1. Also, |P(eit)| = |(1 + eit)/2| > 0 for all |t| < π. So all the conditions of Theorem 5.23 are satisfied. The iterative algorithm in this theorem starts with the Haar scaling function for ϕ0. For the Haar scaling coefficients, p0 = P1 = 1, all the ϕk equal ϕ0. For example, from Eq. (5.33) we obtain
Likewise, ϕ2 = ϕ1 = ϕ0, and so forth. This is not surprising since the iterative algorithm uses the scaling equation to produce ϕk from ϕk–1. If the Haar scaling equation is used, with the Haar scaling function as ϕ0, then the iterative algorithm will keep reproducing ϕ0.
Example 5.25 Daubechies’ Example. Let P(z) = (1/2) ∑k pkZk, where
As established in exercise 11, P(z) satisfies the conditions in Theorem 5.23. Therefore, the iterative scheme stated in Theorem 5.23 converges to a scaling function ϕ whose construction is due to Daubechies. Plots of the first four iterations, ϕ0, ϕ1, ϕ2, and ϕ4, of this process are given in Figures 5.18–5.21.
In the next section, we explain the motivation for this choice of p0, ..., P3 and other similar choices of pk for the Daubechies wavelets.
1. For each function g below, plot g(x), g(x + 1), 21/2g(2x – 3), and – 1).
2. The support of a complex-valued function f is the closure of the set of all points for which f ≠ 0. We denote the support of f by supp(f). We say that f is compactly supported or has compact support if supp(f) is contained in a bounded set. Find the support of the following functions. Which are compactly supported?
3. Parseval’s equation came up in connection with Fourier series, in Theorem 1.40. It holds under much more general circumstances, however. Let V be a complex inner product space with an orthonormal basis . Show that if f ∈ V and g ∈ V have the expansions
then
This is called Parseval’s equation. The indexing k = 1, 2, ... is unimportant. Any index set (including double indexing) can be used just long as the index set is countable.
4. Use Parseval’s equation from exercise 3 to establish the following equations.
(a) Equation 1 in Theorem 5.9.
(b) Show that
where ψ is defined in Eq. (5.5). Hint: Change summation indices in Eq. (5.5) and use the fact that {21/2ϕ (2x – k)}k∈z is orthonormal.
(c) Show that
where ψ is defined in Eq. (5.5). Hint: Change summation indices in Eq. (5.5) and use the fact that {21/2ϕ (2x – k)}k∈z is orthonormal.
5. We defined the support of a function in exercise 2. Show that if the scaling function ϕ in a multiresolution analysis has compact support, then there are only a finite number of nonzero coefficients in the scaling relation from Theorem 5.6.
6. Suppose that {Vj : j ∈ Z} is a multiresolution analysis with scaling function ϕ, and that ϕ is continuous and compactly supported.
(a) Find uj, the orthogonal projection onto Vj, of the step function
(b) If , show that for all j sufficiently large, ||u – uj|| ≥ .
(c) Explain why the result above implies that
7. In the case where the pk’s are complex, the quantities E and O, which are defined in connection with the proof of Eq. 4 in Theorem 5.9, may be complex. Show that even if we allow for this, the equations |E|2 + |O|2 = 2 and E + O = 2 have E = O = 1 as their only solution.
8. (Shannon Multiresolution Analysis). For j ∈ Z, let Vj be the space of all finite energy signals f for which the Fourier transform equals 0 outside of the interval [–2jπ, 2jπ]–that is, all f ∈ L2(R) that are band-limited and have supp .
(a) Show that {Vj}j∈z satisfies properties 1–4 in the definition of a mul-tiresolution analysis, Definition 5.1.
(b) Show that ϕ(x):= sinc(x), where sinc is defined in Eq. (5.1), satisfies property 5 of Definition 5.1, and so it is a scaling function for the Vj’s. (Hint: Use the Sampling Theorem 2.23 to show that {ϕ(x – k)} is an orthonormal basis for V0.)
(c) Show that ϕ satisfies the scaling relation,
(d) Find an expansion for the wavelet ψ associated with ϕ.
(e) Find the high- and low-pass decomposition filters, h and
(f) Find the high- and low-pass reconstruction filters, and .
9. (Linear Spline Multiresolution Analysis) For j ∈ Z, let Vj be the space of all finite energy signals f that are continuous and piecewise linear, with possible corners occurring only at the dyadic points k/2j, k ∈ Z.
(a) Show that {Vj}j∈z satisfies properties 1–4 in the definition of a multiresolution analysis, Definition 5.1.
(b) Let φ(x) be the “tent function,”
Show that {φ(x – k}k∈z is a (nonorthogonal) basis for V0. Find the scaling relation for φ.
10. Let φ(x) be the tent function from exercise 9.
(a) Show that
(b) Show that
(Hint: Differentiate both sides of Eq. (5.28) twice and simplify.)
(c) Verify that the function g defined via its Fourier transform,
is such that {g(x – k)}k∈z is an orthonormal set. (Hint: Use (b) to show that and then apply Theorem 5.18.)
11. Let where
Explicitly show that P(z) satisfies the conditions in Theorem 5.23. Use a computer algebra system, such as Maple, to establish the equation |P(z)|2 + |P(–z)|2 = 1 for |z| = 1. Illustrate by a graph of y = |P(eit)| that |P(eit)| > 0 for |t| ≤ π/2.
12. Suppose f is a continuously differentiable function with |f′(x)| ≤ M for 0 ≤ x ≤ 1. Use the following outline to uniformly approximate f to within some specified tolerance, ∈, by a step function in Vn, the space generated by ϕ (2nx – k), k ∈ Z, where ϕ is the Haar scaling function.
(a) For 1 ≤ j ≤ 2n, let aj = f(j/2n) and form the step function
(b) Show that |f (x) – fn(x)| ≤ ∈, provided that n is greater than log2(M/∈).
13. Let P(z) = (1/2) ∑k pkZk and define
For |z| = 1, show that . This exercise establishes a formula for the scaling polynomial for the wavelet, ψ, from the scaling polynomial, P, for the scaling function.
14. Prove Theorem 5.21 using the ideas in the proof of Theorem 5.20.
15. Prove Theorem 5.12 in the case where the signal f is only piecewise continuous by dividing up the support of f into subintervals where f is continuous. Care must be used for the nodes, l/2j that lie near the discontinuities, for then the approximation f(x) ≈ f(l/2j) does not necessarily hold. However, these troublesome nodes are fixed in number and the support of ϕ(2j x – l) becomes small as j becomes large. So the L2 contribution of the terms αlϕ (2jx – l) corresponding to the troublesome nodes can be shown to be small as j gets large.
16. Prove the second part of Theorem 5.18 using the proof of the first part as a guide.
17. In exercise 5, we showed that if ϕ has compact support, then only a finite number of pk’s are nonzero. Use the iterative construction of ϕ in Eq. (5.34) to show that if only a finite number of pk’s are nonzero, then ϕ has compact support.
18. (Filter Length and Support) Show that if the pk’s in the scaling relation are all zero for k > N and k < 0, and if the iterates ϕn converge to ϕ, then ϕ has compact support. Find the support of ϕ in terms of N.
1 Continuous signals are frequently called analog signals. The term comes from the output of an analog computer being continuous!