This chapter addresses discrete signal transformations in both time and frequency domains. The relationship between continuous and discrete Fourier transform is discussed in Section 2.1. Section 2.2 reviews several key properties of discrete Fourier transform. The effect of Window functions on discrete Fourier transform is described in Section 2.3, and fast discrete Fourier transform techniques are covered in Section 2.4. The discrete cosine transform is reviewed in Section 2.5. Finally, the relationship between continuous and discrete signals in both time and frequency domains is illustrated graphically with an example in Section 2.6.
It was shown in Chapter 1 that a periodic signal gp(t) can be expressed as
where T0 is the period of g(t), and
It also was shown that gp(t) can be represented in terms of Fourier series coefficients as
where ω0 = 2π/T0.
Now, let t = n ΔT and τ = N1 ΔT, where ΔT = 1/fs, and fs is the sampling frequency that satisfies the Nyquist requirement. Assume that all frequencies are normalized with respect to fs, or fs = 1. Then t = nΔT = n, T0 = NΔT = N, τ = N1ΔT = N1, and ω0 = 2π/(NΔT) = 2π/N. The analog or continuous signal gp(t) therefore becomes discrete or digital signal gp(n), which can be expressed as
and
A graphical representation of a periodic signal or sequence gp(n) is shown in Fig. 2.1.
A digitized and periodic signal gp(n), with period N, can therefore be expressed as
Then
Similarly, the variable n can also be represented as n = n0 ± mN, and the exponential term shown above also holds true for n0. For simplicity, we will drop the subscript 0 and use k and n directly. Equation (2.1a) then becomes
Here G(k) is the N-term sequence of the periodic sequence Gp(k).
Let g(n) be the N-term sequence of gp(n) in the region for n = 0 to N −1. By taking summation of gp(n) e−jr(2π/N)n for n = −∞ to ∞, we see that Eq. (2.1b) becomes
The second summation of Eq. (2.2a) satisfies
Equation (2.2a) then becomes
Equations (2.1b) and (2.2b) constitute a discrete Fourier transform pair. Equation (2.2b) serves as the discrete Fourier transform (DFT) from {g(n)} to G(r) and Eq. (2.1b) serves as the inverse discrete Fourier transform (IDFT) from {G(r)} to g(n). Notice that g(n) = gp(n) for n = 0 to N −1 and G(r) = Gp(r) for r = 0 to N −1.
To comply with the popular notation of analog domain Fourier transform, Eqs. (2.1b) and (2.2b) are then modified to become
Here n = 0 to N−1 and k = 0 to N−1. Equations (2.2c) and (2.2d) serve as the discrete Fourier transform pair.
By defining WN = e−j(2π/N), Eqs. (2.2c) and (2.2d) can be modified to become another form of DFT:
The definition of WN satisfies the following two basic properties:
The discrete Fourier transform serves as a very important tool in the digital signal processing fields. Some important and useful properties of DFT will now be described.
As shown in Eq. (1.18), a delay in the time domain will cause a phase shift in the frequency domain. Similarly, for a periodic sequence, a shift in discrete time will cause a phase shift in the discrete frequency domain. This phase shift characteristic can be derived as shown below.
Let g(n) be a N-sample periodic sequence with the corresponding DFT as G(k). Then the g(n−m) will have its DFT as follows, from Eq. (2.2c):
This implies that g(n − m) and G(k)e−j(2π/N)km form a DFT transform pair, and the exponential term represents a phase shift in frequency domain.
If the values of periodic sequence g(n) are real numbers, then the real part of G(k) for k = 1 to k = (N/2) − 1 are identical to that of G(k) for k > (N/2). This can be seen from Eq. (2.2d):
Furthermore, if g(n) is real and symmetric around its center, that is, if g(n) = g(N−n), then G(k) can easily be proved to be real and G(N−k) = G(k).
Leting x(n) and h(n) be two nonperiodic N-sample sequences, the linear convolution of x(n) and h(n) is defined as
where h(m) and x(n−m) are zero outside the duration of N samples. Figure 2.2 displays the linear convolution of x(n) and h(n) with y(n) as the output and N = 8. As can be seen in the figure, the output y(n) has duration of 2N−1 samples and is free of spectrum aliasing. The symbol in Fig. 2.2 represents convolution.
Now, consider the two new sequences x(n) and h(n), which are periodic with period N = 8. These two sequences are identical to the above two nonperiodic sequences for n = 0, 1, ..., N−1. The DFTs of these two new sequences are
The circular convolution of x(n) and h(n) with output y(n) is defined as
where y(n) is the IDFT of Y(k) = H(k)X(k). The function y(n) is also a periodic sequence with period N.
Figure 2.3 displays the circular convolution of two sequences x(n) and h(n) with n = 0, 1, ..., 7. By comparing the values of y(n) with those shown in Fig. 2.2, one can see that the N-term circular convolution output y(n) is distorted by the overlap with a neighboring sequence when performing multiplication and summation. This type of distortion is called “the edge effect of circulation convolution.”
The edge effect of circular convolution is a built-in characteristic of DFT. To ensure that the circular convolution of any two sequences with period N will have nonaliased output with 2N−1 samples, one can establish the length of both x(n) and h(n) at 2N samples by padding N zeros after the N-term sequence. For practical applications, L = 2N is chosen to be a power of 2. The same N-point data sequences x(n) and h(n) used in Figs. 2.2 and 2.3 are again used in Fig. 2.4 to demonstrate the process of linear convolution based on a circular convolution scheme. As expected, the output y(n) is identical to that of linear convolution as shown in Fig. 2.2.
For most applications, the two sequences x(n) and h(n) may not have the same length of non-zero-valued samples. For some applications, one could have a much longer sample length than the other has. In that case, the above mentioned zero-padded 2N-term sequence with 2N = L for both x(n) and h(n) may not be practical to compute the convolution. To solve this problem, two techniques are developed. These methods partition the larger sequence into several smaller sequences and compute the partial result first and add them together later to form the complete results. This will be discussed next.
Let x(n) be the input time sequence with N1 samples and h(n) be the impulse response with N2 samples. Assume that N1 N2 and N1 can be equally divided into M blocks with each block having N0 samples, that is, N1 = M N0. The samples of the ith data block, xi(n), is defined to be equal to x(n) for n [(i − 1)N0, ..., iN0 − 1] and zero elsewhere, where i = 1, 2, ..., M. The output of linear convolution with input data xi(n) and impulse response h(n) equals yi(n), which has length L = N0 + N2 − 1. The linear convolution can be implemented by circular convolution if L is chosen to be a power of 2.
Figure 2.5 demonstrates how the Overlap-and-Add method performs the convolution. As can be seen, the ith block of x(n), or xi(n) with i = 1, 2, ..., M, convolves with h(n) to generate an ith data block yi(n). The section of yi(n) that extends beyond the range n [(i−1)N0, ..., iN0–1] will coincide with part of the next yi + 1(n). This extra portion, with N2−1 samples in length, is called the overlap part, which will be added to yi+1(n) to form the convolution sum for the (i + 1)th section of output y(n).
Unlike the overlap-and-add method, the overlap-and-save method requires that the input blocks overlap with each other to process the convolution. Consider the input time sequences x(n) and the impulse response h(n), which have N1 and N2 samples, respectively. Assume that N1 N2, and the input data blocks can be divided into M sections of data blocks xi(n) with size of N0 + N2 –1 samples each. The end portion of the input data xi(n) is overlapped with N2 −1 samples of the next block of input data xi+1(n). Samples of the ith data block xi(n) are defined to be equal to x(n) over n [(i − 1) N0, ..., iN0 + N2 − 1], where i = 1, 2, ..., M. The data blocks xi(n) are then circularly convolved with the impulse response h(n). Because of the overlap redundancy at the input, the circular artifacts in the output (the last N2 − 1 samples) can simply be discarded. Figure 2.6 illustrates the overlap-and-save method.
One of the fundamental characteristics of discrete signals is that zero padding in one domain results in an increased sampling rate in the other domain. The N-sample Hanning window will be used to illustrate the zero-padding technique. Mathematically, the Hanning window is expressed as follows:
The discrete time sequence w(n) with N = 32 is shown in Fig. 2.7a. By taking a 32-point DFT on w(n), the power spectrum of Hanning window, WdB(m), is shown in Fig. 2.7b. For comparison, a 128-point Hanning window w(n) and its corresponding power spectrum WdB(m) are also displayed in Figs. 2.7c and 2.7d, respectively. Note that, for convenience, the zero-frequency components have been moved to the center of the spectra.
On comparison of the two sets of power spectra, discribed above, it is clear that at a higher sampling rate in the time domain (increasing from 32 samples to 128 samples), the frequency spectrum was hardly improved. The number of frequency samples inside the signal spectrum over −60 dB is about the same in both cases. The signal's frequency bandwidth stays the same in both cases, although it appears wider in the 32-point plot than in the 128-point plot. This is because the sampling frequency for the 128-point is 4 times higher than that of the 32-point.
The zero-padding technique can be used to increase a signal's resolution in either time or frequency domain. The 32-point w(n) is again shown in Fig. 2.8a, together with the real and imaginary parts of its 32-point DFT, X(m), shown in Figs. 2.8b and 2.8c, respectively. For convenience, the zero-frequency components are moved to the center of the spectra in Figs. 2.8b and 2.8c, respectively.
Now, by padding (appending) 96 zero-valued samples to the end of w(n), a new 128-sample w′(n) sequence is obtained (Fig. 2.9a). By computing the DFT of w′(n), one obtains a more detailed power spectrum as shown in Fig. 2.9b. This new 128-point spectrum provides a frequency structure that is more detailed than that of Fig. 2.7d. The number of frequency components inside the signal spectrum over −60 dB is about 4 times that of Figs. 2.7b or 2.7d. Figures 2.9c and 2.9d display the real and imaginary parts of the frequency spectrum corresponding to the zero-padded Hanning window.
Again, the power spectrum was displayed with the zero-frequency component moved to the center of the spectrum. The real and imaginary parts of the frequency spectrum are also rearranged accordingly.
The preceding example demonstrates that zero padding in the time domain results in an increased sample rate in the frequency domain. Here, the zero-padding technique increased the frequency-domain sample rate (or resolution) by a factor of 4 (= 128/32).
It is interesting to see that doubling the sampling rate in the time domain did not improve the signal's frequency resolution when compared to that of the zero-padding method for an identical new sample rate. This is due to the fact that doubling the sampling rate in the time domain from N samples to 2N samples also increases the number of frequency samples in the frequency domain from N to 2N. The bandwidth of the signal remained the same, but the sampling frequency was increased twofold. In the frequency domain the signal spectrum was still represented by N samples only; the extra N samples were wasted to represent the stretched frequency region.
The zero-padding technique used in the frequency domain is similar to that in the time domain, we will use a 64-point interpolation, not 128-point interpolation as in the time domain for illustration. The same 32-point w(n) is first Fourier-transformed to become W(m). Then, 32 zero-valued complex samples are inserted in the middle of W(m) to form a new 64-point complex spectrum W′(m). The real and imaginary parts of W′(m) are shown in Figs. 2.10a and 2.10b, respectively. Notice that the complex data shown in Figs. 2.8b and 2.8c are used to insert zeros as shown in Figs. 2.10a and 2.10b. However, Figs. 2.8b and 2.8c were arranged with zero frequency in the center, while Figs. 2.10a and 2.10b are zero-padded directly on DFT outputs W(m) with zero frequency at bin 1. In general, the zeros are inserted after the first N/2 spectrum samples in order to maintain spectrum symmetry. A 64-point IDFT is then operated on the zero-padded W′(m) to generate the interpolated time sequence w′(n) as shown in Figs. 2.10c and 2.10d. Since |W(m)| is an even function and the phase of W(m) is an odd function, the IDFT outputs are all real numbers with all imaginary parts equal to zeros. The 64 samples shown in Fig. 2.10c represent the new interpolated sequence w′(n). It is obvious that zero padding in the frequency domain results in an increased resolution in the time domain.
The filter-based time-domain interpolation technique, discussed in Chapter 1, inserts zero-valued samples between each of the original time samples; the zero-inserted sequence is then lowpass-filtered to attenuate the sidelobes of spectrum caused by the inserted zeros. The accuracy of this technique depends on the quality of the filter. The greater the stopband attenuation by lowpass filtering, the more accurate the interpolation becomes. The zero-padding technique does not require lowpass filtering, yet the sidelobe rejection is ideal in the sense that all sidelobes have zero values due to zero padding in the frequency domain. Therefore, this frequency-domain zero-padding technique is called “exact interpolation” for periodic time-domain sequences. This technique fails if the sampling rate of w(n) violates the Nyquist requirement.
Let the N-sample discrete time sequence x(n) be a sinusoidal function with frequency equal to f0, and let fs be the sampling frequency. If the sampling frequency fs is not an integer multiple of f0, the frequency spectrum of x(n) will not show up as a sharp pulse at frequency f0. Instead, the spectrum expands wider over several frequency bins around f0. One practical solution to reduce this type of energy spreading problem is to use a windowing function. The windowing function involves the imposition of a prescribed profile on the time signal prior to performance of the Fourier transformation. Let x(n) and w(n) be the original time sequence and windowing function, respectively; the new sequence is then given as
When the windowing function is applied to x(n), the resultant frequency spectrum will suffer because of the loss of energy at both ends of time sequence x(n). An averaging scheme that is used to compensate the loss of energy is discussed next.
Assume that an N-sample data block with appropriate window function is used to compute the DFT. To generate the ith block of data DFT, the (i − )th data DFT, the ith data DFT and (i + )th data DFT must be added together and divided by 3. That is, using i = 2 as an example, one obtains
Figure 2.11 illustrates the generation of DFT based on the averaging scheme.
There are many windowing functions available for different types of applications. The following are four popular ones, and their mathematical expressions are shown below:
When a long sequence of a signal is partitioned into smaller sections for processing, this partition process is equivalent to multiplying the long sequence by the rectangular windowing function, which in fact serves no purpose. The Hanning window function has been discussed extensively in previous sections. The two new windowing functions, namely, the Hamming window and the Blackman window, are shown in Fig. 2.12, where the time-domain waveforms and frequency spectra are displayed.
The fast Fourier transform (FFT) is a time-saving computation technique of the discrete Fourier transform (DFT). Various schemes are used to realize the FFT. The radix-2 is the most popular one and is discussed next.
Consider the computation of N = 2L points DFT by the divide-and-conquer approach. Let f1(n) and f2(n) be the two N/2-point data sequences, corresponding to the even-numbered and odd-numbered samples of N-point data sequence x(n); that is
for n = 0, 1, ..., N/2− 1.
Since f1(n) and f2(n) are obtained by decimating x(n) by a factor of 2, the resulting FFT algorithm is called the “decimation-in-time” algorithm.
The N-point DFT can be expressed in terms of the decimated sequences as follows:
where k = 0, 1, 2, ..., N. With = WN/2 and leting f1(m) = x(2m) , f2(m) = x(2m+1), this equation becomes
where F1(k) and F2(k) are the N/2-point DFTs of the sequences f1(m) and f2(m), respectively.
Since F1(k) and F2(k) are periodic with period N/2, we have F1(k + N/2) = F1(k) and F2(k + N/2) = F2(k). In addition, ; hence, Eq. (2.8) can be expressed as
The procedure used in this computation can be repeated through decimation of the N/2-point sequences X(k) and X(k + N/2). The entire process involves L = log2 N stages of decimation. The computation of N-point DFT via the decimation-in-time FFT requires (N/2)log2 N complex multiplications and Nlog2 N complex additions.
Figure 2.13 depicts the computation of N = 8 points DFT. The computation is performed in three stages, beginning with the computations of four 2-point DFTs, then two 4-point DFTs, and finally one 8-point DFT.
Figure 2.14 illustrates the implementation of FFT through the decimation-in-time algorithm for the case of N = 8. The 8-point DFT is split into two 4-point DFTs and then four 2-point DFTs. The well-known butterfly pattern is clearly visible.
Another important radix-2 FFT scheme, called the “decimation-in-frequency” algorithm, is obtained by also using the divide-and-conquer approach. This method splits the DFT formula into two summations, one involving the sum over the first N/2 data points and the second sum involving the last N/2 data points:
N/2 data points and the second sum involving the last N/2 data points:
Since
It follows that
e−jπk = (− 1) k.
Therefore
By splitting (or decimate) X(k) into even- and odd-numbered samples, one obtains
Here we have used the fact that = WN/2.
The computational procedure shown above can be repeated through decimation of the N/2-point DFT data sequences X(2k) and X(2k+1), k = 0, 1, ..., (N/2 − 1). The entire process involves L = log2 N stages of decimation, where each stage involves N/2 butterflies. The first stage of an 8-point butterfly is shown in Fig. 2.15.
Consequently, the computation of N-point DFT via the decimation-in-frequency FFT requires (N/2)log2 N complex multiplications and Nlog2 N complex additions, the same as in the decimation-in-time algorithm. An 8-point decimation-in-frequency example is shown in Fig. 2.16.
The rapid growth of digital imaging applications, including desktop publishing, multimedia, teleconferencing, high-definition television (HDTV), and radar imaging, has increased the need for effective and standardized image compression techniques. Among the emerging standards are JPEG (Joint Photographics Expert Group) for compression of still images, MPEG (Moving Picture Experts Group) for compression of motion video, and CCITT [International Consultative Committee on Telecommunications and Telegraphy (now ITU-T)] H.261 (also known as Px64) for compression of videotelephony and videoteleconferencing. All three standards employ a basic technique known as the discrete cosine transform DCT. We will start our discussion with the regular DCT, followed by the two-dimensional (2D) DCT later.
The DCT of a periodic and finite-length sequence x(n) is defined as
The DCT is closely related to the discrete Fourier transform (DFT), but has better energy compaction properties. This implies that the majority of the signal energy in time sequence can be represented by just a few of the DCT-transformed coefficients. The energy compaction properties of the DCT also make it useful in applications of data communication.
The inverse discrete cosine transform (IDCT) computes the inverse DCT of a finite length sequence. It reconstructs a signal from a complete or partial set of DCT coefficients. The IDCT is defined as
Comparing Eqs. (2.12) and (2.13), one can see that the forward and backward (inverse) DCT have identical transformation kernels. These basis vectors are sampled cosines and can be expressed as
The attractiveness of the DCT is twofold: (1) it is nearly optimal with high positive values of adjacent-sample correlation, and (2) it can be computed via the DFT using a FFT algorithm. The computation of DCT through FFT consists of extending the N samples of DCT to form a 2N-block data with even symmetry and then taking a 2N-point DFT and saving N terms in it. The following procedure is used to compute DCT through FFT:
Step 1. Define a 2N data block samples {x′(n)} as
Step 2. Compute the 2N DFT of x′(n) as
By comparing y′(k) with y(k) of Eq. (2.12), it is clear that
Figure 2.17 illustrates the end effects of DFT and DCT. Figure 2.17a shows the original N-sample data block, white Fig. 2.17b shows the end effect of the N-sample-based DFT. Figure 2.17c shows how 2N data sequence x′(n) is formed from an N data sequence x(n). It also shows the end effect of DCT. One can see that the 2N-extension DCT has a smaller end effect (caused by a discontinuity at the border of one block and its repetition) compared to DFT operations based on the N-sample data block.
Since the N-sample DCT can be computed through the FFT by extending the input samples to 2N, the computation requirement for N-sample DCT is then 2N log2N multiply–add operations. Notice that the computation requirements for the regular N-sample FFT is (N/2) log2 N complex multiplications and N log2 N complex additions.
The one-dimensional DCT is useful in processing one-dimensional signals such as speech waveforms. For two-dimensional (2D) signals such as images, a 2D version of the DCT is needed. The 2D DCT can be expressed as follows:
Forward 2D DCT:
Inverse 2D DCT:
where
Here x(m, n) is an N × N data array. The parameters k, l, m, and n are all integers and range from 0 to N−1. Since the transformation kernels are separable, the 2D DCT can be performed in two steps, each involving a 1D DCT. Similar to the 1D case, the 2D DCT image coding with an even symmetry has fewer-edge effect problems than that of 2D DFT image coding.
In the following discussion, a signal is assumed to be of finite length in the time domain and band-limited in the frequency domain. In general, the frequency spectrum of a finite-length time signal will extend to infinity. The assumption here is that if the frequency components of a finite-length time sequence exceed a certain range, they are considered small enough to be neglected. The relationship between a continuous signal and a discrete signal in both time and frequency domains, based on Fourier transform, leads to the definition of the discrete Fourier transform, described next.
Let an analog sinusoidal function s(t) with single frequency f1, namely, s(t) = cos(2πf1t), be the band-limited time-domain signal. The time-limited, or gated, signal of s(t) can then be represented as s(t)g(t), where g(t) is a gate function with time duration Tp and is defined as
The band-limited signal s(t) is shown on the left side of Fig. 2.18a, and its frequency spectrum S(f) is shown on the right side of Fig. 2.18a. For simplicity, the time duration Tp of g(t) is chosen to be Tp = 4/f1. The gate function g(t) and the gated signal s(t)g(t), together with their corresponding frequency spectra, are shown in Figs. 2.18b and 2.18c, respectively. An impulse train i1(t) with period Δt = 1/(2f1) is shown on the left side of Fig. 2.18d, together with its frequency spectrum, shown on the right side of Fig. 2.18d. The gated signal s(t)g(t) is then multiplied by an impulse train i1(t) to become a finite-length discrete signal as shown on the left side of Fig. 2.18e, together with its spectrum shown on the right side of Fig. 2.18e. An impulse train I2(f) with period Δf = 1/Tp is shown on the right side of Fig. 2.18f, together with its time-domain signal shown on the left side of Fig. 2.18f. The signal spectrum shown on the right side of Fig. 2.18e becomes discrete by multiplying it by I2(f) and is shown on the right side of Fig. 2.18g, together with its time-domain signal shown on the left side of Fig. 2.18g.
Leting Tp = N Δt, with N = 8 in this example, one can express the gated and digitized signal as
Notice that this discrete and finite-time signal has a continuous frequency spectrum and is repeated at the sampling frequency fs = 1/ Δt, where Δt = 1/(2f1). The pulse train in the frequency domain, displayed on the right side of Fig. 2.18f, has Δf = 1/Tp = f1/4. The digitized and periodic spectrum is obtained by multiplying the analog (or continuous) and periodic spectrum by the pulse train in the frequency domain. The corresponding digitized and periodic time-domain signal is shown on the left side of Fig. 2.18g.
The preceding discussion, which starts with an analog or continuous time signal and goes through various gating and multiplications with impulse trains in either time or frequency domain, finally comes to the results of two discrete and periodic signals in both time and frequency domains. These two discrete and periodic signals, shown on left and right sides of Fig. 2.18g, form the discrete Fourier transform pair!
From the preceding discussion and graphical representation on time–frequency and analog–discrete relations for a time-gated and frequency band-limited signal, the following observations can be obtained: (Assume that the signal x(t) is bandlimited with , and the sampling frequency )
Hardware implementation of real-time resampling and interpolation on a discrete time sequence based on FIR filtering was addressed in Section 1.6 (of Chapter 1). A second method of zeropadding in the frequency domain followed by IDFT was described in Section 2.2. This section describes another method for fractional interpolation, which is based on DFT and provides some advantages over other methods.
The interpolation on an N-sample data sequence x(n) to generate a new N-sample sequence x′(n) = x(n + Δ), where n = 0, 1, ..., N − 1 and 0 < Δ < 1, is shown in Fig. 2.19. The continuous signal x(t) appears as an envelope of the discrete signal x(n), and is shown as a dashed curve line. The discrete signal x(n) is shown as a vertical solid line with a dot on top. The new interpolated signal x′(n) = x(n + Δ) is the signal with Δ away from the signal x(n) and is shown as a dashed vertical line.
To compute x′(n) from x(n), we will use Eqs. (2.2c) and (2.2d), which are repeated here:
Leting x′(n) = x(n + Δ), Eq. (2.2d) becomes
Defining , then
Therefore
This implies that the interpolated signal x(n+Δ) can be obtained by
This DFT–based resampling scheme can be applied to any fractional interpolation based on evenly spaced data. It fails on unevenly spaced data.