Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

1
MATHEMATICAL BACKGROUND AND ANALYSIS TECHNIQUES

1.1 INTRODUCTION

This introductory chapter focuses on various mathematical techniques and solutions to practical problems encountered in many of the following chapters. The discussions are divided into three distinct topics: deterministic signal analysis involving linear systems and channels; statistical analysis involving probabilities, random variables, and random processes; miscellaneous topics involving windowing functions, mathematical solutions to commonly encountered problems, and tables of commonly used mathematical functions. It is desired that this introductory material will provide the foundation for modeling and finding practical design solutions to communication system performance specifications. Although this chapter contains a wealth of information regarding a variety of topics, the contents may be viewed as reference material for specific topics as they are encountered in the subsequent chapters.

This introductory section describes the commonly used waveform modulations characterized as amplitude modulation (AM), phase modulation (PM), and frequency modulation (FM) waveforms. These modulations result in the transmission of the carrier‐ and data‐modulated subcarriers that are accompanied by negative frequency images. These techniques are compared to the more efficient suppressed carrier modulation that possesses attributes of the AM, PM, and FM modulations. This introduction concludes with a discussion of real and analytic signals, the Hilbert transform, and demodulator heterodyning, or frequency mixing, to baseband.

Sections 1.2–1.4, deterministic signal analysis, transform in the context of a uniformly weighted pulse f(t) and its spectrum F(f) and the duality between ideal time and frequency sampling that forms the basis of Shannon’s sampling theorem [1]. This section also discusses the discrete Fourier transform (DFT), the fast Fourier transform (FFT), the pipeline implementation of the FFT, and applications involving waveform detection, interpolation, and power spectrum estimation. The concept of paired echoes is discussed and used to analyze the signal distortion resulting from a deterministic band‐limited channel with amplitude and phase distortion. These sections conclude on the subject of autocorrelation and cross‐correlation of real and complex deterministic functions; the corresponding covariance functions are also examined.

Sections 1.5–1.10, statistical analysis, introduce the concept of random variables and various probability density functions (pdf) and cumulative distribution functions (cdf) for continuous and discrete random variables. Stochastic processes are then defined and the properties of ergodic and stationary random processes are examined. The characteristic function is defined and examples, based on the summation of several underlying random variables, exhibit the trend in the limiting behavior of the pdf and cdf functions toward the normal distribution; thereby demonstrating the central limit theorem. Statistical analysis using distribution‐free or nonparametric techniques is introduced with a focus on order statistics. The random process involving narrowband white Gaussian noise is characterized in terms of the noise spectral density at the input and output of an optimum detection filter. This is followed by the derivation of the matched filter and the equivalence between the matched filter and a correlation detector is also established. The next subject discussed involves the likelihood ratio and log‐likelihood ratio as they pertain to optimum signal detection. These topics are generalized and expanded in Chapter 3 and form the basis for the optimum detection of the modulated waveforms discussed in Chapters 4–9. Section 1.9 introduces the subject of parameter estimation which is revisited in Chapters 11 and 12 in the context of waveform acquisition and adaptive systems. The final topic in this section involves a discussion of modem configurations and the important topic of automatic repeat request (ARQ) to improve the reliability of message reception.

Sections 1.11–1.14, miscellaneous topics, include a characterization of several window functions that are used to improve the performance the FFT, decimation filtering, and signal parameter estimation. Section 1.12 provides an introductory discussion of matrix and vector operations. In Section 1.13 several mathematical procedures and formulas are discussed that are useful in system analysis and simulation programming. These formulas involve prime factorization of an integer and determination of the greatest common factor (GCF) and least common multiple (LCM) of two integers, Newton’s approximation method for finding the roots of a transcendental function, and the definition of the standard deviation of a sampled population. This chapter concludes with a list of frequently used mathematical formulas involving infinite and finite summations, the binomial expansion theorem, trigonometric identities, differentiation and integration rules, inequalities, and other miscellaneous relationships.

Many of the examples and case studies in the following chapters involve systems operating in a specific frequency band that is dictated by a number of factors, including, the system objectives and requirements, the communication range equation, the channel characteristics, and the resulting link budget. The system objectives and requirements often dictate the frequency band that, in turn, identifies the channel characteristics. Table 1.1 identifies the frequency band designations with the corresponding range of frequencies. The designations low frequency (LF), medium frequency (MF), and high frequency (HF) refer to low, medium, and high frequencies and the prefixes E, V, U, and S correspond to extremely, vary, ultra, and super.

TABLE 1.1 Frequency Band Designations

Designation	Frequency	Letter Designation	Frequency (GHz)
ELF	3–30 Hz	L	1–2
SLF	30–300 Hz	S	2–4
ULF	0.3–3 kHz	C	4–8
VLF	3–30 kHz	X	8–12
LF	30–300 kHz	Ku	12–18
MF	0.3–3 MHz	K	18–27
HF	3–30 MHz	Ka	27–40
VHF	30–300 MHz	V	40–75
UHF	0.3–3 GHz	W	75–110
SHF	3–30 GHz	mm (millimeter)	110–300
EHF	30–300 GHz

1.1.1 Waveform Modulation Descriptions

This section characterizes signal waveforms comprised of baseband information modulated on an arbitrary carrier frequency, denoted as f_c Hz. The baseband information is characterized as having a lowpass bandwidth of B Hz and, in typical applications, f_c >> B. In many communication system applications, the carrier frequency facilitates the transmission between the transmitter and receiver terminals and can be removed without effecting the information. When the carrier frequency is removed from the received signal the signal processing sampling requirements are dependent only on the bandwidth B.

The signal modulations described in Sections 1.1.1.1 through 1.1.1.4 are amplitude, phase, frequency, and suppressed carrier modulations. The amplitude, phase, and frequency modulations are often applied to the transmission of analog information; however, they are also used in various applications involving digital data transmission. For example, these modulations, to varying degrees, are the underlying waveforms used in the U.S. Air Force Satellite Control Network (AFSCN) involving satellite uplink and downlink control, status, and ranging.

In describing the demodulator processing of the received waveforms, the information, following removal of the carrier frequency, is associated with in‐phase and quadphase (I/Q) baseband channels or rails. Although these I/Q channels are described as containing quadrature real signals, they are characterized as complex signals with real and imaginary parts. This complex signal description is referred to as complex envelope or analytic signal representations and is discussed in Section 1.1.1.5. Suppressed carrier modulation and the analytic signal representation emphasize quadrature data modulation that leads to a discussion of the Hilbert transform in Section 1.1.1.6. Section 1.1.1.7 discusses conventional heterodyning of the received signal to baseband followed by data demodulation.

1.1.1.1 Amplitude Modulation

Conventional amplitude modulation (AM) is characterized as

(1.1)

where A is the peak carrier voltage, is the modulation index, m(t) is the information modulation function, ω_m is the modulation angular frequency, and ω_c is the AM carrier angular frequency. Upon multiplying (1.1) through by sin(ω_ct) and applying elementary trigonometric identities, the AM‐modulated signal is expressed as

(1.2)

Therefore, s(t) represents the conventional double sideband (DSB) AM waveform with the upper and lower sidebands at equally spaced about the carrier at ω_c. With the information modulation function m(t) normalized to unit power, the power in each sideband is m_IP_S/4 where P_S is the power in the carrier frequency f_c.

1.1.1.2 Phase Modulation

Conventional phase modulation (PM) is characterized as

(1.3)

where A is the peak carrier voltage, ω_c is the carrier angular frequency, and φ(t) is an arbitrary phase modulation function containing the information. The commonly used phase function is expressed as

(1.4)

where ϕ is the peak phase deviation. Substituting (1.4) into (1.3), the phase‐modulated signal is expressed as

(1.5)

and, upon applying elementary trigonometric identities, (1.5) yields

(1.6)

The trigonometric functions involving sinusoidal arguments can be expanded in terms of Bessel functions [2] and (1.6) simplifies to

(1.7)

Equation (1.7) is characterized by the carrier frequency with peak amplitude AJ₀(ϕ) and upper and lower sideband pairs at with peak amplitudes AJ_n(ϕ). For small arguments the Bessel functions reduce to the approximations with and (1.7) reduces to

(1.8)

Under these small argument approximations, the similarities between (1.8) and (1.2) are apparent.

1.1.1.3 Frequency Modulation

The frequency‐modulated (FM) waveform is described as

(1.9)

where A is the peak carrier voltage, ω_c is the carrier angular frequency, Δf is the peak frequency deviation of the modulation frequency f_m, and ω_m is the modulation angular frequency. The ratio Δf/f_m is the frequency modulation index. Noting the similarities between (1.9) and (1.5), the expression for the frequency‐modulated waveform is expressed, in terms of the Bessel functions, as

(1.10)

with the corresponding small argument approximation for the Bessel function expressed as

(1.11)

The similarities between (1.11), (1.8), and (1.2) are apparent.

1.1.1.4 Suppressed Carrier Modulation

A commonly used form of modulation is suppressed carrier modulation expressed as

(1.12)

In this case, when the carrier is mixed to baseband, information modulation function m(t) does not have a direct current (DC) spectral component involving δ(ω). So, upon multiplication by the carrier, there is no residual carrier component ω_c in the received baseband signal. Because the carrier is suppressed it is not available at the receiver/demodulator to provide a coherent reference, so special considerations must be given to the carrier recovery and subsequent data demodulation. Suppressed carrier‐modulated waveforms are efficient, in that, all of the transmitted power is devoted to the information. Suppressed carrier modulation and the various methods of carrier recovery are the central focus of the digital communication waveforms discussed in the following chapters.

1.1.1.5 Real and Analytic Signals

The earlier modulation waveforms are described mathematically as real waveforms that can be transmitted over real or physical channels. The general description of the suppressed carrier waveform, described in (1.12), can be expressed in terms of in‐phase and quadrature modulation functions m_c(t) and m_s(t) as

(1.13)

The quadrature modulation functions are expressed as

(1.14)

and

(1.15)

With PM the data {d_c, d_s} may be contained in a phase function φ_d(t), m(t) is a unit energy symbol shaping function that provides for spectral control relative to the commonly used rect(t/T) function, and A represents the peak carrier voltage on each rail. With quadrature modulations, unique symbol shaping functions, m_c(t) and m_s(t), may be applied to each rail; for example, unbalanced quadrature modulations involve different data rates on each quadrature rail. With quadrature amplitude modulation (QAM) the data is described in terms of the multilevel quadrature amplitudes {α_c, α_s} that are used in place of {d_c, d_s} in (1.14) and (1.15).

Equation (1.13) can also be expressed in terms of the real part of a complex function as

(1.16)

where

(1.17)

The function is referred to as the complex envelope or analytic representation of the baseband signal and plays a fundamental role in the data demodulation, in that, it contains all of the information necessary to optimally recover the transmitted information. Equation (1.17) applies to receivers that use linear frequency translation to baseband. Linear frequency translation is typical of heterodyne receivers using intermediate frequency (IF) stages. This is a significant result because the system performance can be evaluated using the analytic signal without regard to the carrier frequency [3]; this is particularly important in computer performance simulations.

Evaluation of the real part of the signal expressed in (1.16) is performed using the complex identity No. 4 in Section 1.14.6 with the result

(1.18)

A note of caution is in order, in that, the received signal power based on the analytic signal is twice that of the power in the carrier. This results because the analytic signal does not account for the factor of 1/2 when mixing or heterodyning with a locally generated carrier frequency and is directly related the factor of 1/2 in (1.18). The signal descriptions expressed in (1.12) through (1.18) are used to describe the narrowband signal characteristics used throughout much of this book.

1.1.1.6 Hilbert Transform and Analytic Signals

The Hilbert transform of the real s(t) is defined as

(1.19)

The second expression in (1.19) represents the convolution of s(t) with a filter with impulse response where h(t) represents the response to a Hilbert filter with frequency response H(ω) characterized as

(1.20)

The Hilbert transform of s(t) results in a spectrum that is zero for all negative frequencies with positive frequencies representing a complex spectrum associated with the real and imaginary parts of an analytic function. Applying (1.20) to the signal spectrum results in the spectrum of the Hilbert transformed signal

(1.21)

Applying (1.21) to the spectrum S(ω) of (1.12) or (1.13), the bandwidth B of m(t) must satisfy the condition B << f_c. In this case, the inverse Fourier transform of the spectrum yields the Hilbert filter output given by

(1.22)

where T_H[s(t)] represents the Hilbert transform of s(t).

The function expressed by (1.22) is orthogonal to s(t) and, if the carrier frequency were removed following the Hilbert transform, the result would be identical to the imaginary part of the analytic signal expressed by (1.17). The processing is depicted in Figure 1.1.

Schematic illustrating Hilbert transform of carrier-modulated signal s(t) (B/ fc« 1) with a box labeled Hilbert filter and rightward arrows labeled s(t). — **FIGURE 1.1** Hilbert transform of carrier‐modulated signal s(t) .

1.1.1.7 Conventional and Complex Heterodyning

Conventional heterodyning is depicted in Figure 1.2. The zonal filters are ideal low‐pass filters with frequency response given by

(1.23)

Schematic of heterodyning of carrier-modulated signal s(t) (B/ fc« 1) with inward arrows labeled cos(ωct) and –sin(ωct) pointing to circles with x marks then to boxes labeled zonal filter and then to Sc(t) and Ss(t). — **FIGURE 1.2** Heterodyning of carrier‐modulated signal s(t) .

These filters remove the 2ω_c term that results from the mixing operation and, for s(t) as expressed by (1.13), the quadrature outputs are given by

(1.24)

and

(1.25)

With ideal phase tracking the phase term ϕ(t) is zero resulting in the quadrature modulation functions m_c(t) and m_s(t) in the respective low‐pass channels.

1.2 THE FOURIER TRANSFORM AND FOURIER SERIES

The Fourier transform is so ubiquitous in the technical literature [4–6], and its application are so widely used that it seems unnecessary to dwell at any length on the subject. However, a brief description is in order to aid in the understanding of the parameters used in the applications discussed in the following chapters.

The Fourier transform F(f) of f(t) is defined over the interval and, if f(t) is absolutely integrable, that is, if

(1.26)

then F(f) exists, furthermore, the inverse Fourier transform of F(f) results in f(t). In most applications¹ of practical interest, f(t) satisfies (1.26) leading to the Fourier transform pair defined as

(1.27)

In general, f(t) is real and the Fourier transform F(f) is complex and Parseval’s theorem relates the signal energy in the time and frequency domains as

(1.28)

The Fourier series representation of a periodic function is closely related to the Fourier transform; however, it is based on orthogonal expansions of sinusoidal functions at discrete frequencies. For example, if the function of interest is periodic, such that, f(t) = f(t – iT_o) with period T_o and is finite and single valued over the period, then f(t) can be represented by the Fourier series

(1.29)

where ω_o = 2π/T_o and C_n is the n‐th Fourier coefficient given by

(1.30)

Equation (1.29) is an interesting relationship, in that, f(t) can be described over the time interval T_o by an infinite set of frequency‐domain coefficient C_n; however, because f(t) is contiguously replicated over all time, that is, it is periodic, the spectrum of f(t) is completely defined by the coefficients C_n. Unlike the Fourier transform, the spectrum of (1.29) is not continuous in frequency but is zero except at discrete frequencies occurring at multiples of nω_o. This is seen by taking the Fourier transform of (1.29) and, using (1.27), the result is expressed as

(1.31)

where is the Fourier Transform² of . Equation (1.31) is applied in Chapter 2 in the discussion of sampling theory and in Chapter 11 in the context of signal acquisition.

Alternate forms of (1.29) that emphasize the series expansion in terms of harmonics of trigonometric functions are given in (1.32) and (1.33) when f(t) is a real‐valued function. This is important because when f(t) is real the complex coefficients C_n and C_−n form a complex conjugate pair such that which simplifies the evaluation of f(t). For example, using the complex notations and , the function f(t) is evaluated as

(1.32)

this simplifies to

(1.33)

where and .

An important consideration in spectrum analysis is the determination of signal spectrums involving random data sequences, referred to as stochastic processes [8]. A stochastic process does not have a unique spectrum; however, the power spectral density (PSD) is defined as the Fourier transform of the autocorrelation response. Oppenheim and Schafer [9] discuss methods of estimating the PSD of a real finite‐length (N) sampled sequence by averaging periodograms, defined as

(1.34)

where F(ω) is the Fourier transform of the sampled sequence. This method is accredited to Bartlett [10] and is used in the evaluation of the PSD in the following chapters. For a fixed length (L) of random data, the number of periodograms (K) that can be averaged is K = L/N. As K increases the variance of the spectral estimate approaches zero and as N increases the resolution of the spectrum increases, so there is a trade‐off between the selection of K and N. To resolve narrowband spectral features that occur, for example, with nonlinear frequency shift keying (FSK)‐modulated waveforms, it is important to use large values of N. Fortunately, many of the spectrum analyses presented in the following chapters are not constrained by L so K and N are chosen to provide a low estimation bias, that is, low variance, and high spectral resolution. Windowing³ the periodograms will also reduce the estimation bias at the expense of decreasing the spectral resolution.

1.2.1 The Transform Pair rect(t/T) ⇔ Tsinc(fT)

The transform relationship rect(t/T) ⇔ Tsinc(fT) occurs so often that it deserves special consideration. For example, consider the following function:

(1.35)

where ω_c, τ, and ϕ represent arbitrary angular frequency, delay, and phase parameters. The signal s(t) is depicted in Figure 1.3.

Schematic illustrating pulse-modulated carrier depicting a curved plot. — **FIGURE 1.3** Pulse‐modulated carrier.

The Fourier transform of s(t) is evaluated as

(1.36)

Expressing the cosine function in terms of complex exponential functions and performing some simplifications results in the expression

(1.37)

Evaluation of the integrals in (1.37) appears so often that it is useful to generalize the solutions as follows:

Consider the integral

(1.38)

The general solution involves multiplying the last equality in (1.38) by the factors and , having a product of one, where is the average of the integration limits. Distributing the second factor over the numerator of (1.38) and then simplifying yields the result

(1.39)

Applying (1.39) to (1.37) and simplifying gives the desired result

(1.40)

When , the positive and negative frequency spectrums do not influence one another and, in this case, the positive frequency spectrum is defined as

(1.41)

On the other hand, when the carrier frequency and phase are zero, (1.40) simplifies to the baseband spectrum, evaluated as

(1.42)

Using (1.42), the baseband Fourier transform pair, corresponding to of (1.35) with f_c = 0, is established as

(1.43)

and, with τ = 0,

(1.44)

1.2.2 The sinc(x) Function

The sinc(x) function is defined as

(1.45)

and is depicted in Figure 1.4. When x is expressed as the normalized frequency variable x = fT then (1.45), when scaled by T, is the frequency spectrum of the unit amplitude pulse rect(t/T) of duration T seconds such that t ≤ |T/2|. This function is symmetrical in x and the maximum value of the first sidelobe occurs at x = 1.431 with a level of 10log(sinc²(x)) = −13.26 dB; the peak sidelobe levels decrease in proportion to 1/|x|. The noise bandwidth of a filter function H(f) is defined as

(1.46)

where f_o is the filter frequency corresponding to the maximum response. When a receiver filter is described as H(f) = sinc(fT) the receiver low‐pass noise bandwidth is evaluated as B_n = 1/T where T is the duration of the filter impulse response.

Graph of x vs. sinc(x) displaying a descending wave. — **FIGURE 1.4** The *sinc*(x) function.

It is sometimes useful to evaluate the area of the sinc(x) function and, while there is no closed form solutions, the solution can be evaluated in terms of the sine‐integral S_i(x)⁴ as

(1.47)

where the sine‐integral is defined as the integral of sin(λ)/λ. Equation (1.47) is shown in Figure 1.5. The limit of S_i(πz) as |z| → ∞ is⁵ π sign(1,z)/2 so the corresponding limit of (1.47) is 0.5sgn(z).

Graph of x vs. integral sinc(x) displaying an ascending wave. — **FIGURE 1.5** Integral of *sinc*(x).

A useful parameter, often used as a benchmark for comparing spectral efficiencies, is the area under sinc²(x) as a function of x. The area is evaluated in terms of the sine‐integral as

(1.48)

Equation (1.48) is plotted in Figure 1.6 as a percent of the total area and it is seen that the spectral containment of 99% is in excess of 18 spectral sidelobes, that is, x = fT = 18. In the following chapters, spectral efficient waveforms are examined with 99% containment within 2 or 3 sidelobes, so the sinc(x) function does not represent a spectrally efficient waveform modulation.

Graph of x vs. integral of sinc2(x) displaying an ascending curve. — **FIGURE 1.6** Integral of *sinc*²(x) function.

1.2.3 The Fourier Transform Pair

The evaluation of this Fourier transform pair is fundamental to Nyquist sampling theory and is demonstrated in Section 2.3 in the evaluation of discrete‐time sampling. In this case, the function f(t) is an infinite repetition of equally spaced delta functions δ(t) with intervals T seconds as expressed by

(1.49)

The challenge is to show that the Fourier transform of (1.49) is equal to an infinite repetition of equally spaced and weighted frequency domain delta functions expressed as

(1.50)

with weighting ω_o and frequency intervals . Direct application of the Fourier transform to (1.49) leads to the spectrum but this does not demonstrate the equality in (1.50). Similarly, evaluation of the inverse Fourier transform of (1.50) results in the time‐domain expression

(1.51)

So, by showing that , the transform pair between (1.49) and (1.50) will be established. Consider g_N(t) to be a finite summation of terms in (1.51) given by

(1.52)

The second equality in (1.52) can be shown using the finite series identity No. 12, Section 1.14.1. Equation (1.52) is referred to by Papoulis [7] as the Fourier‐series kernel and appears in a number of applications involving the Fourier transform.

The function g_N(t) is plotted in Figure 1.7 for N = 8. The abscissa is time normalized by the pulse repetition interval such that, , and there are a total of peaks of which three are shown in the figure. Furthermore, there are eight time sidelobes between t/T = 0 and 0.5 with the first nulls from the peak value at t/T = 0 occurring at ; the peak values are in this example.

Graph of normalized time vs. amplitude illustrating the Fourier-series kernel gN(t) (N = 8) with a wave and arrows labeled (2N+1)/T and T/(2N+1). — **FIGURE 1.7** The Fourier‐series kernel *g_N*(t) (N = 8).

The maximum values of , occurring at , are determined by applying L’Hospital’s rule to (1.52), which is rewritten as

(1.53)

The approximation in (1.53) is obtained by noting that as N increases the rate of the sinusoidal variations in the numerator term increases with a frequency of Hz while the rate of sinusoidal variation in the denominator remains unchanged. Therefore, in the vicinity of , and (1.53) reduces to a sin(x)/x function with and a peak amplitude (2N + 1). The proof of the transform pair is completed by showing that f(t) = g(t). Referring to (1.51) g(t) is expressed as

(1.54)

From (1.53) as N approaches infinity the sin(x)/x sidelobe nulls converge to t/T = n, the peak values become infinite, and the corresponding area over the interval |t/T| = n ± 1/2 approaches unity. Therefore, g(t) resembles a periodic series of delta functions resulting in the equality

(1.55)

thus completing the proof that (1.49) and (1.50) correspond to a Fourier transform pair. Papoulis (Reference 7, pp. 50–52) provides a more eloquent proof that the limiting form of g_N(t) is indeed an infinite sequence of delta functions.

1.2.4 The Discrete Fourier Transform

The DFT pair relating the discrete‐time function f(mΔt) ≡ f(m) and discrete‐frequency function F(nΔf) ≡ F(n) is denoted as where

(1.56)

With the DFT the number of time and frequency samples can be chosen independently. This is advantageous when preparing presentation material or examining fine spectral or temporal details, as might be useful when debugging simulation programs, by the independent selection of the integers m and n.

1.2.5 The Fast Fourier Transform

As discussed in the preceding section, the DFT pair, relating the discrete‐time function f(mΔt) ≡ f(m) and the discrete‐frequency function F(nΔf) ≡ F(n), is denoted as where f(m) and F(n) are characterized by the expressions for the DFT. The FFT [11–17], is a special case corresponding to m and n being equal to N as described in the remainder of this section. In these relationships N is the number of time samples and is defined as the power of a fixed radix‐r FFT or as the powers of a mixed radix‐r_j FFT.⁶ The fixed radix‐2 FFT, with r = 2 and N = 2ⁱ, results in the most processing efficient implementation.

Defining the time window of the FFT as T_w results in an implicit periodicity of f(t) such that f(t) = f(t ± kT_w) and Δt = T_w/N. The sampling frequency is defined as and, based on Shannon’s sampling theorem, the periodicity does not pose a practical problem as long as the signal bandwidth is completely contained in the interval |B| ≤ f_s/2 = N/(2T_w). Since the FFT results in an equal number of time and frequency domain samples, that is, Δf = f_s/N and Δt = T_w/N, it follows that ΔfΔt = f_s T_w/N² = 1/N. Normalizing the expression of the time function, f(m), in (1.56), that is, multiplying the inverse DFT (IDFT) by Δt requires dividing the expression for F(n) by Δt. Upon substituting these results into (1.56), the FFT transform pairs become

(1.57)

The time and frequency domain sampling characteristics of the FFT are shown in Figure 1.8. This depiction focuses on a communication system example, in that, the time samples over the FFT window interval T_w are subdivided into N_sym symbol intervals of duration T seconds with N_s samples/symbol.

2 Graphs illustrating FFT time and frequency domain sampling: time sampled waveform (t=m∆t) (left) and bandlimited sampled spectrum (f=n∆f) (right). — **FIGURE 1.8** FFT time and frequency domain sampling.

Typically the bandwidth of the modulated waveform is taken to be the reciprocal of the symbol duration, that is, 1/T Hz; however, the receiver bandwidth required for low symbol distortion is typically several times greater than 1/T depending upon the type of modulation. Referring to Figure 1.8 the sampling frequency is f_s = 1/Δt, the sampling interval is Δt = T/N_s, the size of the FFT is N_fft = N_sN_sym, and the frequency sampling increment is Δf = f_s/N_fft. Upon using these relationships, the frequency resolution, or frequency samples per symbol bandwidth B = 1/T, is found to be

(1.58)

and the number of spectral sidelobes⁷ or symbol bandwidths over the sampling frequency range is

(1.59)

Therefore, to increase the resolution of the sampled signal spectrum, the number of symbols must be increased and this is comparable to increasing T_w. On the other hand, to increase the number of signal sidelobes contained in the frequency spectrum the number of samples per symbol must be increased and this is comparable to decreasing Δt. Both of these conditions require increasing the size (N) of the FFT. However, for a given size, the FFT does not allow independent selection of the frequency and time resolution as determined, respectively, by (1.58) and (1.59). This can be accomplished by using the DFT as discussed in Section 1.2.4. Since the spectrum samples in the range 0 ≤ f < f_s/2 represent the positive frequency signal spectrum and those over the range f_s/2 ≤ f < f_s represent the negative frequency signal spectrum, the range of signal sidelobes of interest is ±f_s/(2B) = ±N_s/2. As a practical matter, if the signal carrier frequency is not zero then the sampling frequency must be increased to maintain the signal sidelobes aliasing criterion. The sampling frequency selection is discussed in Chapter 11 in the context of signal acquisition when the received signal frequency is estimated based on locally known conditions.

The following implementation of the FFT is based on the Cooley and Tukey [18] decimation‐in‐time algorithm as described by Brigham and Morrow [19] and Brigham [20]. Although (1.57) characterizes the FFT transform pairs, the real innovation leading to the fast transformation is realized by the efficient algorithms used to execute the transformation. Considering the radix‐2 FFT with N = 2ⁿ, this involves defining the constant

(1.60)

and recognizing that

(1.61)

Equation (1.61) can be expressed in matrix form, using N = 4 for simplicity, as

(1.62)

Recognizing that W⁰ = 1 and the exponent nm is modulo(N), upon factoring the matrix in (1.62) into the product of two submatrices (in general the product of log₂N submatrices) leads to the implementation involving the minimum number of computations expressed as

(1.63)

The simplifications result in the outputs F(2) and F(1) being scrambled and the unscrambling to the natural‐number ordering simply involves reversing the binary number equivalents, that is, with F′(1) = F(2) and F′(2) = F (1); therefore, the unscrambling is accomplished as F(1) = F (01) = F′(2) = F′(10) and F (2) = F (10) = F′(1) = F′(01). The radix‐2 with N = 4 FFT, described by (1.63), is implemented as shown in the diagram of Figure 1.9.

Flow diagram of radix-2, N = 4-point FFT implementation, from sampled data to F(0), F(1), F(2), and F(3). — **FIGURE 1.9** Radix‐2, N = 4‐point FFT implementation tree diagram.

The inverse FFT (IFFT) is implemented by changing the sign of the exponent of W in (1.60), interchanging the roles of F(n) and f(m), as described earlier, and replacing Δt by Δf. Recognizing that ΔtΔf = 1/N, it is a common practice not to weight the FFT but to weight the IFFT by 1/N as indicated in (1.57). The number of complex multiplication is determined from (1.63) by recognizing that W² = −W⁰ and not counting previous products like W⁰f(2) from row 1 and W²f(2) = −W⁰f(0) from row 3 in the first matrix multiplication on the rhs of (1.63). For the commonly used radix‐2 FFT, the number of complex multiplications is (N/2)log₂(N) and the number of complex additions is Nlog₂(N). By comparison, the number of complex multiplications and additions in the direct Fourier transform are N² and N(N − 1), respectively. These computational advantages are enormous for even modest transform sizes.

1.2.5.1 The Pipeline FFT

The FFT algorithm discussed in the preceding section involves decimation‐in‐time processing and requires collecting an entire block of time‐sampled data prior to performing the Fourier transform. In contrast, the pipeline FFT [21] processes the sampled data sequentially and outputs a complete Fourier transform of the stored data at each sample. The implementation of a radix‐2, N = 8‐point pipeline FFT is shown in Figure 1.10. The pipeline FFT inherently scrambles the outputs F′(n) and the unscrambled outputs are not shown in the figure; the unscrambling is accomplished by simply reversing the order of the binary representation of the output locations, n, as described in the preceding section.

Flow diagram of radix-2, N = 8-point pipeline FFT implementation, from sampled data and f(m) to F′(0), F′(1), F′(2), F′(3), F′(4), F′(5), F′(6), and F′(7). — **FIGURE 1.10** Radix‐2, N = 8‐point pipeline FFT implementation tree diagram.

In general, the number of complex multiplications for a complete transform is (N/2)(N − 1). In Chapter 11 the pipeline FFT is applied in the acquisition of a waveform where a complete N‐point FFT output is not required at every sample. For example, if the complete N‐point FFT is only required at sample intervals of N_sT_s, the number of complex multiplications can be significantly reduced (see Problem 10). The pipeline FFT can be used to interpolate between the fundamental frequency cells by appending zeros to the data samples and appropriately increasing the size of the FFT; it can also be used with data samples requiring mixed radix processing. The pipeline FFT is applicable to radar and sonar signal detection processing [21] using a variety of spectral shaping windows; however, the intrinsic rect(t/T) FFT window is nearly matched for the detection of orthogonally spaced M‐ary FSK modulated frequency tones.

1.2.6 The FFT as a Detection Filter

The pipeline Fourier transform is made up of a cascade of transversal filter building blocks shown in Figure 1.10. The transfer function of this building block is

(1.64)

The overall transfer function from the input to a particular output is evaluated as

(1.65)

where k = log₂(N) and k_i = 2i − 1, i = 1, …, k. The complex weights are given by

(1.66)

where

(1.67)

Substitution of W_ℓ,i into (1.65) results in

(1.68)

where

(1.69)

This transfer function is expressed in terms of a magnitude and phase functions in ω by substituting s = jω with the result

(1.70)

where

(1.71)

Therefore, the FFT forms N filters, each having a maximum response that occurs at the frequencies . As N increases these transfer functions result in the response

(1.72)

The magnitude of (1.72) is the sinc(x) function associated with the uniformly weighted envelope modulation function and, therefore, the FFT filter functions as a matched detection filter for these forms of modulations. Examples of these modulated waveforms are binary phase shift keying (BPSK), quadrature phase shift keying (QPSK), offset quadrature phase shift keying (OQPSK), and M‐ary FSK.

The FFT detection filter loss relative to the ideal matched filter is examined as N increases. The input signal is expressed as

(1.73)

and the corresponding signal spectrum for positive frequencies with ω_c ≫ 2π/T is

(1.74)

The matched filter for the optimum detection of s(t) in additive white noise with spectral density N_o is defined as

(1.75)

where K is an arbitrary scale factor and T_o is an arbitrary delay influencing the causality of the filter. By letting , , T_o = (N − 1)T_s/2, and it is seen that the FFT approaches a matched filter as N increases.

The question of how closely the FFT approximates a matched filter detector is examined in terms of the loss in signal‐to‐noise ratio. The filter loss is expressed in dB as

(1.76)

where (SNR_o)_opt = 2E/N_o is the signal‐to‐noise ratio out of the matched filter and E is the signal energy. The signal‐to‐noise ratio out of the FFT filter is expressed in terms of the peak signal output of the detection filter and the output noise power as

(1.77)

where B_n is the detection filter noise bandwidth. For convenience the zero‐frequency FFT filter output is considered, that is, for , and letting the signal phase ϕ = 0, the response of interest is

(1.78)

and, from (1.74),

(1.79)

To evaluate SNR_o at the output of the FFT filter, g_o(t)_max and B_n are computed as

(1.80)

and

(1.81)

Substituting these results into (1.77) and using (1.76), the parameter ρ is evaluated as

(1.82)

Equation (1.82) is evaluated numerically for several values of N and the results are tabulated in Table 1.2. These results indicate, for example, that detecting an 8‐ary FSK‐modulated waveform with orthogonal tone spacing using an N = 8‐point FFT results in a performance loss of 0.116 dB relative to an ideal matched filter.

TABLE 1.2 N‐ary FSK Waveform Detection Loss Using an N‐Point FFT Detection Filter

N	ρ (dB)
2	0.452
3	0.236
8	0.116
16	0.053

1.2.7 Interpolation Using the FFT

When an FFT is performed on a uniformly weighted set of N data samples a set of N sinc(fT_w) orthogonal filters is generated where T_w = NT_s is the sampled data window and T_s is the sampling interval. The N filters span the frequency range f_s = 1/T_s and provide N frequency estimates that are separated by Δf = f_s/N Hz. Frequency interpolation is achieved if the FFT window is padded by adding nN zero‐samples, thereby increasing the window by nNT_s seconds. In this case, a set of (n + 1)N sinc(fT_w) filters spanning the frequency f_s is generated that provides n‐point interpolation between each of the original N filters.

The FFT can also be used to interpolate between time samples. For example, consider a sampled time function characterized by N samples over the interval T_w = NT_s where T_s is the sampling interval. The corresponding N‐point FFT has N filters separated by Δf = f_s/N where f_s = 1/T_s. If nN zero‐frequency samples are inserted between frequency samples N/2 and N/2 + 1 and the IFFT is taken on the resulting (n + 1)N samples, the resulting time function contains n interpolation samples between each of the original N time samples. These interpolations methods increase the size of the FFT or IFFT and thereby the computational complexity.

1.2.8 Spectral Estimation Using the FFT

Many applications involve the characterization of the PSD of a finite sequence of random data. A random data sequence represents a stochastic process, for which, the PSD is defined as the Fourier transform of the autocorrelation function of the sequence. If the random process is such that the statistical averages formed among independent stochastic process are equal to the time averages of the sequences, then the Fourier transform will converge in some sense⁸ to the true PSD, S²(ω); however, this typically requires very long sequences that are seldom available. Furthermore, the classical approach, using the Fourier transform of the autocorrelation function, is processing intense and time consuming, requiring long data sequences to yield an accurate representation to the PSD. A much simpler approach, analyzed by Oppenheim and Schafer [22], is to recognize that the Fourier transform of a relatively short data sequence x(n) of N samples is

(1.83)

and, defining the Fourier transform of the autocorrelation function C_xx(m) of x(n) as the periodogram

(1.84)

However, the periodogram is not a consistent estimate⁹ of the true PSD, having a large variance about the true values resulting in wild fluctuations. Oppenheim and Schafer then show that Bartlett’s procedure [10, 23] of averaging periodograms of independent data sequences results in a consistent estimate and, if K periodograms are averaged, the resulting variance is decreased by K. In this case, the PSD estimate is evaluated as

(1.85)

Oppenheim and Schafer also discuss the application of windows to the periodograms and Welch [17] describes a procedure involving the averaging of modified periodograms.

1.2.9 Fourier Transform Properties

The following Fourier transform properties are based on the transform pairs and where x(t) and y(t) may be real or complex.

1.2.9.1 Linearity

(1.86)

1.2.9.2 Translation

(1.87)

and

(1.88)

1.2.9.3 Conjugation

(1.89)

and

(1.90)

1.2.9.4 Differentiation

With and then

(1.91)

and

(1.92)

1.2.9.5 Integration

Defining images and images then

(1.93)

and

(1.94)

1.2.10 Fourier Transform Relationships

The following Fourier transform relationships are based on the transform pairs and where x(t) and y(t) may be real or complex.

1.2.10.1 Convolution

Defining the Fourier transforms and then

(1.95)

and

(1.96)

1.2.10.2 Integral of Product (Parseval’s Theorem)

(1.97)

Letting y(t) = x(t) results in Parseval’s Theorem that equates the signal energy in the time and frequency domains as

(1.98)

1.2.11 Summary of Some Fourier Transform Pairs

Some often used transform relationships are listed in Table 1.3.

1.3 PULSE DISTORTION WITH IDEAL FILTER MODELS

In this section the distortion is examined for an isolated baseband pulse after passing through an ideal filter with uniquely prescribed amplitude and phase responses. In radar applications isolated pulse response leads to a loss in range resolution; however, in communication application, where the pulse is representative of a contiguous sequence of information‐modulated symbols, the pulse distortion leads to intersymbol interference (ISI) that degrades the information exchange. The following two examples use the baseband pulse, or symbol, as characterized in the time and frequency domains by the familiar functions

(1.99)

1.3.1 Ideal Amplitude and Zero Phase Filter

In this example, the filter is characterized in the frequency domain as having a constant unit amplitude over the bandwidth f ≤ |B| with zero amplitude otherwise and a zero phase function. Using the previous notation the filter is characterized in the frequency and time domains as

(1.100)

The frequency characteristics of the signal and filter are shown in Figure 1.11.

Two graphs illustrating ideal signal and filter spectrums with a wave plot from –2/T to 2/T (top) and rectangular-shaped plot from –B to B (bottom). — **FIGURE 1.11** Ideal signal and filter spectrums.

The easiest way to evaluate the filter response to a pulse input signal is by convolving the functions as

(1.101)

The rect(•) function determines the integration limits with the upper and lower limits evaluated for τ when the argument equals ±½, respectively. This evaluation leads to the integration

(1.102)

Equation (1.102) is evaluated in terms of the sine integral [25]

(1.103)

resulting in the filter output g(t) expressed as

(1.104)

Defining the normalized variable y = t/T and the parameter ρ = BT, Equation (1.104) is expressed as

(1.105)

Equation (1.105) is plotted in Figure 1.12 for several values of the time‐bandwidth (BT) parameter. Range resolution is proportional to bandwidth and the increased rise time or smearing of the pulse edges with decreasing bandwidth is evident. The ISI that degrades the performance of a communication system results from the symbol energy that occurs in adjacent symbols due to the filtering.

Graph of normalized time vs. response illustrating ideal band-limited pulse response with arrows labeled pre-symbol ISI and post-symbol ISI. — **FIGURE 1.12** Ideal band‐limited pulse response (constant‐amplitude, zero‐phase filter).

This analysis considers only the pulse distortion caused by constant amplitude filter response and, as will be seen in the following section, filter amplitude ripple and nonlinear phase functions also result in additional signal distortion. If the filter were to exhibit a linear phase function ϕ(f) = −2πfT_o where T_o represents a constant time delay, then, referring to Table 1.3, the output is simply delayed by T_o without any additional distortion. If T_o is sufficiently large, the filter can be viewed as a causal filter, that is, no output is produced before the input signal is applied.

1.3.2 Nonideal Amplitude and Phase Filters: Paired Echo Analysis

In this section the pulse distortion caused by a filter with prescribed amplitude and phase functions is examined using the analysis technique of paired echoes [26]. A practical application of paired echo analysis occurred when a modem production line was stopped at considerable expense due to noncompliance of the bit‐error test involving a few tenths of a decibel. The required confidence level of the bit‐error performance under various IF filter conditions precluded the use of Monte Carlo simulations; however, much to the pleasure of management, the paired echo analysis was successfully applied to identify the cause of the subtle filter distortion losses.

Consider a filter with amplitude and phase functions expressed as

(1.106)

where the amplitude and phase fluctuations with frequency are expressed as

(1.107)

and

(1.108)

The parameters a and τ_a represent the amplitude and period of the amplitude ripple and b and τ_b represent the amplitude and period of the phase ripple. Using these functions in (1.106) and separating the constant delay term involving T_o, results in the filter function

(1.109)

Equation (1.109) is simplified by using the trigonometric identity

(1.110)

and the Bessel function identity [27]

(1.111)¹⁰

In arriving at the last expression in (1.111), the following identities were used

(1.112)

Upon substituting (1.110) and (1.111) into (1.109), and performing the multiplications to obtain additive terms representing unique delays results in the filter frequency response

(1.113)

Upon performing the inverse Fourier transform of each term in (1.113), the filter impulse response, h(t), becomes a summation of weighted and delayed sinc(x) functions of the form 2BKsinc(2B(t − T_d)) where K and T_d are the amplitude and delay associated with each of the terms. Performing the convolution indicated by the first equality in (1.101), that is, for an arbitrary signal s(t), the ideally filtered response g(t) is expressed as

(1.114)

When g(t) is passed through the filter H(f) with amplitude and phase described, respectively, by (1.107) and (1.108), the distorted output g_o(t) is evaluated as

(1.115)

If the input signal is described by the rect(t/T) function, then g(t) is the response expressed by (1.104) and depicted in Figure 1.12. The distortion terms appear as paired echoes of the filtered input signal and Figure 1.13 shows the relative delay and amplitude of each echo of the filtered output g(t). For b << 1 the approximations J₀(b) = 1.0 and J₁(b) = b/2 apply and when a = b = 0 the filter response is simply the delayed but undistorted replica of the input signal, that is, g_o(t) = g(t − T_o). More complex filter amplitude and phase distortion functions can be synthesized by applying Fourier series expansions that yield paired echoes that can be viewed as noisy interference terms that degrade the system performance; however, the analysis soon becomes unwieldy so computer simulation of the echo amplitudes and delays must be undertaken.

Three graphs illustrating amplitude distortion terms (top), phase distortion terms (middle), and joint amplitude and phase distortion terms (bottom). — **FIGURE 1.13** Location of amplitude and phase distortion paired echoes relative to delay *T_o*.

1.3.3 Example of Delay Distortion Loss Using Paired Echoes

The evaluation of the signal‐to‐interference ratio resulting from the delay distortion of a filter is examined using paired echo analysis. The objective is to examine the distortion resulting from a specification of the filters peak phase error and group delay within the filter bandwidth. The filter phase response is characterized as

(1.116)

where T_o is the filter delay resulting from the linear phase term, ϕ_o is the peak phase deviation from linearity over the filter bandwidth, and τ is the period of the sinusoidal phase distortion function. The linear phase term introduces the filter delay T_o that does not result in signal distortion; however, the sinusoidal phase term does cause signal distortion. In this example, the phase deviation over the filter bandwidth is specified parametrically as ϕ_o(deg) = 3 and 7°. The parameter τ is chosen to satisfy the peak delay distortion defined as

(1.117)

where ϕ_o is in radians. The peak delay, evaluated for fτ = 0, is specified as T_d = 34 and 100 ns and, using (1.117), the period of the sinusoidal phase function, τ = T_d/ϕ_o, is tabulated in Table 1.4 for the corresponding peak phase errors and peak delay specification. Practical maximum limits of the group delay normalized by the symbol rate, R_s, are also specified.

TABLE 1.4 Values of τ for the Phase and Delay Specifications

ϕ_o(deg)	T_d(ns)	τ(ns)	T_g/R_s^a
3	34	649	±0.15
7	100	818	±0.15

^aNormalized group delay over filter bandwidth.

Considering an ideal unit gain filter with amplitude response of A(ω) = 1, the filter transfer function is expressed as

(1.118)

Upon taking the inverse Fourier transform of (1.118), the filter impulse response is evaluated as

(1.119)

The parameter τ determines the delay spread of all the interfering terms; however, for small arguments the interference is dominated by the J₁(ϕ_o) term and the signal‐to‐interference ratio is defined as

(1.120)

For ϕ_o(deg) = 3 and 7°, the respective signal‐to‐interference ratios are 32 and 24.3 dB and under these conditions, a 10 dB filter input signal‐to‐noise ratio results in the output signal‐to‐noise ratio degraded by 0.02 and 0.17 dB, respectively.

1.4 CORRELATION PROCESSING

Signal correlation is an important aspect of signal processing that is used to characterize various channel temporal and spectral properties, for example, multipath delay and frequency dispersion profiles. The correlation can be performed as a time‐averaged autocorrelation or a time‐averaged cross‐correlation between two different signals. Frequency domain, autocorrelation, and cross‐correlation are performed using frequency offsets rather than time delays. The Doppler and multipath profiles are characteristics of the channel that are typically based on correlations involving statistical expectations as opposed to time‐averaged correlations that are applied to deterministic signal waveforms and linear time‐invariant channels. The following discussion focuses on the correlation of deterministic waveforms and linear time‐invariant channels.

The autocorrelation of the complex signal is defined as¹¹

(1.121)

The autocorrelation function implicitly contains the mean value of the signal and the autocovariance is evaluated, by removing the mean value, as

(1.122)

where is the complex mean of the signal . The cross‐correlation of the complex signals and ỹ(t) is defined as

(1.123)

Similarly, the corresponding cross‐covariance is evaluated as

(1.124)

The properties of various correlation functions applied to complex and real valued functions are summarized in Table 1.5. The properties of correlation functions are also discussed in Section 1.5.9 in the context of stochastic processes.

1.5 RANDOM VARIABLES AND PROBABILITY

This section contains a brief introduction to random variables and probability [6, 8, 28–30]. A random variable is described in the context of Figure 1.14 in which an event χ in the space S is mapped to the real number x characterized as X(χ) = x or f(x) : x_a ≤ x ≤ x_b. The function X(χ) is defined as a random variable which assigns the real number x or f(x) to each event χ ∈ S.¹² The limits [x_a, x_b] of the mapping are dependent upon the physical nature or definition of the event space. The second depiction shown in Figure 1.14 comprises disjoint, or nonintersecting, subspaces, such that, for i ≠ j the intersection S_i∩S_j = Ø is the null space. Each subspace possesses a unique mapping x|S_j conditioned on the subspace S_j : j = 1, …, J. The union of subspaces is denoted as S_i∪S_j. This is an important distinction since each subspace can be analyzed in a manner similar to the mapping of χ ∈ S. The three basic forms of the random variable X are continuous, discrete, and a mixture of continuous and discrete random variables as distinguished in the following sections.

Schematic illustrating mapping of random variable X(X) on the real line x with arrows labeled X(X) = x (left) and X(Xj) = x|Sj (right). — **FIGURE 1.14** Mapping of random variable X(χ) on the real line x.

1.5.1 Probability and Cumulative Distribution and Probability Density Functions

The mathematical description [6, 8, 24, 28, 30–32] of the random variable X resulting from the mapping X(χ) given the random event χ ∈ S is based on the statistical properties of the random event characterized by the probability P({X ≤ x}) where {X ≤ x} denotes all of the events X(χ) in S. For continuous random variables P(X = x) = 0. The probability function P(X_i ∈ S_i) satisfies the following axioms:

A1. P(X(χ)∈S) ≥ 0
A2. P({X(χ)∈S}) = 1
A3. If P(S_i∩S_j) = Ø ∀i ≠ j then

Axiom A3 applies for infinite event spaces by letting J = ∞. Several corollaries resulting from these axioms are as follows:

C1. P(χ^c) = 1 − P(χ) where χ^c is the complement of χ such that χ^c∩χ = Ø
C2. P(χ) ≤ 1
C3. P(χ_i ∪ χ_j) = P(χ_i) + P(χ_j) − P(χ_i ∩ χ_j)
C4. If P(Ø) = 0

The cumulative distribution function (cdf) of the variable X is defined in terms of the value of x on the real line as

(1.125)

where F_X(x) has the following properties:

P1. 0 ≤ F_X(x) ≤ 1
P2. In the limit as x approaches ∞, F_X(x) = 1
P3. In the limit as x approaches −∞, F_X(x) = 0
P4. F_X(x) is a nondecreasing function of x
P5. In the limit as ε approaches 0, F_X(x_i) = F_X(x_i + ε)
P5. The probability in the interval x_i < x ≤ x_j is: P(x_i < x ≤ x_j) = F_X(x_j) – F_X(x_i)
P6. In the limit as ε approaches 0, the probability of the event x_i is P(x_i − ε < x ≤ x_i) = F_X(x_i) − F_X(x_i − ε).

Property P5 is referred to as being continuous from the right and is particularly important with discrete random variables, in that, F_X(x_i) includes a discrete random variable at x_i. Property P7, for a continuous random variable, states that P(x_i) = 0; however, for a discrete random variable, P(x_i) = p_X(x_i) where p_X(x_i) is the probability mass function (pmf) defined in Section 1.5.1.2.

The probability density function¹³ (pdf) of X is defined as

(1.126)

The pdf is frequency used to characterize a random variable because, compared to the cdf, it is easier to describe and visualize the characteristics of the random variable.

1.5.1.1 Continuous Random Variables

A random variable is continuous if the cdf is continuous so that F_X(x) can be expressed by the integral of the pdf. The mapping in Figure 1.14 results in the continuous real variable x. From (1.125) and (1.126) it follows that

(1.127)

A frequently encountered and simple example of a continuous random variable is characterized by the uniformly distributed pdf shown in Figure 1.15 with the corresponding cdf and probability function.

3 Graphs of x vs. fX(x) for pdf (left), x vs. FX(x) for cdf (middle), and X vs. P(X≤ x) for probability (right) with rectangular-shaped plot, diagonal lines, and dashed lines. — **FIGURE 1.15** Uniformly distributed continuous random variable.

From property P7, the probability of X = x_i is evaluated as

(1.128)

However, for continuous random variables, the limit in (1.128) is equal to F_X(x_i) so P(X = x_i) = 0; this event is handled as described in Section 1.5.2.

1.5.1.2 Discrete Random Variables

The probability mass function [8, 28, 29] (pmf) of the discrete random variable X is defined in terms of the discrete probabilities on the real line as

(1.129)

The corresponding cdf is expressed as

(1.130)

where u(x − x_i) is the unit‐step function occurring at x = x_i and is defined as

(1.131)

Using (1.126), and recognizing that the derivative of u(x − x_i) is the delta function δ(x − x_i), the pdf of the discrete random variable is expressed as

(1.132)

The pmf p_X(x_i) results in a weighted delta function and, from (1.130), (1.131), and property P2, the summation must satisfy the condition .

The pdf, cdf, and the corresponding probability for the discrete random variable corresponding to binary data {0,1} with pmf functions p_X(0) = 1/3 and p_X(1) = 2/3 are shown in Figure 1.16. The importance of property P5 is evident in Figure 1.16, in that, the delta function at x = 1 is included in the cdf resulting in P(X ≤ 1) = 1. Regarding property P7, the limit in (1.128) approaches X = x_i from the left, corresponding to the base of the discontinuity, so that P(X = x_i) = p_X(x_i).

3 Graphs of x vs. fX(xi) for pdf with upward arrows labeled 1/3 and 2/3 (left), x vs. FX(x) for cdf with upward arrow labeled 1/3 (middle), and x vs. P(X≤ x) for probability with diagonal lines in a box labeled 1/3 (right). — **FIGURE 1.16** Discrete binary random variables.

1.5.1.3 Mixed Random Variables

Mixed random variables are composed of continuous and discrete random variables and the following example is a combination of the continuous and discrete random variables in the examples of Sections 1.5.1.1 and 1.5.1.2. The major consideration in this case is the determination of the event pmf for the continuous (C) and discrete (D) random variables to satisfy property P2. Considering equal pmfs, such that, p_X(S = C) = p_X(S = D) = 1/2, the pdf, cdf, and probability are depicted in Figure 1.17.

3 Graphs of x vs. fX(x) for pdf with upwards arrows labeled 1/6 and 1/3 (left), x vs. FX(x) for cdf with upward arrows labeled 1/6 and 2/3 (middle), and X vs. P(X≤ x) for probability with diagonal lines in a box (right). — **FIGURE 1.17** Mixed random variables.

1.5.2 Definitions and Fundamental Relationships for Continuous Random Variables

For the continuous random variables X, such that the events X(χ_j) ∈ S_i, the joint cdf is determined by integrating the joint pdf expressed as

(1.133)

and, provided that is continuous and exists, it follows that

(1.134)

The probability function is then evaluated by integrating x_i over the appropriate regions x_i1 < r_i ≤ x_i2: i = 1, …, N with the result

(1.135)

1.5.2.1 Marginal pdf of Continuous Random Variables

The marginal pdf is determined by integrating over the entire region of all the random variables except for the desired marginal pdf. For example, the marginal pdf for x₁ is evaluated as (see Problem 17)

(1.136)

The random variables X_i are independent iff the joint cdf can be expressed as product of the each cdf, that is

(1.137)

In addition, if X_i ∀ i are jointly continuous, the random variables are independent if the joint pdf can be expressed as the product of each pdf as

(1.138)

Therefore, the joint pdf of independent random variables is the same as the product of each marginal pdf computed sequentially as in (1.136).

The joint cdf of two continuous random variables is defined as

(1.139)

with the following properties,

(1.140)

and the joint pdf is defined as

(1.141)

with the following properties,

(1.142)

1.5.2.2 Conditional pdf and cdf of Continuous Random Variables

The conditional pdf is expressed as

(1.143)

and the conditional cdf is evaluated as

(1.144)

A basic rule for removing random variables from the left and right side of the conditional symbol ( | ) is given by Papoulis [33]. To remove random variables from the left side simply integrate each variable x_j from −∞ to ∞: j ≤ i. To remove random variables from the right side, for example, x_j and x_k: i + 1 ≤ j,k ≤ n, multiply by the conditional pdfs of x_j and x_k with respect to the remaining variables and integrate x_j and x_k from −∞ to ∞. For example, referring to (1.143) and considering f_X1(x₁|x₂,x₃,x₄), eliminating the random variables x₃ and x₄ from the right side is evaluated as

(1.145)

The conditional probability of Y ∈ S₁ given X(χ) = x is expressed as

(1.146)

Since P(X = x) = 0 for the continuous random variable X, (1.146) is undefined; however, if X and Y are jointly continuous with continuous joint cdfs, as defined in (1.139), then the conditional cdf of Y, given X, is defined as

(1.147)

and differentiating (1.147) with respect to y results in

(1.148)

If f_X(x) ≠ 0, the conditional cdf of y, given X = x, is expressed as [34]

(1.149)

and the corresponding conditional pdf is evaluated by differentiating (1.149) with respect to y and is expressed as

(1.150)

If X and Y are independent random variables then and (1.147) and (1.150) become and .

Upon rearranging (1.150), the joint pdf of X and Y is expressed as

(1.151)

Considering the probability space S₁ = S_Y|X ∩ S_X, such that Ø, the probability P(Y ∈S_X) is determined by the total probability law defined as

(1.152)

In this case, the subspace S_X can be examined as if it were a total probability space obeying the axioms, corollaries, and properties stated earlier.

1.5.2.3 Expectations of Continuous Random Variables

In general, the k‐th moment of the random variable X is defined as the expectation

(1.153)

and the k‐th central moments are defined as the expectation

(1.154)

The mean value m_x of X is defined as the expectation

(1.155)

The second central moment of X is evaluated as

(1.156)

where Var[x] is the variance of x. An efficient approach in evaluating the k‐th moments of a random variable, without performing the integration in (1.153) or (1.155), is based on the moment theorem as expressed by the moment generation function (1.241) in Section 1.5.6.

The expectation of the function g(x) is evaluated as

(1.157)

and the expectation of the function g(X,Y) of two continuous random variables is

(1.158)

The expectation is distributive over summation so that

(1.159)

and

(1.160)

The following relationships between X and Y apply under the indicated conditions:

(1.161)

From (1.160) and (1.161) it is seen that if X and Y are uncorrelated random variables they are also orthogonal random variables if the mean of either X or Y is zero. The following example demonstrates that if two jointly Gaussian distributed random variables are orthogonal they are also independent.

The conditional expectation of X given Y is defined as

(1.162)

However, if Y is a random variable the function g₂(Y) = E(X|Y) is also a random variable and, using (1.157), the expectation (1.162) becomes

(1.163)

Papoulis [35] establishes the basic theorem for the conditional expectation of the function g(X,Y) conditioned on X = x, expressed as the random variable E[g(X,Y)|X = x]. The theorem is:

(1.164)

with the corollary relationship

(1.165)

Papoulis refers to (1.165) as a powerful formula.

The Bivariate Distribution—An Example of Conditional Distributions

Consider that x₁ and x₂ are Gaussian random variables with means m₁, m₂ and variances σ₁, σ₂, respectively, with the joint pdf is expressed as [36]

(1.166)

where ρ is the correlation coefficient, such that, |ρ| ≤ 1, expressed as

(1.167)

Using (1.150), the distribution of x₁ conditioned on x₂ is expressed as

(1.168)

If x₁ and x₂ are uncorrelated random variables then E[x₁x₂] = E[x₁]E[x₂] and, from (1.167), the correlation coefficient is zero and (1.168) reduces to the Gaussian distribution of x₁ with . Therefore, two jointly Gaussian distributed random variables are orthogonal and independent if they are uncorrelated.

Referring to (1.165), the first and second conditional moments of the second equality in (1.168) are evaluated using as E[g₁(X₁)g₂(X₂)] and , respectively, with and In the evaluation, the conditional mean of the Gaussian distribution is established from (1.168) by observation as

(1.169)

and the desired result is evaluated as

(1.170)

where and . The evaluation of is left as an exercise in Problem 12. The evaluation of (1.169) could have been performed using the integration in (1.155); however, it is significantly easier and less prone to error to simply associate the required parameters with the known form of the conditional Gaussian distribution as indicated in (1.168).

With zero‐mean random variables X₁ and X₂, that is, when m₁ = m₂ = 0, the second equality in (1.168) results in (see Papoulis [37])

(1.171)

and

(1.172)

The time correlated zero‐mean, equal‐variance Gaussian random variables denoted as x_i and x_i−1 taken at t_i = t_i−1 + Δt are characterized, using the last equality in (1.168), as

(1.173)

Equation (1.173) is used to model Gaussian fading channels with the fade duration dependent on Δt and ρ and the fade depth dependent on σ₁.

1.5.3 Definitions and Fundamental Relationships for Discrete Random Variables

In the following relationships, x_i, y_i, x, and y are considered to be discrete random variables corresponding to the event probabilities P_X(x_i), P_Y(y_i), P_X(x), and P_Y(y) with the corresponding pmfs p_Z(z) = P_Z(Z = z) : Z = {X,Y}, z = {x_i,y_i,z,y} corresponding to the amplitude of the discrete delta functions. In general, the characterization of discrete random variables is similar to that of continuous random variables with the integrations replaced by summations and the pdf replaced with the pmf.

1.5.3.1 Statistical Independence

If X(χ_i) = x with χ_i ∈ S and the events χ_i are independent ∀ i, then the joint probabilities are expressed as the product

(1.174)

or, in terms of the pmf, p_X(x_i) = P(X = x_i)

(1.175)

If S = S₁∩S₂ such that X(χ_i) = x_i with χ_i ∈ S₁, Y(χ_j) = y_j with χ_j∈ S₂, and the individual mdfs satisfy (1.175), then

(1.176)

Therefore, if the joint pmfs are independent, X and Y are also independent and, from the last equality in (1.176), S₁ and S₂ are also independent. Consequently, {X,Y} are independent iff the pmfs of X and Y can be expressed in the product form as in (1.175).

The expectation of x is evaluated as

(1.177)

For the discrete sampled function g(X,Y), the expectation value is evaluated as

(1.178)

where the pmf is expressed as .

1.5.3.2 Conditional Probability

The conditional probability of X given Y = y_j is expressed as

(1.179)

and, in terms of the conditional pmfs, (1.179) becomes

(1.180)

The pmf behaves like the pdf of continuous random variables, in that, if the event X(χ_i) = x_i with χ_i ∈ S₁, the probability of X ∈ S₁ given Y = y_j is evaluated as

(1.181)

If X and Y are independent (1.180) becomes

(1.182)

1.5.3.3 Bayes Rule

Bayes rule is expressed, in terms of the condition probability, as

(1.183)

and, in terms of probabilities and pmfs, Bayes rule is expressed as

(1.184)

The probability state transition diagram is shown in Figure 1.18 for N‐dimensional input and output states x_i and y_i, respectively. The outputs are completely defined by the conditional, or transition, probabilities P(y_j|x_i) and the input a priori probabilities P(x_i). Upon choosing the state y_j, that is, given y_j, the a posteriori probability P(x_i|y_j) is the conditional probability that the input state was x_i. Wozencraft and Jacobs (Reference 30, p. 34) point out that, “The effect of the transmission [decision] is to alter the probability of each possible input from its a priori to its a posteriori value.”

Probability state transition diagram illustrating arrows with circles at each point labeled P(x1), P(xi), P(xN), P(yj), etc. — **FIGURE 1.18** Probability state transition diagram.

The conditional expectation of X given Y = y is

(1.185)

where the pmf p_X(x_i|y) = P(X = x_i|y).

1.5.4 Functions of Random Variables

Applications involving random variables that are functions of random variables, that is, z = g(x₁, …, x_M), require that the density function f_Z(z) be determined given : n = 1, …, M. In the following subsections, the transformation from to f_Z(z) is discussed for the relatively easy case involving functions of one random variables, that is, M = 1. More complicated cases are also discussed involving functions of two random variables and M random variables of the form . The following descriptions involve continuous random variables and cases involving discrete and mixed random variables are discussed in References 6, 8, 29.

1.5.4.1 Functions of One Random Variable

In the following description, the mapping of the random variable X = x is continuous and F_X(x) is differentiable at x as in (1.126), with finite values of f_X(x). The transformation from X to Z can be based on the functional relationships z = g(x) or x = h(z) with the requirements that corresponding to unit areas under each transformation. These transformations correspond, respectively, to

(1.186)

and

(1.187)

Equations (1.186) and (1.187) require the inverse relationship

(1.188)

The function z = h(x) typically has a finite number of solutions x_n, corresponding to the roots z = h(x₁), h(x₂),…, h(x_N) of the transformation and, under these conditions, the solution to f_Z(z) given f_X(x_n) is determined using the fundamental theorem [38, 39],

(1.189)

where h(z_n) corresponds to the transformation of x_n expressed in terms of z_n and .

As an example, consider a sinusoidal signal z, with constant amplitude a and random phase φ uniformly distributed between ±π, expressed as

(1.190)

Referring to Figure 1.19, and noting that , the problem is to determine the pdf f_Z(z) using the two roots of and . Using (1.190), is evaluated as

(1.191)

and

(1.192)

Graph of random variable x = asin(φ) (fФ(φ) = 1/(2π)) displaying wave and rectangular plot. — **FIGURE 1.19** Random variable x = asin(φ) (*f_Φ*(φ) = 1/(2π)).

Therefore, evaluating (1.189) with f_Φ(φ) = 1/(2π) results in

(1.193)

1.5.4.2 Functions of Two or More Random Variables

The concepts involving a function of one random variable can also be applied when the random variable Z is a function of several random variables; for example, the dependence on two random variables, such that, z = g(x,y) is discussed at length by Papoulis (Reference 8, Chapters 6 and 7) where the subjects involving marginal distributions, joint density functions, probability masses, conditional distributions and densities, and independence are introduced. According to (1.126), the probability density function f_Z(z) is determined from the distribution function F_Z(z) as

(1.194)

and the joint pfd of X and Y is characterized for continuous distributions as

(1.195)

where the joint cdf is given by

(1.196)

Based on the conditions for the equality of the probabilities, that is,

the pdfs are equated as

(1.197)

Upon differentiating (1.197) with respect to z yields the desired result expressed as

(1.198)

As an example application consider the random variable Z = X + Y; Papoulis states that, “This is the most important example of a function involving two random variables.” Upon letting y = z – x and using (1.198) the density function of Z is evaluated as

(1.199)

and, when X and Y are independent, (1.199) is simply the convolution of f_X(x) with f_Y(y). Several examples involving the use of (1.199) are given in Section 1.5.6.1.

Using the joint probability density function of two continuous random variables x and y, as expressed in (1.195), the marginal pdfs f_X(x) and f_Y(y) are obtained by integrating over y and x, respectively, resulting in

(1.200)

and

(1.201)

These results can also be generalized to apply to the joint density function of any number of continuous random variables by integrating over each of the undesired variables.

1.5.5 Probability Density Functions

The following two subsections examine the probability density function [40] of the magnitude and phase of a sinusoidal signal with additive noise and the probability density function of the product of two zero‐mean equal‐variance Gaussian distributions. In these cases, the random variables of interest involve functions of two random variables. In Section 1.5.6, the characteristic function is defined and examined for several probability distribution functions demonstrating the central limit theorem with increasing summation of random variables. In Section 1.5.7, many of the probability distributions used in the following chapters are summarized and compared.

1.5.5.1 Distributions of Sinusoidal Signal Magnitude and Phase in Narrowband Additive White Gaussian Noise

This example involves the evaluation of the pdf of the magnitude and phase at the output of a narrowband filter when the input is a sinusoidal signal with uniformly distributed phase and zero‐mean additive white Gaussian noise [41] (AWGN). In this case, the output of the narrowband filter is a narrowband random process. The evaluation involves three random variables: the input signal phase φ and the two independent‐identically distributed (iid) zero‐mean quadrature noise random variables with variance . The signal plus noise out of the filter is expressed as

(1.202)

where the third equality in (1.202) emphasizes the in‐phase and quadrature functions of the signal and noise terms and, when sampled at t = iT_s, represent the random variables x_c, n_c, x_s, and n_s. The functional relationships are and with n_c and n_s representing zero‐mean quadrature Gaussian random variables. The signal phase, φ, is uniformly distributed between 0 and 2π. Under these conditions, the quadrature signal and noise components x_c and x_s are independent random variables¹⁴ and the pdfs of x_c and x_s are expressed as

(1.203)

and

(1.204)

The pdf of the phase is

(1.205)

Using (1.203), (1.204), and (1.205) the joint pdf is expressed as

(1.206)

The evaluation of the joint pdf of the magnitude and phase of the sampled sine‐wave plus noise involves the transformation of variables from (x_c,x_s) to (r,θ) as depicted in Figure 1.20. The magnitude is described as

(1.207)

and the in‐phase and quadrature components, x_c and x_s, are described in terms of the angle θ as

(1.208)

Graph illustrating the relationship between transformation variables with diagonal arrow and vertical (xc) and horizontal (xs) dashed line intersecting at the tip of the arrow originating. — **FIGURE 1.20** Relationship between transformation variables.

Expressing the phase angle in (1.208) as a function of x_c and x_s leads to the expressions

(1.209)

and

(1.210)

The Jacobian of the transformation is defined as [6, 8, 28, 29]

(1.211)

and, using the Jacobian, the transformation from (x_c,x_s) to (r,θ) is expressed as

(1.212)

To evaluate the Jacobian for this transformation, the functions g_ij(x_c,x_s) are defined in terms of (1.207), (1.209), and (1.210) as follows:

(1.213)

(1.214)

and

(1.215)

Upon evaluating the partial derivatives in (1.211), the Jacobian is found to be¹⁵

(1.216)

and, using (1.208), the functions h₁(r,θ) and h₂(r,θ) are expressed as

(1.217)

and

(1.218)

Substituting (1.216), (1.217), and (1.218) into (1.212) and applying the independence of x_c, x_s, and φ, as in (1.206), the pdf of the transformed variables r and θ is expressed as

(1.219)

where r ≥ 0, otherwise the pfd is zero, and θ and φ are uniformly distributed over the range 0 ≤ θ, φ ≤ 2π. The pdf for the magnitude r is determined by computing the marginal distribution M_R(r) by integrating over the ranges of θ and φ. Defining ψ = θ − φ, the marginal is evaluated as

(1.220)

Davenport and Root [42] point out that the integrand of the bracketed integral is periodic in the uniformly distributed phase ψ and can be integrated over the interval 0 to 2π. With this integration range, the bracketed integral is identified as the zero‐order modified Bessel function expressed as [43]

(1.221)

Therefore, upon using (1.221) and performing the integration over φ, the marginal distribution function M_R(r) simplifies, at least in notation, to

(1.222)

Equation (1.222) is the Rice distribution or, as referred to throughout this book, the Ricean distribution that, as developed in the forgoing analysis, characterizes the baseband magnitude distribution of a CW signal with narrowband additive white Gaussian noise. The Ricean distribution also characterizes the magnitude distribution of a received signal from a channel with multipath interference; this channel is referred to as a Ricean fading channel. The Ricean distribution becomes the Rayleigh distribution as A → 0 and the Gaussian distribution as A → ∞; the proof of these two limits is the subject of Problems 19 and 20. The Rayleigh distribution characterizes the amplitude distribution of narrowband noise or, in the case of multipath interference, the composite signal magnitude of many random scatter returns without a dominant specular return or signal component. The multipath interference is the subject of Chapter 18. Defining the signal‐to‐noise ratio as γ = A²/(2σ²), (1.222) is expressed as

(1.223)

The pdf of the phase function is evaluated by computing the marginal distribution M_ΘΦ(θ,φ) by integrating over the range of the magnitude r. By forming or completing the square of the exponent in the last equality in (1.219) the integration is performed as

(1.224)

Davenport and Root [44] provide an approximate solution to (1.224), under the condition Acos(θ – φ) >> σ. The approximation is expressed as

(1.225)

where γ is the signal‐to‐noise ratio defined earlier. An alternate solution, without the earlier restriction, is expressed by Hancock [45], with ψ = θ − φ, as

(1.226)

where P(z) is the probability integral defined in Section 3.5. Hancock’s phase function is used in Section 4.2.1 to characterize the performance of phase‐modulated waveforms.

As γ → 0 in (1.226) the function f_Ψ(ψ) → 1/2π resulting in the uniform phase pdf. However, for γ greater than about 3, the probability integral is approximated as [26]

(1.227)

Using (1.227), the phase pdf is approximated as

(1.228)

With |ψ| ≅ 0 such that sin²(ψ) ≅ ψ² and defining (1.228) is approximated as

(1.229)

Equation (1.229) describes a zero‐mean Gaussian phase pdf with the phase variance . Hancock’s phase function, expressed in (1.226), is plotted in Figure 4.3 for various signal‐to‐noise ratios.

1.5.5.2 Distribution of the Product of Two Independent Gaussian Random Variables

In this section the pdf of the product z = xy of two zero‐mean equal‐variance iid Gaussian random variables X and Y is determined. The solution involves defining an auxiliary random variable w = h(x) = x with z = g(x,y) = xy and evaluating f_Z,W(w,z) characterized as

(1.230)

where J_X,Y(x,y) is the Jacobian of the transformation evaluated as

(1.231)

Using (1.231) and the joint Gaussian pfd of X and Y, expressed by (1.230), with x = w and y = z/w, the marginal pdf of z is evaluated as

(1.232)

However, since X and Y are independent

(1.233)

and, upon substituting x = w and y = z/w into (1.233), (1.232) is expressed as

(1.234)

where the second equality recognizes that the first equality is symmetrical in w. Letting λ = w²/2σ² (1.234) is expressed as

(1.235)

The solution to the integral in (1.235) appears in the table of integrals by Gradshteyn and Ryzhik (Reference 46, p. 340, pair No. 12) as

(1.236)

where K_v(u) is the modified Bessel function of the second kind of order v. With v = 0 and u = z/σ², (1.235) is evaluated as

(1.237)

The magnitude of z in (1.237) is used because of the even symmetry of f_Z(z) with respect to z. The symmetry of f_Z(z) results in a zero‐mean value so the variance is evaluated as

(1.238)

The solution to the integral in (1.238) is found in Gradshteyn and Ryzhik (Reference 46, p. 684, Integral No. 16) and the variance f_Z(z) is evaluated as

(1.239)

where the second equality in (1.239) results from the value of the Gamma function . In Example 4 of Section 1.5.6.1, the pdf of the summation of N iid random variables with pdfs expressed by (1.237) is examined.

1.5.6 The Characteristic Function

The characteristic function of the random variable X is defined as the average value of e^jvx and is expressed as

(1.240)

With v = −ω and x = t (1.240) is similar to the Fourier transform of a time‐domain function. The characteristic function is also referred to as the moment‐generating function, in that, the nth moment of the random variable X, defined as the expected value E[xⁿ], is evaluated (see Problem 26) as

(1.241)

The Fourier transform relationship between time domain convolution and frequency domain multiplication also applies to the convolution of random variables and the multiplication of the corresponding characteristic functions. Therefore, based on the discussion in Section 1.5.6.1, the summation of N identically distributed (id) random variables corresponds to the product of their individual characteristic functions, that is,

(1.242)

This is a very useful result, in that, the distribution of the summation of N independent random variables is obtained as the inverse transform [47] of (1.242) expressed as

(1.243)

Campbell and Foster [47] provide an extensive listing of Fourier transform pairs defined as

(1.244)

and, by defining v = −2πf, the Fourier transform pairs apply to the transform pairs between f_X(x) and C_X(v) as expressed in (1.240).

1.5.6.1 Summation of Independently Distributed Random Variables

If two random variables X and Y are independent then the probability density f_Z(z) of their sum Z = X + Y is determined from the convolution of f_X(x) with f_Y(y) so that¹⁶

(1.245)

For multiple summations of a random variable, the convolution is repeated for each random variable in the summation.

Example 1

Consider the summation of N zero‐mean uniformly distributed random variables X_i expressed as

(1.246)

with

(1.247)

For N = 2 the convolution involves two ranges of the variable z as shown in Figure 1.21 and the integrations are evaluated as

(1.248)

and

(1.249)

Three graphs illustrating convolution of two zero-mean uniform distributions with rectangular-shaped plots for range 1: –2a≤ z ≤0 and range 2: 0≤ z ≤2a. — **FIGURE 1.21** Convolution of two zero‐mean uniform distributions.

Upon evaluation of (1.248) and (1.249) and recognizing the symmetry about z the density function is expressed as

(1.250)

Repeating the application of the convolution for N = 3 and 4 (see Problem 24) results in the probability density functions shown in Figure 1.22 with the corresponding cdf results shown in Figure 1.23. As the probability density and characteristic functions will approach those of the Gaussian distributed random variable (see Problem 23).

Graph of z vs. fz(z) illustrating pdf for sum of N = 2 (solid line), 3 (dashed line), and 4 (solid thick line) independent zero-mean uniform distributions (a = 1). — **FIGURE 1.22** *pdf* for sum of N = 2, 3 and 4 independent zero‐mean uniform distributions (a = 1).

Graph of z vs. Fz(z) illustrating cdf for sum of N = 2 (solid line), 3 (dashed line), and 4 (solid thick line) independent zero-mean uniform distributions (a = 1). — **FIGURE 1.23** *cdf* for sum of N = 2, 3 and 4 independent zero‐mean uniform distributions (a = 1).

The moments of the random variable X are evaluated using the characteristic function

(1.251)

In regions where the characteristic function converges, the moments E[xⁿ] completely define the characteristic function and the pdf of the random variable X, so, upon expanding (1.251) as the power series

(1.252)

the moments are easily evaluated using (1.241). The moments for the random variable Z, formed as in (1.246), are determined using (1.242) and, with X_i : i = 1, …, N iid random variables, the characteristic function for Z is approximated as

(1.253)

The first and second moments for N = 1, …, 4 are listed in Table 1.6. These results are also obtained by evaluating f_Z(z) using (1.250) and then evaluating the moments (see Problem 25) as

(1.254)

N	E[z]	E[z²]
1	0	a²/3
2	0	2a²/3
3	0	a²
4	0	4a²/3

However, it is much easier to use the characteristic function.

Example 2

As another example, consider the summation of N random variables X_i characterized as the sinusoidal function

(1.255)

with constant amplitudes A_i and zero‐mean uniformly distributed phase, expressed as

(1.256)

The resulting pdf of the random variable X_i for ϕ = π, is evaluated in (1.193) as

(1.257)

and is plotted in Figure 1.24.

Graph of x vs. fx(x) illustrating pdf of x = Asin(φ) with zero-mean uniformly distributed phase, A = 1 and ф = π depicting a u-curved plot. — **FIGURE 1.24** *pdf* of x = Asin(φ) with zero‐mean uniformly distributed phase, A = 1 and ϕ = π.

The pdf of the random variable Z, expressed as in (1.246),¹⁷ is evaluated by successive convolutions as in (1.245) and the results for N = 2, 3, and 4 are plotted in Figure 1.25 with the corresponding cdf functions shown in Figure 1.26. The results in Figures 1.25 and 1.26 for N > 1 are obtained by numerical evaluations of the convolutions using incremental values of Δz = 2.5 × 10⁻⁵; this is a reasonable compromise between simulation time and fidelity in dealing with the infinite value at |x| = 1.0.

Graph of z vs. fz(z) illustrating pdf of N = 2, 3, and 4 successive convolutions of fX(x) depicting a solid and dashed curves. — **FIGURE 1.25** *pdf* of N = 2, 3, and 4 successive convolutions of *f_X*(x).

Graph of z vs. Fz(z) illustrating cdf of N = 2(solid line), 3 (dashed line), and 4 (solid thick line) successive convolutions of fX(x). — **FIGURE 1.26** *cdf* of N = 2, 3, and 4 successive convolutions of *f_X*(x).

In this case, the mean and variance of the random variable X are evaluated using the characteristic function of (1.257) found in (Reference 47, p. 123, Transform Pair 914.5); the result is

(1.258)

where I_o(−) is the modified Bessel function of order zero. Expanding (1.258) for Av < 1 as a power series, (Reference 46, p. 375, Ascending Series 9.6.10), results in¹⁸

(1.259)

and the moments are easily evaluated using (1.241). The first and second moments are listed as the theoretical values in Table 1.7. The moments for the random variable Z, formed as in (1.246) with X_i iid random variables for all i as expressed by the pdf in (1.255), are determined using the characteristic function expressed as

(1.260)

N	Numerical^a (A = 1)
1	A²/2	0.4999
2	A²	1.0044
3	3A²/2	1.5055
4	2A²	2.0146

The corresponding first two moments of the random variable Z for N = 2, 3, and 4 are also listed in Table 1.7. The numerical results listed in Table 1.7 are based on computer evaluations of the various convolutions resulting in the pdfs shown in Figures 1.24 and 1.25.

A major observation in these two examples is that the probability distribution of the random variable Z approaches a Gaussian distribution as N increases (see Problem 27). This is evidence of the central limit theorem which states that (see Davenport and Root, Reference 6, p. 81) the sample mean of the sum of N arbitrarily distributed statistically independent samples becomes normally distributed as N increases. This is referred to the equal‐components case of the central limit theorem. However, as pointed out by Papoulis (Reference 8, p. 266), a consequence of the central limit theorem is that the distribution f_Z(z) of the sum of N statistically independent distributions having arbitrary pdf’s tends to a normal distribution as N increases. This is a stronger statement and suggests that the probability P(z) = f_Z(Z < z) can be considered a Gaussian distribution for all z as is frequency assumed to be the case in practice. Davenport and Root also point out that, even though N is seemingly large, the tails of the resulting distribution may result in a poor approximation to the Gaussian distribution.

Upon computing the mean and variance using the power series expansion of C_Z(v) expressed by (1.252) with av << 1, the approximate expression for the corresponding Gaussian distribution is easily obtained. After summing N uniformly distributed amplitudes the expression for the pdf is

(1.261)

Similarly, for the summation of N sinusoids with Av << 1, the pdf in Example 2 is expressed as

(1.262)

It is interesting to note that the second moments are Nλ² for all values of N including those for which the pdf does not have the slightest resemblance to the Gaussian pdf. In these cases, the important difference is that the corresponding probabilities P(x) = F_X(X < x) are entirely different from those of the Gaussian distribution with the possible exception of the median value. Finally, it is noted that the limiting behavior for λv << 1 and N → ∞ applies to the summation of independently distributed distributions that may, or may not, be identically distributed distributions.

Example 3

This example involves the summation of random chips {±1} in a direct‐sequence spread‐spectrum (DSSS) waveform. In this case, the chips occur with equal probabilities according to the pdf expressed as

(1.263)

Using (1.240), the characteristic function is evaluated as

(1.264)

The DSSS waveform uses N chips per bit and the demodulator correlation sums the N chips to form the correlation output with the corresponding characteristic function given by

(1.265)

To evaluate the first and second moments of y only the first two terms in the expansion of cos^N(v) are required and, upon using (1.241), these moments are evaluated as

(1.266)

and

(1.267)

The variance of y is defined as the second central moment and with zero‐mean the variance is .

Example 4

The pdf f_Z(z) of the product of two, zero‐mean, equal‐variance, iid Gaussian random variables, z = xy, is expressed in (1.237) as a function of the zero‐order modified Bessel function K_o(|z|/σ²) where the magnitude of z provides for the range: −∞ ≤ z ≤ ∞. In this example, the pdf is evaluated where : i = 1, …, N. The evaluation is based on the N‐th power of the characteristic function C_Z(v) and, from the work of Campbell and Foster (Reference 47, p. 60, pair No. 558), the characteristic function is evaluated as¹⁹

(1.268)

The characteristic function of is the N‐th power of (1.268) expressed as

(1.269)

and, using the transform pair of Campbell and Foster (Reference 47, p. 61, pair No. 569.0), the pdf of is evaluated as

(1.270)

As in the case for f_Z(z), the pdf applies for −∞ ≤ ≤ ∞ and is symmetrical with respect to resulting in a zero‐mean value with the variance expressed as

(1.271)

The solution to the integral in (1.271) is found in Gradshteyn and Ryzhik (Reference 46, p. 684, Integral No. 16) and the variance f_Z(z) is evaluated using

(1.272)

Substituting the solution to the integral in (1.272) into (1.271), with u = (N + 3)/2, v = (N − 1)/2, and a = 1/σ², the solution to variance simplifies to

(1.273)

In the earlier evaluation, the integer argument Gamma function is related to the factorial as ! and . This result could also be evaluated using the movement generating function of (1.241), however, using the integral solution as in (1.272) it is sometimes easier to evaluate the moments. With a sufficiently large value of N the pdf is approximated as the Gaussian pfd expressed as

(1.274)

The probability density functions discussed earlier and others encountered in the following chapters are summarized in Table 1.8 with the corresponding mean values, variances, and characteristic functions.

TABLE 1.8 Probability Distributions and Characteristic Functions

Name	f_X(x)	E[x]	Var[x]	C_X(v)	Conditions
Uniform
Bernoulli		p	p(1 − p)	(1 − p) + pv	Discrete binary variable x = k_i : i = 1, 2
Binomial		np	np(1 – p)		Discrete variable
Poisson		α	α		Discrete variable
Exponential		1/α	1/α²
Gaussian (normal)		m	σ²
Chi‐square (N = 2) Exponential (α = 1/2)		2	4		x ≥ 0
Chi‐squared (N‐degrees)		N	2N		N‐degrees of freedom x ≥ 0
Rayleigh		as γ→∞	as γ→∞		x > 0
Ricean		^a	^a	^a	x > 0
Gamma		α/β	α/β²		x > 0 β > 0; λ > 0
Lognormal				^b	y is lognormal y = e^x ≥ 0 x = N(m,σ)
Nakagami‐m		^c	^c	^c	x ≥ 0

Notes: γ = A²/(2σ²) is the signal‐to‐noise ratio. γ→0 f_X(x) = Rayleigh with E[x] = , Var[x] = .

^aγ→∞ f_X(x) = Gaussian with E[x] = A, Var[x] = σ².

^bApproximated using a series expansion of e^jvy.

^cRefer to special cases in Section 1.5.7.2.

1.5.7 Relationships between Distributions

In the following two subsections, the relationship between various probability density functions is examined by straightforward parameter transformations, allowing parameters to approach limits, or simply altering various parameter values. The most notable relationship is based on the central limit theorem in which a distribution approaches the Gaussian distribution by increasingly summing the operative random variable.

1.5.7.1 Relationship between Chi‐Square, Gaussian, Rayleigh, and Ricean Distributions

A random variable has a chi‐square (χ²) distribution with N degrees of freedom if it has the same distribution as the sum of the squares of N‐independent, normally distributed random variables, each with zero‐mean and unit variance.²⁰

Consider the zero‐mean Gaussian or normal distributed random variable x with variance and pdf expressed as

(1.275)

The pdf of a new random variable y = x², obtained by simply squaring x, is determined by considering the positive and negative regions of as shown in Figure 1.27.

Graph illustrating transformation of the random x to y = x2 depicting a u-curved plot. — **FIGURE 1.27** Transformation of the random x to y = x².

The pdf of y is determined using the incremental intervals dy = 2xdx at such

(1.276)

The characteristic function of (1.276) is evaluated as

(1.277)

Consider now the random variable z resulting from the summation of N independent random variables y_i such that

(1.278)

The characteristic function of z is simply the N‐th power of C_Y(v) so that

(1.279)

Equation (1.279) transforms to the pdf of z, resulting in

(1.280)

In conforming to the earlier definition, the chi‐square distribution is expressed by letting in (1.280) or, more formally, using the transformation ; therefore, the pdf of the chi‐square random variable χ with N degrees of freedom is

(1.281)

and the corresponding characteristic, or moment generating, function is

(1.282)

Equation (1.281) is occasionally referred to as the central χ² distribution because it is based on noise only, that is, the underlying zero‐mean Gaussian random variables x_i with distribution given by (1.275) do not contain a signal component.²¹

Special Case for N = 2

Under this special case (1.280) reduces to the exponential distribution

(1.283)

So the resulting chi‐square χ² distribution is obtained from (1.281) with N = 2. This is an important case because x₁ and x₂ can be thought of as orthogonal components in the complex description of a baseband data sample. Urkowitz [48] shows that the energy of a wide‐sense stationary narrowband white noise Gaussian random process with bandwidth –W to W Hz and measured over a finite interval of T seconds is approximated by N = 2WT terms or degrees of freedom. The frequency W is the noise bandwidth of the narrowband baseband filter and the approximation error in the energy measurement decreases with increasing 2WT. The factor of two can be thought of as the computation of complex orthogonal baseband functions so N = 2 degrees of freedom correspond to WT = 1. For example, the rect(t/T) function observed over the interval T seconds has a noise bandwidth of W = 1/T Hz corresponding to WT = 1 resulting in 2 degrees of freedom.

Upon letting , the random variable w is described in terms of the Rayleigh distribution

(1.284)

So the Rayleigh distribution is derived from the magnitude of the quadrature zero‐mean Gaussian distributed random variables, x = N(0,σ).²²

1.5.7.2 Relationship between Nakagami‐m, Gaussian, Rayleigh, and Ricean Distributions

The Nakagami‐m distribution [49] was initially derived from experimental data to characterize HF fading; however, subsequent experimental observations demonstrate its application to rapid fading at carrier frequencies from 200 MHz to 4 GHz. It is considered to be a generalized distribution from which other distributions can be derived, for example, m = 1 results in the Rayleigh power distribution, m = ½ results in the one‐sided zero‐mean Gaussian distribution, and as the m‐distribution approaches the Gaussian distribution with a unit mean value. In the region 1 ≤ m ≤ ∞, the Nakagami‐m distribution behaves much like the Ricean distribution; however, the normalized distributions are subtly different when plotted for various signal‐to‐noise ratios less than about 10 dB. The Ricean distribution, referred to as the n‐distribution by Nakagami, is derived from concepts involving narrowband filtering of a continuous wave (CW) signal with additive Gaussian noise, whereas the Nakagami‐m distribution is derived from experimental data involving multipath communication links.

1.5.8 Order Statistics

Communication systems analysis and performance evaluations often involve a large number of random samples taken from an underlying continuous or discrete probability distribution function. The various parameters, used to characterize the system performance, result in limiting distributions with associated means, variances, and confidence levels as dictated, for example, by an underlying distribution. Order statistics [31, 50, 51], on the other hand, involves a distribution‐free or nonparametric analysis that requires only that the probability distribution functions be continuous and not necessary related to the underlying distribution from which the samples are taken. However, the randomly drawn samples are considered to be statistically independent.

Consider that the n random samples {X₁, X₂, …, X_n} are taken from the continuous pdf f_X(x) over the range . Now consider reordering the random variables X_i : i = 1, …, n to form the random variables {Y₁, Y₂, …, Y _n} arranged in ascending order of magnitude, such that, where is uniformly distributed over the interval b − a. The joint pdf of the ordered samples [52] is expressed as

(1.285)

for and n! is the number of mutually disjoint sets of x₁, x₂, …, x_n. For example, for n = 4 the set x₁, x₂, x₃, x₄ results in n! = 24 mutually disjoint sets determined as shown in Table 1.9. The first six mutually disjoint sets are determined by cyclically left shifting the indicated subsets of original set x₁, x₂, x₃, x₄; a cyclic left shift of a subset is obtained by shifting each element of the subset to the left and replacing the leftmost element in the former position of the rightmost element. Following the first six sets shown in the figure, the original set is cyclically shifted three more times each leading to six mutually disjoint sets by shifting subsets resulting in a total of 24 mutually disjoint sets.

TABLE 1.9 Example of Mutually Disjoint Sets (n = 4, 24 Mutually Disjoint Sets)

No.	Mutually Disjoint Sets	Shifting^a
1	x₁, x₂, x₃, x₄	Original set
2	x₁, x₂, x₄, x₃	Shift subset x₃, x₄
3	x₁, x₃, x₄, x₂	Shift subset x₂, x₃, x₄
4	x₁, x₃, x₂, x₄	Shift subset x₄, x₂
5	x₁, x₄, x₂, x₃	Shift subset x₃, x₄, x₂
6	x₁, x₄, x₃, x₂	Shift subset x₂, x₃
7	x₂, x₃, x₄, x₁	Shift original set
8	x₂, x₃, x₁, x₄	Shift subset x₄, x₁
9	x₂, x₄, x₁, x₃	Shift subset x₃, x₄, x₁
⋯	⋯	⋯

^aShift denotes a cyclic left shift of a previous set or subset.

The ordered sample Y_i is referred to as the i‐th order statistic of the sample set. The marginal pfd of the n‐th order statistic Y_n, that is, the maximum of {X₁, X₂, …, X_n}, is evaluated using (1.285) by performing the integrations in the ascending order i = 1, 2, …, n − 1 as follows²³:

(1.286)

The solution (see Problem 15) to (1.286) is

(1.287)

where Fⁿ⁻¹(y_n) is the cdf evaluated as

(1.288)

Using the marginal pdf of Y_n given by (1.287), the probability of selecting the maximum of value Y_n is determined as

(1.289)

These results are distribution free, in that, the pdf has not been defined; however, from a practical point of view (1.289) can be evaluated for any continuous pdf.

The distributions from which the x_i are taken need not be identical²⁴; for example, the samples x₁ through x_j can be taken from a distribution involving signal plus noise (or clutter) and those from x_j+1 through x_n corresponding to noise (or clutter) only. Using this example the distribution in (1.287) is expressed as

(1.290)

where F_sn(y) is the distribution corresponding to signal plus noise and F_n(y) is the noise‐only distribution.

Example distributions used to evaluate the performance of communication and radar systems are Gaussian, Ricean, lognormal, and Weibull distributions. Table 1.10 lists the false‐detection probabilities, for the indicated signal‐to‐noise ratios γ_dB, associated with the detection of j = 1 signal‐plus‐noise event and k = n − j = (1,2,4, and 8) noise‐only events.

TABLE 1.10 Order Statistics False‐Detection Probability for Gaussian Distributed Random Variables

Ordered S + N and N Statistics (j,k)	*False‐Detection Probability (P_fd)*
Ordered S + N and N Statistics (j,k)	γ_dB = 10	γ_dB = 15	γ_dB = 20
1,1	2.440e⁻²	6.645e⁻⁵	1.408e⁻¹²
1,2	4.226e⁻²	1.308e⁻⁴	2.815e⁻¹²
1,4	6.980e⁻²	2.547e⁻⁴	5.629e⁻¹²
1,8	1.089e⁻¹	4.874e⁻⁴	1.126e⁻¹¹

1.5.9 Properties of Correlation Functions

Correlation processing is used in nearly every aspect of demodulator signal detection from energy detection, waveform acquisition, waveform tracking, parameter estimation, and information recovery processing. With this wide range of applications, the theoretical analyst, algorithm developer, software coder, and hardware developer must be thoroughly familiar with the properties and implementation of waveform correlators. An equally important processing function is that of convolution or linear filtering. The equivalence between matched filtering and correlation is established in Section 1.7.2 and involves a time delay in the correlation response; with this understanding, the properties of correlation can be applied to convolution or filtering. The correlation response can be exploited to determine the signal signature regarding the location of a signal in time and frequency, the duration and bandwidth of the signal, the shape of the modulated signal waveform, and the estimate of the information contained in the modulated waveform.

The correlation function²⁵ is evaluated for the complex functions as the integral

(1.291)

and

(1.292)

where the asterisk denotes complex conjugation.

Autocorrelation processing examines the correlation characteristics of a single random process with the maximum magnitude corresponding to the zero‐lag condition that is equal to the maximum energy over the correlation interval. The correlation response is indicative of the shape of and the duration, τ_d, of the principal correlation response is indicative of the correlation time. For deterministic signals, the correlation time (τ_o) is usually characterized in terms of the one‐sided width of the principal correlation lobe; however, for stochastic processes the correlation interval is defined when || decreases monotonically from to a defined level; for example, when the normalized correlation response first reaches the level . The normalized correlation response is referred to as the correlation coefficient as defined in (1.295) or (1.296). The parameters related to the correlation of the function have equivalent Fourier transform frequency‐domain definitions. In the case of stochastic processes, the Fourier transform of is defined as the PSD of the process.

Expanding (1.292) in terms of the real and imaginary with and results in

(1.293)

This evaluation requires four real multiplies and integrations for each lag, whereas, if and ỹ(t) were real functions only one multiplication and integration is required for each lag. With discrete‐time sampling, the integrations are replaced by summations over the finite sample values and ỹ_n where t = nT_s: n = 0, …, N − 1 and T_s is the sampling interval; in this case, the computational complexity is proportional to N². The computation complexity can be significantly reduced by performing the correlation in the frequency domain using FFT [53], in which case, for a radix‐2 FFT with N = 2^k, the computation complexity is proportional to Nlog₂(N). Brigham [54] provides detailed descriptions of the implementation and advantages of FFT correlation and convolution processing. The correlation results throughout the following chapters use the direct and FFT approaches without distinction.

Referring to (1.291) the zero‐lag correlation is expressed as

(1.294)

where E_x is the total energy in the received signal. Using (1.294), the normalized correlation is defined in terms of the normalized autocorrelation coefficient as

(1.295)

with |ρ_x(τ)| ≤ 1. From (1.292), the normalized cross‐correlation coefficient is defined as

(1.296)

with |ρ_xy(τ)| ≤ 1.

The correlation may also be defined in terms of the long‐term average over the interval T as

(1.297)

However, most practical waveforms are limited to a finite duration T_c = NT_s and, in these cases, is zero outside of the range T_c. Therefore, dividing the zero‐lag correlation by T_c results in the second‐order moment where is the DC or mean signal power. Removing the mean signal level prior to performing the correlation results in the autocovariance with . Table 1.11 summarized several properties of correlation functions.

Consider, for example, that is a received signal plus AWGN, the correlation is performed in the demodulator using the known reference signal . The dynamic range of the demodulator detection processing is minimized by the normalization in (1.296) and the optimum signal detection corresponds to ρ_xy(0). On the other hand, if the optimum timing is not known, near optimum detection can be achieved by choosing the maximum correlation output over the uncertainty range of the correlation lag about τ = 0. During initial signal acquisition, the constant false‐alarm rate (CFAR) threshold, described in Section 11.2.2.1, is an effective algorithm for signal presence detection and coarse synchronization.

1.6 RANDOM PROCESSES

Many of the signal descriptions and processing algorithms in the following chapters deal exclusively with the signal and neglect the additive noise under the reasoning that the noise detracts from the fundamental signal processing requirements and complicates the notation which has the same effect. On the other hand, understanding the impact of the noise on the system performance is paramount to the waveform selection and adherence to the system performance specifications. To this end, the performance evaluation is characterized by detailed analysis of the signal‐plus‐noise conditions and confirmed by computer simulations.

The following descriptions of noise and signal plus noise are provided to illustrate the assumptions and analysis associated with the inclusion of the most basic noise source—AWGN. The reference to narrowband Gaussian noise simply means that the carrier frequency f_c is much greater than the signal modulation Nyquist bandwidth B_N so that the 2f_c heterodyning or homodyne mixing terms are completely eliminated through filtering. In such cases, the white noise in the baseband demodulator bandwidth is denoted by the single‐sided noise density N_o watts/Hz, where single‐sided refers to positive frequencies.

1.6.1 Stochastic Processes

The subject of stochastic processes is discussed in considerable detail by Papoulis [55] and Davenport and Root [56] and the following definitions are often stated or implied in the applications discussed in throughout the following chapters. A stochastic process is defined as a random variable that is a function of time and the random events χ in S as depicted in Figure 1.14. In this context the random variable is characterized as x(t,χ). For a fixed value of t = t_i, x(t_i,χ) is a random variable and χ = χ_i, x(t,χ_i) denotes as the real random process x(t) such that x(t_i) is a random variable with pdf f_X(x:t_i); in general, the pdf of x(t) is defined as f_X(x:t).

1.6.1.1 Stationarity

There are several ways to define the stationarity of a stochastic process, for example, stationarity of finite order, asymptotic stationary, and periodic stationarity; however, the following two are the most frequently encountered.

Strict‐Sense Stationary Process

The stochastic process x(t) is strict‐sense stationary, or simply stationary, if the statistics are unaltered by a shift in the time axis. Furthermore, two random variables are jointly stationary if the joint statistics are unaltered by an equal time shift of each random variable, that is, the probability density function f(x ; t) is the same for all time shifts τ. This is characterized as

(1.298)

Wide‐Sense Stationary Process

The stochastic process x(t) is wide‐sense stationary (WSS) if its expected value is constant and autocorrelation function is a function of the time shift τ = t₂ − t₁ ∀ t₁ and t₂. WSS stationarity is characterized as

(1.299)

and

(1.300)

Because wide‐sense stationarity depends on only the first and second moments it is also referred to as weak stationarity. A function of two random processes is wide‐sense stationary if each process is wide‐sense stationary and their cross‐correlation function is dependent only the time shift, that is,

(1.301)

1.6.1.2 Ergodic Random Process

The random process x(t), defined earlier, is an ergodic random process if the statistics of x(t) are completely defined by the statistics of x(t,χ). Denoting the random process x(t_i,χ) as an ensemble of x(t,χ), then ergodicity ensures that the statistics x(t_i) are identical to those of the ensemble; in short, the time statistics are identical to the ensemble statistics.²⁶ Ergodicity of the mean, of the stochastic process x(t,χ), exists under the condition

(1.302)

where the time average is defined as

(1.303)

and the ensemble average is defined as

(1.304)

Since the mean value of a random process must be a constant, the ergodic of the mean theorem states that the equality condition in (1.302) is satisfied when where η is a constant. This is a nontrivial task to prove, however, following the discussion by Papoulis [57], the ergodic of the mean theorem states that

(1.305)

The iff condition in (1.305) is formally expressed in terms of the autocovariance function for which the limit T → ∞ is expressed as the variance . However, from (1.305), the expectation E[x(t)] = η resulting in . Therefore, the limit T → ∞ of the autocovariance function converges in probability with the conclusion that proving ergodicity of the mean.²⁷ Demonstration of ergodicity of the autocorrelation function is considerably more involved, requiring the fourth‐order moments.

1.6.2 Narrowband Gaussian Noise

Consider the noise described by the narrowband process [58] with bandwidth B << f_c expressed as

(1.306)

where N(t) and θ(t) represent, respectively, the envelop and phase of the noise and ω_c = 2πf_c is the angular carrier frequency. Upon expanding the trigonometric functions, (1.306) can also be expressed as

(1.307)

where

(1.308)

and

(1.309)

The noise terms n_c(t) and n_s(t) are uncorrelated with spectrum S(f) and bandwidth B, such that S(f) = 0 for |f − f_c| > B/2. This is the general characterization of a narrowband noise process; however, in the following analysis, n_c(t) and n_s(t) are also considered to be statistically independent, stationary zero‐mean white noise Gaussian processes with one‐sided spectral density N_o watts/Hz.

Because of the stationarity, the noise autocorrelation is dependent only on the correlation lag τ and is evaluated as

(1.310)

Upon evaluating the product in (1.310) and distributing the expectation, it is found that the conditions for stationarity require²⁸ R_ss(τ) = R_cc(τ) and R_cs(τ) = −R_sc(τ) so that (1.310) reduces to

(1.311)

The noise power is evaluated using (1.311) with τ = 0 with the result R_nn(0) = R_cc(0) = . This evaluation can be carried further using the Wiener–Khinchin theorem²⁹ which states that the power spectral density of a WSS random process is the Fourier transform of the autocorrelation function, that is,

(1.312)

From (1.312) the inverse Fourier transform is

(1.313)

and, substituting the condition that the single‐sided noise spectral density is defined as N_o watts/Hz, (1.313) becomes

(1.314)

In (1.314) the single‐sided noise density is divided by two because of the two‐sided integration, that is, the integration includes negative frequencies. In this case, the noise power, defined for τ = 0, is infinite, however, when the ideal band‐limited filter, with bandwidth B, is considered the noise power in the filter centered at f_c is computed as

(1.315)

In this case the one‐sided noise density N_o is used instead of N_o/2 because the one‐sided integration is over positive frequencies.

If a linear filter with impulse response h(t) is used, the frequency response is given by

(1.316)

The corresponding unit gain normalizing factor is |H(0)|. With the stationary noise process n(t) applied to the input of the filter, the output is determined using the convolution integral and the result is as follows:

(1.317)

Using (1.317) it can be shown (see Problem 33) that the normalized spectrum of the output noise is expressed, in terms of the input noise PSD S_n(f), as

(1.318)

where |H(0)| is the normalizing gain of the filter. Using (1.318), with S_n(f) = N_o/2 corresponding to white noise, the output noise power is evaluated as

(1.319)

where the second integral in (1.319) is recognized as the definition of the noise bandwidth of the bandpass filter with lowpass bandwidth B_n.

1.7 THE MATCHED FILTER

The problem in the detection of weak signals in noise is one of deciding whether the detection filter output is due to the signal and noise or simply noise only. The matched filter [59, 60], provides for the optimum signal detection in AWGN noise based on the maximum instantaneous signal‐to‐noise ratio when sampled at the optimum time.³⁰ The matched filter, for an AWGN channel, is characterized as having an impulse response equal to the delayed time‐reverse replica of the received signal. To maximize the signal detection probability the matched filter output must be sampled at T_o as defined in the following analysis. The matched filter can be implemented at a convenient receiver IF or in the demodulator using quadrature baseband‐matched filters.

Considering the received signal, s_r(t), the matched filter impulse response depicted in Figure 1.28 is expressed as

(1.320)

Two graphs of Sr(t) vs. t (top) and h(t) vs. t (bottom) illustrating a descending and ascending curve, respectively. — **FIGURE 1.28** Example received signal and corresponding matched filter.

The gain G is selected for convenience; however, it must be a constant value. The delay T_o is required to result in a causal impulse response, that is, the response of h(t ≤ 0) = 0 for h(t) to be realizable; consequently, s_r(t ≥ T_o) must be zero. Usually the selection of T_o is not an issue since many symbol modulation functions are time limited or can be truncated without a significant impact on the transmitted signal spectrum; however, the matched filter delay results in a throughput delay. To the extent that the impulse approximates (1.320) a detection loss will be encountered.

The criterion of the matched filter is to provide the maximum signal‐to‐noise ratio in the AWGN channel when sampled at the optimum time T_o. The following matched filter analysis follows that of Skolnik [61]. The signal‐to‐noise ratio of interest is

(1.321)

where is evaluated as

(1.322)

and N is the noise power evaluated as

(1.323)

In these expressions, the filter spectrum H(f) is normalized, such that H(0) = 1, and the last equality in (1.323) results because the channel noise is white with one‐sided constant power density of N_o watts/Hz. Substituting (1.322) and (1.323) into (1.321) results in the expression for the signal‐to‐noise ratio

(1.324)

The maximum signal‐to‐noise ratio is evaluated by applying Schwarz’s inequality (see Section 1.14.5, Equation 5) to the numerator of (1.324). Upon substituting and into the Schwarz inequality, (1.324) is expressed as

(1.325)

The equality condition of the signal‐to‐noise ratio in (1.325) applies when f(f) = cg(f), where c > 0 is a conveniently selected constant resulting in the matched filter frequency response expressed as

(1.326)

where G = 1/c is an arbitrarily selected constant gain greater that zero. Upon applying Parseval’s theorem and recognizing that the numerator of the second equality in (1.325) is the signal energy, E, the optimally sampled matched filter output signal‐to‐noise ratio is simply expressed as

(1.327)

Therefore, for the AWGN channel, the optimally sampled matched filter output results in a signal‐to‐noise ratio that is a function the signal energy and noise density and is independent of the shape of the signal waveform. The factor of two in (1.327) results from the analytic or baseband signal description in the derivation of the matched filter. Typically, the received signal spectrum is modulated onto a carrier frequency with an average power equal to one‐half the peak carrier power. In this case, the signal‐to‐noise ratio at the output of the matched filter is one‐half of that in (1.327) resulting in

(1.328)

Referring to (1.326), the inverse Fourier transform of the complex conjugate of the signal spectrum results in the filter impulse response corresponding to the time reverse of the signal. In addition, the inverse Fourier transform of the exponential function in (1.326) results in a signal time delay of T_o seconds, so the resulting filter impulse response, h(t), corresponds to the example depicted in Figure 1.28. Consequently, the matched filter impulse response can be expressed in the time domain by (1.320) or in the frequency domain by (1.326).

The detection loss associated with a filter that is not matched to the received signal is evaluated as

(1.329)

where and are the output signal and mean noise power density at the output of the unmatched filter. Typically the matched filter is based on the transmitted waveform; however, the received signal into the matched filter may be distorted by the channel or receiver filtering³¹ resulting in a detection loss. The matched filter implementation may also result in design compromises that result in a detection loss.

1.7.1 Example Application of Matched Filtering

In this example, a BPSK‐modulated received signal is considered with binary source data bits b_i = {0,1} expressed as the unipolar data d_i = (1 − 2b_i) = {1,−1} over the data intervals of the bit duration T. The received signal plus noise is expressed as

(1.330)

The signal is described as

(1.331)

The noise is zero‐mean additive white Gaussian noise with one‐sided spectral density N_o described as

(1.332)

The receiver‐matched filter impulse response and Fourier transform are given by

(1.333)

In (1.333) the signal spectrum defined as S(f) and the squared magnitude of the matched filter output at the optimum sampling point is

(1.334)

where the gain G = |H(0)| is normalized to one resulting in a unit gain‐matched filter response H(f).

Referring to the additive noise described by (1.332) and Section 1.6.2, the noise power at the output of the matched filter is expressed as

(1.335)

where B/2 is the baseband bandwidth of the matched filter.

The received signal, as expressed in (1.330), can be rewritten in terms of the optimally sampled matched filter output as

(1.336)

where n_i are iid zero‐mean, unit variance, white Gaussian noise samples. Upon dividing (1.336) by the sampled matched filter output is expressed as

(1.337)

The sampled values l(r(iT_o)) and l′(r_i) are referred to as sufficient statistics, in that, they contain all of the information in r(t), expressed in (1.330), to make a maximum‐likelihood estimate of the source‐bit d_i. The normalized form in (1.337) is used as the turbo decoder input discussed in Section 8.12. In Section 1.8 the sufficient statistic is seen to be a direct consequence of the log‐likelihood ratio.

1.7.2 Equivalence between Matched Filtering and Correlation

Consider the receiver input as the sum of the transmitted signal plus noise expressed as

(1.338)

The cross‐correlation of r(t) with a replica of the received signal is computed as

(1.339)

Defining the matched filter impulse response as h(t), the matched filter output response to the input r(t) is

(1.340)

However, referring to the preceding matched filter discussion, the matched filter response is equal to the delayed image of the signal, such that,

(1.341)

As mentioned previously, the delay T_o ensures that the filter response is causal and, therefore, realizable. To substitute (1.341) into (1.340) first let so that and substitute this result in (1.340) to get

(1.342)

So that the convolution response is equal to the cross‐correlation response delayed by T_o. If the input noise is zero, so that r(t) = s(t), the same conclusion can be drawn regarding the autocorrelation response.

1.8 THE LIKELIHOOD AND LOG‐LIKELIHOOD RATIOS

Bayes criterion is based on two events, referred to as hypothesis H₁ and H₀, that are dependent upon a priori probabilities P₁ and P₀ and the, respective, associated costs (C₀₁,C₁₁) and (C₁₀,C₀₀). Letting m correspond to the decision and n correspond to the hypothesis, the range of the cost is 0 ≤ C_mn ≤ 1 with C_mn + C_nn|_m≠n = 1. The cost of a correct decision is C_nn and an incorrect decision is C_mn|_m≠n. For communication links the cost of incorrect decision is typically higher than a correct decision so that C_mn|_m≠n > C_nn. For example, when C_mn|_m≠n = 1 and C_nn = 0 the decision threshold minimizes the probability of error which is the goal of communication demodulators. In summary,

(1.343)

and the a priori probabilities are typically known and equal.

In the following example, the hypotheses correspond to selecting d_i = {1,−1}, such that, under the two hypotheses

(1.344)

with the observations

(1.345)

corresponding to the optimally sampled outputs of the matched filter. In terms of the a priori, the transition probabilities, and the cost functions, the hypothesis H₁: with d_i = 1 is chosen if the following inequality holds,

(1.346)

otherwise, chose H₀ with d_i = −1. The decisions are made explicit under the following rearrangement of (1.346)

(1.347)

Left and right sides of (1.347) are defined as the likelihood ratio (LR) Λ(r) and decision threshold η or, alternately, as the log‐likelihood ratio (LLR) lnΛ(r) with the threshold lnη, so (1.347) is also expressed as

(1.348)

1.8.1 Example of Likelihood and Log‐Likelihood Ratio Detection

Consider the two hypotheses H₁ and H₀ mentioned earlier with d_i = {1,−1} and the observation r_i in (1.345) with the additive noise n_i characterized as iid zero‐mean white Gaussian noise, denoted as N(0,σ_n). The transition probabilities are expressed in terms of the Gaussian noise pdf as

(1.349)

Upon forming the likelihood ratio and recognizing that = ±d_i, the likelihood ratio decision simplifies to

(1.350)

and the log‐likelihood ratio decision simplifies to

(1.351)

Recognizing that l(r_i) is a sufficient statistic, (1.351) is rewritten as

(1.352)

When C₁₀ = C₀₁ = 1, C₀₀ = C₁₁ = 0, and P₀ = P₁ the LLR simplifies to

(1.353)

Therefore, the data estimate is when l(r_i) > 0, otherwise, . Recall that observations r_i : t = iT_o are made at the output of the matched filter. These concepts involving the LR and LLR surface again in Section 3.2 and the notion of the natural logarithm of the transitions probabilities is discussed in the following section involving parameter estimation.

1.9 PARAMETER ESTIMATION

The subject of optimum signal detection in noise was examined in the preceding section in terms of a pulsed‐modulated carrier and it resurfaces throughout the following chapters in the context of a number of different waveform modulations. However, signal detection is principally based on the signal energy without regard to specific signal parameters, although frequency and range delay must be estimated to some degree to declare signal presence and subsequently detection. Signal detection uses concepts involving direct probabilities, whereas the subject of parameter estimation uses concepts involving inverse probabilities as discussed by Feller [32], Slepian [62], Woodward and Davies [63], and others. The distinction between these concepts is that direct probability is based on the probability of an event happening, whereas inverse probability formulates the best estimate of an event that has already occurred. With this distinction, it is evident that parameter estimation involves inverse probabilities. The major characteristic of inverse probability is the use of a priori information associated with the available knowledge of each source event. At the receiver the a posteriori probability is expressed in terms of the inverse probability using Bayes rule that associates the transition probability and a priori knowledge of the source events.

The subject of this section is signal parameter estimation and, although the major parameter of interest in communications is the estimation of the source information, the estimation of parameters like, frequency, delay, and signal and noise powers are important parameters that aid in the estimation of the source information. For example, estimation of the received signal and noise powers form the basis for estimating the receiver signal‐to‐noise ratio that is used in the network management to improve and maintain communication reliability. Furthermore, characterizing the theoretical limits in the parameter estimates provides a bench mark or target for the accuracy of the parameter estimation during the system design.

The following discussion of statistical parameter estimation is largely based on the work of Cramér [64], Rao [65], Van Trees [66], and Cook and Bernfeld [67]. The received signal is expressed in terms of the transmitted signal with M unknown parameters a₁, a₂, …, a_M and additive noise, as

(1.354)

Considering that N discrete samples of the received signal and additive noise are used to estimate the parameters, the joint probability density function (pdf) of the samples is

(1.355)

where the noise samples n_i = r_i − s_i are substituted into the joint pdf of the noise. The noise samples are statistically independent and the statistical characteristics of the noise are assumed to be known. Therefore, based on the received signal‐plus‐noise samples r_i, the receiver must determine the estimates â₁, â₂, …, â_M of the M unknown parameters. The probability density function p_r(r₁,… |a₁,…) in (1.355) is called the likelihood function.

Van Trees discusses several estimation criteria³² and the following focuses on the optimum estimates for the mean‐square (MS) error³³ and maximum a posteriori probability (MAP) criterion that are defined, respectively, for a single parameter a as

(1.356)

and

(1.357)

The estimate â_ms(r) is optimum in the sense that it results in the minimum MS error over all s_i and a. The MAP estimate â_map(r) is the solution to (1.357) and is optimum in the sense that it locates the maximum of the a posteriori probability density function; however, the solution must be checked to determine if it corresponds to the global maximum in the event of a multimodal distribution.

By applying Bayes rule to (1.357), the MAP estimate is expressed in terms of the a priori pdf, p_a(a), and the likelihood function, , as

(1.358)

When the a priori probabilities are unknown, that is, as the a priori knowledge approaches zero, (1.358) becomes the maximum‐likelihood equation and â_ml(r) is the maximum‐likelihood estimate, evaluated as the solution to

(1.359)

To make use of these estimates it is necessary to determine the bias and the variance of the estimate. The mean value of the estimate is computed as

(1.360)

The bias of the estimate is defined as . If, as indicated, the bias is a function of a, the estimate has an unknown bias, however, if the bias is B, independent of a, the estimate has a known bias that can be removed from the observation measurements r. In general, for any known biased estimate â(r) of the real random variable a, the variance is defined as

(1.361)

Although the bias and variance are often difficult to determine, the Cramér–Rao inequality provides a lower bound on the variance of the estimate. For a biased estimate of the random parameter a with a priori pdf p_a(a), the variance is lower bounded by the Cramér–Rao inequality [64, 66]

(1.362)

or, the equivalent result,

(1.363)

When the estimate is unbiased, that is, when , the estimation variance of the random variable a simplifies to

(1.364)

or, the equivalent result,

(1.365)

The Cramér–Rao bound in these relationships is formulated in terms of the Schwarz inequality and the equality condition applies when

(1.366)

where k is a constant. Therefore, (1.366) guarantees that the equality condition for the variance applies in (1.362) through (1.365); in this case, the MAP estimate is defined as an efficient estimate. Furthermore, an unbiased estimate, excluding the trivial case k = 0, requires that leading to (1.358).

When the a priori knowledge p_a(a) is constant, that is, the parameter a is nonrandom, or unknown, then (1.359) also requires that or . Under the maximum‐likelihood (ML) criteria Schwarz’s equality condition applies when

(1.367)

In this case, the constant k(a) may be a function of a; this condition only applies when parameter a is a constant which corresponds to the ML estimate.

Van Trees lists three principles based on the forgoing results:

The mean‐square (MS) error estimate is always the mean of the a posteriori density, that is, the conditional mean.
The MAP estimate corresponds to the value of a for which the a posteriori density is maximum.
For a large class of cost functions, the optimum estimate is the conditional mean whenever the a posteriori density is a unimodal function which is symmetric about the conditional mean. The Gaussian pdf is a commonly encountered example.

By way of review, the estimates are evaluated using the a posteriori pdf; however, if the parameter is a random variable, the a posteriori pdf must be expressed in terms of the transition distribution and the a priori pdf of the random parameter using Bayes rule. If the estimate is unbiased, that is, if = 0, evaluation of the Cramér–Rao bound simplifies to (1.364); it is sometimes necessary to use the equivalent expression in (1.365). The Cramér–Rao equality condition is established if the left‐hand side of (1.366) can be expressed in terms of the right‐hand side where k is a constant parameter resulting from Schwarz’s condition for equality.

If the a priori knowledge is unknown then the maximum‐likelihood equation given in (1.359) is used to determine maximum‐likelihood estimate. In this case, the Cramér–Rao bound is established by omitting the dependence of p_a(a) in, (1.362) through (1.365) and the equality condition is established if the left‐hand side of (1.367) can be expressed in terms of the right‐hand side where, in this case, the constant k(a) is a function of the parameter a. With either the MAP or ML estimates, if the bias is zero and the equality condition applies, the estimate is referred to as an efficient estimate.

Van Trees shows that for the MS estimate to be an efficient estimate, the a posteriori probability density must be Gaussian for all r and, for efficient MAP estimates, . However, it may be easier to solve the MAP equation than to determine the conditional mean as required by the MS estimation procedure.

1.9.1 Example of MS and MAP Parameter Estimation

As an example application of the parameter estimation procedures discussed earlier, consider the Poisson distribution that is used to predict population growth, telephone call originations, gamma ray emissions from radioactive materials, and is central in the development of queueing theory [68]. For this example, the Poisson distribution is characterized as

(1.368)

In the application of (1.368) to queueing theory, a = λt is the average number of people entering a queueing line in the time interval 0 to t and λ is the arrival rate. The a posteriori distribution is the probability of a conditioned on exactly n arrivals occurring in the time interval. A fundamental relationship in the Poisson distribution is that the time interval between people entering the queueing line is exponentially distributed and is characterized by the a priori distribution

(1.369)

The a posteriori pdf in (1.368) is expressed in terms of the a priori and transition pdfs as

(1.370)

where the constant k is a normalizing constant that includes 1/p_n(n). Integrating of the second equality in (1.370) with respect to da over the range 0 to ∞ and setting the result equal to one, the value of k is found to be k = 2ⁿ⁺¹ and (1.370) becomes

(1.371)

Using (1.356) and (1.371) the MS estimate is evaluated as

(1.372)

Also, using (1.357) and (1.371) the MAP estimate is evaluated as

(1.373)

and solving the second equality in (1.373) for a results in . As is typical in many cases, the MS and MAP estimation procedures result in the same estimate. It is left as an exercise (see Problem 38) to determine the bias of the estimates, compute the Cramér–Rao bound, and using (1.366), determine if the estimates are efficient, that is, if the Cramér–Rao equality condition applies.

1.9.2 Constant‐Parameter Estimation in Gaussian Noise

To simplify the description of the estimation processing, the analysis in this section considers the single constant‐parameter case with zero‐mean narrowband additive Gaussian noise. Under these conditions, the estimation is based on the solution to the maximum‐likelihood equation with the joint pdf of the received signal and noise written as

(1.374)

where a is the constant parameter to be estimated and r_i = s_i + n_i represents the received signal samples. The sampling rate satisfies the Nyquist sampling frequency and the second equality in (1.374) recognizes that the noise samples are independent. The following analysis is based on the work of the Woodard [24] and Skolnik [61] and uses the maximum‐likelihood estimate of (1.359) with the Cramér–Rao bound expressed by (1.365).

Using (1.374) with zero‐mean AWGN, the minimum Cramér–Rao bound on the variance of the estimate is expressed as

(1.375)

In arriving at the third equality in (1.375) the factor k is independent of the parameter a and, it is recognized that, where B = 1/T_e is the bandwidth corresponding to the estimation time. The integral is formed by letting Δt → 0 as the number of samples N → ∞ over the estimation interval T_e. Upon taking the logarithm of the product kexp(−) and performing the partial derivatives on the integrand, (1.375) simplifies to

(1.376)

The last equality in (1.376) is the basis for determining the variance and is obtained by moving the expectation inside of the integral and recognizing that . The following example outlines the procedures for estimating the variance of the estimate using the ML procedures.

1.9.2.1 Example of ML Estimate Variance Evaluation

Consider the signal s(t) expressed as

(1.377)

where A is the peak carrier voltage, ω_o is the IF angular frequency, is the angular frequency rate, and ϕ is a constant phase angle; the signal power is defined P_s = A²/2.

The variance of the frequency estimate is determined by squaring the partial derivative of s(t) respect to ω_o and integrating over the estimation interval T_e as indicated in (1.376). Under these conditions the analysis of the Cramér–Rao lower bound is performed as follows.

(1.378)

Upon neglecting the term involving 2ω_o and performing the integration, (1.378) becomes

(1.379)

where γ_e = P_sT_e/N_o is the signal‐to‐noise ratio in the estimation bandwidth of 1/T_e. In terms of the carrier frequency f_o in hertz, the standard deviation of the estimate is

(1.380)

In a similar manner, the standard deviation of the frequency rate and phase are evaluated as

(1.381)

and

(1.382)

The evaluation of the standard deviation of the signal amplitude (A) estimation error is left as an exercise (see Problem 39).

1.9.3 Received Signal Delay and Frequency Estimation Errors

Accurate estimation of the signal delay and frequency are essential in aiding the signal acquisition processing by minimizing the overall time and frequency search ranges. The delay estimation accuracy (σ_Td) is inversely related to the signal bandwidth (B) and the signal frequency estimation accuracy (σ_fd) is inversely related to the time duration (T) of the signal. Neglecting the signal‐to‐noise dependence of each measurement, these inverse dependencies are evident, in that, the product σ_Tdσ_fd ∝ 1/TB where TB is the time‐bandwidth product of the waveform. For typical modulated waveforms T and B are inversely related so that simultaneous accurate time and frequency estimates are not attainable. However, the use of spread‐spectrum (SS) waveform modulation provides for arbitrarily large BT products with simultaneous accurate estimates of T_d and f_d. The analysis of delay and frequency estimation errors in the following sections is based on the work of Skolnik [61] and can be applied to conventional or SS‐modulated waveforms. In Section 1.9.3.3 delay and frequency estimation is examined using a DSSS‐modulated waveform.

1.9.3.1 Delay Estimation Error Based on Effective Bandwidth

The signal delay measurement accuracy using the effective signal bandwidth was introduced by Gabor [69] and is discussed by Woodward [24] and defined as the standard deviation of the delay measurement expressed as³⁴

(1.383)

where γ_e = P_s/N_oW = E/N_o is the signal‐to‐noise ratio³⁵ measured in the two‐sided bandwidth W, N_o is the one‐sided noise density, P_s is the signal power, and β is the effective bandwidth of the signal. β² is the normalized second moment of the waveform spectrum |S(f)|², defined as

(1.384)

The denominator in (1.384) is the signal energy and the integration limits extend over the frequency range f ≤ |W/2| corresponding to the nonzero signal spectrum. The one‐way range error corresponding to (1.383) is meters where c is the free‐space velocity of light in meter/second.

For the rectangular symbol modulation function rect(t/T), band limited to W Hz with β² ≅ W/T and large time‐bandwidth products WT/2, (1.383) is evaluated as

(1.385)

1.9.3.2 Frequency Estimation Error Based on Effective Signal Duration

In a manner similar to the analysis of the delay estimation error in the preceding section, Manasse [70] shows that the, minimum root‐mean‐square (rms) error in the frequency estimate is given by³⁶

(1.386)

where γ_e = E/N_o is the signal‐to‐noise ratio measured in the two‐sided bandwidth W, N_o is the one‐sided noise density, P_s is the signal power, and α is the effective time duration of the received signal. The parameter α² is the normalized second moment of the waveform s(t) and is defined as

(1.387)

The Doppler frequency results from the velocity (v) and the carrier frequency (f_c) and is expressed as f_d = (v/c)f_c. Frequency errors resulting from hardware oscillators are usually treated separately and combined as the root‐sum‐square (RSS) of the respective standard deviations.

For the band‐limited rect(t/T) symbol modulation used in the preceding section, the normalized second moment is evaluated as α² ≅ (πT)²/3 and (1.386) is expressed as

(1.388)

Comparison of (1.385) and (1.388) demonstrates the inverse relationship between the estimation accuracy of the range‐delay and frequency errors for conventional (unspread) modulations. For example, for a given time bandwidth (WT) product and signal‐to‐noise ratio (γ_e), the delay estimate error decreases with decreasing symbol duration; however, the frequency estimate error increases. The issue resolves about the signal‐to‐noise ratio in the estimation bandwidth. For example, with conventional waveform modulations, WT = 2BT = 2 so BT = 1 and the bandwidth changes inversely with the symbol duration. Consequently, by decreasing symbol duration, the bandwidth increases resulting in a signal‐to‐noise γ_s, measured in the symbol bandwidth, degradation of B/B′ where . Therefore, in the previous example, to maintain a constant signal‐to‐noise ratio γ_e the estimation interval must be appropriately adjusted. As mentioned previously, the solution to simultaneously obtaining accurate estimates of range delay and frequency while maintaining a constant γ_s, involves the use of a SS‐modulated signals with an inherently large WT product as discussed in the following section.

1.9.3.3 Improved Frequency and Time Estimation Errors Using the DSSS Waveform

The DSSS waveform uses a pseudo‐noise (PN) sequence of chips with and instantaneous bandwidth (W) over the estimation interval (T) as shown in Figure 1.29.³⁷ The resulting large WT product signal provides for arbitrarily low time and frequency estimation errors. This is accomplished by the respective selection of a high bandwidth (short duration) chip interval (τ) and the low bandwidth (long duration) estimation interval T. The estimation interval can be increased to improve the frequency estimate; conversely, the chip interval can be decreased to improve the range‐delay estimate; however, to maintain the accuracy of the other, the number of chips per estimation interval (N) must be increased. These relationships are described in terms of the pulse compression ratio, defined as ρ = T/τ = N. In Figure 1.29 the chips are depicted as appropriately delayed Ad_nrect((t−n)/τ − 0.5): n = 0, …, N − 1 functions and, because of the equivalence between the correlator and matched filter, the peak correlator output is a triangular function with a peak value³⁸ of AN. When sampled at t = Nτ, the correlator output results in the maximum signal‐to‐noise ratio measured in the bandwidth of 1/T Hz.

3 Graphs of transmitted signal (top left), signal spectrum (top right), and demodulator correlation response (bottom) illustrating time–frequency estimation using DSSS waveform. — **FIGURE 1.29** Time–frequency estimation using DSSS waveform.

Based on the fundamental principles for jointly achieving accurate time and frequency estimates as stated earlier, the triangular shape of the wide bandwidth correlator output is related to the accuracy of the time estimate and the low bandwidth sampled output at interval of T = Nτ determines the accuracy of the frequency estimate. Therefore, evaluation of the time and frequency estimation accuracies of the DSSS waveform involves evaluating, respectively, the effective bandwidth (β) of the triangular function and the effective time duration (α) of the rect(t/T − 0.5) function.

Delay Estimation Error of the DSSS Waveform

The delay estimation error is based on detecting the changes in the leading and trailing edge of wideband signals. This does not require that the signal has a short duration but that the bandwidth is sufficiently wide to preserve the rapid rise and fall times of the correlator response. On the other hand, received signals with additive noise must be detected and the parameters estimated under the optimum signal‐to‐noise conditions as provided by matched filtering or correlation. In this regard, the correlator output in Figure 1.29 is examined in the context of the signal delay estimate error.

The triangular function, corresponding to the correlator output, is an isosceles triangle with base and height equal to 2τ and AN, respectively, and is described as

(1.389)

where, for convenience, ξ = t − Nτ such that the time axis is shifted so that the isosceles triangle is symmetrical about ξ = 0. The effective bandwidth of R_s(ξ) is evaluated (see Problem 41) as

(1.390)

and the corresponding standard deviation of the delay estimate is

(1.391)

Frequency Estimation Error of the DSSS Waveform

The frequency estimation error is based on the interval T of the PN sequence under the conditions corresponding to the local PN reference being exactly synchronized with and multiplied by the received signal; in other words, with zero frequency and phase errors, the integrand of the correlation integral is constant over the interval T. However, with a frequency error of f_ε Hz the correlator response is computed as

(1.392)

The principal frequency error in the main lobe of the sinc(f_εT) function corresponds to| f_ε| ≤ 1/T which defines the fundamental resolution accuracy of the frequency estimate. However, the effective duration of the correlator of length T = Nτ is evaluated (see Problem 42) as

(1.393)

and the corresponding standard deviation of the frequency estimate is

(1.394)

Considering the SS pulse compression ratio, or processing gain, ρ = T/τ, the correlator output signal‐to‐noise ratio (γ_e) in (1.391) and (1.394) is measured in the bandwidth of 1/T. The product of the estimation accuracies of the SS waveform is

(1.395)

Therefore, the time and frequency estimates accuracies can be made arbitrarily low, even in low signal‐to‐noise ratio environments, by designing a SS waveform with a sufficiently high pulse compression ratio.

1.9.3.4 Effective Bandwidth of SRRC and SRC Waveforms

In view of the increasing demands on bandwidth, the spectral containment of the spectral raised‐cosine (SRC) waveform meets the corresponding need for spectrum conservation. Although the spectral root‐raised‐cosine (SRRC) waveform has a slightly wider bandwidth than the SRC waveform, it is preferred because of the improved matched filter detection³⁹ and, in the context of range delay estimation, provides for a somewhat better range delay estimate. The spectrum of the SRC waveform is characterized, in the context of a spectral windowing function, in Section 1.11.4.1 and the spectrum of the SRRC is characterized in Section 4.3.2 in the context of the optimum transmitted waveform for root‐raised‐cosine (RRC) waveform modulation. The following analysis compares the effective bandwidth of the SRC and SRRC waveforms with the understanding that the SRC delay estimate is based on the matched filter out samples taken symmetrically about the optimum matched filter sample at t = T_o.

The dependence of the effective bandwidth of the SRRC and SRC waveforms on the excess bandwidth parameter α is expressed as

(1.396)

and

(1.397)

Equations (1.396) and (1.397) are plotted in Figure 1.30 that demonstrates the advantages of the wider bandwidth SRRC waveform in providing short rise‐times symbols with the associated improvement in range‐delay detection. In this regard, the rect(t/T) modulated received symbol, as characterized by the BPSK‐modulated waveform, has zero rise‐time and results in perfect range‐delay detection in a noise‐free channel and receiver; however, infinite bandwidth is required to achieve this performance. The dashed curve in Figure 1.30 shows the normalized effective bandwidth of the rect(t/T) modulated symbol after passing through an ideal (1/B)rect(f/B) filter with one‐sided bandwidth B/2 Hz; this is referred to a filtered BPSK and is discussed in the following section.

Graph illustrating normalized effective bandwidths for SRRC and SRC waveforms with a diagonal dashed line for filtered BPSK. — **FIGURE 1.30** Normalized effective bandwidths for SRRC and SRC waveforms.

The noise bandwidth of the SRRC frequency function is significant, in that, it corresponds to the demodulator‐matched filter response used in the detection of the SRRC‐modulated waveform. On the other hand, the interest in the noise bandwidth of the SRC is more academic in nature because of its application as a windowing function. In either event, the noise bandwidth of the SRRC and SRC frequency functions is examined in Problem 44.

1.9.3.5 Effective Bandwidth of the Ideally Filtered rect(t/T) Waveform

In this case the effective bandwidth of the ideal symbol modulation, characterized by rect(t/T), is evaluated after passing through an ideal filter with frequency response (1/B)rect(f/B) where B/2 is the one‐sided or low‐pass bandwidth of the filter. The filter response is examined in Section 1.3 and the normalized effective bandwidth of the filtered symbol is characterized by Skolnik [71] as

(1.398)

This result is plotted in Figure 1.31 as a function of BT where T is the symbol duration. From this plot it is evident that as BT approaches infinity the standard deviation of the range‐delay estimate approaches zero resulting in an exact estimate of the true range delay. A practical application is to define a finite bandwidth which is sufficiently wide so as not to degrade the symbol detection through intersymbol interference.

Graph of normalized effective bandwidth vs. ideal filter time-bandwidth product illustrating an ascending wave. — **FIGURE 1.31** Normalized effective bandwidth for filtered *rect*(t/T) waveform.

Defining the excess bandwidth factor for the ideal filter as , where R_s = 1/T is the rect(t/T) symbol rate, in terms of the excess bandwidth factor α of the raised‐cosine (RC) waveform, . The corresponding range of BT is 1/2 ≤ BT ≤ 1; this range of the filtered rect(t/T) effective bandwidth from Figure 1.31 is plotted as the dashed curve in Figure 1.30. The range of BT results in significant intersymbol interference and received symbol energy loss even under the ideal conditions of symbol time and frequency correction. However, under the same conditions, if the SRRC waveform and matched filter responses are sufficiently long, the intersymbol interference and symbol energy loss will be negligible. At the maximum SRRC normalized effective bandwidth of βT = 2.27, the filter time bandwidth product corresponds to BT = 2.64 or a 32% increase with the filter bandwidth spanning the main signal spectral lobe and 16% of the adjacent sidelobes. In other words, the one‐sided filter bandwidth spans 1.32 lobes and, referring to Appendix A, this results in a performance loss of about 1.25 dB for BPSK waveform modulation; for a loss of less than 0.3 dB the BT product should be greater than 5 with a resulting effective bandwidth of βT = 3.23 corresponding to a (3.23/2.27 − 1)100 = 42% improvement relative to the best SRRC range‐delay estimation error; however, the required bandwidth is 150% wider. The bandwidth and range‐delay estimation accuracy are design trade‐off in the waveform selection.

1.10 MODEM CONFIGURATIONS AND AUTOMATIC REPEAT REQUEST

The three basic modulator and demodulator configurations are simplex, half‐duplex, and full‐duplex. The definition of simplex communications involves communication in one direction between a modulator/transmitter and a remote receiver/demodulator. Examples of simplex communications include broadcasting from radio and television stations or from various types of monitoring devices. Half‐duplex communications is a broader definition including two‐way communications but only in one direction at a time. In these cases, transceivers and modems are required at each location. A common application of half‐duplex operation is the push‐to‐talk handheld radios. Full‐duplex communications provide the capability to communicate in both directions simultaneously. In these cases the bidirectional communications may use identical transceivers and modems operating at the same symbol rate; however, as is often the case, the communication link in one direction may be designated as the reverse channel and operated at a lower symbol rate. In either event, the forward and reverse channels must operate at different, noninterfering, frequencies.

The transfer of data is often performed using information frames or packets, each containing a cyclic redundancy check (CRC) code for error checking. If an error is detected the receiving terminal requests that the frame be retransmitted, otherwise an acknowledgment may be returned indicating that the frame was received without error. These protocols are referred to as automatic repeat request [72] (ARQ). The ARQ protocol requires either a half‐duplex or full‐duplex communication capability. The two commonly used variations of the ARQ protocol are generally referred to as idle‐repeat request (RQ) and continuous‐RQ.⁴⁰ However, more complex variations involving point‐to‐point and multipoint protocols are also defined.⁴¹

The remainder of this section analyzes the idle‐RQ protocol which is the simplest ARQ system to implement and evaluate, in that, when a data frame is transmitted a timer is initiated and a new frame is transmitted only after acknowledgment (ACK) that the current frame was received without errors and/or the timer has not exceeded a maximum timeout T_max. However, the current frame is retransmitted if the timer exceeds T_max, a negative acknowledgment (NAK) is received, indicating the receipt of an incorrect frame, or the ACK or NAK code is received in error. The timer limit is based on the information bits per frame, the bits in the ACK and NAK codes, the date rates, and the expected two‐way link propagation delay through the media. The idle‐RQ implementation also has the advantage of requiring less data storage compared to the continuous‐RQ protocol and the Go‐back‐N protocol [73]. However, the performance cost of these advantages is that the end‐to‐end transmission efficiency is lower and more sensitive to the link propagation delay. The end‐to‐end transmission efficiency is defined as

(1.399)

where is the average bit rate over the forward channel and R_bf is the uninterrupted forward channel bit rate.

The idle‐RQ is modeled as shown in Figure 1.32 with the delays and other parameters defined in Table 1.12.

Schematic model of idle-AQ implementation depicting primary and secondary terminal and forward and reverse channel. — **FIGURE 1.32** Model of idle‐RQ implementation.

TABLE 1.12 Idle‐ARQ Parameter Definitions

Delay	Value	Description
T_df	(N_b + N_crc)/R_bf	Forward message duration
T_sf	N_sf/R_bf	Forward synchronization duration
T_p	range/c	Propagation delay between terminals
T_dr	(N_br+N_chk)/R_br	Reverse message duration
T_sr	N_sr/R_br	Reverse synchronization duration
T_cs	0	Secondary terminal computational delay (ms)^a
T_cp	0	Primary terminal computational delay (ms)^a
T_max	>T_min	Idle‐RQ maximum idle time (ms)
N_B	Variable	Message bits: (N_b = info) + (N_crc = CRC)
N_sf	30	Forward synchronization bits
R_bf	100	Forward bit rate (kbps)
range	Parameter	One‐way: 18.5, 200, 600, 35,800 (km)
c	3 × 10⁸	Free‐space velocity (m/s)
N_brt	30	ARQ bits:(N_br=ACK) + (N_chk = parity)
N_sr	10	Reverse synchronization bits
R_br	= R_bf	Reverse bit rate (kbps)
P_bef	Parameter	Forward channel bit‐error probability: 10⁻⁴, 10⁻⁵
P_ber	= P_bef	Reverse channel bit‐error probability

^aThe CRC and parity check codes provide instantaneous error decisions.

Using the parameters described earlier, the average time associated with the transmission and acknowledgment is described as

(1.400)

where is the average number of frame repetition based on the specified bit‐error probability and the number of frame information and CRC bits N_B. The computation of is based on iid additive white Gaussian channel noise over all of the N_B bits. This provides for the probability of a correct message to be expressed in terms of the discrete binomial distribution⁴² given the bit‐error probability P_bef; the result is expressed as

(1.401)

Therefore, using (1.401), the average number of transmissions required to obtain an error‐free frame with N_B bits is evaluated as

(1.402)

The idle‐RQ transmission efficiency, as defined in (1.399), is expressed as

(1.403)

The number of forward channel bits per frame is defined as⁴³

(1.404)

With this definition, the transmission efficiency is expressed explicitly in terms of N_B, by substituting (1.400) and (1.402) into (1.403) and, after some simplifications, the efficiency is expressed as

(1.405)

where K_B is defined as

(1.406)

The idle‐RQ efficiency expressed in (1.405) is plotted in Figure 1.33 for several one‐way communication link ranges; the solid curves correspond to P_bef = 10⁻⁵ and the dashed curves correspond to P_bef =10⁻⁴. The impact of the link propagation delay (T_p) is significant and results in long idle times for the ACK/NAK response. The performance is also dependent on the link bit‐error probabilities, the bit rate, and the number of bits per frame.

Graph of idle-RQ efficiency vs. kilobits-per-frame depicting solid and dashed curves and lines for LEO and GSO satellites. — **FIGURE 1.33** Idle‐RQ efficiency as function of bits/frame (*R_bf* = 100 kbps, *P_bef* = 10⁻⁵ *solid*, 10⁻⁴ *dashed* curve).

The optimum number of forward channel bits per frame, N_B, corresponding to the maximum efficiency, η_trans(max), is evaluated by differentiating (1.405) with respect to N_B and setting the result equal to zero and then solving for N_B(opt). Following this procedure, the optimum N_B, corresponding to the maximum efficiency, is evaluated as the solution to the quadratic equation

(1.407)

where . The solution to (1.407) is

(1.408)

Using the example parameters, N_B(opt) is listed in Table 1.13 for the conditions shown in Figure 1.33.

TABLE 1.13 Optimum N_B^a Corresponding to the Maximum Efficiency Conditions in Figure 1.33

*P_bef*	Range (km)
*P_bef*	18.5	200	600	35,800
1e−3	204	338	478	960
1e−4	697	1,232	1,889	7,589
1e−5	2,261	4,077	6,416	38,380
1e−6	7,208	13,079	20,757	143,125
1e−7	22,850	41,546	66,112	477,137

^aN_B in bits.

Optimizing the message frame size using N_B(opt) has limited practical appeal because the resulting maximum efficiency may be unacceptable or because of the broad range of N_B over which the slope around η_trans(max) is virtually unchanged; this latter point is seen in Figure 1.33 corresponding to P_bef = 10⁻⁵. Selecting N_B from a range that satisfies an acceptable minimum transmission efficiency is a preferable criterion and operating at low bit‐error probabilities offers a wider range of selections.

1.11 WINDOWS

Windows have been characterized and documented by a number of researchers [74, 75], and this section focuses on the windows that are used in the simulation codes in the following chapters to enhance various performance objectives. Windows are applied in radar and communication systems to achieve a variety of objectives including antenna sidelobe reduction, improvements in range resolution and target discrimination, spectral control for adjacent channel interference (ACI) reduction [76], ISI control, design of linear phase filters, and improvements in parameter estimation algorithms. Windows can be applied as time or frequency functions to achieve complementary results depending on the application.

Windows are described in terms of the discrete‐time samples⁴⁴ w(n) where n is indexed over the finite window length of N samples. When the window is applied as a discrete‐frequency sampled window the notation W(n) is used. The discrete‐time sampled rectangular window, defined, for example, as w(n) = 1 for |n| ≤ N/2 and zero otherwise, is typically used as the reference window by which the performance measures of other windows are compared. The spectrum of the time‐domain rectangular window is described in terms of the sinc(x) function as S(f) = sinc(nf/N). The rectangular window is also referred to as a uniformly weighted or simply as an unweighted window.

Several window parameters [75] that are useful in selecting a window for a particular application are the gain, the noise bandwidth, and the scalloping loss. The window voltage gain is defined as

(1.409)

The noise bandwidth of the window follows directly from the definition of the noise bandwidth defined by Equation (1.46). In terms of the discrete‐time sampling and application of Parseval’s theorem, the normalized noise bandwidth is expressed as

(1.410)

In terms of Hertz, the bandwidth is given by

(1.411)

where 1/T_w Hz is the fundamental frequency resolution of the window with duration T_w seconds. The theoretical normalized noise bandwidth of the rectangular window is , so the noise bandwidth is B_n = 1/T_w Hz. Harris [75] defines the scalloping loss (SL) of a time‐domain window, as the frequency domain loss, relative to the maximum level, mid‐way between two maximum adjacent DFT outputs. The scalloping loss is expressed as

(1.412)

The characteristics of the various windows considered in the following sections are summarized in Tables 1.14 and 1.15. The maximum sidelobes correspond to those adjacent to the central, or main, lobe, and apply to the time (or frequency) domains depending on whether the window is applied, respectively, in the frequency (or time) domains. The scalloping loss results from the frequency or time domain ripple resulting from contiguous repetitions of the window functions. Harris has compiled an extensive table of windows and their performance characteristics.

TABLE 1.14 Summary of Window Performance Results

Window	Max. Sidelobe (dB)	At ±fT_w	*Scalloping Loss (dB) with T_w* Zero‐Padding**
Window	Max. Sidelobe (dB)	At ±fT_w	0	1	2	3
Rectangular	−13.26	1.5	3.92	0.91	0.4	0.18
Bartlett	−26.4	3.0	1.81	0.44	0.20	0.09
Blackman	−58.2	3.6	1.09	0.27	0.12	0.05
Blackman–Harris	−92.0	4.52	0.82	0.20	0.09	0.04
Hamming	−42.6	4.5	1.74	0.43	0.19	0.09
Cosine k = 1	−23.0	1.89	2.08	0.51	0.22	0.10
k = 2	−31.5	2.4	1.41	0.35	0.15	0.07
k = 3	−39.3	2.83	1.06	0.26	0.12	0.05
k = 4	−46.7	3.33	0.85	0.21	0.09	0.04

Using N = 200 samples/window.

TABLE 1.15 Summary of Window Performance Results

	Bartlett		Blackman		Blackman–Harris		Hamming
Samples (N)		*G_v*		*G_v*		*G_v*		*G_v*
1000	1.3357	0.4995	1.728	0.4196	2.006	0.3584	1.364	0.5395
500	1.335	0.499	1.730	0.419	2.008	0.358	1.365	0.539
200	1.340	0.498	1.735	0.418	2.014	0.357	1.368	0.538
100	1.347	0.495	1.744	0.416	2.025	0.355	1.373	0.535
50	1.361	0.490	1.762	0.412	2.045	0.352	1.383	0.531
25	1.394	0.480	1.799	0.403	2.088	0.344	1.403	0.522
12	1.467	0.455	1.884	0.385	2.186	0.329	1.450	0.502
	Cosine k = 1		Cosine k = 2		Cosine k = 3		Cosine k = 4
1000	1.235	0.636	1.502	0.4995	1.737	0.424	1.946	0.3746
500	1.236	0.635	1.503	0.499	1.738	0.4236	1.948	0.3742
200	1.240	0.633	1.507	0.498	1.744	0.422	1.954	0.373
100	1.246	0.630	1.515	0.495	1.752	0.420	1.964	0.371
50	1.260	0.624	1.531	0.490	1.770	0.416	1.984	0.368
25	1.290	0.610	1.562	0.480	1.807	0.407	2.026	0.360
12	1.364	0.580	1.636	0.458	1.892	0.389	2.121	0.344

Noise bandwidth and voltage gain.

The duration of the window and the manner in which it is sampled is dependent upon the application. In the following descriptions, the sampling is applied to windows that are symmetrical and asymmetrical about t = 0 as described by (1.413) using the rect(x) window. The rect(x) window is a uniformly weighted window⁴⁵ that is used to describe the window delay⁴⁶ and duration T_w for all arbitrarily weighted windows.

(1.413)

The parameter T_d introduces a delay and m is an integer corresponding to a contiguous sequence of windows; in the following analysis T_d and m are zero. Considering N to be an integer number of samples over the window duration with the sampling interval of T_s = T_w/N seconds, the windows are characterized, with a maximum value of unity, in terms of the sample index n. In the following examples, the Bartlett or triangular window⁴⁷ is used and the respective odd and even values of N are 9 and 8. For odd integers N, the asymmetrical triangle window is expressed as

(1.414)

and the symmetrical triangle window is expressed as

(1.415)

For the even integers, the asymmetrical triangle window is expressed as

(1.416)

and the symmetrical triangle window is expressed as

(1.417)

Equations (1.414) through (1.417) are plotted in Figure 1.34 with the circles indicating the window sampling instances. The distinction between the symmetrical and asymmetrical sampling is evident and must be applied commensurately to the sampled data. In this regard, the windows can be applied, for example, to the transmitted data‐modulated symbols for spectrum control, to the received data symbol for detection, or to the FFT window for spectrum evaluation. With minimum shift keying (MSK) modulation, discussed in Section 4.2.3.4, a cosine window is applied to each quadrature rail that are delayed, or offset, by one‐half symbol period, so the notion of symmetrical and asymmetrical windows applies.⁴⁸ The examples using an odd number of window samples, shown in Figure 1.34a and b, include the first and last window samples that are zero and increase symmetrically to the peak value. Although these cases are visually appealing, the sampled windows in Figure 1.34c and d use an even number of samples per window and result in same performance commensurate with the Nyquist sampling rate. Furthermore, using an even number of samples is suitable for analysis using the efficient FFT.

Four graphs of normalized amplitude vs. sample illustrating triangular window sampling plotted as circles with lines. — **FIGURE 1.34** Examples of triangular window sampling.

The case for using an even number of samples can be made based on the down‐sampled output of a high sample rate analog‐to‐digital converter. For example, relative to a received data‐modulated symbol, the down‐sampled, or rate‐reduced, output is the average symbol amplitude of the down‐sampled interval from nT_s to (n + 1)T_s, so the window weighting should be applied to the point mid‐way between the two samples as is often done using rectangular integration. This is accommodated by the even number of samples over the window interval shown in Figure 1.34c and d, with the understanding that the data sample at n = n′ + 0.5 results from the linear interpolation of the data between sample n′ and n′ + 1. In addition to applying the correctly aligned window with interpolated data samples, this sampling arrangement also removes the delay estimation bias thereby improving the symbol tracking performance. Of course, the roles of the odd and even sampling can be reversed; however, it is convenient to use an even number of samples per symbol into the detection‐matched filter for symbol tracking and the detection of quadrature symbol offset modulations. However, as long the Nyquist sampling criterion is satisfied the symbol information can be extracted in either case. In the following sections, the spectrums of the various windows are evaluated using both symmetrically and asymmetrically sampled windows with an odd number of samples. The reason for this choice is simply based on esthetics or eye appeal which is particularly noticeable when only a few number of samples is used.

1.11.1 Rectangular Window

In the time domain, the rectangular window is a uniformly weighted function described by the rect(t/T_w) function with amplitude equal to unity over the range |t/T_w| ≤ 1/2 and zero otherwise. Expressing the time in terms of the discrete samples t = nT_s, where T_s is the sampling interval, results in the sample range |n| ≤ (N − 1)/2 for the symmetrical window and 0 ≤ n ≤ N − 1 for the asymmetrical window, where N = T_w/T_s is the total number of samples per window.

The spectrum of the rectangular window is described in terms of the sinc(fT_w) function and, upon letting f = nδf and δf = Δf/L, where Δf = 1/T_w is the fundamental frequency resolution of the window, the spectrum is expressed as

(1.418)

Defining the frequency increment in this way allows for L samples per spectral sidelobe. The Fourier transforms relationship between the time and frequency domains is discussed in Section 1.2. Special applications involving the rect(x) window in the time and frequency domains are discussed in Sections 1.11.4, 4.4.1, 4.4.4, and 4.4.5.

1.11.2 Bartlett (Triangular) Window

The sampling of the Bartlett or triangular window is discussed earlier under a variety of conditions that depend largely on the application and signal processing capabilities. The Bartlett window is shown in Figure 1.35 for N = 21 samples per symbol. Considering the frequency dependence of the spectral attenuation, expressed as sinc²(fT_w/2), the spectral folding about the Nyquist band f_N = f_s/2 = N/2T_w is negligible for N = 201 and the peak spectral sidelobes are virtually identical to the theoretical Bartlett spectrum over the range of frequencies shown in Figure 1.36b, although the sidelobe level at fT_w = 29 is −65.8 dB and the theoretical value is −66.5 dB. Furthermore, upon close examination, the peak values of the sidelobe are progressively shifted to the right with increasing fT_w. These observations are more evident for the case involving N = 21 where the first sidelobe level is −26.8 dB (1 dB higher that theory) and the sidelobe at fT_w = 9 is significantly skewed to the right with a level 6 dB higher than the theoretical value. The sidelobe skewing is a direct result of the odd symmetry of the folded spectrum about 2kfT_w : k = 1, 2, … which does not occur when N is even. However, with even values of N, the sidelobes are still altered by the folded spectral sidelobes. This phenomenon is a direct result of the sampling and the implicit periodicity of the window when using the discrete Fourier transform. These observations do not alter the utility of windows for spectral control; however, they may influence the spectral detail in applications involving spectral analysis.

Graph of window amplitude vs. sample (n) illustrating Bartlett window with N=21 samples plotted as circles with lines. — **FIGURE 1.35** Bartlett window with N = 21 samples.

2 Graphs of normalized spectrum magnitude vs. normalized frequency illustrating Bartlett (triangular) window spectrums with N= 21 samples (left) and N= 201 samples (right). — **FIGURE 1.36** Bartlett (triangular) window spectrums.

1.11.3 Cosine Window

The sidelobes of the window spectrum can be reduced by providing additional shaping or tapering of the window function at the expense of a wider main spectral lobe that results in a higher noise bandwidth. For example, the rectangular window has an abrupt change in amplitude leading to a first sidelobe level of −13 dB, a spectral roll‐off proportional to 1/f ² (6‐dB/octave), and a noise bandwidth of 1/T_w Hz. On the other hand, the Bartlett window has an abrupt change in the slope of the amplitude leading to a first sidelobe level of −26.9 dB, a spectral roll‐off of 1/f ⁴ (12‐dB/octave), and a noise bandwidth of 1.336/T_w. The cosine window results in the k‐th derivative of the amplitude, that is, the slope of the amplitude at the edges of the window is zero resulting in even greater spectral sidelobe roll‐off.

The cosine window is characterized as

(1.419)

where δ₁ = 0 when and δ₁ = −π/2 when n = 0, …, N − 1; this corresponds to a sin^k(ϕ_n) window function. The phase function in (1.419) is expressed as

(1.420)

where δ₀ = 0 when n ∈ odd and δ₀ = 0.5 when n ∈ even. The cosine window is shown in Figure 1.37 for n > 0 and k = 1, …,4. In the following subsections, the cosine windows for various values of k are described for the conditions δ₀ = δ₁ = 0.

Graph of window amplitude vs. sample (n) illustrating four cosine windows labeled k=1, 2, 3, and 4. — **FIGURE 1.37** Cosine windows (δ₀ = 0, δ₁ = −π/2).

The gain for the finite sampled cosine window with k = 1 and N samples per window is given by

(1.421)

and the normalized noise bandwidth is given by

(1.422)

G_v and are recorded in Table 1.15 for various values of N. Equations (1.421) and (1.422) approach their theoretical limits as N → ∞.

1.11.3.1 Cosine Window (k = 1)

With k = 1 the cosine window is expressed as

(1.423)

This window is used as the symbol weighting function for MSK modulation and, in terms of the window duration T_w, the theoretical spectrum is given by

(1.424)

The spectrum shown in Figure 1.38a is based on computer simulation with N = 200 samples per window.

3 Graphs of normalized spectrum magnitude vs. normalized frequency illustrating descending cosine window spectrum for k= 1 (top left), k= 2: raised cosine (hanning) (top right), and k= 3 and 4 (bottom). — **FIGURE 1.38** Cosine window spectrums.

1.11.3.2 Cosine‐Squared (Hanning) Window (k = 2)

The cosine‐squared window with k = 2 is referred to as a Hanning window [74]. Applying trigonometric identities, this window is expressed as

(1.425)

and the spectrum is shown in Figure 1.38b for N = 200 samples per window.

1.11.3.3 Cosine Window (k = 3 and 4)

Applying trigonometric identities, the cosine windows for k = 3 and 4 are expressed as

(1.426)

and

(1.427)

The spectrums for these two cases are shown in Figure 1.38c.

1.11.4 Temporal Raised‐Cosine (TRC) Window

The temporal raised‐cosine (TRC) window applies a cosine shaping function symmetrically about each end of the rectangular window function, rect(nΔt/T_w), as shown in Figure 1.39. In this case, the integer n corresponds to the samples N over the interval T_w such that N = T_w/Δt.

Schematic of temporal raised-cosine window illustrating rectangular window function from –Tw/2 to Tw/2 overlapped by a trapezoid. — **FIGURE 1.39** Temporal raised‐cosine window.

The TRC window is expressed as

(1.428)

where n′ = t/T_w = nΔt/T_w is the normalized sample index and α is a design parameter limited to 0 ≤ α ≤ 1. The TRC window is applied to the phase function of PSK‐modulated waveforms in Sections 4.2.8 and 4.4.3.9 to improve the waveform spectral containment while maintaining a constant signal amplitude; in this application α is referred to as the excess phase factor.

Letting m = f/T_w, the spectrum of the TRC amplitude window is evaluated as

(1.429)

1.11.4.1 Spectral Raised‐Cosine Window

The raised‐cosine window when applied in the frequency domain is referred to as the spectral RC (SRC) window and the parameter α is referred to as the excess bandwidth. With m = fT_w, the SRC frequency response is expressed as

(1.430)

and upon letting n = t/T_w, the SRC impulse response is evaluated as

(1.431)

This response has indeterminate solutions of the form 0/0 at n = 0 and 1/(2α) that are evaluated as and

(1.432)

Using these results, (1.431) is plotted in Figure 1.40 for several values of the excess bandwidth factor. The SRRC window, associated with spectral shaping of a modulated symbol, is discussed in detail in Section 4.3.2. In this application the symbol duration is T = T_w and the transmitted symbol and demodulator‐matched filter spectrums have a square‐root RC, or simply root RC, frequency response of . The matched filter output results in the SRC impulse response shown in Figure 1.40 with symbol spacing corresponding to integer values of n that results in zero ISI ∀n ≠ 0. With an AWGN channel and optimum matched filter sampling the demodulator performance corresponds to maximum‐likelihood detection.

Graph of response amplitude vs. sample (n) illustrating impulse response of SRC window with solid (0.2), dot-dashed (0.3), and dashed (0.4) curves. — **FIGURE 1.40** Impulse response of SRC window.

1.11.5 Blackman Window

The Blackman window uses three terms to provide more tapering of the window at the expense of a narrower time response and a wider noise bandwidth. The Blackman window is expressed as

(1.433)

where the phase function in (1.433) is expressed as

(1.434)

This definition of the phase function results in a symmetrical window for all values of N with indexing over as shown in Figure 1.41a for N = 21. The spectrum of the Blackman window corresponding to N = 200 samples per window is shown in Figure 1.41b.

2 Graphs of window amplitude vs. sample (n) illustrating Blackman window (N=21) (left) and normalized spectrum magnitude vs. normalized frequency illustrating spectrum (N=200) (right). — **FIGURE 1.41** Blackman window and spectrum.

1.11.6 Blackman–Harris Window

The Blackman–Harris window is expressed as

(1.435)

Harris [75] performed a gradient search on the coefficients c_i to minimize spectral sidelobe level and the results are given in Table 1.16 for a three‐ and four‐term window.

TABLE 1.16 Blackman–Harris Window Coefficients^a

Coefficient	3‐Term 67 dB	4‐Term 92 dB
c₀	0.42323	0.35875
c₁	0.49755	0.48829
c₂	0.07922	0.14128
c₃	0.0	0.01168

^aHarris [75]. Reproduced by permission of the IEEE.

The Blackman–Harris window, as formulated by Harris, divides the phase function by N that results in an asymmetrical window. To provide a symmetrical window for all N, the phase function ϕ_n is divided by N − 1 and expressed as

(1.436)

The spectrum of the 4‐Term 92 dB Blackman–Harris window is shown in Figure 1.42 for N = 200 samples per symbol and L = 40 samples in the bandwidth 1/T_w Hz.

Graph of normalized spectrum magnitude vs. normalized frequency illustrating Blackman–Harris window spectrum (N = 200, 92 dB, 4-term). — **FIGURE 1.42** Blackman–Harris window spectrum (N = 200, 92 dB, 4‐Term).

The coefficients for the Blackman–Harris function, as given in Table 1.16, do not result in window values of zero for the first and last sample resulting in a window pedestal of 6 × 10⁻⁵; this is a direct result of the coefficients not summing to zero at the window edges. The pedestal plays a critical role in the control of the sidelobes and noise bandwidth and windows with more dramatic pedestals are examined in the following sections.

1.11.7 Hamming Window

The Hamming window is expressed as

(1.437)

where the phase function ϕ_n is expressed as

(1.438)

The window is shown in Figure 1.43a with indexing over the range 0 ≤ n ≤ N − 1 and the spectrum is shown in Figure 1.43b.

2 Graphs of window amplitude vs. sample (n) illustrating Hamming window (N=21) (left) and normalized spectrum magnitude vs. normalized frequency illustrating spectrum (N=201) (right). — **FIGURE 1.43** Hamming window and spectrum.

1.11.8 Kaiser (Kaiser–Bessel) Window

The Kaiser window, also referred to as the Kaiser–Bessel window, is expressed as

(1.439)

where I_o(x) is the modified Bessel function of order zero and β is the time‐bandwidth product equal to T_wB where T_w is the window duration and B is the corresponding baseband bandwidth. The Kaiser window is shown in Figure 1.44a, using N = 51 samples per window, as a symmetrical window for β = 2, 3, and 4; the range −N/2 ≤ n ≤ N/2 corresponds to –T_w/2 ≤ t ≤ T_w/2. The corresponding spectrum of the Kaiser window is shown in Figure 1.44b.

2 Graphs of normalized response vs. normalized time (left) and normalized spectrum vs. normalized frequency (right) illustrating Kaiser windows and spectrums with labels β2, 3, and 4, respectively. — **FIGURE 1.44** Kaiser windows and spectrums (β = 2, 3, and 4).

1.12 MATRICES, VECTORS, AND RELATED OPERATIONS

A matrix is a convenient way to describe complicated linear systems and is used extensively in the analysis of control systems. For example, a large linear system is generally described in terms of the inputs x_j, outputs y_i, and the system coefficients a_ij as

(1.440)

and using matrix and vector notations (1.440) is described as

(1.441)

where the 1 by n inputs x_j, and the 1 by m outputs y_i are described by vectors and the coefficients a_ij are described by the m by n matrix. The notation of an m by n matrix refers, respectively, to the number of rows and columns of the matrix. The linear system may include time‐varying inputs, outputs, and coefficients as found in control systems and time‐varying parameter estimation and tracking applications. The following descriptions of matrices and vectors are introductory and targeted to the description of adaptive systems in Chapter 12. More in‐depth discussions on the subject and applications are given by Derusso, Roy, and Close [77], Haykin [78], Sage and White [79], and Gelb [80].

1.12.1 Definitions and Types of Matrices

A m × n matrix A contains m rows and n columns of elements a_ij and is denoted as
(1.442)
A diagonal matrix (Λ) is a m × m square matrix with a_ij = 0: i ≠ j, otherwise, the elements on the principal diagonal are identified as a_ii with Λ = diag(a_ii) ∀ i.
The unit matrix (I) is similar to the diagonal matrix with a_ij = 0: i ≠ j, and a_ii. = 1.
A null matrix or zero matrix (O) is a matrix for which all of the elements are zero.
A symmetric matrix is a square matrix of real elements if A = A^T, that is, a_ji = a_ij ∀ i,j.
A skew matrix is a square matrix of real elements if A = −A^T, that is, a_ij = −a_ji ∀ i,j.
A nonsingular matrix A has an inverse A⁻¹.
A complex matrix is denoted as Ã and has complex elements where a_cij and a_sij denote the respective real and imaginary values of the complex elements. Conjugation the complex matrix is denoted as Ã^* with elements .

The transposition of the matrix interchanges the row and columns and is denoted with the superscript T. For example, the transpose of the matrix A is the n × m matrix denoted as

(1.443)

The Hermitian matrix is a complex matrix with the elements below the principal diagonal equal to the complex conjugate of the those about the principal diagonal; satisfying the conditions (A^*)^T = A⁺ = A^H where the superscripts + and H denote the complex conjugate transposition. The superscript + generally denotes complex transposition and H is used to emphasize the Hermitian complex transposition.

The order, or rank, of an (m × n) matrix is denoted as m‐by‐n and the order, or rank, of the square (m × m) matrix is denoted as m.

1.12.2 The Determinant and Matrix Inverse

The determinant [81] of an m × m square matrix A is denoted as |A| and defined in terms of the cofactors, A_ij, of A as

(1.444)

The cofactors are defined as

(1.445)

where M_ij is the determinant of the minor matrix of the element a_ij. The minor matrix of a_ij is the matrix formed by the m′ rows and columns of A excluding the row and column containing a_ij. For example, considering the 4 × 4 square matrix A, the 3 × 3 minor matrix A′ of the element a₃₂, and the determinant M₃₂ are expressed in (1.446) as

(1.446)

Therefore, using (1.446), the cofactor A₃₂ is evaluated as A₃₂ = (−1)⁵M₃₂. Following the evaluation of the remaining cofactors A_i,2: i = 1,2,4 or the cofactors A_3j: j = 1,3,4, the determinant |A| is evaluated using, respectively, the first or second equality in (1.444).

For square matrices of order ≤ 3, the determinant can be computed in terms of the elements a_ij as expressed, for example, in the 3 × 3 matrix described in (1.447). In this example, the determinant is formed by summing the n = 3 products of the elements parallel to the principal diagonal elements (a₁₁,a₂₂,a₃₃) and subtracting the n = 3 products of the elements parallel to the diagonal elements (a₁₃,a₂₂,a₃₁). The three positive products are indicated by solid arrows pointing to the diagonal elements parallel to the principal diagonal and the three negative products are indicated by dashed arrows pointing to the diagonal elements parallel to the complement of the principal diagonal. In both cases, the elements below the diagonals are wrapped around to form the three element products. In general, this procedure does not give the correct results for orders > 3.

(1.447)

Determinants are defined only for square matrices and if |A| = 0 the matrix A is referred to as a singular matrix, otherwise, A is nonsingular. The inverse of a nonsingular n × n matrix A, with elements a_ij, is expressed as

(1.448)

where the matrix [A_ji] is defined as the adjoint of the matrix A. Considering the n × n matrix of cofactors [A_ij] of the matrix A, with the cofactor of each element a_ij evaluated using (1.445), the adjoint matrix is the transpose of the matrix [A_ij] so that

(1.449)

Premultiplication of A⁻¹ by A results in the unit matrix, that is, AA⁻¹ = I

1.12.3 Definition and Types of Vectors

A vector is an m × 1 column matrix with elements a_i and is denoted as

(1.450)

A complex vector is denoted as ã with complex elements where a_c and a_s denote the real and imaginary values of the complex element. The transposition of the matrix a is the 1 × m complex row vector a^T. Conjugation and conjugate transposition are denoted by the respective superscripts * and +.

1.12.4 Matrix and Vector Operations

Addition of two matrices of the same order (m × n) is C = A + B : The elements of C are c_ij = a_ij + b_ij
Multiplication of the two matrices must conform to the following rule: [m × n][ n × m′] = [m × m′] where the inner dimension n must be identical.
y = Ax : Matrix postmultiplication by a vector results in a [m × n][n × 1] = [m × 1] vector
The element multiplication is
y = x^TA : Matrix premultiplication by a vector results in a [1 × n][n × m] = [1 × m] vector.
The element multiplication is
C = AB : Matrix multiplication results in a matrix
The element multiplication is
In general, multiplication is not communicative AB ≠ BA
A(BC) = (AB)C : Associative
A(B + C) = AB + AC : Distributive
Multiplication by diagonal matrix D
C = AD : Postmultiplication of a real (m × m) matrix A by D results in a diagonal matrix C with the diagonal element of A scaled by those of D; the elements of C are c_ij = a_ijd_jj : i,j = 1, …, m
C = DA : Premultiplication of a real (m × m) matrix A by D results in the same diagonal matrix C; the elements of C are c_ij = d_iia_ij : i,j = 1, …, m. In this case multiplication is communicative.
Multiplication by unit matrix I
IA = A : Multiplication of a (m × m) matrix A by I does not alter the matrix A.
IA = AI : Multiplication by I is communicative.
Transposition of AB
C = (AB)^T = A^TB^T
Scalar, or inner, product of two (m × 1) complex vectors and ỹ
are complex values and
Scalar product of two (m × 1) real vectors x and y
is a real scalar value and
Orthogonality of two vectors is defined if their scalar product is zero.
Length of a complex vector is defined as magnitude, denoted as ||||, of the inner product
(1.451)
Outer, or dyadic, product of two (m × 1) and (1 × n) complex vectors and ỹ forms an (m × n) complex matrix with elements . The autocorrelation matrix of a complex vector is computed as the outer product.
Differentiation of Complex Matrices: Differentiation of a complex matrix Ã(t) results in the complex matrix with elements evaluated as . The differentiation of sums and products of matrices is evaluated as
(1.452)
(1.453)
Differentiation of a complex vector follows by considering the vector as a single column matrix. Therefore, differentiation of real matrices and vectors is identical with a_ij = 0.
Differentiation of quadratic transformation with the matrix expressed as
(1.454)
where the elements of the (m × m) complex matrix Ã(t) and the complex (m × 1) vector are functions of t. The derivative of is evaluated as
(1.455)
For a real matrix A(t) and vector x(t), the matrix Q(t) is symmetric and the derivative of the quadratic function Q(t) simplifies to
(1.456)
Bilinear transformation with the real (m × m) matrix B(t), expressed in terms of the (m × 1) real vectors x(t) and y(t), is evaluated as
(1.457)
Differentiation of B(t) with respect to t is evaluated as
(1.458)

1.12.5 The Quadratic Transformation

The quadratic matrix Q is similar to the bilinear matrix, expressed in (1.457), with the vector y = x. The following description is based on the (n × n) correlation matrix⁴⁹

(1.459)

where ũ is an (n,1) complex vector of arbitrary data and ũ^H is the complex conjugate transpose⁵⁰ of ũ. The n columns of the matrix Q represent the (n,1) characteristic vectors of the correlation matrix R that is transformed to the diagonal matrix Λ as

(1.460)

The diagonal elements of Λ are the characteristic values⁵¹ of the matrix R.

The derivation of (1.460) follows the work of Haykin [82] and is based on the linear transformation of the (n,1) complex vector to the vector using the transformation

(1.461)

where λ is a constant. Equation (1.461) can be expressed as which has nontrivial solutions with ≠ 0, iff the determinant of R − λI is zero, that is,

(1.462)

Equation (1.462) is the characteristic equation of the matrix R and has n solutions corresponding to the λ_i characteristic values or roots of (1.462). In general, the characteristic values are distinct; however, for the correlation matrix R, with the complex vector ũ based on samples of a discrete‐time weakly stationary stochastic process, the mean‐square value of the scalar is evaluated as

(1.463)

Equation (1.463) corresponds to a nonnegative definite quadratic form and Haykin points out that the equality condition rarely applies in practice so that (1.463) is almost always positive definite.⁵² Consequently, in these cases, characteristic values are real with and λ_i > 0 ∀ i and, for the weakly stationary stochastic process, the characteristic values are equal with λ_i = σ² ∀ i.

Referring to (1.461), the n solutions to the characteristic equation are determined using

(1.464)

where are the characteristic vectors corresponding to the characteristic values λ_i. Based on the correlation matrix R, it can be shown [83] that the characteristic values are all real and nonnegative and the characteristic vectors are linearly independent and orthogonal to each other. A corollary to the proof of independence is that when the characteristic value is multiplied by a scalar constant the characteristic vectors remain independent and orthogonal. This allows for the independent scaling of all characteristic values so that

(1.465)

resulting in an orthonormal set of characteristic vectors satisfying (1.464). Recognizing that the matrix Q is a column of characteristic vectors with and Λ is a diagonal matrix of characteristic values , (1.464) is expressed as

(1.466)

and the orthonormal matrix Q satisfies the relationship Q^HQ = I, therefore, Q^H = Q⁻¹ and premultiplying both sides of (1.466) by Q^H results in (1.460) as stated. Also, by postmultiplying both sides of (1.466) the correlation matrix is expressed as

(1.467)

Haykin states that there is no best way to compute the characteristic values and suggest that the use of the characteristic equation should be avoided except for the simple case involving the 2 × 2 matrix. However, a number of authors [77, 78, 84–89] describe computationally efficient methods for computing the characteristic values.

Consider the quadratic form discussed in Section 12.3, where is a (m × 1) complex vector with elements and R= is a (m × m) Hermitian matrix with elements : i ≠ j corresponding to the autocorrelation matrix of the (m × 1) complex vector . The following derivatives with respect to the complex vector are evaluated as:

(1.468)

(1.469)

and

(1.470)

where is a (m × 1) complex vector. The solutions to the derivative of the quadratic form with respect to the vector w are used in Chapter 12 in the analysis of adaptive systems. The proofs of (1.468), (1.469), and (1.470) are given by Haykin [78].

1.13 OFTEN USED MATHEMATICAL PROCEDURES

This section outlines the processing and formulas encountered in several of the chapters throughout this book.

1.13.1 Prime Factorization and Determination of Greatest Common Factor and Least Common Multiple

Prime factors are used in various coding applications where polynomials are used as code generators. The greatest common factor (GCF) and least common multiple (LCM) are often used where signal processing sample‐rate changes are required to improve performance or reduce processing complexity. They are also used in the implementation of mixed radix fast Fourier transforms. An integer p is prime if p ≠ ±1 and the only divisors are ±p and ±1. The procedures involved in determining the prime factors of a number and the GCF and LCM between two or more numbers are simply given by way of the following examples. The algorithms are easy to generalize and implement in a computer program.

1.13.1.1 Prime Factors of Two Numbers

The following examples, presented in Table 1.17, demonstrate the procedure for determining the prime factors of the numbers 120 and 200. The results are used in the following two examples to determine the GCF and LCM. The prime factors of a number are determined, starting with repeated division of the number by 2.

TABLE 1.17 Example of Prime Factorization

Using 120	Using 200
120 ÷ 2 = 60	200 ÷ 2 = 100
60 ÷ 2 = 30	100 ÷ 2 = 50
30 ÷ 2 = 15	50 ÷ 2 = 25
15 ÷ 3 = 5	25 ÷ 5 = 5
5 ÷ 5 = 1	5 ÷ 5 = 1
The prime factors are:
2 × 2 × 2 × 3 × 5	2 × 2 × 2 × 5 × 5

1.13.1.2 Determination of the Greatest Common Factor

This example outlines the procedures in determining the GCF of the two numbers 120 and 200 using the prime factors listed in Table 1.17; inclusion of more than more than two numbers is straightforward. The GCF is also referred to as the greatest common divisor (GCD). The GCF (or GCD) between 120 and 200 is determined as the product of the prime factors taken the minimum number of times that they occurred in any one of the prime factorizations, that is,

1.13.1.3 Determination of the Least Common Multiple

This example outlines the procedures in determining the LCM between 120 and 200. In this case, the LCM is determined as the product of the prime factors taken the maximum number of times that they occurred in any one of the prime factorizations, that is,

1.13.2 Newton’s Method

A transcendental equation involves trigonometric, exponential, logarithmic, and other functions that do not lend themselves to solutions by algebraic means. Newton’s method [90] of solving transcendental equations is used extensively in arriving at solutions to problems characterized by nonalgebraic equations. The method provides a rapid and accurate solution to determine the value of functions having the form f(x) = h(x), by finding the solution of the auxiliary function g(x) = f(x) – h(x) = 0. The solution begins by starting with an estimate of the solution and performing iterative updates to the estimate, described as

(1.471)

where g′(x_i) = ∂g(x_i)/∂x_i. The evaluation is terminated when where ε is an acceptable error in the solution and the corresponding x_i+1 is the desired value of x satisfying f(x) ≅ h(x).

1.13.3 Standard Deviation of Sampled Population

When a finite number of samples n comprises the entire population, the standard deviation is computed as

(1.472)

where the summation is over the entire population. However, when the samples n are a subset of the entire population, the standard deviation is computed as

(1.473)

where the summations are over the sample size n of the subset.

1.13.4 Solution to the Indeterminate Form 0/0

The frequently encountered functional form

(1.474)

evaluated at x = a often results in the indeterminate form 0/0 resulting from f(a) = g(a) = 0. Application of L’Hospital’s rule of repeated differentiation of f(x) and g(x) and evaluating the result as

(1.475)

often leads to a solution for n = 1 or 2. Solutions to other indeterminate forms involving ∞/∞, 0*∞, ∞−∞, 0⁰, ∞⁰, 0^∞, and 1^∞ may also be found using similar techniques [90, 91].

1.14 OFTEN USED MATHEMATICAL RELATIONSHIPS

In this section a number of mathematical relationships are listed as found in various mathematical handbooks. The references frequently referred to are as follows: Burington [92], Korn and Korn [93], Milton and Stegun [94], and Gradshteyn and Ryzhik [46].

1.14.1 Finite and Infinite Sums

1.14.2 Binomial Theorem and Coefficients

TABLE 1.18 Brief List of Binomial Coefficients

n	0	1	2	3	4	5	6	7	8	9	10
m	0	1	2	3	4	5	6	7	8	9	10
1	1	1
2	1	2	1
3	1	3	3	1
4	1	4	6	4	1
5	1	5	10	10	5	1
6	1	6	15	20	15	6	1
7	1	7	21	35	35	21	7	1
8	1	8	28	56	70	56	28	8	1
9	1	9	36	84	126	126	84	36	9	1
10	1	10	45	120	210	252	210	120	45	10	1

1.14.3 Trigonometric Identities

1.14.4 Differentiation and Integration Rules

The notations u and v are functions of x

;
;
;
;
;
;
;
;
: integration by parts with u = f(x) and dv = g(x)dx

1.14.5 Inequalities

b_n > 0; equality holds iff b_n = a_n
Equality holds for c = constant > 0 iff a_n= c b_n
Equality holds for c = constant > 0 iff f(x) = c g(x).

1.14.6 Relationships between Complex Numbers

For and

;

1.14.7 Miscellaneous Relationships [94]

[95]
[95]
⁵³
Example: for b = 3: a = …−4 −3 −2 −1 0 1 2 3 4 …

r = …−1 0 –2 −1 0 1 2 0 1 …
^*
Example: for b = 3: a = …−4 −3 −2 −1 0 1 2 3 4 …

r = … 2 0 1 2 0 1 2 0 1 …
^*
The solutions to the quadratic equation is
Completing the square of:
Completing the square of:

ACRONYMS

ACI: Adjacent channel interference
ACK: Acknowledgment (protocol)
AFSCN: U.S. Air Force Satellite Control Network
AM: Amplitude modulation
ARQ: Automatic repeat request
AWGN: Additive white Gaussian noise
BPSK: Binary phase shift keying
BT: Time bandwidth (product, low pass)
CFAR: Constant false‐alarm rate
CRC: Cyclic redundancy check (code)
DC: Direct current
DFT: Discrete Fourier transform
DSB: Double sideband
DSSS: Direct‐sequence spread‐spectrum (waveform)
EHF: Extremely high frequency
ELF: Extremely low frequency
FFT: Fast Fourier transform
FM: Frequency modulation
FSK: Frequency shift keying
GCD: Greatest common divisor
GCF: Greatest common factor
HF: High frequency
I/Q: In‐phase and quadrature (channels or rails)
IDFT: Inverse discrete Fourier transform
IF: Intermediate frequency
IFFT: Inverse Fast Fourier transform
ISI: Intersymbol interference
LCM: Least common multiple
LF: Low frequency
LLR: Log‐likelihood ratio
LR: Likelihood ratio
MAP: Maximum a posteriori
MF: Medium frequency
ML: Maximum likelihood
MMSE: Minimum mean‐square error
MS: Mean square
MSK: Minimum shift keying
NAK: Negative acknowledgment (protocol)
OQPSK: Offset quadrature phase shift keying
PM: Phase modulation
PN: Pseudo‐noise (sequence)
PSD: Power spectral density
PSK: Phase shift keying
QAM: Quadrature amplitude modulation
QPSK: Quadrature phase shift keying
RC: Raised‐cosine
RQ: Repeat request
RRC: Root‐raised‐cosine (temporal)
RSS: Root‐sum‐square
SHF: Super high frequency
SLF: Super low frequency
SRC: Spectral raised‐cosine
SRRC: Spectral root‐raised‐cosine
SS: Spread‐spectrum
TRC: Temporal raised‐cosine
UHF: Ultra‐high frequency
ULF: Ultra‐low frequency
VLF: Very low frequency
WSS: Wide‐sense stationary
WT: Time bandwidth (product, bandpass)

PROBLEMS

Show that the amplitude‐modulated waveform given by (1.2), when heterodyned by a receiver local oscillator that is phase locked to the received carrier angular frequency ω_c, recovers the modulation function , except for a factor of 1/2.
Hint: Mix (1.2) with sin(ω_ct+ϕ) and show that ϕ must be zero.
Show that the real signal given by (1.13) is a form of suppressed carrier modulation. Under what conditions of M(t) and ϕ(t) + ψ(t) does (1.13) reduce to the form of the suppressed carrier modulation given by (1.12)? What can be said about the information capacity between the suppressed carrier modulations given by (1.12) and (1.13)?
Compute the Hilbert transform of and .
Given that the bandwidth of the modulation function A(t) satisfies the condition B << f_c, compute the Hilbert transform of .
Show that the Fourier coefficients C_n and C_−n, expressed in (1.30), form complex conjugate pairs when f(t) is real.
Show that the real‐valued function f(t) can be expressed in terms of the Fourier series real coefficient C_o = α_o and the complex coefficients C_n = α_n + jβ_n: 1 ≤ n ≤ ∞ as
where and . Note: This solution is based on Problem 5.
Show that the finite summation is equal to the second equality in (1.52).
Hint: Expand the summation and combine the exponential terms to yield a series involving cos(nω_oT) terms and then evaluate the closed form of the corresponding trigonometric series as identified in Section 1.14.1 Identity No. 12.
With ω_o = 2π/T show that the integral is equal to unity as N → ∞.
Referring to Figure 1.7 and using ω_o = 2π/T, show that the maximum value of (1.52) is (2N + 1)/T and that the closest zero or null removed from a maximum occurs at t = nT ± T/(2N + 1): |n| = 0,1,….
Consider a radix‐2, N‐point, pipeline FFT with the output sampled at intervals of T = N_sT_s seconds, where N_s is the number of samples per symbol. If the sequential input samples are simply passed through the FFT delay elements with the complex multiplications and additions only performed at the output sampling instants: (A) determine the percentage of complex multiplies relative to the pipeline FFT sampled every T_s seconds. Examine the result as a function of increasing N_s with 1 ≤ N_s ≤ 32 and N ≥ N_s; (B) determine the minimum number of complex multiplications when 100 % zero padding is used for frequency estimation and tracking and discuss the pipeline FFT sampling requirements.
Compute the second moment, E[X²], of the Gaussian random variable X with mean m and variance σ².
Referring to (1.165) compute for the conditional Gaussian pdf, expressed by the second equality of (1.168), with and . Express the result in terms of the expectations as and express C₁, C₂, and C₃ in terms of the parameters .
Using , , and evaluate with m₁ = m₂ = 0.
Repeat Problem 12 using (1.172) and show that
when x₁ and x₂ are zero‐mean Gaussian random variables.
In the transformation from f_X(x) to f_Z(z), discussed in Section 1.5.4.1, show that the inverse relationship in (1.188) applies for the function z = ax².
Part B: Express f_Z(z) using (1.186) or (1.187).

Part C: Express f_Z(z) when the pdf of f_X(x) is Gaussian with mean value m and variance σ². Plot or sketch your expression f_Z(z) as a function of z.

Note: e^λ + e^−λ = 2cosh(λ).

Part D: Express f_Z(z) when m = 0 in Part C and plot or sketch as a function of z.
Given the statistically independent ordered random variables {X₁, X₂, …, X_n} such that a ≤ X₁<X₂<⋯<X_n ≤ b and characterized by the uniformly distributed pdf: ∀i, with the corresponding cdf expressed as . Show that with .
Hint: Start with with and note that .
For the transformation in Section 1.5.5, evaluate the Jacobian in (1.211) using the phase angle expressed as θ = tan⁻¹(x_s/x_c).
Hint: use g₁₁(x_c,x_s) = g₁₂(x_c,x_s) = and g₂₁(x_c,x_s) = g₂₂(x_c,x_s) = tan⁻¹(x_s/x_c).
Given the joint pdf f_X,Y(x,y), expressed in (1.166), compute the marginal pdf M_X(x).
Hint: Complete the square using: .
Given the pdf f_X,(x), perform the following:
1. Compute the pdf f_Y,(y) under the condition y = |x|. Note that f_Y,(y) = 0 for y < 0.
2. Determine and sketch f_Y,(y) when f_X,(x) is described by the normal distribution N(m_x,σ_x)
3. Repeat Part B with m_x = 0
Show that the limiting form of the Ricean distribution, expressed by (1.222), corresponds to the Rayleigh distribution as A → 0. Refer to Table 1.8.
Hint: Use the ascending series expression with .
Show that the limiting form of the Ricean distribution, expressed by (1.222), corresponds to the Gaussian distribution as A → ∞. Refer to Table 1.8.
Hint: Use the asymptotic expansion of I_o(z) for large arguments expressed asI_o(z) ~ with .

Recognize that as r → A the condition r = A results in the Gaussian distribution.
Determine the marginal pdf of Y₁ = min{X₁, X₂, …, X_n} given the joint pdf g_Y(y₁, y₂, …, y_n) of the uniformly distributed ordered samples corresponding to .
Hint: Show that the cdfs in the descending order are expressed as

with and .

Also show that where .
Show that the Nakagami‐m distribution is the same as the Rayleigh power distribution.
Hint: Use the transformation in the Rayleigh distribution.
Derive the expression for the characteristic function C_X(v) for the Gaussian distribution f_X(x) with mean value x_o and variance σ².
Set up the integrations identifying the integration limits and ranges of the variable z for the evaluation of f_Z(z) where the random variable Z is the summation of three (3) zero‐mean uniformly distributed random variables X between –a and a.
Hint: There are three unique ranges on z. The evaluation of the integrations is optional; however, the application of Mathsoft’s Mathcad® symbolic formula evaluation is an error‐free time saver.
Using f_Z(z) evaluated in Problem 24 for N = 3, compute the first and second moments of the random variable Z using (1.254) and compare the results with those in Table 24.
Hint: It is much easier and less prone to mistakes to use Mathsoft’s Mathcad symbolic formula evaluation.
Show that the moments of the random variable X are determined from the characteristic function as expressed in (1.240).
Hint: Take the first derivative of C_X(v) with respect to v and evaluate the result for v = 0; and observe that the resulting integral is E[x]. Repeat this procedure for additional derivatives of C_X(v) and show that (1.240) follows.
Plot the cdf of a zero‐mean Gaussian distribution with variances corresponding to the second moments in Table 1.6 for N = 3 and 4 and compare the results with the corresponding cdf’s in Figure 1.23; comment on the quality of the match in light of the central limit theorem. Repeat this exercise using the theoretical second moments from Table 1.7 for N = 3 and 4 and compare with the corresponding cdf’s in Figure 1.26.
Show that equations (1.261) and (1.262) apply for λv <<1 as N increases in the respective summation of N iid distributions in Examples 1 and 2 of Section 1.5.6.1.
The narrowband noise process n(t), given by (1.307), is expressed in terms of the baseband analytic noise ñ(t) as
Using this relationship, express the correlation function in terms of the individual correlation functions , R_ññ(τ), and . What are the required conditions on these correlation functions to satisfy the stationarity property of the narrowband process n(t)?
Express the individual correlation functions in Problem 29 in terms of the correlation functions R_cc(τ), R_cs(τ), R_sc(τ), and R_ss(τ), where the baseband analytic noise is given by . Use these results and the conditions for stationarity found in Problem 29 to express R_nn(τ) in terms of the R_cc(τ) and R_sc(τ).
Referring to (1.315), that applies to the noise power out of a bandpass filter centered at the positive frequency f_c. When the bandpass filter output is mixed to baseband, express the noise power out the baseband filter in terms of the bandwidth B and the one‐sided noise spectral density N_o.
Given the noise input, expressed by (1.307), to a linear filter with impulse response h(t), show that the respective input and output of the correlation responses R_nn(τ) and are related by the convolutions . Using this result with Fourier transform pairs h(τ) ↔ H(f) and h^*(−τ) ↔ H^*(f), show the relationship between the input and output noise spectrums.
Hint: Using the convolution integral show that n′(t) has zero‐mean. Then from the correlation

and show that and, as the final step, form the correlation

and show that .
Derive the expression for the matched filter output signal‐to‐noise ratio when the additive noise is not white noise, that is, the noise power spectral density into the matched filter is .
Under the condition stated in Section 1.7.1 show that (1.332) is a wide‐sense stationary random process.
Given the random process x(t_i) = a where t_i is a discrete‐time sample and a is a discrete random variable such that a = 1 with probability p and = −1 with probability q = 1 − p. Using (1.303) and (1.304) determine if x(t_i) is ergodic.
Show that the random process is wss if f_c is constant and x(t) is a wss random process independent of the random variable ϕ uniformly distributed over the interval 0 to 2π. Also, express the PSD S_y(ω) in terms of the autocorrelation R_x(τ) and the PSD S_x(ω).
The risk for the mean‐square estimate is defined as
Show that and results in the optimum estimate given by (1.356).
Determine if the MS and MAP estimates in the example of Section 1.9.1 are unbiased estimates. If not, what is the bias of the estimate? Also, evaluate the Cramér–Rao bound for the estimates and, using (1.366), determine if the estimates are efficient.
Given that the received baseband signal amplitude is A volts, using the ML estimate, determine the following: Part 1, the variance of the estimation error of A given the baseband samples : i = 1, …, N where n_i are iid Gaussian random variables characterized as N(0,σ_n), Part 2, show that the estimate â_ml(r) is efficient, Part 3 show the condition for which the estimate â_ml(r) is unbiased.
Repeat Problem 39 under the following condition: the baseband signal amplitude is Gaussian distributed with a priori pdf p_a(A) characterized by N(A,σ_a).
Using (1.384) determine the effective bandwidth (β) for the isosceles triangle shaped pulse with base equal to 2τ and peak amplitude of AN volts.
Hints: The solution to the integral is encountered with m = 2 and the double factorial [96] is defined as (2m + 1)!! = 1 × 3 × 5 … (2m + 1). The denominator in the expression for α² is the signal energy E.
Determine the normalized effective bandwidth (βT) and the corresponding standard deviation (σ_Td) for the SRC and SRRC waveforms with 100% excess bandwidth, that is, α = 1.
Determine the normalized effective bandwidth (βT) and the corresponding standard deviation (σ_Td) of the delay estimate for the SRC and SRRC waveforms with zero excess bandwidth, that is, α = 0.
Determine the noise bandwidth for the SRRC and SRC frequency functions.
Note: the noise bandwidth is defined by (1.46).
Using (1.387) determine the normalized effective time duration for the rectangular pulse Arect(t/T − 0.5).

REFERENCES

1. C.E. Shannon, “Communication in the Presence of Noise,” Proceedings of the IEEE, Vol. 86, Issue 2, pp. 447–457, February, 1998.
2. M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions with Formula, Graphs and Mathematical Tables, National Bureau of Standards, Applied Mathematical Series 55, Washington, DC, U.S. Government Printing Office, p. 361, June 1964.
3. R.A. Manske, “Computer Simulation of Narrowband Systems,” IEEE Transactions on Computers, Vol. C‐17, No. 4, pp. 301–308, April 1968.
4. E.C. Titchmarsh, The Theory of Functions, 2 ed., Oxford University Press, New York, 1939.
5. E.C. Titchmarsh, Introduction to the Theory of Fourier Integrals, 2 ed., Oxford University Press, New York, 1948.
6. W.B. Davenport, Jr., W.L. Root, An Introduction to the Theory of Random Signals and Noise, McGraw Hill Book Company, New York, 1958.
7. A. Papoulis, The Fourier Integral and Its Applications, pp. 9, 42–47, McGraw‐Hill Book Company, New York, 1987.
8. A. Papoulis, Probability, Random Variables and Stochastic Processes, McGraw‐Hill Book Company, New York, 1965.
9. A.V. Oppenheim, R.W. Schafer, Digital Signal Processing, Chapter 11, “Power Spectrum Estimation”, Prentice‐Hall, Inc., Englewood Cliffs, NJ, 1975.
10. M.S. Bartlett, An Introduction to Stochastic Processes with Special Reference to Methods and Applications, Cambridge University Press, New York, 1953.
11. B.P. Bogert, Guest Editor, “The Fast Fourier Transform and Its Application to Digital Filtering and Spectral Analysis,” Special Issue of the IEEE Transactions on Audio and Electroacoustics, Vol. AU‐15, No. 2, June 1967.
12. J.W. Cooley, P.A.W. Lewis, P.D. Welch, “The Finite Fourier Transform,” IEEE Transactions on Audio and Electroacoustics, Vol. AU‐17, No. 2, pp. 77–85, June 1969.
13. R.C. Singleton, “A Short Bibliography on the Fast Fourier Transform,” IEEE Transactions on Audio and Electroacoustics, Vol. AU‐17, No. 2, pp. 166–169, June 1969.
14. G.D. Bergland, “A Guided Tour of the Fast Fourier Transform,” IEEE Spectrum, Vol. 6, pp. 41–52, July 1969.
15. T.H. Glisson, C.I. Black, A.P. Sage, “The Digital Computation of Discrete Spectra Using the Fast Fourier Transform,” IEEE Transactions on Audio and Electroacoustics, Vol. AU‐18, No. 3, pp. 271–287, September 1970.
16. J.D. Merkel, “FFT Pruning,” IEEE Transactions on Audio and Electroacoustics, Vol. AU‐19, No. 4, pp. 305–311, September 1971.
17. P.D. Welch, “The Use of the Fast Fourier Transform for the Estimation of Power Spectra: A Method Based on Time Averaging over Short, Modified Periodograms,” IEEE Transactions on Audio and Electroacoustics, Vol. AU‐15, pp. 70–73, June 1967.
18. J.W. Cooley, J.W. Tukey, “An Algorithm for Machine Calculation of Complex Fourier Series,” Mathematical Computation, Vol. 9, No. 5, pp. 297–301, April 1965.
19. E.O. Brigham, R.E. Morrow, “The Fast Fourier Transform,” IEEE Spectrum, Vol. 4, No. 2, pp. 63–70, December 1967.
20. E.O. Brigham, The Fast Fourier Transform and Its Applications, Prentice‐Hall, Inc., Englewood Cliffs, NJ, 1988.
21. H.L. Groginsky, G.A. Works, “A Pipeline Fast Fourier Transform,” IEEE Transactions on Computers, Vol. C‐19, No. 11, pp. 1015–1019, November 1970.
22. A.V. Oppenheim, R.W. Schafer, Digital Signal Processing, pp. 542–554, Prentice‐Hall, Inc., Englewood Cliffs, NJ, 1975.
23. A.V. Oppenheim, R.W. Schafer, Digital Signal Processing, pp. 548–549, Prentice‐Hall, Inc., Englewood Cliffs, NJ, 1975.
24. P.M. Woodward, Probability and Information Theory, with Applications to Radar, Pergamon Press, London, 1960.
25. M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions with Formula, Graphs and Mathematical Tables, National Bureau of Standards, Applied Mathematical Series 55, Washington, DC, U.S. Government Printing Office, p. 231, June 1964.
26. P.F. Panter, Modulation, Noise, and Spectral Analysis Applied to Information Transmission, McGraw‐Hill Book Company, New York, 1965.
27. S. Goldman, Transformation calculus and Electrical Transients, Chapter 9, “Bessel Functions,” Prentice‐Hall, Inc., Englewood Cliffs, NJ, 1950.
28. H. Cramér, Mathematical Methods of Statistics, Princeton University Press, Princeton, NJ, 1974.
29. A. Leon‐Garcia, Probability and Random Processes for Electrical Engineering, Second Addition, Addison‐Wesley Publishing Company, Inc., New York, May 1994.
30. J.M. Wozencraft, I.M. Jacobs, Principles of Communication Engineering, John Wiley & Sons, Inc., New York, 1967.
31. H.D. Brunk, An Introduction to Mathematical Statistics, Blaisdell Publishing Company, Waltham, MA, 1965.
32. W. Feller, An Introduction to Probability Theory and Its Applications, John Wiley & Sons, Inc., New York, 1957.
33. A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 236, McGraw‐Hill Book Co., New York, 1965.
34. A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 176, McGraw‐Hill Book Co., New York, 1965.
35. A. Papoulis, Probability, Random Variables, and Stochastic Processes, pp. 207–209, McGraw‐Hill Book Co., New York, 1965.
36. W.B. Davenport, Jr., W.L. Root, An Introduction to the Theory of Random Signals and Noise, p. 149, McGraw Hill Book Co., New York, 1958.
37. A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 221, McGraw‐Hill Book Co., New York, 1965.
38. W.B. Davenport, Jr., W.L. Root, An Introduction to the Theory of Random Signals and Noise, pp. 33–35, McGraw Hill Book Co., New York, 1958.
39. A. Papoulis, Probability, Random Variables, and Stochastic Processes, pp. 126–127, McGraw‐Hill Book Co., New York, 1965.
40. N.A.J. Hastings, J.B. Peacock, Statistical Distributions: A Handbook for Students and Practitioners, A Halsted Press Book, John Wiley & Sons, Inc., New York, 1975.
41. W.B. Davenport, Jr., W.L. Root, An Introduction to the Theory of Random Signals and Noise, p. 165, McGraw Hill, New York, 1958.
42. W.B. Davenport, Jr., W.L. Root, An Introduction to the Theory of Random Signals and Noise, p. 166, McGraw Hill, New York, 1958.
43. M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standards, Applied Mathematics Series No. 55, U.S. Government Printing Office, Washington, DC, p. 376, Integral 9.6.16, June 1964.
44. W.B. Davenport, Jr., W.L. Root, An Introduction to the Theory of Random Signals and Noise, pp. 165–167, McGraw Hill Book Co., New York, 1958.
45. J.C. Hancock, An Introduction to the Principles of Communication Theory, McGraw‐Hill Book Co., New York, 1961.
46. I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series, and Products, Corrected and Enlarged Edition, Academic Press, Inc., New York, 1980.
47. G.A. Campbell, R.M. Foster, Fourier Integrals for Practical Applications, Fourth Printing, D. Van Norstrand Company, Inc., New York, 1948.
48. H. Urkowitz, “Energy Detection of Unknown Deterministic Signals,” Proceeding of the IEEE, Vol. 55, No. 4, pp. 523–531, April 1967.
49. M. Nakagami, “The m‐distribution—A general formula for Intensity Distribution of Rapid Fading,” W.C. Hoffman, Editor, Statistical Methods in Radio Wave Propagation, pp. 3–36, Pergamon Press, New York, 1960.
50. M.G. Kendall, A. Stuart, The Advanced Theory of Statistics, Vol. I, II, Hafner Publishing Company, New York, 1958, 1961 and Advanced Theory of Statistics, Vol. III, Hafner Press, New York, 1982.
51. H.A. David, H.N. Nagaraja, Order Statistics, 3rd Addition, John Wiley & Sons, Hoboken, NJ, 2003.
52. B.W. Lindgren, Statistical Theory, The Macmillan Company, New York, 1962.
53. R.C. Borgioli, “Fast Fourier Transform Correlation versus Direct Discrete Time Correlation,” Proceeding of the IEEE, Vol. 56, No. 9, pp. 1602–1604, September, 1968.
54. E.O. Brigham, Editor, The Fast Fourier Transform and its Applications, Chapter 10, “FFT Convolution and Correlation,” Prentice‐Hall, Englewood Cliffs, NJ, 1988.
55. A. Papoulis, Editor, Probability, Random Variables, and Stochastic Processes, Chapter 9, “Stochastic Processes: General Concepts,” McGraw‐Hill Book Co., New York, 1965.
56. W.B. Davenport, Jr., W.L. Root, An Introduction to the Theory of Random Signals and Noise, pp. 38–42, 66–71, McGraw Hill Book Co., New York, 1958.
57. A. Papoulis, Probability, Random Variables, and Stochastic Processes, pp. 323–332, McGraw‐Hill Book Co., New York, 1965.
58. A. Papoulis, Editor, Probability, Random Variables, and Stochastic Processes, Chapter 10, “Stochastic Processes: Correlation and Power Spectrum of Stationary Processes,” McGraw‐Hill Book Co., New York, 1965.
59. D.O. North, “Analysis of the Factors Which Determine Signal/Noise Discrimination in Radar,” Radio Corporation of America (RCA), Technical Report PTR‐6‐C, June, 1943; reprinted in Proceeding of the IRE, Vol. 51, pp. 1016–1028, July 1963.
60. G.L. Turin, “An Introduction to the Matched Filter,” IRE Transactions on Information Theory, Vol. 6, No. 3, pp. 311–329, June 1960.
61. M.L. Skolnik, Introduction to Radar Systems, McGraw‐Hill Book Company, Inc., New York, 1962.
62. D. Slepian, “Estimation of Signal Parameters in the Presence of Noise,” IRE Transactions on Information Theory, Vol. PGIT‐3, No. 4, pp. 68–89, March 1954.
63. P.M. Woodward, I.L. Davies, “Information Theory and Inverse Probability in Telecommunication,” Proceedings of the IEE, Vol. 99, Part III, pp. 37–44, March 1952.
64. H. Cramér, Mathematical Methods of Statistics, Chapter 32, “Classification of Estimates,” Princeton University Press, Princeton, NJ, 1974.
65. C.R. Rao, “Information and Accuracy Attainable in the Estimation of Statistical Parameters,” Bulletin Calcutta Mathematical Society, Vol. 37, pp. 81–91, 1945.
66. H.L. Van Trees, Detection, Estimation, and Modulation Theory: Part I, John Wiley & Sons, New York, 1968.
67. C.E. Cook, M. Bernfeld, Radar Signals: An Introduction to Theory and Application, Academic Press, New York, 1967.
68. L. Kleinrock, Queueing Systems, Volume I: Theory, John Wiley & Sons, New York, 1975.
69. D. Gabor, “The Theory of Communication,” Journal of the IEE, Vol. 93, Part III, pp. 429–441, 1946.
70. R. Manasse, “Range and Velocity Accuracy from Radar Measurements,” MIT Lincoln Laboratory, Lexington, MA, February, 1955. (This unpublished internal report is not generally available.)
71. M.L. Skolnik, Introduction to Radar Systems, pp. 467–469, McGraw‐Hill Book Co., Inc., New York, 1962.
72. F. Halsall, Data Communications, Computer, Networks and Open Systems, Fourth Edition, Addison‐Wesley Publishing Company, Harlow, UK, 1996.
73. C. Fujiwara, K. Yamashita, M. Kasahara, T. Namekawa, “General Analyses Go‐Back‐N ARQ System,” Electronics and Communications in Japan, Vol. J59‐A, No. 4, pp. 24–31, 1975.
74. A.V. Oppenheim, R.W. Schafer, Digital Signal Processing, Chapter 5, “Digital Filter Design Techniques,” Prentice‐Hall, Inc., Englewood Cliffs, NJ, 1975.
75. F.J. Harris, “On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform,” Proceeding of the IEEE, Vol. 66, No. 1, pp. 51–83, January 1978.
76. L.S. Metzger, D.M. Boroson, J.J. Uhran, Jr., I. Kalet, “Receiver Windows for FDM MFSK Signals,” IEEE Transactions on Communications, Vol. 27, No. 10, pp. 1519–1527, October 1979.
77. P.M. Derusso, R.J. Roy, C.M. Close, State Variables for Engineers, John Wiley & Sons, Inc., New York, 1965.
78. S. Haykin, Adaptive Filter Theory, Prentice‐Hall, Englewood Cliffs, NJ, 1986.
79. A.P. Sage, C.C. White, III, Optimum System Control, 2nd Edition, Prentice‐Hall, Inc., Englewood Cliffs, NJ, 1977.
80. A. Gelb, Editor, Applied Optimum Estimation, The M.I.T. Press, Massachusetts Institute of Technology, Cambridge, MA, 1974.
81. C.R. Wylie, Jr., Advanced Engineering Mathematics, McGraw‐Hill Book Company, Inc., New York, 1960.
82. S. Haykin, Adaptive Filter Theory, Chapter 2, “Stationary Discrete‐Time Stochastic Processes,” Prentice‐Hall, Englewood Cliffs, NJ, 1986.
83. S. Haykin, Adaptive Filter Theory, pp. 54–56, Prentice‐Hall, Englewood Cliffs, NJ, 1986.
84. G.W. Stewart, Introduction to Matrix Computations, Academic Press, New York, 1973.
85. J.S. Frame, “Matrix Functions and Applications, Part I – Matrix Operations and Generalized Inverses,” IEEE Spectrum, Vol. 1, No. 3, pp. 209–220, March 1964.
86. J.S. Frame, “Matrix Functions and Applications, Part II – Functions of a Matrix,” IEEE Spectrum, Vol. 1, No. 4, pp. 102–110, April 1964.
87. J.S. Frame, H.E. Koenig, “Matrix Functions and Applications, Part III – Applications of Matrices to Systems Analysis, a Matrix,” IEEE Spectrum, Vol. 1, No. 5, pp. 100–109, May 1964.
88. J.S. Frame, “Matrix Functions and Applications, Part IV – Matrix Functions and Constituent Matrices,” IEEE Spectrum, Vol. 1, No. 6, pp. 123–131, June 1964.
89. J.S. Frame, “Matrix Functions and Applications, Part V – Similarity Reductions by Rational or Orthogonal Matrices,” IEEE Spectrum, Vol. 1, No. 7, pp. 103–109, June 1964.
90. L.L. Smail, Calculus, Appleton‐Century‐Crofts, Inc., New York, 1949.
91. O.W. Eshbach, Handbook of Engineering Fundamentals, John Wiley & Sons, Inc., New York, 1952.
92. R.S. Burington, Handbook of Mathematical Tables and Formulas, 3rd Addition, Handbook Publishers, Inc., Sandusky, OH, 1957.
93. G.A. Korn, T.M. Korn, Mathematical Handbook for Scientists and Engineers, 2nd Addition, McGraw‐Hill Book Co., New York, 1961.
94. M. Milton, I.A. Stegun, Editors, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standard Applied Mathematical Series 55, U.S. Government Printing Office, Washington, DC, 1964.
95. I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series, and Products, Corrected and Enlarged Edition, p. xliii, Academic Press, Inc., New York, 1980.
96. I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series, and Products, Corrected and Enlarged Edition, p. 446, Integral No. 10, Academic Press, Inc., New York, 1980.

NOTES

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Waveform f(t)	Spectrum F(f)
1	δ(f)
f(t − τ)	F(f)exp(−j2πfτ)

δ(t)
δ(t − τ)	exp(−j2πfτ)
f(at)	(1/a)F(f/a)

cos(2πf_ot)
sin(2π f_ot)

	(j2πf)ⁿF(f)




f(t) = x(t) y(t)	X(f)^*Y(f) =
f(t) = x(t)^*y(t)	F(f) = X(f) Y(f)

	U(f)exp(−j2πfτ)
^a

	Tsinc(fT)^b
sinc(2t/T)

Property	Comments


	x : real
	x : real


	x,y : real

Property	Comments
	Autocorrelation
	x(t) is real
	Autocovariance




	Cross‐covariance

Table of Contents for 1 MATHEMATICAL BACKGROUND AND ANALYSIS TECHNIQUES

Create new playlist

Sign In

Sign Up

1.1 INTRODUCTION

1.1.1 Waveform Modulation Descriptions

1.1.1.1 Amplitude Modulation

1.1.1.2 Phase Modulation

1.1.1.3 Frequency Modulation

1.1.1.4 Suppressed Carrier Modulation

1.1.1.5 Real and Analytic Signals

1.1.1.6 Hilbert Transform and Analytic Signals

1.1.1.7 Conventional and Complex Heterodyning

1.2 THE FOURIER TRANSFORM AND FOURIER SERIES

1.2.1 The Transform Pair rect(t/T) ⇔ Tsinc(fT)

1.2.2 The sinc(x) Function

1.2.3 The Fourier Transform Pair

1.2.4 The Discrete Fourier Transform

1.2.5 The Fast Fourier Transform

1.2.5.1 The Pipeline FFT

1.2.6 The FFT as a Detection Filter

1.2.7 Interpolation Using the FFT

1.2.8 Spectral Estimation Using the FFT

1.2.9 Fourier Transform Properties

1.2.9.1 Linearity

1.2.9.2 Translation

1.2.9.3 Conjugation

1.2.9.4 Differentiation

1.2.9.5 Integration

1.2.10 Fourier Transform Relationships

1.2.10.1 Convolution

1.2.10.2 Integral of Product (Parseval’s Theorem)

1.2.11 Summary of Some Fourier Transform Pairs

1.3 PULSE DISTORTION WITH IDEAL FILTER MODELS

1.3.1 Ideal Amplitude and Zero Phase Filter

1.3.2 Nonideal Amplitude and Phase Filters: Paired Echo Analysis

1.3.3 Example of Delay Distortion Loss Using Paired Echoes

1.4 CORRELATION PROCESSING

1.5 RANDOM VARIABLES AND PROBABILITY

1.5.1 Probability and Cumulative Distribution and Probability Density Functions

1.5.1.1 Continuous Random Variables

1.5.1.2 Discrete Random Variables

1.5.1.3 Mixed Random Variables

1.5.2 Definitions and Fundamental Relationships for Continuous Random Variables

1.5.2.1 Marginal pdf of Continuous Random Variables

1.5.2.2 Conditional pdf and cdf of Continuous Random Variables

1.5.2.3 Expectations of Continuous Random Variables

The Bivariate Distribution—An Example of Conditional Distributions

1.5.3 Definitions and Fundamental Relationships for Discrete Random Variables

1.5.3.1 Statistical Independence

1.5.3.2 Conditional Probability

1.5.3.3 Bayes Rule

1.5.4 Functions of Random Variables

1.5.4.1 Functions of One Random Variable

1.5.4.2 Functions of Two or More Random Variables

1.5.5 Probability Density Functions

1.5.5.1 Distributions of Sinusoidal Signal Magnitude and Phase in Narrowband Additive White Gaussian Noise

1.5.5.2 Distribution of the Product of Two Independent Gaussian Random Variables

1.5.6 The Characteristic Function

1.5.6.1 Summation of Independently Distributed Random Variables

Example 1

Example 2

Example 3

Example 4

1.5.7 Relationships between Distributions

1.5.7.1 Relationship between Chi‐Square, Gaussian, Rayleigh, and Ricean Distributions

Special Case for N = 2

1.5.7.2 Relationship between Nakagami‐m, Gaussian, Rayleigh, and Ricean Distributions

1.5.8 Order Statistics

1.5.9 Properties of Correlation Functions

1.6 RANDOM PROCESSES

1.6.1 Stochastic Processes

1.6.1.1 Stationarity

Strict‐Sense Stationary Process

Wide‐Sense Stationary Process

1.6.1.2 Ergodic Random Process

1.6.2 Narrowband Gaussian Noise

1.7 THE MATCHED FILTER

1.7.1 Example Application of Matched Filtering

1.7.2 Equivalence between Matched Filtering and Correlation

Table of Contents for
1 MATHEMATICAL BACKGROUND AND ANALYSIS TECHNIQUES