In this section, we discuss some nonlinearity tests available in the literature that have decent power against the nonlinear models considered in Section 4.1. The tests discussed include both parametric and nonparametric statistics. The Ljung–Box statistics of squared residuals, the bispectral test, and the Brock, Dechert, and Scheinkman (BDS) test are nonparametric methods. The RESET test (Ramsey, 1969), the F tests of Tsay (1986, 1989), and other Lagrange multiplier and likelihood ratio tests depend on specific parametric functions. Because nonlinearity may occur in many ways, there exists no single test that dominates the others in detecting nonlinearity.
4.2.1 Nonparametric Tests
Under the null hypothesis of linearity, residuals of a properly specified linear model should be independent. Any violation of independence in the residuals indicates inadequacy of the entertained model, including the linearity assumption. This is the basic idea behind various nonlinearity tests. In particular, some of the nonlinearity tests are designed to check for possible violation in quadratic forms of the underlying time series.
Q-Statistic of Squared Residuals
McLeod and Li (1983) apply the Ljung–Box statistics to the squared residuals of an ARMA(p, q) model to check for model inadequacy. The test statistic is
where T is the sample size, m is a properly chosen number of autocorrelations used in the test, at denotes the residual series, and is the lag-i ACF of . If the entertained linear model is adequate, Q(m) is asymptotically a chi-squared random variable with m − p − q degrees of freedom. As mentioned in Chapter 3, the prior Q-statistic is useful in detecting conditional heteroscedasticity of at and is asymptotically equivalent to the Lagrange multiplier test statistic of Engle (1982) for ARCH models; see Section 3.4.3. The null hypothesis of the test is H0:β1 = ⋯ = βm = 0, where βi is the coefficient of in the linear regression
for t = m + 1, … , T. Because the statistic is computed from residuals (not directly from the observed returns), the number of degrees of freedom is m − p − q.
Bispectral Test
This test can be used to test for linearity and Gaussianity. It depends on the result that a properly normalized bispectrum of a linear time series is constant over all frequencies and that the constant is zero under normality. The bispectrum of a time series is the Fourier transform of its third-order moments. For a stationary time series xt in Eq. (4.1), the third-order moment is defined as
where u and v are integers, , ψ0 = 1, and ψk = 0 for k < 0. Taking Fourier transforms of Eq. (4.37), we have
4.38
where with , and wi are frequencies. Yet the spectral density function of xt is given by
where w denotes the frequency. Consequently, the function
The bispectrum test makes use of the property in Eq. (4.39). Basically, it estimates the function b(w1, w2) in Eq. (4.39) over a suitably chosen grid of points and applies a test statistic similar to Hotelling's T2 statistic to check the constancy of b(w1, w2). For a linear Gaussian series, so that the bispectrum is zero for all frequencies (w1, w2). For further details of the bispectral test, see Priestley (1988), Subba Rao and Gabr (1984), and Hinich (1982). Limited experience shows that the test has decent power when the sample size is large.
BDS Statistic
Brock, Dechert, and Scheinkman (1987) propose a test statistic, commonly referred to as the BDS test, to detect the iid assumption of a time series. The statistic is, therefore, different from other test statistics discussed because the latter mainly focus on either the second- or third-order properties of xt. The basic idea of the BDS test is to make use of a “correlation integral” popular in chaotic time series analysis. Given a k-dimensional time series Xt and observations , define the correlation integral as
4.40
where Iδ(u, v) is an indicator variable that equals one if ||u − v|| < δ, and zero otherwise, where || · || is the supnorm. The correlation integral measures the fraction of data pairs of {Xt} that are within a distance of δ from each other. Consider next a time series xt. Construct k-dimensional vectors , which are called k histories. The idea of the BDS test is as follows. Treat a k history as a point in the k-dimensional space. If are indeed iid random variables, then the k-histories should show no pattern in the k-dimensional space. Consequently, the correlation integrals should satisfy the relation Ck(δ) = [C1(δ)]k. Any departure from the prior relation suggests that xt are not iid. As a simple, but informative example, consider a sequence of iid random variables from the uniform distribution over [0, 1]. Let [a, b] be a subinterval of [0, 1] and consider the “2-history” (xt, xt+1), which represents a point in the two-dimensional space. Under the iid assumption, the expected number of 2-histories in the subspace [a, b] × [a, b] should equal the square of the expected number of xt in [a, b]. This idea can be formally examined by using sample counterparts of correlation integrals. Define
where Tℓ = T − ℓ + 1 and if ℓ = 1 and if ℓ = k. Under the null hypothesis that {xt} are iid with a nondegenerated distribution function F( · ), Brock, Dechert, and Scheinkman (1987) show that
for any fixed k and δ. Furthermore, the statistic is asymptotically distributed as normal with mean zero and variance:
where C = ∫[F(z + δ) − F(z − δ)] dF(z) and N = ∫[F(z + δ) − F(z − δ)]2dF(z). Note that C1(δ, T) is a consistent estimate of C, and N can be consistently estimated by
The BDS test statistic is then defined as
4.41
where σk(δ, T) is obtained from σk(δ) when C and N are replaced by C1(δ, T) and N(δ, T), respectively. This test statistic has a standard normal limiting distribution. For further discussion and examples of applying the BDS test, see Hsieh (1989) and Brock, Hsieh, and LeBaron (1991). In application, one should remove linear dependence, if any, from the data before applying the BDS test. The test may be sensitive to the choices of δ and k, especially when k is large.
4.2.2 Parametric Tests
Turning to parametric tests, we consider the RESET test of Ramsey (1969) and its generalizations. We also discuss some test statistics for detecting threshold nonlinearity. To simplify the notation, we use vectors and matrices in the discussion. If necessary, readers may consult Appendix 4.6 of Chapter 8 for a brief review on vectors and matrices.
The RESET Test
Ramsey (1969) proposes a specification test for linear least-squares regression analysis. The test is referred to as a RESET test and is readily applicable to linear AR models. Consider the linear AR(p) model
where and . The first step of the RESET test is to obtain the least-squares estimate θ of Eq. (4.42) and compute the fit , the residual , and the sum of squared residuals , where T is the sample size. In the second step, consider the linear regression
where for some s ≥ 1, and compute the least-squares residuals
and the sum of squared residuals of the regression. The basic idea of the RESET test is that if the linear AR(p) model in Eq. (4.42) is adequate, then α1 and α2 of Eq. (4.43) should be zero. This can be tested by the usual F statistic of Eq. (4.43) given by
4.44
which, under the linearity and normality assumption, has an F distribution with degrees of freedom g and T − p − g.
Remark
Because for k = 2, … , s + 1 tend to be highly correlated with Xt-1 and among themselves, principal components of Mt-1 that are not co-linear with Xt-1 are often used in fitting Eq. (4.43). Principal component analysis is a statistical tool for dimension reduction; see Chapter 8 for more information.
Keenan (1985) proposes a nonlinearity test for time series that uses only and modifies the second step of the RESET test to avoid multicollinearity between and Xt-1. Specifically, the linear regression (4.43) is divided into two steps. In step 2(a), one removes linear dependence of on Xt-1 by fitting the regression
and obtaining the residual . In step 2(b), consider the linear regression
and obtain the sum of squared residuals to test the null hypothesis α = 0.
The F Test
To improve the power of Keenan's test and the RESET test, Tsay (1986) uses a different choice of the regressor Mt-1. Specifically, he suggests using Mt-1 = vech(Xt-1 Xt-1′), where vech(A) denotes the half-stacking vector of the matrix A using elements on and below the diagonal only; see Appendix B of Chapter 8 for more information about the operator. For example, if p = 2, then Mt-1 = (xt-12, xt-1 xt-2, xt-22)′. The dimension of Mt-1 is p(p + 1)/2 for an AR(p) model. In practice, the test is simply the usual partial F statistic for testing α = 0 in the linear least-squares regression
where et denotes the error term. Under the assumption that xt is a linear AR(p) process, the partial F statistic follows an F distribution with degrees of freedom g and T − p − g − 1, where g = p(p + 1)/2. We refer to this F test as the Ori-F test. Luukkonen, Saikkonen, and Teräsvirta (1988) further extend the test by augmenting Mt-1 with cubic terms xt-13 for i = 1, … , p.
Threshold Test
When the alternative model under study is a SETAR model, one can derive specific test statistics to increase the power of the test. One of the specific tests is the likelihood ratio statistic. This test, however, encounters the difficulty of undefined parameters under the null hypothesis of linearity because the threshold is undefined for a linear AR process. Another specific test seeks to transform testing threshold nonlinearity into detecting model changes. It is then interesting to discuss the differences between these two specific tests for threshold nonlinearity.
To simplify the discussion, let us consider the simple case that the alternative model is a 2-regime SETAR model with threshold variable xt−d. The null hypothesis H0: xt follows the linear AR(p) model
4.45
whereas the alternative hypothesis Ha: xt follows the SETAR model
where r1 is the threshold. For a given realization and assuming normality, let be the log-likelihood function evaluated at the maximum-likelihood estimates of and . This is easy to compute. The likelihood function under the alternative is also easy to compute if the threshold r1 is given. Let be the log-likelihood function evaluated at the maximum-likelihood estimates of and conditioned on knowing the threshold r1. The log-likelihood ratio l(r1) defined as
is then a function of the threshold r1, which is unknown. Yet under the null hypothesis, there is no threshold and r1 is not defined. The parameter r1 is referred to as a nuisance parameter under the null hypothesis. Consequently, the asymptotic distribution of the likelihood ratio is very different from that of the conventional likelihood ratio statistics. See Chan (1991) for further details and critical values of the test. A common approach is to use lmax = supv & rt & u as the test statistic, where v and u are prespecified lower and upper bounds of the threshold. Davis (1987) and Andrews and Ploberger (1994) provide further discussion on hypothesis testing involving nuisance parameters under the null hypothesis. Simulation is often used to obtain empirical critical values of the test statistic lmax, which depends on the choices of v and u. The average of l(r1) over r1 ∈ [v, u] is also considered by Andrews and Ploberger as a test statistic.
Tsay (1989) makes use of arranged autoregression and recursive estimation to derive an alternative test for threshold nonlinearity. The arranged autoregression seeks to transfer the SETAR model under the alternative hypothesis Ha into a model change problem with the threshold r1 serving as the change point. To see this, the SETAR model in Eq. (4.46) says that xt follows essentially two linear models depending on whether xt−d < r1 or xt−d ≥ r1. For a realization , xt−d can assume values {x1, … , xT−d}. Let x(1) ≤ x(2) ≤ ⋯ ≤ x(T−d) be the ordered statistics of (i.e., arranging the observations in increasing order). The SETAR model can then be written as
where if x(j) < r1 and if x(j) ≥ r1. Consequently, the threshold r1 is a change point for the linear regression in Eq. (4.47), and we refer to Eq. (4.47) as an arranged autoregression (in increasing order of the threshold xt−d). Note that the arranged autoregression in (4.47) does not alter the dynamic dependence of xt on xt−i for i = 1, … , p because x(j)+d still depends on x(j)+d−i for i = 1, … , p. What is done is simply to present the SETAR model in the threshold space instead of in the time space. That is, the equation with a smaller xt−d appears before that with a larger xt−d. The threshold test of Tsay (1989) is obtained as follows.
and its standard error. Let ê(m+1)+d be the standardized predictive residual.
and compute the usual F statistic for testing αi = 0 in Eq. (4.48) for i = 0, … , p. Under the null hypothesis that xt follows a linear AR(p) model, the F ratio has a limiting F distribution with degrees of freedom p + 1 and T − d − m − p.
We refer to the earlier F test as a TAR-F test. The idea behind the test is that under the null hypothesis there is no model change in the arranged autoregression in Eq. (4.47) so that the standardized predictive residuals should be close to iid with mean zero and variance 1. In this case, they should have no correlations with the regressors x(m+j)+d−i. For further details including formulas for a recursive least-squares method and some simulation study on performance of the TAR-F test, see Tsay (1989). The TAR-F test avoids the problem of nuisance parameters encountered by the likelihood ratio test. It does not require knowing the threshold r1. It simply tests that the predictive residuals have no correlations with regressors if the null hypothesis holds. Therefore, the test does not depend on knowing the number of regimes in the alternative model. Yet the TAR-F test is not as powerful as the likelihood ratio test if the true model is indeed a 2-regime SETAR model with a known innovational distribution.
4.2.3 Applications
In this subsection, we apply some of the nonlinearity tests discussed previously to five time series. For a real financial time series, an AR model is used to remove any serial correlation in the data, and the tests apply to the residual series of the model. The five series employed are as follows:
1. r1t: A simulated series of iid N(0, 1) with 500 observations.
2. r2t: A simulated series of iid Student-t distribution with 6 degrees of freedom. The sample size is 500.
3. a3t: The residual series of monthly log returns of CRSP equal-weighted index from 1926 to 1997 with 864 observations. The linear AR model used is
4. a4t: The residual series of monthly log returns of CRSP value-weighted index from 1926 to 1997 with 864 observations. The linear AR model used is
5. a5t: The residual series of monthly log returns of IBM stock from 1926 to 1997 with 864 observations. The linear AR model used is
Table 4.2 shows the results of the nonlinearity test. For the simulated series and IBM returns, the F tests are based on an AR(6) model. For the index returns, the AR order is the same as the model given earlier. For the BDS test, we chose δ = and with k = 2, … , 5. Also given in the table are the Ljung–Box statistics that confirm no serial correlation in the residual series before applying nonlinearity tests. Compared with their asymptotic critical values, the BDS test and F tests are insignificant at the 5% level for the simulated series. However, the BDS tests are highly significant for the real financial time series. The F tests also show significant results for the index returns, but they fail to suggest nonlinearity in the IBM log returns. In summary, the tests confirm that the simulated series are linear and suggest that the stock returns are nonlinear.
aThe sample size of simulated series is 500 and that of stock returns is 864. The BDS test uses k = 2, … , 5.