Chapter 5
Time Series Analysis

Time series analysis can be used to analyze both a univariate time series and a vector time series. We are interested in estimating the dependence between consecutive observations. In a vector time series there is both cross-sectional dependence and time series dependence, which means that the components of the vector depend on each other at any given point of time, and the future values of the vectors depend on the past values of the vectors.

Model free time series analysis can estimate the joint distribution of

5.1 equation

for some c05-math-002. The estimation could be done using nonparametric multivariate density estimation. A different model free approach models at the first step the distribution of c05-math-003 parametrically, using density c05-math-004. At the second step, parameter c05-math-005 is taken to be time dependent. This leads to a semiparametric time series analysis, because we combine a cross-sectional parametric model with a time varying estimation of the parameter. Time localized maximum likelihood or time localized least squares can be used to estimate the parameter. Of particular interest is to estimate a univariate excess distribution c05-math-006 with a time varying c05-math-007, because this leads to time varying quantile estimation.

Prediction is one of the most important applications of time series analysis. In prediction it is useful to use regression models

where c05-math-009 and c05-math-010 is noise. For the estimation of c05-math-011 we can use nonparametric regression. We study prediction with models (5.2) in Chapter 6.

Autoregressive moving average processes (ARMA) models are classical parametric models for time series analysis. It is of interest to find formulas of conditional expectation in ARMA models, because these formulas for conditional expectation can be used to construct predictors. The formulas for conditional expectation in ARMA models give insight into different types of predictors: AR models lead to state space prediction, and MA models lead to time space prediction.

Prediction of future returns of a financial asset is difficult, but prediction of future absolute returns and future squared returns is feasible. Generalized autoregressive conditional heteroskedasticity (GARCH) models are applied in the prediction of squared returns. Prediction of future squared returns is called volatility prediction. Prediction of volatility is applied in Chapter 7.

We concentrate on time series analysis in discrete time, but we define also some continuous time stochastic processes, like the geometric Brownian motion, because it is a standard model in option pricing.

Section 5.1 discusses strict stationarity, covariance stationarity, and autocovariance function. Section 5.2 studies model free time series analysis. Section 5.3 studies parametric time series models, in particular, ARMA and GARCH processes. Section 5.4 considers models for vector time series. Section 5.5 summarizes stylized facts of financial time series.

5.1 Stationarity and Autocorrelation

A time series (stochastic process) is a sequence of random variables, indexed by time. We define time series models for double infinite sequences

equation

A time series model can also be defined for a one-sided infinite sequence c05-math-012, where c05-math-013 or c05-math-014. A realization of a time series is a finite sequence c05-math-015 of observed values. We use the term “time series” both to denote the underlying stochastic process and a realization of the stochastic process. Besides a sequence of real valued random variables, we can consider a vector time series, which is a sequence c05-math-016 of random vectors c05-math-017.1

5.1.1 Strict Stationarity

Time series c05-math-021 is called strictly stationary, if c05-math-022 and c05-math-023 are identically distributed for all c05-math-024. This means that for a strictly stationary time series all finite dimensional marginal distributions are equal.

Figure 5.1(a) shows a time series of S&P 500 daily prices, using data described in Section 2.4.1. The time series has an exponential trend and is not stationary. The exponential trend can be removed by taking logarithms, as shown in panel (b), but after that we have a time series with a linear trend. The linear trend can be removed by taking differences, as shown in panel (c), which leads to the time series of logarithmic returns, which already seems to be a stationary time series. Figure 2.1(b) shows that the gross returns seem to be stationary.

Graphical representation of Removing a trend: Differences of logarithms.

Figure 5.1 Removing a trend: Differences of logarithms. (a) S&P 500 prices; (b) logarithms of S&P 500 prices; (c) differences of the logarithmic prices.

Figure 5.2(a) shows a time series of differences of S&P 500 prices, which is not a stationary time series. Panels (b) and (c) show short time series of price differences, which seem to be approximately stationary. Thus, we could also define the concept of approximate stationarity.

Image described by caption and surrounding text.

Figure 5.2 Removing a trend: Differencing. (a) Differences of S&P 500 prices over 65 years; (b) differences over 4 years; (c) differences over 100 days.

Figure 5.3 studies a time series of squares of logarithmic returns, computed from the daily S&P 500 data, which is described in Section 2.4.1. The squared logarithmic returns are often modeled as a stationary GARCH(c05-math-025) time series. However, we can also model the squared logarithmic returns with a signal plus noise model

equation

where c05-math-026, c05-math-027 is a deterministic trend, and c05-math-028 is stationary white noise. We can estimate the trend c05-math-029 with a moving average c05-math-030. Moving averages are defined in Section 6.1.1. Panel (a) shows time series c05-math-031 (black circles) and c05-math-032 (red line). Panel (b) shows c05-math-033. Panel (b) suggests that subtracting the moving average could lead to stationarity. We use the one-sided exponential moving average in (6.3) with smoothing parameter c05-math-034.

Graphical representation of Removing a trend: Subtracting amoving average.

Figure 5.3 Removing a trend: Subtracting a moving average. (a) A times series of squared returns and a moving average of squared returns (red); (b) squared returns minus the moving average of squared returns.

5.1.1.1 Random Walk

Random walk is a discrete time stochastic process c05-math-035 defined by

equation

where c05-math-036 is a random variable or a fixed value, and c05-math-037 is distributed as IID(c05-math-038). We have that

equation

If c05-math-039 is a constant, then c05-math-040 and c05-math-041. Thus, random walk is not strictly stationary (and not covariance stationary). We obtain a Gaussian random walk if c05-math-042 is Gaussian white noise. If c05-math-043, then a Gaussian random walk satisfies c05-math-044.

Figure 5.4(a) shows the time series of S&P 500 prices over a period of 100 days. Panel (b) shows a simulated Gaussian random walk of length 100, when the initial value is 0. A random walk leads to a time series that has a stochastic trend. A stochastic trend is difficult to distinguish from a deterministic trend. A time series of stock prices resembles a random walk. Also a time series of a dividend price ratio in Figure 6.7(a) resembles a random walk.

Graphical representation of Stochastic trend.

Figure 5.4 Stochastic trend. (a) Prices of S&P 500 over 100 days; (b) simulated random walk of length 100, when the initial value is 0.

Geometric random walk is a discrete time stochastic process defined by

equation

where c05-math-045 are i.i.d and c05-math-046 is independent of c05-math-047.

5.1.2 Covariance Stationarity and Autocorrelation

We define autocovariance and autocorrelation first for scalar time series and then for vector time series.

5.1.2.1 Autocovariance and Autocorrelation for Scalar Time Series

We say that a time series c05-math-048 is covariance stationary, if c05-math-049 is a constant, not depending on c05-math-050, and c05-math-051 depends only on c05-math-052 but not on c05-math-053. A covariance stationary time series is called also second-order stationary.

If c05-math-054 for all c05-math-055, then strict stationary implies covariance stationarity. There exists time series that are strictly stationary but for which covariance is not defined.2 Covariance stationarity does not imply strict stationarity. For a Gaussian time series, strict stationarity and covariance stationarity are equivalent. By a Gaussian time series, we mean a time series whose all finite dimensional marginal distributions have a Gaussian distribution.

For a covariance stationary time series the autocovariance function is defined by

equation

where c05-math-063. The covariance stationarity implies that c05-math-064 depends only on c05-math-065 and not on c05-math-066. The autocorrelation function is defined as

equation

where c05-math-067.

The sample autocovariance with lag c05-math-068, based on the observations c05-math-069, is defined as

equation

where c05-math-070.3 The sample autocorrelation with lag c05-math-074 is defined as

Figure 5.5 shows sample autocorrelation functions for the daily S&P 500 index data, described in Section 2.4.1. Panel (a) shows the sample autocorrelation function c05-math-076 for the return time series c05-math-077 and panel (b) shows the sample autocorrelation function for the time series of the absolute returns c05-math-078. The lags are on the range c05-math-079.

Graphical representation of S&P 500 autocorrelation.

Figure 5.5 S&P 500 autocorrelation. (a) The sample autocorrelation function c05-math-080 of S&P 500 returns for c05-math-081; (b) the sample autocorrelation function for absolute returns. The red lines indicate the 95% confidence band for the null hypothesis of i.i.d process.

If c05-math-082 are i.i.d. with mean zero, then

equation

as c05-math-083; see Brockwell and Davis (1991). Thus, if c05-math-084 are i.i.d. with mean zero, then about c05-math-085 of the observed values c05-math-086 should be inside the band

equation

where c05-math-087 is the c05-math-088-quantile for the standard normal distribution. Figure 5.5 has the red lines at the heights c05-math-089, where we have chosen c05-math-090, so that c05-math-091.

The Box–Ljung test can be used to test whether the autocorrelations are zero for a stationary time series c05-math-092. The null hypothesis is that c05-math-093 for c05-math-094, where c05-math-095. Let us have observed time series c05-math-096. The test statistics is

equation

where c05-math-097 is defined in (5.3). The test rejects the null hypothesis of zero autocorrelations if

equation

where c05-math-098 is the c05-math-099-quantile of the c05-math-100-distribution with degrees of freedom c05-math-101. We can compute the observed c05-math-102-values

equation

for c05-math-103, where c05-math-104 is the distribution function of the c05-math-105-distribution with degrees of freedom c05-math-106. Small observed c05-math-107-values indicate that the observations are not compatible with the null hypothesis.

5.1.2.2 Autocovariance for Vector Time Series

Let c05-math-108 be a vector time series with two components. Vector time series c05-math-109 is covariance stationary when the components c05-math-110 and c05-math-111 are covariance stationary and

for all c05-math-113. Thus, vector time series c05-math-114 is covariance stationary when c05-math-115 is a vector of constants, not depending on c05-math-116, and the covariance

equation

depends only on c05-math-117 but not on c05-math-118 for c05-math-119.

For a covariance stationary time series the autocovariance function is defined by

5.5 equation

For a scalar covariance stationary time series c05-math-121 we have

equation

However, the autocovariance function of a vector time series satisfies4

5.2 Model Free Estimation

Univariate and multivariate descriptive statistics and graphical tools can be applied to get insight into a distribution of a time series. We can apply c05-math-123-variate descriptive statistics and graphical tools to the c05-math-124-dimensional marginal distributions of a time series. This is discussed in Section 5.2.1.

Univariate and multivariate density estimators and regression estimators can be applied to time series data. We can apply c05-math-125-variate estimators to the c05-math-126-dimensional marginal distributions of a time series. This is discussed in Section 5.2.2, by assuming that the time series is a Markov process of order c05-math-127.

Section 5.2.3 considers modeling time series with a combination of parametric and nonparametric methods. First a static parametric model is posed on the observations and then the time dynamics is introduced with time space or state space smoothing. The approach includes both local likelihood, covered in Section 5.2.3.1, and local least squares method, covered in Section 5.2.3.2. We apply local likelihood and local least squares to estimate time varying tail index in Section 5.2.3.3.

5.2.1 Descriptive Statistics for Time Series

Univariate statistics, as defined in Section 3.1, can be used to describe time series data c05-math-128. Using univariate statistics, like sample mean and sample variance, is reasonable if c05-math-129 are identically distributed.

Multivariate statistics, as defined in Section 4.1, can be used to describe vector time series data c05-math-130. Again, the use of multivariate statistics like sample correlation is reasonable if c05-math-131 are identically distributed.

Multivariate statistics can be used also for univariate time series data c05-math-132 if we create a vector time series from the initial univariate time series. We can create a two-dimensional vector time series by defining

for some c05-math-134. Now we can compute a sample correlation coefficient, for example, from data c05-math-135. This is reasonable if c05-math-136 are identically distributed. The requirement that c05-math-137 in (5.7) are identically distributed follows from strict stationarity of c05-math-138.

5.2.2 Markov Models

We have defined strict stationarity in Section 5.1.1. A strictly stationary time series c05-math-139 can be defined by giving all finite dimensional marginal distributions. That is, to define the distribution of a strictly stationary time series we need to define the distributions

equation

for all c05-math-140. If the time series is IID(0,c05-math-141), then we need only to define the distribution of c05-math-142. We say that the time series is a Markov process, if

equation

To define a Markov process we need to define the distribution of c05-math-143 and c05-math-144. More generally, we say that the time series is a Markov process of order c05-math-145, if

equation

To define a Markov process of order c05-math-146 we need to define the distributions of c05-math-147, c05-math-148, …, c05-math-149.

To estimate nonparametrically the distribution of a Markov process of order c05-math-150, we can estimate the distributions of c05-math-151, c05-math-152, …, c05-math-153 nonparametrically.

5.2.3 Time Varying Parameter

Let c05-math-154 be a time series. Let c05-math-155 be a density function, where c05-math-156 is a parameter, and c05-math-157. We could ignore the time series properties and assume that c05-math-158 are independent and identically distributed with density c05-math-159.

However, we can assume that parameter c05-math-160 changes in time. Then the observations are not identically distributed, but c05-math-161 has density c05-math-162. In practice, we do not specify any dynamics for c05-math-163, but construct estimates c05-math-164 using nonparametric smoothing.

Note that even when we would assume independent and identically distributed observations, with time series data the parameter estimate is changing in time, because at time c05-math-165 the estimate c05-math-166 is constructed using data c05-math-167. This is called sequential estimation.

5.2.3.1 Local Likelihood

If c05-math-168 are independent with density c05-math-169, then the density of c05-math-170 is

equation

The maximum likelihood estimator of c05-math-171 is the value c05-math-172 maximizing

equation

over c05-math-173. We can find a time varying estimator c05-math-174 using either time space or state space localization. The local likelihood approach has been studied in Spokoiny (2010). The localization is discussed more in Sections 6.1.1 and 6.1.2.

Time Space Localization

Let

where c05-math-176 is the smoothing parameter and c05-math-177 is a kernel function. For example, we can take c05-math-178. Let c05-math-179 be the value maximizing

over c05-math-181.

For example, let us consider the model

equation

where c05-math-182 are i.i.d. Denote c05-math-183 and c05-math-184, where c05-math-185 is the density of the standard normal distribution. Let c05-math-186. Now

equation

Then c05-math-187, where

State Space Localization

Let us observe the state variables c05-math-189 in addition to observing time series c05-math-190. Let

where c05-math-192 is the smoothing parameter and c05-math-193 is a kernel function. We can take c05-math-194 the density of the standard normal distribution. Let c05-math-195 be the value maximizing (5.9) over c05-math-196.

For example, let us consider the model

equation

where c05-math-197 are i.i.d. The model can be written as

equation

Denote c05-math-198 and c05-math-199, where c05-math-200 is the density of the standard normal distribution. Then c05-math-201, as defined in (5.10).

5.2.3.2 Local Least Squares

Let us consider a linear model with time changing parameters. Let us observe the explanatory variables c05-math-202 in addition to observing time series c05-math-203. Consider the model

equation

where c05-math-204, c05-math-205 are time dependent constants, c05-math-206 is the vector of explanatory variables, and c05-math-207 is an error term.

We define the estimates of the time varying regression coefficients as the values c05-math-208 and c05-math-209 minimizing

equation

where c05-math-210 is the time space localized weight defined in (5.8). When we observe in addition the state variables c05-math-211, then we can use the state space localized weight c05-math-212, defined in (5.11).

5.2.3.3 Time Varying Estimators for the Excess Distribution

We discussed tail modeling in Section 3.4. The idea in tail modeling is to fit a parametric model only to the data in the left tail or to the data in the right tail. We can add time space or state space localization to the tail modeling. As before, c05-math-213 is a time series.

Local Likelihood in Tail Estimation

Let family c05-math-214, c05-math-215, model the excess distribution of a return distribution, where c05-math-216. This means that if c05-math-217 is the density of the return distribution then we assume for the left tail that

equation

for some c05-math-218, where c05-math-219, and c05-math-220 is the c05-math-221th quantile of the return density: c05-math-222. For the right tail the corresponding assumption is

equation

for some c05-math-223, where c05-math-224, and c05-math-225 is the c05-math-226th quantile of the return density: c05-math-227.

The local maximum likelihood estimator for the parameter of the left tail is obtained from (3.63) as

5.12 equation

where c05-math-229 is the empirical quantile computed from c05-math-230, c05-math-231, and

equation

The time space localized weights c05-math-232 are modified from (5.8) as

where c05-math-234 is the smoothing parameter and c05-math-235 is a kernel function. If there is available the state variables c05-math-236, then we can use the state space localized weights, modified from (5.11) as

where c05-math-238 is the smoothing parameter and c05-math-239 is a kernel function.

The local maximum likelihood estimator for the parameter of the right tail is obtained from (3.64) as

equation

where c05-math-240 is the empirical quantile, c05-math-241, and

equation

The weights are obtained from (5.13) and (5.14) by replacing c05-math-242 with c05-math-243.

For example, let us assume that the excess distribution is the Pareto distribution, as defined in (3.74) as

equation

where c05-math-244 is the shape parameter. The maximum likelihood estimator has the closed form expression (3.75). The local maximum likelihood estimators are

5.15 equation

and

5.16 equation

where c05-math-247, and c05-math-248. For the left tail we assume that c05-math-249, and for the right tail we assume that c05-math-250. These are the time varying Hill's estimators.

Figure 5.6 studies time varying Hill's estimates for the S&P 500 daily data, described in Section 2.4.1. Panel (a) shows the estimates for the left tail index and panel (b) shows the estimates for the right tail index. Sequentially calculated Hill's estimates are shown in black, time localized Hill's estimates with c05-math-251 are shown in blue, and the case with c05-math-252 is shown in yellow. The exponential kernel function is used. The estimation is started after there are 4 years of data. The tails are defined by the empirical quantile c05-math-253 with c05-math-254 and c05-math-255.

Graphical representation of Time varying Hill's estimator.

Figure 5.6 Time varying Hill's estimator. (a) Left tail index; (b) right tail index. The black curves show sequentially calculated Hill's estimates, the blue curves show the time localized estimates with c05-math-256 and the yellow curves have c05-math-257.

Time Varying Regression Estimator for Tail Index

Let c05-math-258 be the observed time series at time c05-math-259. The regression estimator for the parameter c05-math-260 of the Pareto distribution is given in (3.77). Let

equation

The local regression estimator of the parameter of the left tail is

equation

where c05-math-261 is the empirical quantile and we assume c05-math-262. The weights c05-math-263 are obtained from (5.13) and (5.14) by replacing index c05-math-264 with the index c05-math-265, so that the weights correspond to the ordering c05-math-266.

The local regression estimator of the parameter of the right tail is

equation

where c05-math-267 are the observations in reverse order,

equation

c05-math-268 is the empirical quantile, c05-math-269, and we assume c05-math-270.

Figure 5.7 studies time varying regression estimates for the tail index using the S&P 500 daily data, described in Section 2.4.1. Panel (a) shows the estimates for the left tail index and panel (b) shows the estimates for the right tail index. Sequentially calculated regression estimates are shown in black, time localized estimates with c05-math-271 are shown in blue, and the case with c05-math-272 is shown in yellow. The standard Gaussian kernel function is used. The estimation is started after there are 4 years of data. The tails are defined by the empirical quantile c05-math-273 with c05-math-274 and c05-math-275.

Graphical representation of Time varying regression estimator.

Figure 5.7 Time varying regression estimator. Time series of estimates of the tail index are shown. (a) Left tail index; (b) right tail index. The black curves show sequentially calculated regression estimates, the blue curves show the time localized estimates with c05-math-276 and the yellow curves have c05-math-277.

5.3 Univariate Time Series Models

We discuss first ARMA (autoregressive moving average) processes and after that we discuss conditional heteroskedasticity models. Conditional heteroskedasticity models include ARCH (autoregressive conditional heteroskedasticity) and GARCH (generalized autoregressive conditional heteroskedasticity) models. The ARMA, ARCH, and GARCH processes are discrete time stochastic processes. We discuss also continuous time stochastic processes, because geometric Brownian motion and related continuous time stochastic processes are widely used in option pricing.

Brockwell and Davis (1991) give a detailed presentation of linear time series analysis, Fan and Yao (2005) give a short introduction to ARMA models and a more detailed discussion of nonlinear models. Shiryaev (1999) presents results of time series analysis that are useful for finance.

5.3.1 Prediction and Conditional Expectation

Our presentation of discrete time series analysis is directed towards giving prediction formulas: these prediction formulas are used in Chapter 7 to provide benchmarks for the evaluation of the methods of volatility prediction. Chapter 6 studies nonparametric prediction.

Let c05-math-278 be a time series with c05-math-279. We take the conditional expectation

5.17 equation

to be the best prediction of c05-math-281, given the observations c05-math-282, where c05-math-283 is the prediction step. Using the conditional expectation as the best predictor can be justified by the fact that the conditional expectation minimizes the mean squared error. In fact, the function c05-math-284 minimizing

5.18 equation

is the conditional expectation: c05-math-286.5

Besides predicting the value c05-math-298, we consider also predicting the squared value c05-math-299.

In the following text, we give expressions for c05-math-300 in the ARMA models and for c05-math-301 in the ARCH and GARCH models. These expressions depend on the unknown parameters of the models. In order to apply the expressions we need to estimate the unknown parameters and insert the estimates into the expressions.

The conditional expectation whose condition is the infinite past is a function c05-math-302 of the infinite past. Since we have available only a finite number of observations, we have to truncate these functions to obtain a function c05-math-303. It would be more useful to obtain formulas for

equation

and

equation

However, these formulas are more difficult to derive than the formulas where the condition of the conditional expectation is the infinite past.

5.3.2 ARMA Processes

ARMA processes are defined in terms of an innovation process. After defining innovation processes, we define MA (moving average) processes and AR (autoregressive) processes. ARMA processes are obtained by combining autoregressive and moving average processes.

5.3.2.1 Innovation Processes

Innovation processes are used to build more complex processes, like ARMA and GARCH processes. We define two innovation processes: a white noise process and an i.i.d. process.

We say that c05-math-304 is a white noise process and write c05-math-305 if

  1. 1. c05-math-306,
  2. 2. c05-math-307,
  3. 3. c05-math-308 for c05-math-309,

where c05-math-310 is a constant. A white noise is a Gaussian white noise if c05-math-311.

We say that c05-math-312 is an i.i.d. process and write c05-math-313 if

  1. 1. c05-math-314,
  2. 2. c05-math-315,
  3. 3. c05-math-316 and c05-math-317 are independent for c05-math-318.

An i.i.d. process is also a white noise process. A Gaussian white noise is also an i.i.d. process.

5.3.2.2 Moving Average Processes

We define first a moving average process of a finite order, then we give prediction formulas, and finally define a moving average process of infinite order.

MA(c05-math-319) Process

We use MA(c05-math-320) as a shorthand notation for a moving average process of order c05-math-321. A moving average process c05-math-322 of order c05-math-323 is a process satisfying

equation

where c05-math-324, c05-math-325 is a white noise process, and c05-math-326.

Figure 5.8 illustrates the definition of MA(c05-math-327) processes. In panel (a) c05-math-328 and in panel (b) c05-math-329. When c05-math-330, then c05-math-331 and c05-math-332 have one common white noise-term, but c05-math-333 and c05-math-334 do not have common white noise-terms. When c05-math-335, then c05-math-336 and c05-math-337 have two common white noise-terms, c05-math-338 and c05-math-339 have one common white noise-term, and c05-math-340 and c05-math-341 do not have common white noise-terms.

Graphical representation of The definition of a MA(q) process. (a) MA(1) process; (b) MA(2) process.

Figure 5.8 The definition of a MA(c05-math-342) process. (a) MA(c05-math-343) process; (b) MA(c05-math-344) process.

We have that

and

where c05-math-347. Thus, MA(c05-math-348) process is such that a correlation exists between c05-math-349 and c05-math-350 only if c05-math-351. Equations (5.19) and (5.20) show that MA(c05-math-352) process is covariance stationary.

If we are given a covariance function c05-math-353, which is such that c05-math-354 for c05-math-355, we can construct a MA(c05-math-356) process with this covariance function by solving c05-math-357 and c05-math-358 from the c05-math-359 equations

equation
Prediction of MA Processes

The conditional expectation c05-math-360 is the best prediction of c05-math-361 for c05-math-362, given the infinite past c05-math-363, in the sense of the mean squared prediction error, as we mentioned in Section 5.3.1. We denote c05-math-364. The best linear prediction in the sense of the mean squared error is given in (6.19). We can use that formula when the covariance function of the MA(c05-math-365) process is first estimated.

A recursive prediction formula for the MA(c05-math-366) process can be derived as follows. We have that

equation

because c05-math-367 for c05-math-368. The noise terms c05-math-369 are not observed, but we can write

equation

This leads to a formula for c05-math-370 in terms of the infinite past c05-math-371. For example, for the MA(1) process c05-math-372 we have

The prediction formula for prediction step c05-math-374 is a version of exponential moving average, which is defined in (6.7).

We can obtain a recursive prediction for practical use in the following way. Define c05-math-375, when c05-math-376 and

equation

when c05-math-377. Finally we define the c05-math-378-step prediction as

equation

For example, for the MA(1) process c05-math-379 we get the truncated formulas

equation

In the implementation the parameters c05-math-380 have to be replaced by their estimates.

MA(c05-math-381) Process

A moving average process of infinite order is defined as

equation

The series converges in mean square if6

equation

We have that

and

Equations (5.22) and (5.23) imply that MA(c05-math-394) process is covariance stationary. MA(c05-math-395) process can be used to study the properties of AR processes. For example, if we can write an AR process as a MA(c05-math-396) process, this shows that the AR process is covariance stationary.

5.3.2.3 Autoregressive Processes

An autoregressive process c05-math-397 of order c05-math-398 is a process satisfying

where c05-math-400, c05-math-401 is a white noise process, and c05-math-402. We assume that c05-math-403 is uncorrelated with c05-math-404. We use AR(c05-math-405) as a shorthand notation for an autoregressive process of order c05-math-406.

The autocovariance function of an AR(c05-math-407) process can be computed recursively. Multiply (5.24) by c05-math-408 from both sides and take expectations to get

where c05-math-410. The first values c05-math-411 can be solved from the c05-math-412 equations. After that, the values c05-math-413 for c05-math-414 can be computed recursively from (5.25).

Prediction of AR Processes

Let us consider the prediction of c05-math-415 for c05-math-416 when the process is an AR(c05-math-417) process. The best prediction of c05-math-418, given the observations c05-math-419, is denoted by

equation

where we denote c05-math-420. We start with the one-step prediction. The best prediction of c05-math-421, given the observations c05-math-422, is

because c05-math-424. For the two-step prediction the best predictor is

equation

The general prediction formula is

equation

The best prediction is calculated recursively, using the value of c05-math-425 in (5.26), and the fact that c05-math-426 for c05-math-427.

For example, for the MA(1) process c05-math-428 we have

5.3.2.4 ARMA Processes

We define an autoregressive moving average process c05-math-430, of order c05-math-431, c05-math-432, as a process satisfying

equation

where c05-math-433 and c05-math-434 is a MA(c05-math-435) process. We use ARMA(c05-math-436) as a shorthand notation for an autoregressive moving average process of order (c05-math-437).

Stationarity, Causality, and Invertability of ARMA Processes

Let c05-math-438 be an ARMA(c05-math-439) process with

equation

Denote

equation

where c05-math-440, and c05-math-441 is the set of complex numbers. If c05-math-442 for all c05-math-443 such that c05-math-444, then there exists the unique stationary solution

equation

where the coefficients c05-math-445 are obtained from the equation

equation

where c05-math-446 for some c05-math-447; see Brockwell and Davis (1991, Theorem 3.1.3).

The condition for the covariance stationarity does not guarantee that the ARMA(c05-math-448) process would be suitable for modeling. Let us consider the AR(1) model

equation

where c05-math-449. The AR(1) model is covariance stationary if and only if c05-math-450. This can be seen in the following way. Let us consider first the case c05-math-451. We can write recursively

equation

where c05-math-452. Since c05-math-453, we get the MA(c05-math-454) representation7

equation

which implies that c05-math-461 is covariance stationary. Let us then consider the case c05-math-462. Since c05-math-463, we can write recursively

equation

where c05-math-464. Since c05-math-465, we get the MA(c05-math-466) representation8

equation

which implies that c05-math-473 is covariance stationary. The latter case c05-math-474 is not suitable for modeling because c05-math-475 is a function of future innovations c05-math-476 with c05-math-477.

We define causality of the process to exclude examples like the AR(1) model with c05-math-478. An ARMA(c05-math-479) process is called causal if there exists constants c05-math-480 such that c05-math-481 and

equation

Let the polynomials c05-math-482 and c05-math-483 have no common zeroes. Then c05-math-484 is causal if and only if c05-math-485 for all c05-math-486 such that c05-math-487.9 This has been proved in Brockwell and Davis (1991, Theorem 3.1.1). The coefficients c05-math-491 are determined by

equation

Thus, under the conditions that c05-math-492 and c05-math-493 have no common zeroes and c05-math-494 for all c05-math-495 such that c05-math-496, we have expressed an ARMA(c05-math-497) process as an infinite order moving average process. Thus, an ARMA(c05-math-498) process is covariance stationary under these conditions.

An ARMA(c05-math-499) process is called invertible if there exists constants c05-math-500 such that c05-math-501 and

equation

Let the polynomials c05-math-502 and c05-math-503 have no common zeroes. Then c05-math-504 is invertible if and only if c05-math-505 for all c05-math-506 such that c05-math-507. This has been proved in Brockwell and Davis (1991, Theorem 3.1.2). The coefficients c05-math-508 are determined by

equation
Prediction of ARMA Processes

The prediction formulas for ARMA processes given the infinite past can be found in Hamilton (1994, p. 77). For the ARMA (1,1) process c05-math-509 we have

where c05-math-511; see Shiryaev (1999, p. 151). Note that the prediction formula (5.21) of the MA(1) process and the prediction formula (5.27) of the AR(1) process follow from (5.28).

5.3.3 Conditional Heteroskedasticity Models

Time series c05-math-512 satisfies the conditional heteroskedasticity assumption if

5.29 equation

where c05-math-514 is an IID(c05-math-515) process and c05-math-516 is the volatility process. The volatility process is a predictable random process, that is, c05-math-517 is measurable with respect to the sigma-field c05-math-518 generated by the variables c05-math-519. We also assume that c05-math-520 is independent of c05-math-521. Then,

Thus, c05-math-523 is the best prediction of c05-math-524 in the mean squared error sense. Also, for c05-math-525,

Thus, the best prediction of c05-math-527 gives the best prediction of c05-math-528, in the mean squared error sense.

ARCH and GARCH processes are examples of conditional heteroskedasticity models.

5.3.3.1 ARCH Processes

Process c05-math-529 is an ARCH(c05-math-530) process (autoregressive conditional heteroskedasticity process of order c05-math-531), if c05-math-532 where c05-math-533 is an IID(c05-math-534) process and

5.32 equation

where c05-math-536 and c05-math-537. As a special case, the ARCHc05-math-538 process is defined as

equation

The ARCH model was introduced in Engle (1982) for modeling UK inflation rates. The ARCH(c05-math-539) process is strictly stationary if c05-math-540; see Fan and Yao (2005, Theorem 4.3) and Giraitis et al. (2000).

Let us consider the prediction of c05-math-541 for c05-math-542 when the process is an ARCH(c05-math-543) process. The best prediction of c05-math-544, given the observations c05-math-545, is denoted by

equation

We start with the one-step prediction. The best prediction of c05-math-546, given the observations c05-math-547, using the inference in (5.30), is

because c05-math-549. For the two-step prediction we use (5.30) to obtain the best predictor

equation

The general prediction formula is

where we denote c05-math-551. The best prediction is calculated recursively, using the value of c05-math-552 in (5.33), and the fact that c05-math-553 for c05-math-554.

The best c05-math-555-step prediction in the ARCH(1) model is

5.35 equation

where we assumed condition c05-math-557, which guarantees stationarity, and we denote c05-math-558.10

5.3.3.2 GARCH Processes

Process c05-math-564 is a GARCH(c05-math-565) process (generalized autoregressive conditional heteroskedasticity process of order c05-math-566 and c05-math-567), if

5.37 equation

where c05-math-569 is an IID(c05-math-570) process and

equation

where c05-math-571, c05-math-572, and c05-math-573. As a special case we get the GARCHc05-math-574 model, where

5.38 equation

The GARCH model was introduced in Bollerslev (1986). The GARCH(c05-math-576) process is strictly stationary if

5.39 equation

see Fan and Yao (2005, Theorem 4.4) and Bougerol and Picard (1992).

The best one-step prediction of the squared value is obtained from (5.30) as

equation

In the GARCH(c05-math-578) model the best c05-math-579-step prediction of the squared value, in the mean squared error sense, is

where we assumed condition c05-math-581, which guarantees strict stationarity, and we denote the unconditional variance by

5.41 equation

Let us show (5.40) for c05-math-583. Let us denote c05-math-584. We have

equation

and c05-math-585. Thus,

equation

Thus, using (5.31),

equation

We have shown (5.40), since c05-math-586, by (5.31). We can also write the best prediction of c05-math-587 in the GARCH(c05-math-588) model as

where c05-math-590.

The prediction formulas (5.40) and (5.42) are written in terms of c05-math-591. We have the following formula for c05-math-592 in a strictly stationary GARCH(c05-math-593) model:

where we assume c05-math-595 to ensure strict stationarity. More generally, for the GARCH(c05-math-596) model we have

where c05-math-598 are obtained from the equation

equation

for c05-math-599; see Fan and Yao (2005, Theorem 4.4).

5.3.3.3 ARCH(c05-math-600) Model

GARCH(c05-math-601) can be considered a special case of the ARCH(c05-math-602) model, since (5.43) can be written as

equation

where c05-math-603 and c05-math-604. We can obtain a more general ARCH (c05-math-605) model by defining

where c05-math-607, c05-math-608, and c05-math-609 is called a news impact curve. More generally, following Linton (2009), the news impact curve can be defined as the relationship between c05-math-610 and c05-math-611 holding past values c05-math-612 constant at some level c05-math-613. In the GARCH(c05-math-614) model the news impact curve is

equation

The ARCH(c05-math-615) model in (5.45) has been studied in Linton and Mammen (2005), where it was noted that the estimated news impact curve is asymmetric for S&P 500 return data. The asymmetric news impact curve can be addressed by asymmetric GARCH processes.

5.3.3.4 Asymmetric GARCH Processes

Time series of asset returns show a leverage effect. Markets become more active after a price drop: large negative returns are followed by a larger increase in volatility than in the case of large positive returns. In fact, past price changes and future volatilities are negatively correlated. This implies a negative skew to the distribution of the price changes.

The leverage effect is taken into account in the model

equation

where c05-math-617 is the skewness parameter. The model was applied in Heston and Nandi (2000) to price options.11 When c05-math-622, then under (5.46)

equation

When c05-math-623, then negative values of c05-math-624 lead to larger increase in volatility than positive values of the same size of c05-math-625. Now the unconditional variance is

5.49 equation

5.3.3.5 The Moment Generating function

We need the moment generating function in order to compute the option prices when the stock follows an asymmetric GARCH(c05-math-627) process. We follow Heston and Nandi (2000). Let

equation

where c05-math-628, c05-math-629, c05-math-630 are i.i.d. c05-math-631, and

For example, when the logarithmic returns follow the asymmetric GARCH(c05-math-633) process, then

equation

so that c05-math-634. We want to find the moment generating function

equation

where c05-math-635 and c05-math-636 is the conditional expectation at time c05-math-637.

We have that

Also,

because the moment generating function of c05-math-640 is c05-math-641, and c05-math-642.

For c05-math-643 we have

5.53 equation

where c05-math-645 and c05-math-646 are defined by the recursive formulas

equation

The cases c05-math-647 and c05-math-648 were proved in (5.51) and (5.52). Let c05-math-649. Let us make the induction assumption that the formulas hold at time c05-math-650. Now,

equation

Insert values

equation

to get

equation

When c05-math-653, then

equation

Equating terms in (5.54) and (5.55) gives the result.12

Figure 5.9 shows moment generating functions c05-math-655. In panel (a) the current stock price is c05-math-656, and in panel (b) c05-math-657. The one period moment generating function (c05-math-658) is with black, two period (c05-math-659) is with red, and three period (c05-math-660) is with blue. The parameters c05-math-661, c05-math-662, c05-math-663, and c05-math-664 are estimated from the daily S&P 500 daily data of Section 2.4.1, using model (5.46).

Graphical representation of Moment generating functions under GARCH.

Figure 5.9 Moment generating functions under GARCH. We show functions c05-math-665, where (a) c05-math-666 and (b) c05-math-667. The case c05-math-668 is with black, c05-math-669 is with red, and c05-math-670 is with blue.

Note that under the usual GARCH(c05-math-671) model

equation

functions c05-math-672 and c05-math-673 are defined by the recursive formulas

equation

This means that c05-math-674 and c05-math-675 depend on the unobserved sequence c05-math-676, unlike in the case of model (5.50).

5.3.3.6 Parameter Estimation

We discuss first estimation of the ARCH processes, and then extend the discussion to the GARCH processes.

Parameter Estimation for ARCH Processes

Estimation of the parameters of ARCH(c05-math-677) model can be done using the method of maximum likelihood, if we make an assumption about the distribution of innovation c05-math-678. When we have observed c05-math-679, then the likelihood function is

equation

Let us ignore the term c05-math-680 and define the conditional likelihood

equation

Let us denote the density of c05-math-681 by c05-math-682. Then the conditional density of c05-math-683, given c05-math-684, is

equation

where

equation

The parameters are estimated by maximizing the conditional likelihood, and we get

equation

where the logarithm of the conditional likelihood is

5.56 equation

If we assume that c05-math-686 has the standard normal distribution c05-math-687 then c05-math-688 and

Parameter Estimation for GARCH Processes

In the GARCH(c05-math-690) model we can use, similarly to (5.57),

where c05-math-692. Unlike in the ARCHc05-math-693 model, c05-math-694 is a sum of infinitely many terms, and we need to truncate the infinite sum in order to be able to calculate the conditional likelihood. The value c05-math-695 can be chosen as the sample variance using c05-math-696, and c05-math-697 for c05-math-698 can be computed using the recursive formula. Then c05-math-699 is a function of c05-math-700 and of the parameters.

5.3.3.7 Fitting the GARCH(c05-math-701) Model

We fit the GARCH(c05-math-702) model for S&P 500 index and for individual stocks of S&P 500.

S&P 500 Daily Data

Figure 5.10 shows tail plots of the residuals c05-math-703, where c05-math-704 is the estimated volatility in the GARCH(c05-math-705) model. Panel (a) shows the left tail plot and panel (b) the right tail plot. The black points show the residuals, the red curves show the standard normal distribution function, and the blue curves show the Student distributions with degrees of freedom 3, 6, and 12. Figure 3.2 shows the corresponding plots for the S&P 500 returns. We see that the standard normal distribution fits well the central area of the distribution of the residuals, but the tails may be better fitted with a Student distribution.

Graphical representation of GARCH(1,1) residuals: Tail plots.

Figure 5.10 GARCH(1,1) residuals: Tail plots. (a) Left tail plot; (b) right tail plot. The red curves show the standard normal distribution function, and the blue curves show the Student distributions with degrees of freedom c05-math-706, c05-math-707, and c05-math-708.

S&P 500 Components Data

We compute GARCH estimates for daily S&P 500 components data, described in Section 2.4.5. Estimates are computed both for the GARCH(c05-math-709) model and for the Heston–Nandi modification of the GARCH(c05-math-710) model, defined in (5.46).13 Both models have parameters c05-math-712, c05-math-713, and c05-math-714. The Heston–Nandi model has the additional skewness parameter c05-math-715.

Figure 5.11(a) shows a scatter plot of c05-math-716, where c05-math-717 are estimates of c05-math-718 in the GARCH(c05-math-719) model and c05-math-720 are estimates of c05-math-721 in the Heston–Nandi model. The red points show the estimates for daily S&P 500 data, described in Section 2.4.1. Panel (b) shows a scatter plot of c05-math-722. We see that the estimates of c05-math-723 are of the order c05-math-724.

Image described by caption and surrounding text.

Figure 5.11 GARCH(1,1) estimates versus Heston–Nandi estimates: c05-math-725 and c05-math-726. (a) A scatter plot of c05-math-727; (b) a scatter plot of c05-math-728, where c05-math-729 and c05-math-730 are estimates in the GARCH(c05-math-731) model, and c05-math-732 and c05-math-733 are estimates in the Heston–Nandi model.

Figure 5.12(a) shows a scatter plot of c05-math-734, where c05-math-735 are estimates of c05-math-736 in the GARCH(c05-math-737) model, and c05-math-738 are estimates of c05-math-739 in the Heston–Nandi model. We leave out outliers with small estimates for c05-math-740. Panel (b) shows a histogram of estimates c05-math-741 of c05-math-742 in the Heston–Nandi model. The red points and the lines show the estimates for daily S&P 500 data, described in Section 2.4.1. We see that estimates of c05-math-743 are close to 1, and they are more linearly related in the two models than the estimates of c05-math-744 and c05-math-745. Also, we see that the estimates of the skewness parameter c05-math-746 are positive for almost all S&P 500 components, with the median value about 2.5. This indicates that high negative returns increase volatility more than the positive returns.

Image described by caption and surrounding text.

Figure 5.12 GARCH(1,1) estimates versus Heston–Nandi estimates: c05-math-747 and c05-math-748. (a) A scatter plot of c05-math-749, where c05-math-750 are estimates in the GARCH(c05-math-751) model, and c05-math-752 are estimates in the Heston–Nandi model. Panel (b) shows a histogram of estimates c05-math-753 of c05-math-754 in the Heston–Nandi model.

5.3.4 Continuous Time Processes

The geometric Brownian motion is used to model stock prices in the Black–Scholes model. We do not go into details about continuous time models, but we think that it is useful to review some basic facts about continuous time models. In particular, the geometric Brownian motion appears as the limit of a discrete time binomial model.

5.3.4.1 The Brownian Motion

Stochastic process c05-math-755, c05-math-756, is called the standard Brownian motion, or the standard Wiener process, if it has the following properties:

  1. 1. c05-math-757 with probability one,
  2. 2. c05-math-758,
  3. 3. c05-math-759 is independent of c05-math-760 for c05-math-761.

The Brownian motion leads to the process

equation

where c05-math-762 is drift and c05-math-763 is volatility. We can use the notation of stochastic differential equations:

equation

5.3.4.2 Diffusion Processes and Itô's Lemma

The diffusion Markov process is defined as

5.59 equation

where c05-math-765, c05-math-766 is a random variable, and

equation

with probability one; see Shiryaev (1999, p. 237). A definition of the stochastic integrals with respect to the Brownian motion can be found in Shiryaev (1999, p. 252).14 The definition of the process can be written with the shorthand notation of the stochastic differential equations:

For example, a mean reverting model is defined as

equation

Let c05-math-775 be a diffusion process as in (5.60), and let c05-math-776, where c05-math-777 is continuously differentiable with respect to the first argument and two times continuously differentiable with respect to the second argument. Furthermore, we assume that c05-math-778. Then c05-math-779 is a diffusion Markov process with

5.61 equation

where

equation

and c05-math-781, c05-math-782, and c05-math-783 are related by c05-math-784. The expression for c05-math-785 follows from Itô's lemma; see Shiryaev (1999, p. 263).15

5.3.4.3 The Geometric Brownian Motion

The geometric Brownian motion is the stochastic process

where c05-math-794 is the standard Brownian motion, c05-math-795, and c05-math-796. The stochastic differential equation of the geometric Brownian motion is

The fact that the solution of the stochastic differential equation in (5.63) is given in (5.62) follows from Itô's formula. Indeed, we consider diffusion process c05-math-798, c05-math-799, c05-math-800, and c05-math-801. Then Itô's formula implies that c05-math-802 is a diffusion process with c05-math-803 and c05-math-804.

5.3.4.4 Girsanov's Theorem

Let c05-math-805 be a filtered probability space and let c05-math-806 be a Brownian motion. Let c05-math-807 be a stochastic process with c05-math-808, for c05-math-809. We construct a process c05-math-810 by setting

equation

If c05-math-811, then c05-math-812. We can define a probability measure c05-math-813 on c05-math-814 by

equation

where c05-math-815. Let c05-math-816 be the restriction of c05-math-817 to c05-math-818. Measure c05-math-819 is equivalent to c05-math-820. Girsanov's theorem states that

5.64 equation

defines a Brownian motion c05-math-822; see Shiryaev (1999, p. 269). A proof can be found in Shiryaev (1999, Chapter VII, Section 3b).

5.4 Multivariate Time Series Models

The multivariate GARCH model is defined for vector time series c05-math-823 that has c05-math-824 components. It is assumed that c05-math-825 is strictly stationary and

where c05-math-827 is the square root of a positive definite covariance matrix c05-math-828, c05-math-829 is measurable with respect to the sigma-algebra generated by c05-math-830, and c05-math-831 is a c05-math-832-dimensional i.i.d. process with c05-math-833 and c05-math-834, where c05-math-835 is the c05-math-836 identity matrix.

The square root of c05-math-837 can be defined by writing the eigenvalue decomposition c05-math-838, where c05-math-839 is the diagonal matrix of the eigenvalues of c05-math-840 and c05-math-841 is the orthogonal matrix whose columns are the eigenvectors of c05-math-842. Then we define c05-math-843, where c05-math-844 is the diagonal matrix obtained from c05-math-845 by taking square root of each element. We can define c05-math-846 also as a Cholesky factor of c05-math-847.

Multivariate GARCH (MGARCH) processes are reviewed in McNeil et al. (2005, Section 4.6), Bauwens et al. (2006), and Silvennoinen and Teräsvirta (2009). Below we write the models only for the case c05-math-848, so that c05-math-849. The multivariate GARCH models are denoted with MGARCH(c05-math-850). We restrict ourselves to the first-order models with c05-math-851. The multivariate GARCH models are based on (5.65) but differ in the definition of the recursive formula for c05-math-852.

5.4.1 MGARCH Models

First we define the VEC model and two restrictions of it: the diagonal VEC model and the Baba–Engle–Kraft–Kroner (BEKK) model. Then we define the constant correlation model and the dynamic conditional correlation model.

Let us denote c05-math-853, c05-math-854, and c05-math-855. The VEC model and the diagonal VEC model were introduced in Bollerslev et al. (1988). The VEC model assumes that

equation

This model has 21 parameters c05-math-856. Since the model has a large number of parameters, it is useful to consider models with less parameters. The diagonal VEC model has only nine parameters and assumes that

Thus, in the diagonal VEC model the components of c05-math-860 follow univariate GARCH models. The BEKK model was introduced in Engle and Kroner (1995). The model has 11 parameters and it can be written more easily with the matrix notation as

equation

where c05-math-861 is a symmetric c05-math-862 matrix and c05-math-863 and c05-math-864 are c05-math-865 matrices. The BEKK model is obtained from the VEC model by restricting the parameters. We can express the parameters c05-math-866 of the VEC model in terms of the parameters of the BEKK model as follows:

equation

where we denote the elements of c05-math-867 by c05-math-868 and the elements of c05-math-869 by c05-math-870.

The recursive formula for c05-math-871 can be written by using the correlation matrix c05-math-872. Let c05-math-873 be the diagonal matrix of the standard deviations of c05-math-874. The correlation matrix c05-math-875, corresponding to c05-math-876, is such that c05-math-877.

The constant correlation MGARCH model, introduced in Bollerslev (1990), is such that the components of c05-math-878 follow univariate GARCH models, and the correlation matrix is constant. That is, c05-math-879 and c05-math-880, where c05-math-881 is the constant correlation matrix. The constant correlation GARCH model assumes the univariate GARCH models for the components, as in (5.66) and (5.67), and

equation

The dynamic conditional correlation MGARCH model, introduced in Engle (2002), is such that the components of c05-math-882 follow univariate GARCH models and

where c05-math-884 Engle (2002) suggests to estimate

equation

where c05-math-885 is the sample covariance with c05-math-886 and c05-math-887. We do not typically have c05-math-888, and thus the conditional correlation is estimated from

equation

where c05-math-889 and c05-math-890.

5.4.2 Covariance in MGARCH Models

The recursive equation (5.68) in the stationary diagonal VEC model implies that

equation

This follows similarly as in the case of GARCHc05-math-891 model (see (5.43) and (5.44)). The recursive equation (5.69) in the stationary dynamic conditional correlation GARCH model implies similarly that

equation

where c05-math-892.

Given the observations c05-math-893, we estimate the parameters, similarly to GARCH(c05-math-894) estimation in (5.58), by maximizing the conditional modified likelihood,

equation

where c05-math-895, c05-math-896 is the density of the standard normal bivariate distribution c05-math-897, and c05-math-898 is the truncated covariance, with elements c05-math-899, c05-math-900, c05-math-901, where

equation

and c05-math-902, c05-math-903 are defined similarly.

Given the data c05-math-904, the MGARCHc05-math-905 estimator for the conditional covariance is

5.70 equation

where the parameter estimators c05-math-907, c05-math-908, and c05-math-909 are are calculated with the maximum likelihood method.

5.5 Time Series Stylized Facts

Time series models of financial time series should be such that they are able to capture stylized facts. We describe the stylized facts mainly using the daily S&P 500 index data, described in Section 2.4.1. Stylized facts of financial time series are studied by Cont (2001) and Bouchaud (2002).

  1. 1. Returns are uncorrelated.

    Figure 5.5(a) shows the sample autocorrelation function for the S&P 500 returns. Sample autocorrelations are small, although they are not completely inside the 95% confidence band.

    When the time scale is shorter than tens of minutes, there can be considerable correlation; see Cont (2001) and Bouchaud (2002).

  2. 2. Absolute returns are correlated.

    Figure 5.5(b) shows the sample autocorrelation function for the absolute S&P 500 returns. The sample autocorrelation goes inside the 95% confidence band after the lag of 500 days, but does not stay inside the band.

    The decay of the autocorrelation of absolute returns has roughly a power law with an exponent in range c05-math-910; see Cont (2001).

    Since absolute returns are correlated, we can claim that the time series of returns does not consist of independent observations, although they are uncorrelated. The autocorrelation can also be seen in scatter plots. Figure 5.13 shows scatter plots of absolute returns. Panel (a) shows the scatter plot of points c05-math-911 c05-math-912. Panel (b) shows the scatter plot of points c05-math-913 c05-math-914.

  3. 3. Volatility is clustered.

    There are localized outbursts of volatility. The bursts of high volatility last for some time, and then the volatility returns to more normal levels.

    Figure 5.14 shows simulated GARCHc05-math-915 returns and real S&P 500 returns. Panel (a) shows a time series of returns that are simulated from the GARCHc05-math-916 model with parameters being equal to the estimates from S&P 500 daily data. The first return is simulated from the distribution c05-math-917. Panel (b) shows the time series of logarithmic S&P 500 returns. S&P 500 data is described in Section 2.4.1. Figure 3.29 shows the corresponding simulated i.i.d. Gaussian returns.

    The decay of volatility correlation is slow. The volatility correlation can be defined as the autocorrelation of squared returns, and the autocorrelation of the squared returns shows similar behavior as the autocorrelation of the absolute returns. Volatility displays a positive autocorrelation over several days; see Cont (2001) and Bouchaud (2002).

  4. 4. Extreme returns appear in clusters.

    Figure 5.15 shows the 10 largest and the 10 smallest returns of S&P 500. The largest returns are shown in blue and the smallest returns are shown in red. We can see that the biggest losses and the biggest gains occur at the same dates.

  5. 5. Leverage effect.

    Markets become more active after a price drop; past price changes and future volatilities are negatively correlated. This implies a negative skew to the distribution of the price changes. The leverage effect has been taken into account in the VGARCH model in Engle and Ng (1993) and in the VGARCH related option pricing in Heston and Nandi (2000). We study asymmetric GARCH models in Section 5.3.3.

    Figures 5.11 and 5.12 study parameter fitting in the basic GARCH(c05-math-918) model and in an asymmetric GARCH(c05-math-919). Figure 5.12(b) shows that the skewness parameter tends to be positive for S&P 500 components.

  6. 6. Conditional heavy tails.

    Even after correcting the returns for volatility clustering, the residual time series still has heavy tails. The residuals may be calculated, for example, via GARCH-type models.

    Figure 5.10 shows the tails of the residuals when GARCH(c05-math-920) is fitted to S&P 500 daily data.

  7. 7. The kurtosis has slow decay.

    This means that the autocorrelation of the fourth power of the returns has slow decay; see Bouchaud (2002).

  8. 8. Volatility and volume are correlated.

    Volatility and the volume of the activity have long-ranged correlations; see Cont (2001) and Bouchaud (2002).

Graphical representation of S&P 500 scatter plots of absolute returns.

Figure 5.13 S&P 500 scatter plots of absolute returns. (a) Scatter plot of points c05-math-921; (b) scatter plot of points c05-math-922.

Graphical representation of Simulated GARCH(1, 1) returns and S&P 500 returns.

Figure 5.14 Simulated GARCHc05-math-923 returns and S&P 500 returns. (a) A time series of simulated returns from a GARCHc05-math-924 model; (b) the time series of S&P 500 returns.

Graphical representation of S&P 500 returns.

Figure 5.15 S&P 500 returns. The 10 smallest returns are shown in red and the 10 largest returns are shown in green.

equation

and (5.4) implies (5.6).

equation

because c05-math-290. Thus, c05-math-291 is minimized with respect to c05-math-292 by choosing c05-math-293. Note that the conditional expectation defined as c05-math-294 is a real-valued function of c05-math-295, but c05-math-296 is a real-valued random variable, which can be defined as c05-math-297.

equation

Thus,

Thus, the best c05-math-561-step prediction of c05-math-562 in ARCH(1) model is given in (5.34), where c05-math-563 and we used (5.31) and (5.36).

5.47 equation

which is for c05-math-619 equal to the GARCH(c05-math-620) model. Engle and Ng (1993) have defined the VGARCH model

5.48 equation

Menn and Rachev (2009) propose the GARMAX model that can also cope with the leverage effect.

equation

Thus,

equation

because

equation

Finally,

equation
equation

the stochastic integral is defined as

equation

where c05-math-767 are random variables, c05-math-768, and we denote c05-math-769. The stochastic integral can be defined for “square integrable” random functions c05-math-770 as the “limit” of integrals c05-math-771 of simple functions c05-math-772, “approximating” function c05-math-773.

equation

where c05-math-787 and c05-math-788 are the first and the second derivatives. Taylor expansion gives

equation

where c05-math-789. If the changes have zero mean, c05-math-790. Thus, in the stochastic case the second-order term is not of a smaller order than the first-order term, whereas in the deterministic case the second-order term is of a smaller order than the first-order term. The Itô's lemma holds for the class of Itô processes. An Itô process is defined as

equation

Itô processes are more general than diffusion processes, because in diffusion processes dependence on c05-math-791 is through c05-math-792; see Shiryaev (1999, p. 257).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset