Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5
Time Series Analysis

Time series analysis can be used to analyze both a univariate time series and a vector time series. We are interested in estimating the dependence between consecutive observations. In a vector time series there is both cross-sectional dependence and time series dependence, which means that the components of the vector depend on each other at any given point of time, and the future values of the vectors depend on the past values of the vectors.

Model free time series analysis can estimate the joint distribution of

5.1

for some $c05-math-002$ . The estimation could be done using nonparametric multivariate density estimation. A different model free approach models at the first step the distribution of $c05-math-003$ parametrically, using density $c05-math-004$ . At the second step, parameter $c05-math-005$ is taken to be time dependent. This leads to a semiparametric time series analysis, because we combine a cross-sectional parametric model with a time varying estimation of the parameter. Time localized maximum likelihood or time localized least squares can be used to estimate the parameter. Of particular interest is to estimate a univariate excess distribution $c05-math-006$ with a time varying $c05-math-007$ , because this leads to time varying quantile estimation.

Prediction is one of the most important applications of time series analysis. In prediction it is useful to use regression models

5.2

where $c05-math-009$ and $c05-math-010$ is noise. For the estimation of $c05-math-011$ we can use nonparametric regression. We study prediction with models (5.2) in Chapter 6.

Autoregressive moving average processes (ARMA) models are classical parametric models for time series analysis. It is of interest to find formulas of conditional expectation in ARMA models, because these formulas for conditional expectation can be used to construct predictors. The formulas for conditional expectation in ARMA models give insight into different types of predictors: AR models lead to state space prediction, and MA models lead to time space prediction.

Prediction of future returns of a financial asset is difficult, but prediction of future absolute returns and future squared returns is feasible. Generalized autoregressive conditional heteroskedasticity (GARCH) models are applied in the prediction of squared returns. Prediction of future squared returns is called volatility prediction. Prediction of volatility is applied in Chapter 7.

We concentrate on time series analysis in discrete time, but we define also some continuous time stochastic processes, like the geometric Brownian motion, because it is a standard model in option pricing.

Section 5.1 discusses strict stationarity, covariance stationarity, and autocovariance function. Section 5.2 studies model free time series analysis. Section 5.3 studies parametric time series models, in particular, ARMA and GARCH processes. Section 5.4 considers models for vector time series. Section 5.5 summarizes stylized facts of financial time series.

5.1 Stationarity and Autocorrelation

A time series (stochastic process) is a sequence of random variables, indexed by time. We define time series models for double infinite sequences

A time series model can also be defined for a one-sided infinite sequence $c05-math-012$ , where $c05-math-013$ or $c05-math-014$ . A realization of a time series is a finite sequence $c05-math-015$ of observed values. We use the term “time series” both to denote the underlying stochastic process and a realization of the stochastic process. Besides a sequence of real valued random variables, we can consider a vector time series, which is a sequence $c05-math-016$ of random vectors $c05-math-017$ .¹

5.1.1 Strict Stationarity

Time series $c05-math-021$ is called strictly stationary, if $c05-math-022$ and $c05-math-023$ are identically distributed for all $c05-math-024$ . This means that for a strictly stationary time series all finite dimensional marginal distributions are equal.

Figure 5.1(a) shows a time series of S&P 500 daily prices, using data described in Section 2.4.1. The time series has an exponential trend and is not stationary. The exponential trend can be removed by taking logarithms, as shown in panel (b), but after that we have a time series with a linear trend. The linear trend can be removed by taking differences, as shown in panel (c), which leads to the time series of logarithmic returns, which already seems to be a stationary time series. Figure 2.1(b) shows that the gross returns seem to be stationary.

Graphical representation of Removing a trend: Differences of logarithms. — **Figure 5.1** *Removing a trend: Differences of logarithms*. (a) S&P 500 prices; (b) logarithms of S&P 500 prices; (c) differences of the logarithmic prices.

Figure 5.2(a) shows a time series of differences of S&P 500 prices, which is not a stationary time series. Panels (b) and (c) show short time series of price differences, which seem to be approximately stationary. Thus, we could also define the concept of approximate stationarity.

Image described by caption and surrounding text. — **Figure 5.2** *Removing a trend: Differencing*. (a) Differences of S&P 500 prices over 65 years; (b) differences over 4 years; (c) differences over 100 days.

Figure 5.3 studies a time series of squares of logarithmic returns, computed from the daily S&P 500 data, which is described in Section 2.4.1. The squared logarithmic returns are often modeled as a stationary GARCH( $c05-math-025$ ) time series. However, we can also model the squared logarithmic returns with a signal plus noise model

where $c05-math-026$ , $c05-math-027$ is a deterministic trend, and $c05-math-028$ is stationary white noise. We can estimate the trend $c05-math-029$ with a moving average $c05-math-030$ . Moving averages are defined in Section 6.1.1. Panel (a) shows time series $c05-math-031$ (black circles) and $c05-math-032$ (red line). Panel (b) shows $c05-math-033$ . Panel (b) suggests that subtracting the moving average could lead to stationarity. We use the one-sided exponential moving average in (6.3) with smoothing parameter $c05-math-034$ .

Graphical representation of Removing a trend: Subtracting amoving average. — **Figure 5.3** *Removing a trend: Subtracting a moving average*. (a) A times series of squared returns and a moving average of squared returns (red); (b) squared returns minus the moving average of squared returns.

5.1.1.1 Random Walk

Random walk is a discrete time stochastic process $c05-math-035$ defined by

where $c05-math-036$ is a random variable or a fixed value, and $c05-math-037$ is distributed as IID( $c05-math-038$ ). We have that

If $c05-math-039$ is a constant, then $c05-math-040$ and $c05-math-041$ . Thus, random walk is not strictly stationary (and not covariance stationary). We obtain a Gaussian random walk if $c05-math-042$ is Gaussian white noise. If $c05-math-043$ , then a Gaussian random walk satisfies $c05-math-044$ .

Figure 5.4(a) shows the time series of S&P 500 prices over a period of 100 days. Panel (b) shows a simulated Gaussian random walk of length 100, when the initial value is 0. A random walk leads to a time series that has a stochastic trend. A stochastic trend is difficult to distinguish from a deterministic trend. A time series of stock prices resembles a random walk. Also a time series of a dividend price ratio in Figure 6.7(a) resembles a random walk.

Graphical representation of Stochastic trend. — **Figure 5.4** *Stochastic trend*. (a) Prices of S&P 500 over 100 days; (b) simulated random walk of length 100, when the initial value is 0.

Geometric random walk is a discrete time stochastic process defined by

where $c05-math-045$ are i.i.d and $c05-math-046$ is independent of $c05-math-047$ .

5.1.2 Covariance Stationarity and Autocorrelation

We define autocovariance and autocorrelation first for scalar time series and then for vector time series.

5.1.2.1 Autocovariance and Autocorrelation for Scalar Time Series

We say that a time series $c05-math-048$ is covariance stationary, if $c05-math-049$ is a constant, not depending on $c05-math-050$ , and $c05-math-051$ depends only on $c05-math-052$ but not on $c05-math-053$ . A covariance stationary time series is called also second-order stationary.

If $c05-math-054$ for all $c05-math-055$ , then strict stationary implies covariance stationarity. There exists time series that are strictly stationary but for which covariance is not defined.² Covariance stationarity does not imply strict stationarity. For a Gaussian time series, strict stationarity and covariance stationarity are equivalent. By a Gaussian time series, we mean a time series whose all finite dimensional marginal distributions have a Gaussian distribution.

For a covariance stationary time series the autocovariance function is defined by

where $c05-math-063$ . The covariance stationarity implies that $c05-math-064$ depends only on $c05-math-065$ and not on $c05-math-066$ . The autocorrelation function is defined as

where $c05-math-067$ .

The sample autocovariance with lag $c05-math-068$ , based on the observations $c05-math-069$ , is defined as

where $c05-math-070$ .³ The sample autocorrelation with lag $c05-math-074$ is defined as

5.3

Figure 5.5 shows sample autocorrelation functions for the daily S&P 500 index data, described in Section 2.4.1. Panel (a) shows the sample autocorrelation function $c05-math-076$ for the return time series $c05-math-077$ and panel (b) shows the sample autocorrelation function for the time series of the absolute returns $c05-math-078$ . The lags are on the range $c05-math-079$ .

Graphical representation of S&P 500 autocorrelation. — **Figure 5.5** *S&P 500 autocorrelation*. (a) The sample autocorrelation function $c05-math-080$ of S&P 500 returns for $c05-math-081$ ; (b) the sample autocorrelation function for absolute returns. The red lines indicate the 95% confidence band for the null hypothesis of i.i.d process.

**Figure 5.5** *S&P 500 autocorrelation*. (a) The sample autocorrelation function $c05-math-080$ of S&P 500 returns for $c05-math-081$ ; (b) the sample autocorrelation function for absolute returns. The red lines indicate the 95% confidence band for the null hypothesis of i.i.d process.

If $c05-math-082$ are i.i.d. with mean zero, then

as $c05-math-083$ ; see Brockwell and Davis (1991). Thus, if $c05-math-084$ are i.i.d. with mean zero, then about $c05-math-085$ of the observed values $c05-math-086$ should be inside the band

where $c05-math-087$ is the $c05-math-088$ -quantile for the standard normal distribution. Figure 5.5 has the red lines at the heights $c05-math-089$ , where we have chosen $c05-math-090$ , so that $c05-math-091$ .

The Box–Ljung test can be used to test whether the autocorrelations are zero for a stationary time series $c05-math-092$ . The null hypothesis is that $c05-math-093$ for $c05-math-094$ , where $c05-math-095$ . Let us have observed time series $c05-math-096$ . The test statistics is

where $c05-math-097$ is defined in (5.3). The test rejects the null hypothesis of zero autocorrelations if

where $c05-math-098$ is the $c05-math-099$ -quantile of the $c05-math-100$ -distribution with degrees of freedom $c05-math-101$ . We can compute the observed $c05-math-102$ -values

for $c05-math-103$ , where $c05-math-104$ is the distribution function of the $c05-math-105$ -distribution with degrees of freedom $c05-math-106$ . Small observed $c05-math-107$ -values indicate that the observations are not compatible with the null hypothesis.

5.1.2.2 Autocovariance for Vector Time Series

Let $c05-math-108$ be a vector time series with two components. Vector time series $c05-math-109$ is covariance stationary when the components $c05-math-110$ and $c05-math-111$ are covariance stationary and

5.4

for all $c05-math-113$ . Thus, vector time series $c05-math-114$ is covariance stationary when $c05-math-115$ is a vector of constants, not depending on $c05-math-116$ , and the covariance

depends only on $c05-math-117$ but not on $c05-math-118$ for $c05-math-119$ .

For a covariance stationary time series the autocovariance function is defined by

5.5

For a scalar covariance stationary time series $c05-math-121$ we have

However, the autocovariance function of a vector time series satisfies⁴

5.6

5.2 Model Free Estimation

Univariate and multivariate descriptive statistics and graphical tools can be applied to get insight into a distribution of a time series. We can apply $c05-math-123$ -variate descriptive statistics and graphical tools to the $c05-math-124$ -dimensional marginal distributions of a time series. This is discussed in Section 5.2.1.

Univariate and multivariate density estimators and regression estimators can be applied to time series data. We can apply $c05-math-125$ -variate estimators to the $c05-math-126$ -dimensional marginal distributions of a time series. This is discussed in Section 5.2.2, by assuming that the time series is a Markov process of order $c05-math-127$ .

Section 5.2.3 considers modeling time series with a combination of parametric and nonparametric methods. First a static parametric model is posed on the observations and then the time dynamics is introduced with time space or state space smoothing. The approach includes both local likelihood, covered in Section 5.2.3.1, and local least squares method, covered in Section 5.2.3.2. We apply local likelihood and local least squares to estimate time varying tail index in Section 5.2.3.3.

5.2.1 Descriptive Statistics for Time Series

Univariate statistics, as defined in Section 3.1, can be used to describe time series data $c05-math-128$ . Using univariate statistics, like sample mean and sample variance, is reasonable if $c05-math-129$ are identically distributed.

Multivariate statistics, as defined in Section 4.1, can be used to describe vector time series data $c05-math-130$ . Again, the use of multivariate statistics like sample correlation is reasonable if $c05-math-131$ are identically distributed.

Multivariate statistics can be used also for univariate time series data $c05-math-132$ if we create a vector time series from the initial univariate time series. We can create a two-dimensional vector time series by defining

5.7

for some $c05-math-134$ . Now we can compute a sample correlation coefficient, for example, from data $c05-math-135$ . This is reasonable if $c05-math-136$ are identically distributed. The requirement that $c05-math-137$ in (5.7) are identically distributed follows from strict stationarity of $c05-math-138$ .

5.2.2 Markov Models

We have defined strict stationarity in Section 5.1.1. A strictly stationary time series $c05-math-139$ can be defined by giving all finite dimensional marginal distributions. That is, to define the distribution of a strictly stationary time series we need to define the distributions

for all $c05-math-140$ . If the time series is IID(0, $c05-math-141$ ), then we need only to define the distribution of $c05-math-142$ . We say that the time series is a Markov process, if

To define a Markov process we need to define the distribution of $c05-math-143$ and $c05-math-144$ . More generally, we say that the time series is a Markov process of order $c05-math-145$ , if

To define a Markov process of order $c05-math-146$ we need to define the distributions of $c05-math-147$ , $c05-math-148$ , …, $c05-math-149$ .

To estimate nonparametrically the distribution of a Markov process of order $c05-math-150$ , we can estimate the distributions of $c05-math-151$ , $c05-math-152$ , …, $c05-math-153$ nonparametrically.

5.2.3 Time Varying Parameter

Let $c05-math-154$ be a time series. Let $c05-math-155$ be a density function, where $c05-math-156$ is a parameter, and $c05-math-157$ . We could ignore the time series properties and assume that $c05-math-158$ are independent and identically distributed with density $c05-math-159$ .

However, we can assume that parameter $c05-math-160$ changes in time. Then the observations are not identically distributed, but $c05-math-161$ has density $c05-math-162$ . In practice, we do not specify any dynamics for $c05-math-163$ , but construct estimates $c05-math-164$ using nonparametric smoothing.

Note that even when we would assume independent and identically distributed observations, with time series data the parameter estimate is changing in time, because at time $c05-math-165$ the estimate $c05-math-166$ is constructed using data $c05-math-167$ . This is called sequential estimation.

5.2.3.1 Local Likelihood

If $c05-math-168$ are independent with density $c05-math-169$ , then the density of $c05-math-170$ is

The maximum likelihood estimator of $c05-math-171$ is the value $c05-math-172$ maximizing

over $c05-math-173$ . We can find a time varying estimator $c05-math-174$ using either time space or state space localization. The local likelihood approach has been studied in Spokoiny (2010). The localization is discussed more in Sections 6.1.1 and 6.1.2.

Time Space Localization

Let

5.8

where $c05-math-176$ is the smoothing parameter and $c05-math-177$ is a kernel function. For example, we can take $c05-math-178$ . Let $c05-math-179$ be the value maximizing

5.9

over $c05-math-181$ .

For example, let us consider the model

where $c05-math-182$ are i.i.d. Denote $c05-math-183$ and $c05-math-184$ , where $c05-math-185$ is the density of the standard normal distribution. Let $c05-math-186$ . Now

Then $c05-math-187$ , where

5.10

State Space Localization

Let us observe the state variables $c05-math-189$ in addition to observing time series $c05-math-190$ . Let

5.11

where $c05-math-192$ is the smoothing parameter and $c05-math-193$ is a kernel function. We can take $c05-math-194$ the density of the standard normal distribution. Let $c05-math-195$ be the value maximizing (5.9) over $c05-math-196$ .

For example, let us consider the model

where $c05-math-197$ are i.i.d. The model can be written as

Denote $c05-math-198$ and $c05-math-199$ , where $c05-math-200$ is the density of the standard normal distribution. Then $c05-math-201$ , as defined in (5.10).

5.2.3.2 Local Least Squares

Let us consider a linear model with time changing parameters. Let us observe the explanatory variables $c05-math-202$ in addition to observing time series $c05-math-203$ . Consider the model

where $c05-math-204$ , $c05-math-205$ are time dependent constants, $c05-math-206$ is the vector of explanatory variables, and $c05-math-207$ is an error term.

We define the estimates of the time varying regression coefficients as the values $c05-math-208$ and $c05-math-209$ minimizing

where $c05-math-210$ is the time space localized weight defined in (5.8). When we observe in addition the state variables $c05-math-211$ , then we can use the state space localized weight $c05-math-212$ , defined in (5.11).

5.2.3.3 Time Varying Estimators for the Excess Distribution

We discussed tail modeling in Section 3.4. The idea in tail modeling is to fit a parametric model only to the data in the left tail or to the data in the right tail. We can add time space or state space localization to the tail modeling. As before, $c05-math-213$ is a time series.

Local Likelihood in Tail Estimation

Let family $c05-math-214$ , $c05-math-215$ , model the excess distribution of a return distribution, where $c05-math-216$ . This means that if $c05-math-217$ is the density of the return distribution then we assume for the left tail that

for some $c05-math-218$ , where $c05-math-219$ , and $c05-math-220$ is the $c05-math-221$ th quantile of the return density: $c05-math-222$ . For the right tail the corresponding assumption is

for some $c05-math-223$ , where $c05-math-224$ , and $c05-math-225$ is the $c05-math-226$ th quantile of the return density: $c05-math-227$ .

The local maximum likelihood estimator for the parameter of the left tail is obtained from (3.63) as

5.12

where $c05-math-229$ is the empirical quantile computed from $c05-math-230$ , $c05-math-231$ , and

The time space localized weights $c05-math-232$ are modified from (5.8) as

5.13

where $c05-math-234$ is the smoothing parameter and $c05-math-235$ is a kernel function. If there is available the state variables $c05-math-236$ , then we can use the state space localized weights, modified from (5.11) as

5.14

where $c05-math-238$ is the smoothing parameter and $c05-math-239$ is a kernel function.

The local maximum likelihood estimator for the parameter of the right tail is obtained from (3.64) as

where $c05-math-240$ is the empirical quantile, $c05-math-241$ , and

The weights are obtained from (5.13) and (5.14) by replacing $c05-math-242$ with $c05-math-243$ .

For example, let us assume that the excess distribution is the Pareto distribution, as defined in (3.74) as

where $c05-math-244$ is the shape parameter. The maximum likelihood estimator has the closed form expression (3.75). The local maximum likelihood estimators are

5.15

and

5.16

where $c05-math-247$ , and $c05-math-248$ . For the left tail we assume that $c05-math-249$ , and for the right tail we assume that $c05-math-250$ . These are the time varying Hill's estimators.

Figure 5.6 studies time varying Hill's estimates for the S&P 500 daily data, described in Section 2.4.1. Panel (a) shows the estimates for the left tail index and panel (b) shows the estimates for the right tail index. Sequentially calculated Hill's estimates are shown in black, time localized Hill's estimates with $c05-math-251$ are shown in blue, and the case with $c05-math-252$ is shown in yellow. The exponential kernel function is used. The estimation is started after there are 4 years of data. The tails are defined by the empirical quantile $c05-math-253$ with $c05-math-254$ and $c05-math-255$ .

Graphical representation of Time varying Hill's estimator. — **Figure 5.6** *Time varying Hill's estimator*. (a) Left tail index; (b) right tail index. The black curves show sequentially calculated Hill's estimates, the blue curves show the time localized estimates with $c05-math-256$ and the yellow curves have $c05-math-257$ .

**Figure 5.6** *Time varying Hill's estimator*. (a) Left tail index; (b) right tail index. The black curves show sequentially calculated Hill's estimates, the blue curves show the time localized estimates with $c05-math-256$ and the yellow curves have $c05-math-257$ .

Time Varying Regression Estimator for Tail Index

Let $c05-math-258$ be the observed time series at time $c05-math-259$ . The regression estimator for the parameter $c05-math-260$ of the Pareto distribution is given in (3.77). Let

The local regression estimator of the parameter of the left tail is

where $c05-math-261$ is the empirical quantile and we assume $c05-math-262$ . The weights $c05-math-263$ are obtained from (5.13) and (5.14) by replacing index $c05-math-264$ with the index $c05-math-265$ , so that the weights correspond to the ordering $c05-math-266$ .

The local regression estimator of the parameter of the right tail is

where $c05-math-267$ are the observations in reverse order,

$c05-math-268$ is the empirical quantile, $c05-math-269$ , and we assume $c05-math-270$ .

Figure 5.7 studies time varying regression estimates for the tail index using the S&P 500 daily data, described in Section 2.4.1. Panel (a) shows the estimates for the left tail index and panel (b) shows the estimates for the right tail index. Sequentially calculated regression estimates are shown in black, time localized estimates with $c05-math-271$ are shown in blue, and the case with $c05-math-272$ is shown in yellow. The standard Gaussian kernel function is used. The estimation is started after there are 4 years of data. The tails are defined by the empirical quantile $c05-math-273$ with $c05-math-274$ and $c05-math-275$ .

Graphical representation of Time varying regression estimator. — **Figure 5.7** *Time varying regression estimator*. Time series of estimates of the tail index are shown. (a) Left tail index; (b) right tail index. The black curves show sequentially calculated regression estimates, the blue curves show the time localized estimates with $c05-math-276$ and the yellow curves have $c05-math-277$ .

**Figure 5.7** *Time varying regression estimator*. Time series of estimates of the tail index are shown. (a) Left tail index; (b) right tail index. The black curves show sequentially calculated regression estimates, the blue curves show the time localized estimates with $c05-math-276$ and the yellow curves have $c05-math-277$ .

5.3 Univariate Time Series Models

We discuss first ARMA (autoregressive moving average) processes and after that we discuss conditional heteroskedasticity models. Conditional heteroskedasticity models include ARCH (autoregressive conditional heteroskedasticity) and GARCH (generalized autoregressive conditional heteroskedasticity) models. The ARMA, ARCH, and GARCH processes are discrete time stochastic processes. We discuss also continuous time stochastic processes, because geometric Brownian motion and related continuous time stochastic processes are widely used in option pricing.

Brockwell and Davis (1991) give a detailed presentation of linear time series analysis, Fan and Yao (2005) give a short introduction to ARMA models and a more detailed discussion of nonlinear models. Shiryaev (1999) presents results of time series analysis that are useful for finance.

5.3.1 Prediction and Conditional Expectation

Our presentation of discrete time series analysis is directed towards giving prediction formulas: these prediction formulas are used in Chapter 7 to provide benchmarks for the evaluation of the methods of volatility prediction. Chapter 6 studies nonparametric prediction.

Let $c05-math-278$ be a time series with $c05-math-279$ . We take the conditional expectation

5.17

to be the best prediction of $c05-math-281$ , given the observations $c05-math-282$ , where $c05-math-283$ is the prediction step. Using the conditional expectation as the best predictor can be justified by the fact that the conditional expectation minimizes the mean squared error. In fact, the function $c05-math-284$ minimizing

5.18

is the conditional expectation: $c05-math-286$ .⁵

Besides predicting the value $c05-math-298$ , we consider also predicting the squared value $c05-math-299$ .

In the following text, we give expressions for $c05-math-300$ in the ARMA models and for $c05-math-301$ in the ARCH and GARCH models. These expressions depend on the unknown parameters of the models. In order to apply the expressions we need to estimate the unknown parameters and insert the estimates into the expressions.

The conditional expectation whose condition is the infinite past is a function $c05-math-302$ of the infinite past. Since we have available only a finite number of observations, we have to truncate these functions to obtain a function $c05-math-303$ . It would be more useful to obtain formulas for

and

However, these formulas are more difficult to derive than the formulas where the condition of the conditional expectation is the infinite past.

5.3.2 ARMA Processes

ARMA processes are defined in terms of an innovation process. After defining innovation processes, we define MA (moving average) processes and AR (autoregressive) processes. ARMA processes are obtained by combining autoregressive and moving average processes.

5.3.2.1 Innovation Processes

Innovation processes are used to build more complex processes, like ARMA and GARCH processes. We define two innovation processes: a white noise process and an i.i.d. process.

We say that $c05-math-304$ is a white noise process and write $c05-math-305$ if

1. $c05-math-306$ ,
2. $c05-math-307$ ,
3. $c05-math-308$ for $c05-math-309$ ,

where $c05-math-310$ is a constant. A white noise is a Gaussian white noise if $c05-math-311$ .

We say that $c05-math-312$ is an i.i.d. process and write $c05-math-313$ if

1. $c05-math-314$ ,
2. $c05-math-315$ ,
3. $c05-math-316$ and $c05-math-317$ are independent for $c05-math-318$ .

An i.i.d. process is also a white noise process. A Gaussian white noise is also an i.i.d. process.

5.3.2.2 Moving Average Processes

We define first a moving average process of a finite order, then we give prediction formulas, and finally define a moving average process of infinite order.

MA( $c05-math-319$ ) Process

We use MA( $c05-math-320$ ) as a shorthand notation for a moving average process of order $c05-math-321$ . A moving average process $c05-math-322$ of order $c05-math-323$ is a process satisfying

where $c05-math-324$ , $c05-math-325$ is a white noise process, and $c05-math-326$ .

Figure 5.8 illustrates the definition of MA( $c05-math-327$ ) processes. In panel (a) $c05-math-328$ and in panel (b) $c05-math-329$ . When $c05-math-330$ , then $c05-math-331$ and $c05-math-332$ have one common white noise-term, but $c05-math-333$ and $c05-math-334$ do not have common white noise-terms. When $c05-math-335$ , then $c05-math-336$ and $c05-math-337$ have two common white noise-terms, $c05-math-338$ and $c05-math-339$ have one common white noise-term, and $c05-math-340$ and $c05-math-341$ do not have common white noise-terms.

Graphical representation of The definition of a MA(q) process. (a) MA(1) process; (b) MA(2) process. — **Figure 5.8** *The definition of a MA*( $c05-math-342$ ) *process*. (a) MA( $c05-math-343$ ) process; (b) MA( $c05-math-344$ ) process.

We have that

5.19

and

5.20

where $c05-math-347$ . Thus, MA( $c05-math-348$ ) process is such that a correlation exists between $c05-math-349$ and $c05-math-350$ only if $c05-math-351$ . Equations (5.19) and (5.20) show that MA( $c05-math-352$ ) process is covariance stationary.

If we are given a covariance function $c05-math-353$ , which is such that $c05-math-354$ for $c05-math-355$ , we can construct a MA( $c05-math-356$ ) process with this covariance function by solving $c05-math-357$ and $c05-math-358$ from the $c05-math-359$ equations

Prediction of MA Processes

The conditional expectation $c05-math-360$ is the best prediction of $c05-math-361$ for $c05-math-362$ , given the infinite past $c05-math-363$ , in the sense of the mean squared prediction error, as we mentioned in Section 5.3.1. We denote $c05-math-364$ . The best linear prediction in the sense of the mean squared error is given in (6.19). We can use that formula when the covariance function of the MA( $c05-math-365$ ) process is first estimated.

A recursive prediction formula for the MA( $c05-math-366$ ) process can be derived as follows. We have that

because $c05-math-367$ for $c05-math-368$ . The noise terms $c05-math-369$ are not observed, but we can write

This leads to a formula for $c05-math-370$ in terms of the infinite past $c05-math-371$ . For example, for the MA(1) process $c05-math-372$ we have

5.21

The prediction formula for prediction step $c05-math-374$ is a version of exponential moving average, which is defined in (6.7).

We can obtain a recursive prediction for practical use in the following way. Define $c05-math-375$ , when $c05-math-376$ and

when $c05-math-377$ . Finally we define the $c05-math-378$ -step prediction as

For example, for the MA(1) process $c05-math-379$ we get the truncated formulas

In the implementation the parameters $c05-math-380$ have to be replaced by their estimates.

MA( $c05-math-381$ ) Process

A moving average process of infinite order is defined as

The series converges in mean square if⁶

We have that

5.22

and

5.23

Equations (5.22) and (5.23) imply that MA( $c05-math-394$ ) process is covariance stationary. MA( $c05-math-395$ ) process can be used to study the properties of AR processes. For example, if we can write an AR process as a MA( $c05-math-396$ ) process, this shows that the AR process is covariance stationary.

5.3.2.3 Autoregressive Processes

An autoregressive process $c05-math-397$ of order $c05-math-398$ is a process satisfying

5.24

where $c05-math-400$ , $c05-math-401$ is a white noise process, and $c05-math-402$ . We assume that $c05-math-403$ is uncorrelated with $c05-math-404$ . We use AR( $c05-math-405$ ) as a shorthand notation for an autoregressive process of order $c05-math-406$ .

The autocovariance function of an AR( $c05-math-407$ ) process can be computed recursively. Multiply (5.24) by $c05-math-408$ from both sides and take expectations to get

5.25

where $c05-math-410$ . The first values $c05-math-411$ can be solved from the $c05-math-412$ equations. After that, the values $c05-math-413$ for $c05-math-414$ can be computed recursively from (5.25).

Prediction of AR Processes

Let us consider the prediction of $c05-math-415$ for $c05-math-416$ when the process is an AR( $c05-math-417$ ) process. The best prediction of $c05-math-418$ , given the observations $c05-math-419$ , is denoted by

where we denote $c05-math-420$ . We start with the one-step prediction. The best prediction of $c05-math-421$ , given the observations $c05-math-422$ , is

5.26

because $c05-math-424$ . For the two-step prediction the best predictor is

The general prediction formula is

The best prediction is calculated recursively, using the value of $c05-math-425$ in (5.26), and the fact that $c05-math-426$ for $c05-math-427$ .

For example, for the MA(1) process $c05-math-428$ we have

5.27

5.3.2.4 ARMA Processes

We define an autoregressive moving average process $c05-math-430$ , of order $c05-math-431$ , $c05-math-432$ , as a process satisfying

where $c05-math-433$ and $c05-math-434$ is a MA( $c05-math-435$ ) process. We use ARMA( $c05-math-436$ ) as a shorthand notation for an autoregressive moving average process of order ( $c05-math-437$ ).

Stationarity, Causality, and Invertability of ARMA Processes

Let $c05-math-438$ be an ARMA( $c05-math-439$ ) process with

Denote

where $c05-math-440$ , and $c05-math-441$ is the set of complex numbers. If $c05-math-442$ for all $c05-math-443$ such that $c05-math-444$ , then there exists the unique stationary solution

where the coefficients $c05-math-445$ are obtained from the equation

where $c05-math-446$ for some $c05-math-447$ ; see Brockwell and Davis (1991, Theorem 3.1.3).

The condition for the covariance stationarity does not guarantee that the ARMA( $c05-math-448$ ) process would be suitable for modeling. Let us consider the AR(1) model

where $c05-math-449$ . The AR(1) model is covariance stationary if and only if $c05-math-450$ . This can be seen in the following way. Let us consider first the case $c05-math-451$ . We can write recursively

where $c05-math-452$ . Since $c05-math-453$ , we get the MA( $c05-math-454$ ) representation⁷

which implies that $c05-math-461$ is covariance stationary. Let us then consider the case $c05-math-462$ . Since $c05-math-463$ , we can write recursively

where $c05-math-464$ . Since $c05-math-465$ , we get the MA( $c05-math-466$ ) representation⁸

which implies that $c05-math-473$ is covariance stationary. The latter case $c05-math-474$ is not suitable for modeling because $c05-math-475$ is a function of future innovations $c05-math-476$ with $c05-math-477$ .

We define causality of the process to exclude examples like the AR(1) model with $c05-math-478$ . An ARMA( $c05-math-479$ ) process is called causal if there exists constants $c05-math-480$ such that $c05-math-481$ and

Let the polynomials $c05-math-482$ and $c05-math-483$ have no common zeroes. Then $c05-math-484$ is causal if and only if $c05-math-485$ for all $c05-math-486$ such that $c05-math-487$ .⁹ This has been proved in Brockwell and Davis (1991, Theorem 3.1.1). The coefficients $c05-math-491$ are determined by

Thus, under the conditions that $c05-math-492$ and $c05-math-493$ have no common zeroes and $c05-math-494$ for all $c05-math-495$ such that $c05-math-496$ , we have expressed an ARMA( $c05-math-497$ ) process as an infinite order moving average process. Thus, an ARMA( $c05-math-498$ ) process is covariance stationary under these conditions.

An ARMA( $c05-math-499$ ) process is called invertible if there exists constants $c05-math-500$ such that $c05-math-501$ and

Let the polynomials $c05-math-502$ and $c05-math-503$ have no common zeroes. Then $c05-math-504$ is invertible if and only if $c05-math-505$ for all $c05-math-506$ such that $c05-math-507$ . This has been proved in Brockwell and Davis (1991, Theorem 3.1.2). The coefficients $c05-math-508$ are determined by

Prediction of ARMA Processes

The prediction formulas for ARMA processes given the infinite past can be found in Hamilton (1994, p. 77). For the ARMA (1,1) process $c05-math-509$ we have

5.28

where $c05-math-511$ ; see Shiryaev (1999, p. 151). Note that the prediction formula (5.21) of the MA(1) process and the prediction formula (5.27) of the AR(1) process follow from (5.28).

5.3.3 Conditional Heteroskedasticity Models

Time series $c05-math-512$ satisfies the conditional heteroskedasticity assumption if

5.29

where $c05-math-514$ is an IID( $c05-math-515$ ) process and $c05-math-516$ is the volatility process. The volatility process is a predictable random process, that is, $c05-math-517$ is measurable with respect to the sigma-field $c05-math-518$ generated by the variables $c05-math-519$ . We also assume that $c05-math-520$ is independent of $c05-math-521$ . Then,

5.30

Thus, $c05-math-523$ is the best prediction of $c05-math-524$ in the mean squared error sense. Also, for $c05-math-525$ ,

5.31

Thus, the best prediction of $c05-math-527$ gives the best prediction of $c05-math-528$ , in the mean squared error sense.

ARCH and GARCH processes are examples of conditional heteroskedasticity models.

5.3.3.1 ARCH Processes

Process $c05-math-529$ is an ARCH( $c05-math-530$ ) process (autoregressive conditional heteroskedasticity process of order $c05-math-531$ ), if $c05-math-532$ where $c05-math-533$ is an IID( $c05-math-534$ ) process and

5.32

where $c05-math-536$ and $c05-math-537$ . As a special case, the ARCH $c05-math-538$ process is defined as

The ARCH model was introduced in Engle (1982) for modeling UK inflation rates. The ARCH( $c05-math-539$ ) process is strictly stationary if $c05-math-540$ ; see Fan and Yao (2005, Theorem 4.3) and Giraitis et al. (2000).

Let us consider the prediction of $c05-math-541$ for $c05-math-542$ when the process is an ARCH( $c05-math-543$ ) process. The best prediction of $c05-math-544$ , given the observations $c05-math-545$ , is denoted by

We start with the one-step prediction. The best prediction of $c05-math-546$ , given the observations $c05-math-547$ , using the inference in (5.30), is

5.33

because $c05-math-549$ . For the two-step prediction we use (5.30) to obtain the best predictor

The general prediction formula is

5.34

where we denote $c05-math-551$ . The best prediction is calculated recursively, using the value of $c05-math-552$ in (5.33), and the fact that $c05-math-553$ for $c05-math-554$ .

The best $c05-math-555$ -step prediction in the ARCH(1) model is

5.35

where we assumed condition $c05-math-557$ , which guarantees stationarity, and we denote $c05-math-558$ .¹⁰

5.3.3.2 GARCH Processes

Process $c05-math-564$ is a GARCH( $c05-math-565$ ) process (generalized autoregressive conditional heteroskedasticity process of order $c05-math-566$ and $c05-math-567$ ), if

5.37

where $c05-math-569$ is an IID( $c05-math-570$ ) process and

where $c05-math-571$ , $c05-math-572$ , and $c05-math-573$ . As a special case we get the GARCH $c05-math-574$ model, where

5.38

The GARCH model was introduced in Bollerslev (1986). The GARCH( $c05-math-576$ ) process is strictly stationary if

5.39

see Fan and Yao (2005, Theorem 4.4) and Bougerol and Picard (1992).

The best one-step prediction of the squared value is obtained from (5.30) as

In the GARCH( $c05-math-578$ ) model the best $c05-math-579$ -step prediction of the squared value, in the mean squared error sense, is

5.40

where we assumed condition $c05-math-581$ , which guarantees strict stationarity, and we denote the unconditional variance by

5.41

Let us show (5.40) for $c05-math-583$ . Let us denote $c05-math-584$ . We have

and $c05-math-585$ . Thus,

Thus, using (5.31),

We have shown (5.40), since $c05-math-586$ , by (5.31). We can also write the best prediction of $c05-math-587$ in the GARCH( $c05-math-588$ ) model as

5.42

where $c05-math-590$ .

The prediction formulas (5.40) and (5.42) are written in terms of $c05-math-591$ . We have the following formula for $c05-math-592$ in a strictly stationary GARCH( $c05-math-593$ ) model:

5.43

where we assume $c05-math-595$ to ensure strict stationarity. More generally, for the GARCH( $c05-math-596$ ) model we have

5.44

where $c05-math-598$ are obtained from the equation

for $c05-math-599$ ; see Fan and Yao (2005, Theorem 4.4).

5.3.3.3 ARCH( $c05-math-600$ ) Model

GARCH( $c05-math-601$ ) can be considered a special case of the ARCH( $c05-math-602$ ) model, since (5.43) can be written as

where $c05-math-603$ and $c05-math-604$ . We can obtain a more general ARCH ( $c05-math-605$ ) model by defining

5.45

where $c05-math-607$ , $c05-math-608$ , and $c05-math-609$ is called a news impact curve. More generally, following Linton (2009), the news impact curve can be defined as the relationship between $c05-math-610$ and $c05-math-611$ holding past values $c05-math-612$ constant at some level $c05-math-613$ . In the GARCH( $c05-math-614$ ) model the news impact curve is

The ARCH( $c05-math-615$ ) model in (5.45) has been studied in Linton and Mammen (2005), where it was noted that the estimated news impact curve is asymmetric for S&P 500 return data. The asymmetric news impact curve can be addressed by asymmetric GARCH processes.

5.3.3.4 Asymmetric GARCH Processes

Time series of asset returns show a leverage effect. Markets become more active after a price drop: large negative returns are followed by a larger increase in volatility than in the case of large positive returns. In fact, past price changes and future volatilities are negatively correlated. This implies a negative skew to the distribution of the price changes.

The leverage effect is taken into account in the model

5.46

where $c05-math-617$ is the skewness parameter. The model was applied in Heston and Nandi (2000) to price options.¹¹ When $c05-math-622$ , then under (5.46)

When $c05-math-623$ , then negative values of $c05-math-624$ lead to larger increase in volatility than positive values of the same size of $c05-math-625$ . Now the unconditional variance is

5.49

5.3.3.5 The Moment Generating function

We need the moment generating function in order to compute the option prices when the stock follows an asymmetric GARCH( $c05-math-627$ ) process. We follow Heston and Nandi (2000). Let

where $c05-math-628$ , $c05-math-629$ , $c05-math-630$ are i.i.d. $c05-math-631$ , and

5.50

For example, when the logarithmic returns follow the asymmetric GARCH( $c05-math-633$ ) process, then

so that $c05-math-634$ . We want to find the moment generating function

where $c05-math-635$ and $c05-math-636$ is the conditional expectation at time $c05-math-637$ .

We have that

5.51

Also,

5.52

because the moment generating function of $c05-math-640$ is $c05-math-641$ , and $c05-math-642$ .

For $c05-math-643$ we have

5.53

where $c05-math-645$ and $c05-math-646$ are defined by the recursive formulas

The cases $c05-math-647$ and $c05-math-648$ were proved in (5.51) and (5.52). Let $c05-math-649$ . Let us make the induction assumption that the formulas hold at time $c05-math-650$ . Now,

5.54

Insert values

to get

5.55

When $c05-math-653$ , then

Equating terms in (5.54) and (5.55) gives the result.¹²

Figure 5.9 shows moment generating functions $c05-math-655$ . In panel (a) the current stock price is $c05-math-656$ , and in panel (b) $c05-math-657$ . The one period moment generating function ( $c05-math-658$ ) is with black, two period ( $c05-math-659$ ) is with red, and three period ( $c05-math-660$ ) is with blue. The parameters $c05-math-661$ , $c05-math-662$ , $c05-math-663$ , and $c05-math-664$ are estimated from the daily S&P 500 daily data of Section 2.4.1, using model (5.46).

Graphical representation of Moment generating functions under GARCH. — **Figure 5.9** *Moment generating functions under GARCH*. We show functions $c05-math-665$ , where (a) $c05-math-666$ and (b) $c05-math-667$ . The case $c05-math-668$ is with black, $c05-math-669$ is with red, and $c05-math-670$ is with blue.

**Figure 5.9** *Moment generating functions under GARCH*. We show functions $c05-math-665$ , where (a) $c05-math-666$ and (b) $c05-math-667$ . The case $c05-math-668$ is with black, $c05-math-669$ is with red, and $c05-math-670$ is with blue.

Note that under the usual GARCH( $c05-math-671$ ) model

functions $c05-math-672$ and $c05-math-673$ are defined by the recursive formulas

This means that $c05-math-674$ and $c05-math-675$ depend on the unobserved sequence $c05-math-676$ , unlike in the case of model (5.50).

5.3.3.6 Parameter Estimation

We discuss first estimation of the ARCH processes, and then extend the discussion to the GARCH processes.

Parameter Estimation for ARCH Processes

Estimation of the parameters of ARCH( $c05-math-677$ ) model can be done using the method of maximum likelihood, if we make an assumption about the distribution of innovation $c05-math-678$ . When we have observed $c05-math-679$ , then the likelihood function is

Let us ignore the term $c05-math-680$ and define the conditional likelihood

Let us denote the density of $c05-math-681$ by $c05-math-682$ . Then the conditional density of $c05-math-683$ , given $c05-math-684$ , is

where

The parameters are estimated by maximizing the conditional likelihood, and we get

where the logarithm of the conditional likelihood is

5.56

If we assume that $c05-math-686$ has the standard normal distribution $c05-math-687$ then $c05-math-688$ and

5.57

Parameter Estimation for GARCH Processes

In the GARCH( $c05-math-690$ ) model we can use, similarly to (5.57),

5.58

where $c05-math-692$ . Unlike in the ARCH $c05-math-693$ model, $c05-math-694$ is a sum of infinitely many terms, and we need to truncate the infinite sum in order to be able to calculate the conditional likelihood. The value $c05-math-695$ can be chosen as the sample variance using $c05-math-696$ , and $c05-math-697$ for $c05-math-698$ can be computed using the recursive formula. Then $c05-math-699$ is a function of $c05-math-700$ and of the parameters.

5.3.3.7 Fitting the GARCH( $c05-math-701$ ) Model

We fit the GARCH( $c05-math-702$ ) model for S&P 500 index and for individual stocks of S&P 500.

S&P 500 Daily Data

Figure 5.10 shows tail plots of the residuals $c05-math-703$ , where $c05-math-704$ is the estimated volatility in the GARCH( $c05-math-705$ ) model. Panel (a) shows the left tail plot and panel (b) the right tail plot. The black points show the residuals, the red curves show the standard normal distribution function, and the blue curves show the Student distributions with degrees of freedom 3, 6, and 12. Figure 3.2 shows the corresponding plots for the S&P 500 returns. We see that the standard normal distribution fits well the central area of the distribution of the residuals, but the tails may be better fitted with a Student distribution.

Graphical representation of GARCH(1,1) residuals: Tail plots. — **Figure 5.10** *GARCH(1,1) residuals: Tail plots*. (a) Left tail plot; (b) right tail plot. The red curves show the standard normal distribution function, and the blue curves show the Student distributions with degrees of freedom $c05-math-706$ , $c05-math-707$ , and $c05-math-708$ .

S&P 500 Components Data

We compute GARCH estimates for daily S&P 500 components data, described in Section 2.4.5. Estimates are computed both for the GARCH( $c05-math-709$ ) model and for the Heston–Nandi modification of the GARCH( $c05-math-710$ ) model, defined in (5.46).¹³ Both models have parameters $c05-math-712$ , $c05-math-713$ , and $c05-math-714$ . The Heston–Nandi model has the additional skewness parameter $c05-math-715$ .

Figure 5.11(a) shows a scatter plot of $c05-math-716$ , where $c05-math-717$ are estimates of $c05-math-718$ in the GARCH( $c05-math-719$ ) model and $c05-math-720$ are estimates of $c05-math-721$ in the Heston–Nandi model. The red points show the estimates for daily S&P 500 data, described in Section 2.4.1. Panel (b) shows a scatter plot of $c05-math-722$ . We see that the estimates of $c05-math-723$ are of the order $c05-math-724$ .

Figure 5.12(a) shows a scatter plot of $c05-math-734$ , where $c05-math-735$ are estimates of $c05-math-736$ in the GARCH( $c05-math-737$ ) model, and $c05-math-738$ are estimates of $c05-math-739$ in the Heston–Nandi model. We leave out outliers with small estimates for $c05-math-740$ . Panel (b) shows a histogram of estimates $c05-math-741$ of $c05-math-742$ in the Heston–Nandi model. The red points and the lines show the estimates for daily S&P 500 data, described in Section 2.4.1. We see that estimates of $c05-math-743$ are close to 1, and they are more linearly related in the two models than the estimates of $c05-math-744$ and $c05-math-745$ . Also, we see that the estimates of the skewness parameter $c05-math-746$ are positive for almost all S&P 500 components, with the median value about 2.5. This indicates that high negative returns increase volatility more than the positive returns.

5.3.4 Continuous Time Processes

The geometric Brownian motion is used to model stock prices in the Black–Scholes model. We do not go into details about continuous time models, but we think that it is useful to review some basic facts about continuous time models. In particular, the geometric Brownian motion appears as the limit of a discrete time binomial model.

5.3.4.1 The Brownian Motion

Stochastic process $c05-math-755$ , $c05-math-756$ , is called the standard Brownian motion, or the standard Wiener process, if it has the following properties:

1. $c05-math-757$ with probability one,
2. $c05-math-758$ ,
3. $c05-math-759$ is independent of $c05-math-760$ for $c05-math-761$ .

The Brownian motion leads to the process

where $c05-math-762$ is drift and $c05-math-763$ is volatility. We can use the notation of stochastic differential equations:

5.3.4.2 Diffusion Processes and Itô's Lemma

The diffusion Markov process is defined as

5.59

where $c05-math-765$ , $c05-math-766$ is a random variable, and

with probability one; see Shiryaev (1999, p. 237). A definition of the stochastic integrals with respect to the Brownian motion can be found in Shiryaev (1999, p. 252).¹⁴ The definition of the process can be written with the shorthand notation of the stochastic differential equations:

5.60

For example, a mean reverting model is defined as

Let $c05-math-775$ be a diffusion process as in (5.60), and let $c05-math-776$ , where $c05-math-777$ is continuously differentiable with respect to the first argument and two times continuously differentiable with respect to the second argument. Furthermore, we assume that $c05-math-778$ . Then $c05-math-779$ is a diffusion Markov process with

5.61

where

and $c05-math-781$ , $c05-math-782$ , and $c05-math-783$ are related by $c05-math-784$ . The expression for $c05-math-785$ follows from Itô's lemma; see Shiryaev (1999, p. 263).¹⁵

5.3.4.3 The Geometric Brownian Motion

The geometric Brownian motion is the stochastic process

5.62

where $c05-math-794$ is the standard Brownian motion, $c05-math-795$ , and $c05-math-796$ . The stochastic differential equation of the geometric Brownian motion is

5.63

The fact that the solution of the stochastic differential equation in (5.63) is given in (5.62) follows from Itô's formula. Indeed, we consider diffusion process $c05-math-798$ , $c05-math-799$ , $c05-math-800$ , and $c05-math-801$ . Then Itô's formula implies that $c05-math-802$ is a diffusion process with $c05-math-803$ and $c05-math-804$ .

5.3.4.4 Girsanov's Theorem

Let $c05-math-805$ be a filtered probability space and let $c05-math-806$ be a Brownian motion. Let $c05-math-807$ be a stochastic process with $c05-math-808$ , for $c05-math-809$ . We construct a process $c05-math-810$ by setting

If $c05-math-811$ , then $c05-math-812$ . We can define a probability measure $c05-math-813$ on $c05-math-814$ by

where $c05-math-815$ . Let $c05-math-816$ be the restriction of $c05-math-817$ to $c05-math-818$ . Measure $c05-math-819$ is equivalent to $c05-math-820$ . Girsanov's theorem states that

5.64

defines a Brownian motion $c05-math-822$ ; see Shiryaev (1999, p. 269). A proof can be found in Shiryaev (1999, Chapter VII, Section 3b).

5.4 Multivariate Time Series Models

The multivariate GARCH model is defined for vector time series $c05-math-823$ that has $c05-math-824$ components. It is assumed that $c05-math-825$ is strictly stationary and

5.65

where $c05-math-827$ is the square root of a positive definite covariance matrix $c05-math-828$ , $c05-math-829$ is measurable with respect to the sigma-algebra generated by $c05-math-830$ , and $c05-math-831$ is a $c05-math-832$ -dimensional i.i.d. process with $c05-math-833$ and $c05-math-834$ , where $c05-math-835$ is the $c05-math-836$ identity matrix.

The square root of $c05-math-837$ can be defined by writing the eigenvalue decomposition $c05-math-838$ , where $c05-math-839$ is the diagonal matrix of the eigenvalues of $c05-math-840$ and $c05-math-841$ is the orthogonal matrix whose columns are the eigenvectors of $c05-math-842$ . Then we define $c05-math-843$ , where $c05-math-844$ is the diagonal matrix obtained from $c05-math-845$ by taking square root of each element. We can define $c05-math-846$ also as a Cholesky factor of $c05-math-847$ .

Multivariate GARCH (MGARCH) processes are reviewed in McNeil et al. (2005, Section 4.6), Bauwens et al. (2006), and Silvennoinen and Teräsvirta (2009). Below we write the models only for the case $c05-math-848$ , so that $c05-math-849$ . The multivariate GARCH models are denoted with MGARCH( $c05-math-850$ ). We restrict ourselves to the first-order models with $c05-math-851$ . The multivariate GARCH models are based on (5.65) but differ in the definition of the recursive formula for $c05-math-852$ .

5.4.1 MGARCH Models

First we define the VEC model and two restrictions of it: the diagonal VEC model and the Baba–Engle–Kraft–Kroner (BEKK) model. Then we define the constant correlation model and the dynamic conditional correlation model.

Let us denote $c05-math-853$ , $c05-math-854$ , and $c05-math-855$ . The VEC model and the diagonal VEC model were introduced in Bollerslev et al. (1988). The VEC model assumes that

This model has 21 parameters $c05-math-856$ . Since the model has a large number of parameters, it is useful to consider models with less parameters. The diagonal VEC model has only nine parameters and assumes that

5.66

5.67

5.68

Thus, in the diagonal VEC model the components of $c05-math-860$ follow univariate GARCH models. The BEKK model was introduced in Engle and Kroner (1995). The model has 11 parameters and it can be written more easily with the matrix notation as

where $c05-math-861$ is a symmetric $c05-math-862$ matrix and $c05-math-863$ and $c05-math-864$ are $c05-math-865$ matrices. The BEKK model is obtained from the VEC model by restricting the parameters. We can express the parameters $c05-math-866$ of the VEC model in terms of the parameters of the BEKK model as follows:

where we denote the elements of $c05-math-867$ by $c05-math-868$ and the elements of $c05-math-869$ by $c05-math-870$ .

The recursive formula for $c05-math-871$ can be written by using the correlation matrix $c05-math-872$ . Let $c05-math-873$ be the diagonal matrix of the standard deviations of $c05-math-874$ . The correlation matrix $c05-math-875$ , corresponding to $c05-math-876$ , is such that $c05-math-877$ .

The constant correlation MGARCH model, introduced in Bollerslev (1990), is such that the components of $c05-math-878$ follow univariate GARCH models, and the correlation matrix is constant. That is, $c05-math-879$ and $c05-math-880$ , where $c05-math-881$ is the constant correlation matrix. The constant correlation GARCH model assumes the univariate GARCH models for the components, as in (5.66) and (5.67), and

The dynamic conditional correlation MGARCH model, introduced in Engle (2002), is such that the components of $c05-math-882$ follow univariate GARCH models and

5.69

where $c05-math-884$ Engle (2002) suggests to estimate

where $c05-math-885$ is the sample covariance with $c05-math-886$ and $c05-math-887$ . We do not typically have $c05-math-888$ , and thus the conditional correlation is estimated from

where $c05-math-889$ and $c05-math-890$ .

5.4.2 Covariance in MGARCH Models

The recursive equation (5.68) in the stationary diagonal VEC model implies that

This follows similarly as in the case of GARCH $c05-math-891$ model (see (5.43) and (5.44)). The recursive equation (5.69) in the stationary dynamic conditional correlation GARCH model implies similarly that

where $c05-math-892$ .

Given the observations $c05-math-893$ , we estimate the parameters, similarly to GARCH( $c05-math-894$ ) estimation in (5.58), by maximizing the conditional modified likelihood,

where $c05-math-895$ , $c05-math-896$ is the density of the standard normal bivariate distribution $c05-math-897$ , and $c05-math-898$ is the truncated covariance, with elements $c05-math-899$ , $c05-math-900$ , $c05-math-901$ , where

and $c05-math-902$ , $c05-math-903$ are defined similarly.

Given the data $c05-math-904$ , the MGARCH $c05-math-905$ estimator for the conditional covariance is

5.70

where the parameter estimators $c05-math-907$ , $c05-math-908$ , and $c05-math-909$ are are calculated with the maximum likelihood method.

5.5 Time Series Stylized Facts

Time series models of financial time series should be such that they are able to capture stylized facts. We describe the stylized facts mainly using the daily S&P 500 index data, described in Section 2.4.1. Stylized facts of financial time series are studied by Cont (2001) and Bouchaud (2002).

1. Returns are uncorrelated.
Figure 5.5(a) shows the sample autocorrelation function for the S&P 500 returns. Sample autocorrelations are small, although they are not completely inside the 95% confidence band.

When the time scale is shorter than tens of minutes, there can be considerable correlation; see Cont (2001) and Bouchaud (2002).
2. Absolute returns are correlated.
Figure 5.5(b) shows the sample autocorrelation function for the absolute S&P 500 returns. The sample autocorrelation goes inside the 95% confidence band after the lag of 500 days, but does not stay inside the band.

The decay of the autocorrelation of absolute returns has roughly a power law with an exponent in range $c05-math-910$ ; see Cont (2001).

Since absolute returns are correlated, we can claim that the time series of returns does not consist of independent observations, although they are uncorrelated. The autocorrelation can also be seen in scatter plots. Figure 5.13 shows scatter plots of absolute returns. Panel (a) shows the scatter plot of points $c05-math-911$ $c05-math-912$ . Panel (b) shows the scatter plot of points $c05-math-913$ $c05-math-914$ .
3. Volatility is clustered.
There are localized outbursts of volatility. The bursts of high volatility last for some time, and then the volatility returns to more normal levels.

Figure 5.14 shows simulated GARCH $c05-math-915$ returns and real S&P 500 returns. Panel (a) shows a time series of returns that are simulated from the GARCH $c05-math-916$ model with parameters being equal to the estimates from S&P 500 daily data. The first return is simulated from the distribution $c05-math-917$ . Panel (b) shows the time series of logarithmic S&P 500 returns. S&P 500 data is described in Section 2.4.1. Figure 3.29 shows the corresponding simulated i.i.d. Gaussian returns.

The decay of volatility correlation is slow. The volatility correlation can be defined as the autocorrelation of squared returns, and the autocorrelation of the squared returns shows similar behavior as the autocorrelation of the absolute returns. Volatility displays a positive autocorrelation over several days; see Cont (2001) and Bouchaud (2002).
4. Extreme returns appear in clusters.
Figure 5.15 shows the 10 largest and the 10 smallest returns of S&P 500. The largest returns are shown in blue and the smallest returns are shown in red. We can see that the biggest losses and the biggest gains occur at the same dates.
5. Leverage effect.
Markets become more active after a price drop; past price changes and future volatilities are negatively correlated. This implies a negative skew to the distribution of the price changes. The leverage effect has been taken into account in the VGARCH model in Engle and Ng (1993) and in the VGARCH related option pricing in Heston and Nandi (2000). We study asymmetric GARCH models in Section 5.3.3.

Figures 5.11 and 5.12 study parameter fitting in the basic GARCH( $c05-math-918$ ) model and in an asymmetric GARCH( $c05-math-919$ ). Figure 5.12(b) shows that the skewness parameter tends to be positive for S&P 500 components.
6. Conditional heavy tails.
Even after correcting the returns for volatility clustering, the residual time series still has heavy tails. The residuals may be calculated, for example, via GARCH-type models.

Figure 5.10 shows the tails of the residuals when GARCH( $c05-math-920$ ) is fitted to S&P 500 daily data.
7. The kurtosis has slow decay.
This means that the autocorrelation of the fourth power of the returns has slow decay; see Bouchaud (2002).
8. Volatility and volume are correlated.
Volatility and the volume of the activity have long-ranged correlations; see Cont (2001) and Bouchaud (2002).

Graphical representation of S&P 500 scatter plots of absolute returns. — **Figure 5.13** *S&P 500 scatter plots of absolute returns*. (a) Scatter plot of points $c05-math-921$ ; (b) scatter plot of points $c05-math-922$ .

**Figure 5.13** *S&P 500 scatter plots of absolute returns*. (a) Scatter plot of points $c05-math-921$ ; (b) scatter plot of points $c05-math-922$ .

Graphical representation of Simulated GARCH(1, 1) returns and S&P 500 returns. — **Figure 5.14** *Simulated GARCH* $c05-math-923$ *returns and S&P 500 returns*. (a) A time series of simulated returns from a GARCH $c05-math-924$ model; (b) the time series of S&P 500 returns.

Graphical representation of S&P 500 returns. — **Figure 5.15** *S&P 500 returns*. The 10 smallest returns are shown in red and the 10 largest returns are shown in green.

and (5.4) implies (5.6).

because $c05-math-290$ . Thus, $c05-math-291$ is minimized with respect to $c05-math-292$ by choosing $c05-math-293$ . Note that the conditional expectation defined as $c05-math-294$ is a real-valued function of $c05-math-295$ , but $c05-math-296$ is a real-valued random variable, which can be defined as $c05-math-297$ .

Thus,

5.36

Thus, the best $c05-math-561$ -step prediction of $c05-math-562$ in ARCH(1) model is given in (5.34), where $c05-math-563$ and we used (5.31) and (5.36).

5.47

which is for $c05-math-619$ equal to the GARCH( $c05-math-620$ ) model. Engle and Ng (1993) have defined the VGARCH model

5.48

Menn and Rachev (2009) propose the GARMAX model that can also cope with the leverage effect.

Thus,

because

Finally,

the stochastic integral is defined as

where $c05-math-767$ are random variables, $c05-math-768$ , and we denote $c05-math-769$ . The stochastic integral can be defined for “square integrable” random functions $c05-math-770$ as the “limit” of integrals $c05-math-771$ of simple functions $c05-math-772$ , “approximating” function $c05-math-773$ .

where $c05-math-787$ and $c05-math-788$ are the first and the second derivatives. Taylor expansion gives

where $c05-math-789$ . If the changes have zero mean, $c05-math-790$ . Thus, in the stochastic case the second-order term is not of a smaller order than the first-order term, whereas in the deterministic case the second-order term is of a smaller order than the first-order term. The Itô's lemma holds for the class of Itô processes. An Itô process is defined as

Itô processes are more general than diffusion processes, because in diffusion processes dependence on $c05-math-791$ is through $c05-math-792$ ; see Shiryaev (1999, p. 257).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5: Time Series Analysis

Create new playlist

Sign In

Sign Up

5.1 Stationarity and Autocorrelation

5.1.1 Strict Stationarity

5.1.1.1 Random Walk

5.1.2 Covariance Stationarity and Autocorrelation

5.1.2.1 Autocovariance and Autocorrelation for Scalar Time Series

5.1.2.2 Autocovariance for Vector Time Series

5.2 Model Free Estimation

5.2.1 Descriptive Statistics for Time Series

5.2.2 Markov Models

5.2.3 Time Varying Parameter

5.2.3.1 Local Likelihood

Time Space Localization

State Space Localization

5.2.3.2 Local Least Squares

5.2.3.3 Time Varying Estimators for the Excess Distribution

Local Likelihood in Tail Estimation

Time Varying Regression Estimator for Tail Index

5.3 Univariate Time Series Models

5.3.1 Prediction and Conditional Expectation

5.3.2 ARMA Processes

5.3.2.1 Innovation Processes

5.3.2.2 Moving Average Processes

MA() Process

Prediction of MA Processes

MA() Process

5.3.2.3 Autoregressive Processes

Prediction of AR Processes

5.3.2.4 ARMA Processes

Stationarity, Causality, and Invertability of ARMA Processes

Prediction of ARMA Processes

5.3.3 Conditional Heteroskedasticity Models

5.3.3.1 ARCH Processes

5.3.3.2 GARCH Processes

5.3.3.3 ARCH() Model

5.3.3.4 Asymmetric GARCH Processes

5.3.3.5 The Moment Generating function

5.3.3.6 Parameter Estimation

Parameter Estimation for ARCH Processes

Parameter Estimation for GARCH Processes

5.3.3.7 Fitting the GARCH() Model

S&P 500 Daily Data

S&P 500 Components Data

5.3.4 Continuous Time Processes

5.3.4.1 The Brownian Motion

5.3.4.2 Diffusion Processes and Itô's Lemma

5.3.4.3 The Geometric Brownian Motion

5.3.4.4 Girsanov's Theorem

5.4 Multivariate Time Series Models

5.4.1 MGARCH Models

5.4.2 Covariance in MGARCH Models

5.5 Time Series Stylized Facts

Table of Contents for
Chapter 5: Time Series Analysis

MA( $c05-math-319$ ) Process

MA( $c05-math-381$ ) Process

5.3.3.3 ARCH( $c05-math-600$ ) Model

5.3.3.7 Fitting the GARCH( $c05-math-701$ ) Model