Chapter 10: Models with Unobserved Components

10.1 Formulation of the Basic Model

10.2 ARIMA Representation

10.3 Extensions of the Model

10.4 Estimation of Unobserved Components Models

10.5 State Space Models in SAS

10.1 Formulation of the Basic Model

This chapter briefly describes the theory of unobserved components models for time series in order to provide a background for the applications of the many facilities in PROC UCM discussed in the chapters that follow. The unobserved components models are a rich and flexible class of models for data series that explicitly allow for time-varying structures that often appear in observed data series as the underlying data-generating processes. Such structures can in no way be assumed to be constant. For example, see Harvey (1989). Remember that these decompositions are formulated so that at the end of the estimations, they provide a fully specified statistical model. This is in contrast to the ideas behind the simple forecasting methods such as exponential smoothing and the decompositions underlying seasonal adjustment. These are mainly introduced as imaginary supports for the practical algorithms.

In its simplest form, a time series, called yt, is decomposed into a sum of a level term and a remainder term by

yt = μt + εt, εt ~ N(0, σε2).

The level component μt is assumed to be generated by

μt = μt-1 + ηt, ηt ~ N(0, ση2),

beginning with an initial value μ0 at time index t = 0 just before the first observation. Both series of remainder terms εt and ηt are assumed to be series of independent errors, so-called white noise series. And they are further assumed to be mutually independent. In this expression, the remainder terms εt, also called the residual component, express a stochastic component that is unexplained by the model. These errors are often considered as measurement errors that inflate the observation of μt, which is considered the true state. A small value of the variance σε2 indicates a close fit to the observed series by the postulated level μt series.

The level process μt varies according to the variance ση2 so that the level forms a smooth series for small values of the variance, but the level is volatile for large values of ση2. In this way, a volatile observed series can be modeled as a smooth level series with a small value of ση2 and a large value of σε2, the variance of the error term. In the extreme case, the level is constant for ση2 = 0, but σε2 equals the variance of the observed series yt. However, it is also possible to model such a series as a volatile level series, which is observed precisely with a small value for the variance of the error term.

This model can be seen as a variation of the idea underlying simple exponential smoothing. The level component can be estimated by the smoothed series equation shown here, because the estimating algorithms are very closely related. (See Chapter 5.) The difference in the approaches is that in the theory for unobserved components, the level series is actually assumed to be a part of a statistical model specified for the series at hand. In an exponential smoothing, the smoothed component simply forms a series of numbers calculated by using the observed data series. This formulation is proved in Section 10.2 to be identical to an ARIMA(0,1,1) model in the same way as exponential smoothing mimics ARIMA model building, which was demonstrated in Chapter 7.

This basic model, including just one unobserved component, can be generalized to models that include many more components (that express, for example, trend and seasonality). The irregular component can be modeled not as simple independent error terms as in the formulation above, but can also include time series properties such as autocorrelation. Furthermore, the model can be extended by regression terms, even using time-varying regression coefficients, in order to explicitly take advantage of known explanatory variables. These generalizations are presented in Section 10.3, and they pop up in the various applications in succeeding chapters.

10.2 ARIMA Representation

In this short technical section, the basic model is demonstrated to be, in fact, equivalent to an ARIMA(0,1,1) model for the time series yt. Consider again the basic model:

yt = μt + εt, εt ~ N(0, σε2),

μt = μt-1 + ηt, ηt ~ N(0, ση2)

By combining these two equations, we see that

yt - yt-1 = μt + εt - (μt-1 + εt-1) = ηt + εt - εt-1.

If we now define wt = ηt + εt - εt-1,we find that

var(wt) =2σε2 + ση2

and

cov(wt, wt-1) = - σε2.

The first-order autocorrelation for the series of differences yt - yt-1 is then seen to equal - σε2 /(2σε2 + ση2), and all higher-order autocorrelations are 0. This means that the series wt has the autocorrelation function of a moving average process

wt = ζt - θ1ζt-1.

Here ζt denotes a white noise series, but the variance of ζt and the value of the moving average parameter are complicated expressions involving the variances σε2 and ση2.

10.3 Extensions of the Model

The level series μt can be extended to include a trend by the following definition:

μt = μt-1 + βt-1 + ηt, ηt ~ N(0, ση2) βt = βt-1 + ξt, ξt ~ N(0, σξ2)

Positive values of βt correspond to an upward drift in the data series, but the actual trend is time dependent because the actual slope βt by the formula above is allowed to vary. The residual series ηt, εt, and ξt are all assumed to be mutually independent white noise series, meaning that they each consist of identically distributed, independent stochastic terms. Their variances provide an idea of the stability of the components. For example, the value σξ2 = 0 gives a model with a constant trend, but larger values of σξ2 allow the trend to fluctuate.

By a similar definition, a time-varying seasonal component can be introduced. For monthly observed series, a level model is extended by a seasonal component as

yt = μt + St + εt

St = - (St-11 + .. + St-1) + ωt, ωt ~ N(0, σω2).

The inclusion of a seasonal component is very similar to the decomposition underlying the seasonal adjustment procedures in Section 8.2. The difference is mainly a change in perspective. In this context, the unobserved components all have the form of specified stochastic processes, but in the seasonal adjustment algorithms, they are mainly postulated components. The actual definition of the seasonal components having a sum of nearly 0 apart from the noise term ensures that the seasonal component has no influence on the estimated level.

The model can also be extended by trigonometric cycles. In economic history, so-called business cycles are often seen in long economic time series as fairly regular oscillations. The basic form of a cyclic component is trigonometric:

Ct = a cos(λt) + b sin(λt) = γ cos(λt - φ)

This gives the regular form of the oscillation. The frequency λ denotes the number of oscillations per time unit. A more intuitive number is perhaps the wavelength, which is the length of the time interval between two following tops. This wavelength equals 2π/λ. The damping factor is equation shown here, where the number γ=1 corresponds to a stable cycle. The parameter φ in the second formulation of the function is a phase shift. It is, of course, possible to extend such a cyclic component by stochastic error terms that allow the damping factor and the phase shift φ to evolve over time. The cycle length λ is, however, assumed to be fixed because it typically represents a constant from the physical world, such as a yearly cycle. The specific formula for this stochastic version of the cycle is a bit mathematical and is omitted here.

The parameters for this formulation can also be estimated for a particular data series. In this way, phenomena such as sunspot activity can be introduced. Also, cycles depending on more than one frequency λ are allowed because many cyclic problems could be included in the model. Mathematical theory for Fourier series indicates that this feature allows for parameterization of every periodic function using cyclic components. Time series with a long seasonal length can be fitted well using only a few cycles and only a few parameters. An example is weekly time series data, which includes a seasonal structure of length approximately 52 weeks. This structure can be fitted by only a few trigonometric components instead of by 52 weekly dummy variables. See Section 13.4 for an example using observed data.

In addition to these unobserved components, the original series can be modeled by independent variables in the form of linear regressions. In this way, an expected effect of external factors such as the number of trading days each month can be included as an explicit part of the model, and the variation of the residual series such as εt, ξt, and so on can be reduced even further. This follows the normal statistical principle of including all known effects in order to reduce the stochastic variation.

In the case of only one independent variable, the stochastic variation can be reduced by simply adding regression terms to the model for yt defining

yt = μt + γxt + εt

and still including a level component and possibly also other components in the model.

As a further refinement, this regression coefficient, γ, is allowed to vary like all other components:

yt = μt + γtxt + εt

γt = γt-1 + υt, υt ~ N(0, συ2)

This introduces yet another residual series, υt, which is assumed to be independent, identically distributed, and also independent of all the previously introduced residual series in the model. Again, the value συ2 = 0 of the variance corresponds to a situation with a constant regression coefficient, and positive values of the variance συ2 allow for time-varying regression coefficients.

Yet another possible extension is to allow the independent variables to act non-linearly by introducing spline functions so that any smooth relationship between the dependent and the independent variables can be fitted.

All models can be used for forecasting by assuming that the final state is also valid for future observations. The level, trend, seasonal components, and so on are considered static. For example, a model with a time-varying level and a time-varying state is predicted to be h time period ahead by

equation shown here.

Many of these models with unobserved components can be represented as Box-Jenkins ARIMA models, but this is not the case for all variations of these models. The class of unobserved component models has fewer possibilities for modeling than traditional Box-Jenkins analysis. But unobserved component models are often more intuitive to identify and formulate than ARIMA models. Because they are easily estimated in SAS, they provide many users with a preferable frame for time series analysis. The purpose of the modeling is, of course, to gain insight into the structure of the observed time series. What is the actual level when corrected for trading days, external factors, and irrelevant irregular effects? What is the seasonal part of the series, and how do external factors affect the series? This insight can be seen as a first explorative step in more formal parametric model-building processes.

10.4 Estimation of Unobserved Components Models

All unobserved component models are variations of the so-called state space models often used in technical applications of statistics. They are defined by the idea that the unobserved level component is the true value. This true value is unfortunately disturbed by a noise component that is added to it, which results in the observed value. Simple intuition leads to the conclusion that the various components are true values that can be observed only with some added measurement noise.

The idea of unobserved components is prominent in technical sciences, for example, in signal theory. For data series in these disciplines, the many error terms are easily understood as measurement errors or noise. A cyclic component can be a radio signal that is observed only with various noises added. For technical applications, it suffices to derive quick estimates of the components, the true states of nature, by removing these noise components. Proper statistical analyses are of no interest because the problem is not scientific but only practical. This is why estimation in this class of models tends to be different from estimation in other types of statistical models.

These types of models are special cases of state space models. See, for example, Durbin and Koopman (2001). The models are usually estimated by versions of the famous Kalman filter, an algorithm adapted by many authors in order to achieve numerically fast and effective methods for estimation in models that are much more complicated than the univariate models introduced in Chapter 7. (See de Jong and Chu-Chun-Lin [2003]).

The Kalman filter is used as a recursion in order to calculate the conditional distribution of the component μt at time t based on knowledge of the observed values y1, ..., yt. These observed values are known at the time of the observation t. But future observations are not available at time t of estimation, so they are not included in the algorithm. This is the basic situation for forecasting; only past values of the time series are used in the calculations. The Kalman filter is primarily a tool for predicting present and future values of the state based on knowledge of past values of the observed series. It is called a filter because the algorithm at every point in time takes the most recent observation as input. Updated estimates of the state of all the unobserved components are then produced as output.

With the Kalman filter, the unobserved component and the corresponding variances are estimated as conditional expectations and conditional variances in multivariate Gaussian distributions by successive recursions. First, proper initial values are chosen, such as the first observation or the average of all observations, according to the nature of each particular component. In the recursion, the conditional multivariate distribution of the next observation yt and the component values (for example, the level µt at time t) is established where the condition is upon all previously known information available up to time t - 1. In this multivariate distribution, at time t the conditional expectation and variance of the component (which is also conditioned upon the new observed value yt of the series) are derived as estimates for the unobserved components (for example, µt and the variances). In this way, the components for the whole span of observations are estimated by successive conditioning as the next observations come in. The real situation is reflected in the algorithm because all available information is used for estimation for each point in time.

But the idea of Kalman filtering can be extended even further. In other situations, the focus might be to arrive at the best estimate of the model components. For example, the level component µt at time t could be estimated at a later point in time when the “future” values yt+1, ..., yT are also known. These future values of the observed series also contain some information about the parameter. In statistical terms, the focus then changes from deriving the conditional distribution of µt conditioning on all previously observed values y1, .., yt up to time index t to deriving the conditional distribution of µt conditioning on the whole set of observations y1, .., yt, ..,yT. In principle, this derivation is obtained by running the Kalman filtering algorithm backward by successive updates of the conditional expectations, using all the extra information available. This extended estimation is denoted smoothing as contrasted to the filtering in the original Kalman filter algorithm.

A likelihood function for the parameters, especially the variances σε2 and ση2 but also for regression coefficients and wavelengths for cyclic components, can be established. This likelihood function is formed as a product of the successive normal densities for the conditional distributions of yt conditioned upon all previous observations y1, .., yt-1 because the recursive procedure outputs the expectation and the variance of this conditional distribution. This likelihood function depends on the parameters such as σε2, ση2 and λ, and these parameters can then be estimated by numerical maximization.

10.5 State Space Models in SAS

In SAS, state space models are easily applied. They are estimated by PROC UCM using a very simple and intuitive syntax for specifying the components in the proposed models and with a rich variety of produced graphics. The following chapters show examples of the application of PROC UCM for time series with different structures so that at least some of the many facilities of the procedures can be demonstrated.

SAS also includes another procedure, PROC STATESPACE, which also handles general state space models. PROC STATESPACE is, however, more general, and its syntax is closer to the original mathematical formulation of the models in Section 10.1. Because it requires a careful understanding of the matrices and so on in the model formulation, this procedure is not as easy to apply as PROC UCM. The advantage of PROC STATESPACE as a more general procedure than the more dedicated PROC UCM is, of course, that it allows for more general system building.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset