CHAPTER 11

A Multiple-Equations Approach to Model-Based Forecasting

This chapter discusses the multiple-equations forecasting approach, which has more than one dependent variable along with several right-hand-side variables. Within the multiple-equations forecasting framework, we discuss vector autoregressions (VARs) and Bayesian vector autoregressions (BAVRs). The VAR/BVAR approaches can be utilized for short-term as well as long-term forecasting. Short-term forecasting is the focus of this chapter; we look at long-term forecasting in Chapter 12.

A specific forecasting approach in short-term forecasting is known as real-time forecasting of macroeconomic and financial variables. A real-time forecast implies forecasting before the actual release of a variable.1 Silvia and Iqbal (2012) developed an accurate real-time short-term, one-month-ahead, macroeconomic forecasting framework that accounts for the real-time challenge of data availability but also provides a more accurate forecast, on average, than those of the Bloomberg real-time consensus forecast.2 They compared their real-time forecasts with the Bloomberg real-time consensus for 20 major macroeconomic variables, including nonfarm payrolls, unemployment rate, Institute for Supply Management manufacturing index, consumer price index (CPI), industrial production, and housing starts. In this chapter, we follow the approach developed by Silvia and Iqbal (2012).

This chapter sheds light on four important areas of real-time macroeconomic short-term forecasting.

  1. Why do we care about short-term forecasting?
  2. Why is an individual forecast approach that is better than consensus valuable?
  3. Why is the timing of the release of the target (dependent) variable and predictor (independent) variable important to forecasting methods?
  4. Why are traditional forecast evaluation methods not enough?

First, new macroeconomic data, especially when the values are different than consensus anticipated, alter asset prices in the equity, bond, and foreign exchange (FX) markets. Therefore, accurate forecasts of key macroeconomic variables, prior to their release announcements, will provide more opportunities to a firm and its clients to profit, or to at least minimize losses (see next section for details). The second question about individual versus consensus forecasts follows a vast amount of literature studying macroeconomic data releases and the financial market response to those announcements.3 Yet most of this literature focuses only on the difference between the actual and consensus forecast. It does not focus on what determines the individual forecast accuracy and how that accuracy may change over time relative to the consensus. As Rigobon and Sack (2008) and Bartolini et al. (2008) concluded, the effect of the macroeconomic news announcements on the financial markets (e.g., equity and bond markets) is most significant when the actual release is different from market expectations.

Moreover, Bartolini et al. (2008) suggested that a stronger-than-expected news announcement may increase interest rates, strengthen the dollar, and raise equity prices. Because the release of macroeconomic data affects market response, the importance of an individual forecast approach that is better than consensus is increased. The first phase of this chapter establishes the importance of short-term macroeconomic forecasting and the individual forecast approach using BVAR. In other words, why do we care about short-term forecasting and an individual forecast approach? Why should we seek value in forecasts outside the consensus forecast?

The next part of the chapter discusses how to produce accurate as well as better-than-consensus forecasts of key macroeconomic variables. In addition, we compare Silvia and Iqbal's (2012) real-time forecasts with the Bloomberg real-time consensus. We also address a few very important issues to short-term macroeconomic forecasting. For example, the release timing of the target (dependent) variable as well as predictors is very important and often overlooked in conventional forecasting evaluations. Because data is released at different points in time and at different frequencies, release timing is an essential (yet the most neglected) issue to short-term forecasting and must be considered in the model specification.

For instance, nonfarm payrolls and unemployment data have a one-month time lag (e.g., on September 3, 2010, employment data was released for August 2010). In contrast, data on construction spending has a two-month lag. Additionally, employment data is released at the beginning of the month (most often on the first Friday of every month), before many other data macroeconomic data points. Due to the lagged nature of the data and the early release of employment data relative to other variables, there are truly a limited number of predictors available to model and predict nonfarm payrolls. Personal income and spending data is released at the end of the month (typically during the last week of the month). By the end of the month, we have likely already received updated data on a number of macroeconomic variables that can be used as predictors to model personal income and spending. Thus, forecasting personal income and spending may be considered easier and safer than predicting employment. In conclusion, the release timing of a target variable and its predictors is very important, and we discuss the topic in greater detail later in the chapter.

The final aspect of short-term forecasting is forecast evaluation. The commonly used forecast evaluation criteria are R2 and root mean square error (RMSE), both in sample and out of sample. RMSE is the most frequently used method, but we believe that, in the financial sector, directional accuracy must be considered along with RMSE, as market positions often are taken on a directional basis rather than on specific numerical bet.

THE IMPORTANCE OF THE REAL-TIME SHORT-TERM FORECASTING

More accurate forecasts of macroeconomic variables prior to their release announcements will provide more opportunities to a firm and its clients to profit, or to at least minimize losses. This section discusses the importance of short-term macroeconomic forecasting in precisely those opportunities. Why short-term forecasting? Because scheduled macroeconomic variable announcements, especially when different from the consensus, affect asset price movements and volatility in the equity, bond, and FX markets. Consequently, accurate forecasts of key macroeconomic variables prior to their release announcements will enhance profit opportunities to a firm and its clients. There is a vast literature studying macroeconomic data releases and the response of financial market to those announcements.

The empirical evidence of the relationship between financial market volatility and macroeconomic news announcements dates back to the 1980s. For instance, Schwert (1981), Pearce and Roley (1985), and Jain (1988) provided evidence of financial market response to macroeconomic news announcements.4 More recently, Anderson et al. (2007) and Bartolini et al. (2008) concluded that there would be a strong financial market response as a result of macroeconomic variables release announcements.5 Furthermore, Huang (2007) suggested that the release of the U.S. employment situation report (nonfarm payrolls and unemployment rate) is the most influential news to the financial market.6 Faust et al. (2007) found release announcements of inflation data moves exchange rates and interest rates.7 Gilbert et al. (2007) suggested that release announcements of the U.S. Leading Economic Index (LEI) affect aggregate stock returns, return volatility, and trading volume.8

In addition, studies have found that macroeconomic data announcements of U.S. variables affect not only U.S. financial markets but foreign financial markets as well. Anderson et al. (2007) characterized the response of the U.S., German, and British equity, bond, and FX markets to real-time U.S. macroeconomic news announcements while Nikkinen and Sahlstran (2001) suggested that Finnish and U.S. financial markets respond to the announcements of the U.S. employment, CPI, and producer price index (PPI) data.9

In summary, macroeconomic news announcements do move equity, bond, and FX markets. Therefore, better forecasting can create strategic opportunities for firms and their clients.

THE INDIVIDUAL FORECAST VERSUS CONSENSUS FORECAST: IS THERE AN ADVANTAGE?

As demonstrated in the previous section, academic literature suggests a connection between macroeconomic data announcements and financial market volatility. However, over the past decade and a half, the effect of macroeconomic news on asset prices may have changed. During the past decade, market consensus estimates have been better publicized, with more estimates being publicly available for every major macroeconomic variable. When the actual release is markedly different from the market consensus, financial markets move more significantly to the economic news (see Bartolini et al. [2008] for more details). The most widely used market consensus is provided by Bloomberg L.P.10 The Bloomberg consensus is based on the median response from financial market participants.11 Therefore, it is a reliable measure of market expectations for the upcoming release of a macroeconomic variable.

Rigobon and Sack (2008) and Bartolini et al. (2008) suggested that key macroeconomic variable release announcements do affect asset prices; however, the effect is more significant when the actual news is different from market expectations. Both studies used the median response of the survey taken by Bloomberg12 as a proxy for the market expectations prior to the release announcements and found that if the actual release is different (stronger than expected or less than expected) than the market consensus, the market response to the news is more significant. In addition, Bartolini et al. (2008) suggested that the stronger-than-expected news announcement may increase interest rates, strengthen the dollar, and raise equity prices. Boyd et al. (2005) showed that stock prices respond differently to changes in the unemployment rate during recessions and expansions because the dividend and discount rate effects have different weights at different points of the business cycle.13 Therefore, based on empirical evidence, we do recognize the macroeconomic variable release announcements effect.14

There are two implications for real-time short-term macroeconomic forecasting.

  1. Reliable forecasts of macroeconomic variables prior to their release announcements provide more opportunities to a firm to generate profits (or reduce losses for itself and its clients).
  2. The importance of an individual forecast approach that is better than the market consensus is increased, given that the markets move significantly when the actual release is different from the market consensus.

Key macroeconomic variable release announcements do affect financial markets; however, the effect is much more significant when the actual release is different from the market consensus. Thus, the importance of an individual forecaster and his or her better-than-consensus findings are thereby increased. An individual forecast approach can provide more opportunities to a firm and its clients to make money (or reduce losses). Moreover, because using the market consensus as a forecast will not add value to a firm, individual analysts have an incentive to develop their own, more accurate forecasts.

THE ECONOMETRICS OF REAL-TIME SHORT-TERM FORECASTING: THE BVAR APPROACH

This section provides the econometric methodology and forecast evaluation for real-time short-term forecasting. As mentioned earlier, Silvia and Iqbal (2012) employed the BVAR approach, an extension of the VAR model, to generate real-time forecasts. After the seminal work of Sims (1980), VAR became a major tool for macroeconomic forecasting.15 Despite its success, however, there is a technical problem with VAR: The approach can utilize only a small subset of available information due to the degree-of-freedom problem, also known as the curse of dimensionality. To address this problem, Litterman (1980, 1986) presented the BVAR approach.16 The BVAR approach is more flexible because it allows more variables as inputs than the VAR approach, permitting the inclusion of more information about the relationship than the traditional VAR model. Litterman (1986) showed that his approach is as accurate, on average, as those used by the best-known commercial forecasting services (DRI, Chase, and Wharton Econometrics at that time). Theoretically, recent literature shows significant development in BVAR modeling (see Sims and Zha [1998] for more details).17 Empirically, however, improvement on Litterman's original methodology does not seem particularly significant (for more details, see Robertson and Tallman [1999]).18

The performance of Litterman's method is at least partially determined by the choice of several parameters—popularly referred to as the Minnesota prior. Litterman was able to implement only a small number of the possible parameter combinations due to limited and expensive computer power at the time of his research. With the programming flexibility and speed available with today's advanced econometric software, an analyst can run Litterman's regression using many parameter combinations.

The Bayesian Vector Autoregression Model

As the BVAR model is an extension of the VAR model, we start our discussion with the VAR approach. In addition, we highlight issues related with the Sims VAR approach and benefits of Litterman's BVAR model. Let Yt = (Y1t, Y2t, Y3t, ..., Ynt) be a set of time series data. The VAR (p) representation of the time series can be presented as shown in equation 11.1:

images

where

α = (α1, α2, ..., αn) is an n-dimensional vector of constants β1, β2, ..., βp are n × n autoregressive matrices

εt = n-dimensional white noise process with covariance matrix E(εtεt)′ = ψ

The traditional VAR model has some limitations. The first issue is known as overparameterization; that is, an analyst has to estimate too many parameters, and some of them may be statistically insignificant. For example, a VAR model with five variables and four lags and a constant in each equation will contain a total of 105 ((1 + 5 × 4) × 5 = 105) coefficients. The second problem is that overparameterization will cause multicollinearity as well as a reduction in the degrees of freedom, which may result in a very good in-sample fit but a possibility of a large out-of-sample forecast error. This is sometimes referred to as over-fitting the model.

Litterman (1980) described an approach to overcome these problems. Litterman (1980, 1986) introduced the Bayesian VAR approach and used a prior, popularly referred to as Minnesota prior, and solved the issue of over-parameterization (see Litterman [1986] for more details). Litterman's prior is based on three assumptions.

  1. All equations contain a random walk with drift model. This essentially shrinks the diagonal elements β1 toward 1 and the other coefficients (β2, β3, ..., βp) toward zero.
  2. More recent lags provide more useful information (have greater predictive power) than more distant ones.
  3. A variable's own lags explain more than the lags of the other variables in the model.

The Litterman prior is imposed by the (mean and variance) moments for the prior distribution of the coefficients shown in equation 11.2.

images

The coefficients β1, β2, ..., βp are assumed to be independent and normally distributed. The covariance matrix of the residuals is assumed to be diagonal, fixed and known, that is, Ψ = Σ, where Σ = diag (σ21, ... σ2n), and the prior on the intercept is diffuse. The random walk prior (δi) has some intuitive implication, such as δi = 1 for all i, indicating that all variables are highly persistent. However, the researcher may believe that some of the variables in the model are following a mean reversion or at least not characterized by a random walk. This does not pose a problem for this framework, because a white noise prior can be set for some or all of the variables in the VAR model by imposing δi = 0 where appropriate. The hyperparameter λ controls for the overall tightness of the prior distribution around δi. This hyperparameter governs the importance of prior beliefs relative to the information contained in the data; λ = 0 imposes the prior exactly so that the data does not inform the parameter estimate, and λ = ∞ removes the influence of the prior altogether so that the parameter estimates are equivalent to ordinary least squares (OLS) estimates. The factor 1/k2 is the rate at which the prior variance decreases with the lag length of the VAR, and σ2i/σ2j accounts for the different scale and variability of the data. The coefficient ϑε(0, 1) governs the extent to which the lags of other variables are less important than own lags.19

Litterman's method is a good solution to many of the problems associated with the traditional VAR model. Another issue, however, is the presence of the unit root in any series of the model. What happened to the BVAR's estimate and to the forecasting in a nonstationary framework and possible cointegration relationships between the components of the BVAR model? There are two popular answers to this question. One group of economists, especially Lütkepohl (1991) and Phillips (1991), suggested that when the BVAR analysis unfolds in context of a nonstationary process and there is potential for cointegration relationships, the estimate would be biased.20

In contrast, a group of economists are in favor of using the BVAR model at the level form of the series. For example, Sims, Stock, and Watson (1990) showed that if the potential cointegration restrictions existing are not taken into account and the model is estimated in levels, this estimation is consistent.21 Sims (1991) said that these critiques were poorly grounded and argued that, owing to the superconvergence property of the estimators in the presence of a cointegration relationship, these aspects tend to manifest themselves with clarity, irrespective of the type of the prior information used.22 Alvarez and Ballabriga (1994) furnished evidence on this matter and performed a Monte Carlo simulation with a cointegrated process that allows consideration of the power of different estimation methods for capturing the long-run relationship.23 The results obtained sustain Sims's proposition, rather than those of the critics, provided that the prior distribution has been selected in keeping with a goodness-of-fit criterion.

In addition to the Alvarez-Ballabriga evidence in support of Sims's views, the nonstationary issue also depends on what the target variable is, especially in forecasting. For instance, in the marketplace, and particularly in the financial sector, investors pay more attention either to the percentage change (e.g., month-to-month and/or year-to-year percentage change in CPI or retail sales) or net change (e.g., net monthly change in nonfarm payrolls) than to the level form of many variables since the level form is not the headline number that is reported.24 Since major macroeconomic variables, such as employment and the CPI, are reported either in percentage form (growth rate) or net change, the nonstationarity issue may not be a problem in many cases. As a result—because of two reasons (a) based on Sims's and the Alvarez-Ballabriga suggestions and (b) most forecasts are a growth rate/net change of a variable instead of the level form of a variable—nonstationary and/or potential cointegration are less likely to affect the forecasting framework.

Forecast Evaluation: Real-Time Measures

This chapter presents an accurate real-time short-term (one-month-forward) macroeconomic forecasting framework. Furthermore, we are not just looking for accuracy per se but also for a better real-time forecasting approach that is more accurate than those of the Bloomberg real-time consensus forecast, on average, for major macroeconomic variables. The BVAR's real-time forecasts are compared with the Bloomberg real-time consensus. For comparison purposes, we use these two criteria: the real-time out-of-sample RMSE as the forecast evaluation criterion for the BVAR approach's real-time forecasts and the Bloomberg real-time consensus forecasts for each of the 20 macroeconomic variables. Equation 11.3 is employed to calculate the real-time out-of-sample RMSE:

images

where

imagest+1 = real-time one-month-ahead forecast, prior to the variable release

Yt+1 = actual release announcement of a macroeconomic variable

The magnitude of this statistic is used to compare the real-time out-of-sample performance of the BVAR approach and the consensus for all 20 variables. Furthermore, a model with a smaller RMSE is the best model among its competitors for a particular variable. For example, if the BVAR's CPI model has a smaller real-time out-of-sample RMSE than Bloomberg, the BVAR approach is the better model.

The real-time out-of-sample RMSE is a very good measure of forecast evaluation. However, in practice, and in the financial sector specifically, the direction of the release announcement is also very important since most hedged positions are based on a directional change rather than the magnitude of the change. As Bartolini et al. (2008) suggested, stronger-than-expected news announcements may increase interest rates, strengthen the dollar, and raise equity prices. Moreover, Boyd et al. (2005) showed that stock prices respond differently to changes in unemployment rate during recessions and expansions because the dividend and discount rate effects have different weights at different points of the business cycle. Therefore, it is very important to predict in real time the direction of the variable to make profits and/or minimize losses.

For instance, the average forecast error, RMSE, for the net change in the nonfarm payrolls model is, let us say, 75K (in forecasting, it may be a very reasonable number, given the volatility of the series), and the real-time one-month-forward forecast is +25K (a net gain of 25,000 jobs). Furthermore, if the actual release comes in as −45K (a net loss of 45,000 jobs), then this would imply that the actual payrolls stayed within the forecast +/− RMSE (25+/−75; +100K and −50K) range. One would think the model is good; in fact, the model is still useful (in terms of the RMSE). However, it may not produce positive opportunities for the firm. Let us assume that the market consensus was a positive number, say +35K (net gain of 35,000 jobs) and the actual number comes in as a negative number, −45K. The response of the financial sector to the announcement would be significant as the news suggests weaker-than-expected job growth. If payrolls data came in weaker than expected, then equity prices may plunge since a negative payrolls number is generally associated with a weak labor market and thus a downshift in economic growth and corporate profit expectations.

A forecasting approach just based on the RMSE may not provide a firm with the most opportunity to generate financial gains. The first step to a more accurate forecasting approach would be that the forecast value should be close to the actual value or have the minimum average forecast error. Second, the direction of the forecast would also be accurate, on average. Since many macroeconomic variables are reported either in percentage change or net change, the sign (positive or negative) of the release announcement is also important. In the example—(forecast +25K versus actual −45K)—actual data stayed within the forecast +/− RMSE range, but the direction of the forecast was not correct, with a net gain of 25K jobs forecast but an actual net loss of 45K jobs. A better model evaluation must consider the directional accuracy along with the minimum forecast error, on average. Equation 11.4 is used to calculate the direction accuracy:

images

where

imagest+1 = a forecast

Yt+1 = actual release of a time series

It is important to note that there are a few variables that are reported on level terms or as an index, such as ISM Manufacturing and Non-Manufacturing surveys, the unemployment rate, and consumer confidence. We use equation 11.5 to calculate the directional accuracy of the forecast for those variables:

images

where

Yt = prior-month value of the time series

If the difference between the actual release and prior-month value (YtYt+1) has the same sign as the difference between forecast and the prior-month value of the time series (Ytimagest+1), then the direction of the forecast is correct.

For average directional accuracy, this equation is used:

images

where

X = number of forecasts that have the right direction

N = total number of forecasts

We convert the ratio into a percentage by multiplying by 100. If the forecast shares an identical sign with the actual release, then the direction is correct. For instance, from the previous example, if the forecast was −25K and actual release came in as −45K then the model was accurate in terms of RMSE as well as directionally accurate.

In conclusion, we suggest that a better forecasting approach must have a smaller real-time out-of-sample RMSE as well as a higher directional accuracy, on average, than the consensus forecast.

A SAS Application of the BVAR Approach: A Case Study of the Employment Forecast

In this section we discuss the SAS code for forecasting a macroeconomic or financial variable using the BVAR approach. We have selected nonfarm payrolls data series as a case study. As mentioned previously, we follow Silvia and Iqbal's (2012) proposed forecasting model for the nonfarm payrolls series. Their suggested model performed better in-sample and out-of-sample forecasting measures compared to its competitors and also performed better than the Bloomberg consensus in a real-time forecasting comparison (see the next section of this chapter for more details). SAS code S11.1 is utilized to generate a one-month-ahead forecast for nonfarm payrolls.25 The procedure PROC VARMAX offers the BVAR option. In the second line of the code, after the SAS keyword Model, we list the variables of interest. Employment is our target variable (month-to-month change in nonfarm payrolls), and the other variables of the model are: help_wanted (help wanted index), Lay_off (number of job losers), ISM_M (ISM manufacturing employment index), ISM_NM (ISM nonmanufacturing employment index), and claims (number of unemployment insurance claims filed).

images

The next step is to provide the number of lags (p=6), Silvia and Iqbal (2012) suggested to use lag order 6 (i.e., up to six lags of each variable). The lambda=.4 and theta=.3 are parameters of the Litterman's prior, as discussed earlier in this chapter. The third line of code instructs SAS to use the date variable as ID, and the frequency of the dataset is month. In the next line, Output specifies a one-month-ahead forecast (as lead=1,) and the forecast along with actual and 95 confidence interval are saved in the newly created data file forecast_payroll.

images

FIGURE 11.1 Nonfarm Payrolls: Actual, Forecast, and 95 Percent Confidence Interval

In Figure 11.1, we plotted the actual employment data series along with the forecast and forecast interval. The forecast and actual values stay within the 95 percent forecast interval.26 Overall, the model successfully predicted the turning points of the nonfarm payrolls series and produced consistent results.

FORECASTING IN REAL TIME: ISSUES RELATED TO THE DATA AND THE MODEL SELECTION

In addition to econometric modeling, there are several important issues that need to be considered in real-time (short-term) forecasting. For example, the functional form of a dependent variable and release timing of dependent/predictors are vital to an accurate real-time forecasting approach. This section discusses the functional form of variables and why the form is important. Furthermore, we present a procedure to select a predictor of a model.

The Functional Form of the Variables

This section discusses issues related to dependent/target variables and their predictors. First, the dependent variables forecasted typically include many macroeconomic and financial variables. Silvia and Iqbal (2012) selected 20 major macroeconomic variables for their study (a complete list of the variables is available in Appendix 11A). Most of these variables are reported either in percentage change or net change but not in level form. For instance, measures of inflation (CPI, Personal Consumption Expenditures deflator, and PPI), industrial production, retail sales, and durable goods orders are reported as month-to-month (MoM) and year-to-year (YoY) percentage changes while the nonfarm payrolls data is reported as a net monthly change. There are a few variables that are reported as level/index value, such as ISM Manufacturing/Nonmanufacturing indices and consumer confidence.

There are a few important points that we want to stress here. First, in the financial sector, the level form of many variables may not be relevant; instead, a specific form of a particular variable is meaningful to the market (e.g., a MoM percentage change in retail sales is more important than the level of the retail sales since the MoM percentage change suggests momentum/growth in the economy). Therefore, it is necessary to determine what form of the variable is meaningful to markets and what form the analyst is going to forecast.

Second, once an analyst determines the functional form of the target variable, then he or she seeks the best predictors for that variable. A final point is that the functional form of the predictors should be consistent with the functional form of the dependent variable (e.g., if the dependent variable is a MoM percentage change, it would be better to use a MoM percentage change of the predictors). Here we present a practical example of a retail sales model; see Silvia and Iqbal (2012) for more details. One of the predictors of the retail sales (MoM percentage change) is the ICSC Chain Store Index (chain-store sales). The chain-store sales is reported as a YoY percentage change. The retail sales model (we predict MoM percentage change in the sales) with chain-store sales (YoY percentage change) has a simulated out-of-sample RMSE of $24,201 million. If, however, the model uses the MoM percentage change of chain-store sales, then the simulated out-of-sample RMSE drops significantly to $3,042 million. One major reason for the two significantly different RMSEs from the same set of variables and same sample period is that the YoY percentage change represents a change that occurs over a different time period from the MoM change. Moreover, our objective is to forecast MoM percentage change in retail sales, and that suggests we should use the MoM percentage change in chain-store sales as a predictor, along with other predictors. This conclusion is true for many other variables as well.27 Consequently, the analysis suggests that during the model selection process, a forecaster should include the functional form of the predictor that is consistent with the functional form of the dependent variable.

It always has been difficult for researchers to filter through masses of data and find the most useful and best predictors. A small number of variables is essential, however, as including too many variables in a traditional econometric modeling framework creates over-fitting and/or degree of freedom issues—the curse of dimensionality problem.28 However, due to advances in computer and database capabilities, combined with econometric/statistical software like SAS, a researcher can analyze each variable from a large dataset and select a reasonable number of variables based on some statistical criteria. This chapter suggests a step-wise procedure and selects a handful of predictors, mostly 5 to 7 variables, from a dataset of more than 300 variables.

Silvia and Iqbal (2012) propose a three-step procedure to select predictors for each of the 20 models from a database of over 300 variables. Four transformations of these 300 variables are utilized: (1) the level percentage change (MoM/YoY percentage change); (2) the lag of the variable; (3) the first-difference form; and (4) the lag of the first-difference form. In total, over 1,000 variables are created as potential predictors. The objective here is to include as many predictors as possible in the first step. In contrast to typical econometric modeling where a modeler already has a model specification guided by an economic theory, here the assumption is that an analyst does not know much about the model specification. The analyst must rely on data variation and statistical principles (basically, the data-mining technique) to indicate the choice of model specifications instead of utilizing previously employed models. The key advantage, among others, is that date-mining technique would allow each variable at least a chance to enter the final model and allows us to explore the usefulness in forecasting of all predictors to a greater extent.

The Selection of the Best Model Specification

To select predictors for all short-term forecasting models, Silvia and Iqbal (2012) utilize this three-step procedure.

  1. We start by taking the regression of the dependent variable against each of these 1,000 variables, and retain those with significant predictive power. With a much smaller dataset, we then find the best model specifications with one predictor, two predictors, and so on, up to six predictors. The R2 is used as the selection criterion in choosing these specifications. Ten variables are selected from this step.
  2. We use the Granger causality test between the dependent variable against each of these 1,000 variables to come up with the top 10 variables based on the Chi-square test. We have now narrowed down our choice list to 20 variables, 10 from the regression and 10 from the Granger causality test.29 These 20 variables, so far, came from an in-sample statistical procedure.
  3. We use a simulated out-of-sample RMSE as a statistical measure to find the final model specification. Silvia and Iqbal set a 6-variable BVAR framework, which provides an opportunity for each of these 20 variables to audition as a predictor.30 For most variables, we assume that the data is available, let us say, until 1999:M12, and we forecast for one month ahead. We then move one month forward, using data until 2000:M1, and again forecast for one month ahead. We repeat this process until the dataset reaches 2008:M11. At the end, we have 108 simulated out-of-sample one-month-ahead forecasted data points, which we use to calculate the RMSE. Six variables are thus selected based on the lowest RMSE value.

With the help of SAS, we increased the predictive power of the final model specification. As mentioned earlier, the BVAR method used a prior, referred to as the Minnesota prior. The efficacy of the BVAR model depends, to some extent, on the prior and selection of lag orders. A more flexible procedure is applied to select the prior and the lag orders, which involves the above-mentioned recursive method to calculate the out-of-sample RMSE, but this time we did not fix the lag orders or the value of Litterman's prior. We fixed a maximum lag order of 9 since the data series of many variables do not have a long history. As the Litterman's prior ranges between zero and one, with the flexibility and speed of the SAS system, we can get a better combination of the lags and the prior. For a six-variable model, for example, we choose a lag parameter, P, that ranges from 1 to 9, and the Litterman's prior, ϑ, that ranges from 0.1 to 0.9 with 0.1 increments, and follow the same procedure for λ. Altogether there will be 729 (9 × 9 × 9 = 729) models, consisting of a unique combination of P, ϑ, and λ, and 729 sets of RMSE. We select a combination that has the minimum RMSE and select the combination (values of P, ϑ, and λ,) as a final model specification. This model has the best overall simulated out-of-sample forecast performance based on the RMSE across multiple equations.

Timing of the Release: A Dependent Variable and Predictors

This section highlights a very important, and the most neglected, issue of the release timing of a dependent variable/predictors, an issue that must be considered in any practical model specification process for market forecasts. Many macroeconomic variables are released with a time lag, such as a one-month or two-month lag. For instance, the ISM manufacturing index is released on the first business day of the month for the previous month, with a one-month lag (on September 1, 2010, ISM manufacturing index was released for August 2010). Construction spending is released with a two-month lag, usually during the first week of the month (on September 1, 2010, construction spending data was released for July 2010). Indeed, during model specification, a researcher must consider these issues, as conventional forecasting procedure does not appear to recognize the data availability issues.

In real-time short-term macroeconomic forecasting, the most recent information about the predictors and dependent variable is crucial to a successful forecast. Here is an example of a nonfarm payrolls model. The model uses ISM manufacturing and nonmanufacturing employment indices as predictors, along with other predictors. If the model uses the lag form of the indices, instead of the current form, then the model has a simulated real-time one-month-ahead RMSE of 124K. With the current form of the employment indices, along with the same set of variables and same sample period, the model's RMSE drops significantly to 75K. This improvement in real-time forecasting using most recent values of predictors is true for many other models. As a result, having the most recent month's information regarding the predictors is an integral part of a good forecasting approach, and data availability is essential to having a realistic and useful model.

Another vital issue is the release timing of a dependent variable. For instance, ISM manufacturing, nonfarm payrolls, CPI, retail sales, industrial production, and personal income and spending data are released with a one-month lag, but the precise release timing is different for all these variables. Furthermore, the ISM manufacturing index is released on the first business day of every month; nonfarm payrolls is released on the first Friday of every month; CPI, retail sales, and industrial production usually are released toward the middle of every month; and personal income and spending data is released at the end of every month. Thus, although all these variables are released with a one-month lag, the precise release timing is different. Consequently, a forecaster cannot use current-month information of CPI, retail sales, and industrial production to predict nonfarm payrolls, the unemployment rate, or the ISM manufacturing index because the former are released after the latter. However, the researcher can use current-month information of the ISM manufacturing index and payrolls to predict the CPI, retail sales and/or industrial production. Similarly, current-month personal income and spending data cannot be used to predict CPI, retail sales, or industrial production. This issue is very important because, based on practical needs, many times when we need most recent values of predictors to generate forecast not all independent variables' recent values are available to include in the forecasting. That said, very often, at the end of the final step of a model specification, we end up with a handful of potential predictors (based on simulated out-of-sample RMSE) but have to drop some because they are released after the dependent variable. Moreover, if we use the lag form of those potential predictors (instead of the current values), the model's accuracy, based on simulated out-of-sample RMSE, deteriorates significantly.

Nonfarm payrolls data, for instance, is released on the first Friday of the month, and there are few independent predictors available for the model. ISM manufacturing and nonmanufacturing employment indices are very good predictors of the employment model and normally are released before the employment data's release. Therefore, the current month's information for these predictors can be used to forecast nonfarm payrolls. However, the use of the lag form of these predictors, instead of the current value, makes a significant difference, especially in terms of the simulated out-of-sample RMSE. From the previous example, if the nonfarm payrolls model uses the lag form of the ISM manufacturing and nonmanufacturing employment indices instead of the current form, then the model has a simulated real-time one-month-ahead RMSE of 124K. With the current form of the employment indices, along with the same set of variables and same sample period, the model's RMSE drops significantly to 75K. This is a major change in the RMSE, and this conclusion is true for many other models. The release timing of the dependent variable and predictors is very important and needs to be considered during the model specification process, especially in real-time short-term macroeconomic forecasting.

CASE STUDY: WFC VERSUS BLOOMBERG

In this section we compare real-time forecasts of Silvia and Iqbal (2012) using the BVAR approach with those of the Bloomberg real-time consensus for 20 key macroeconomic variables over the period of 2009:M01 to 2010:M08; 20 observations for each variable and total 400 (20 × 20). According to Silvia and Iqbal, every Friday morning, they submit the forecasts using the BVAR approach to Bloomberg and other media (and to their clients) for those variables that will be released during the next week. Therefore, the traders have enough time to make their investment decisions based on economic projections. Often they have almost a week to make their decisions (e.g., employment data releases on Friday; therefore, they submit their forecast usually one week before the actual release). The results appear in Tables 11.1 to 11.5.

Table 11.1 provides detailed information about the nonfarm payrolls forecasts.31 Column 1 shows the date—when the forecast was submitted to the media, Bloomberg and others. Column 2 indicates real-time forecasts using the BVAR approach and column 3 provides Bloomberg's real-time consensus for nonfarm payrolls. Column 4 contains the prior month's value and revisions to the prior month are shown in the column 5. Column 6 depicts the actual release of the nonfarm payrolls. The last two columns show real-time out-of-sample forecast error for the BVAR forecasts as well as the Bloomberg consensus.

TABLE 11.1 Net Change in the Nonfarm Payrolls: BVAR VS. Bloomberg Consensus

images

images

It is worth mentioning that the nonfarm payrolls data is notorious for revisions, sometimes huge ones; in our analysis, 17 of the 20 months of nonfarm payrolls data were revised. Moreover, sometimes data is revised from a negative number (net job loss) to a positive (net job gain) and vice versa (e.g., on January 2010, the estimate for December 2009 nonfarm payrolls data was revised to +4K from −11K). That makes real-time one-month-ahead forecasting process for nonfarm payrolls harder. As mentioned earlier, we have set real-time out-of-sample RMSE along with real-time out-of-sample average directional accuracy as forecast evaluation criteria. Therefore, a model with a smaller RMSE and a higher average directional accuracy would be the most accurate.

As can be seen from Table 11.1, for the net change in nonfarm payrolls, the BVAR approach has a real-time out-of-sample RMSE of about 65K, and it is smaller than those of Bloomberg's real-time consensus, which is about 73K. Therefore, based on the RMSE, the BVAR model is more accurate than Bloomberg. The RMSE as a measure of forecast evaluation is necessary but not sufficient to adopt as a forecast procedure. The average directional accuracy would also be important. Markets move more when the actual announcement is different from the market expectations (stronger than expected or vice versa) because the direction of the release variable suggests momentum or loss of momentum in the economy. For instance, from Table 11.1, the BVAR's real-time forecast for January 2010 (released on February 5, 2010) was −68K (a net loss of 68,000 jobs) and the Bloomberg real-time consensus was +20K and the prior-month value was −85K. The actual release came in as −20K. The range for the BVAR forecast (forecast +/− RMSE) was between −3K and −133K (−68K +/− 65K); and between +93K and −53K (20K +/− 73K) for Bloomberg. Since the actual release stayed within one standard deviation within the range, the BVAR model and the Bloomberg consensus may be useful.

The implication of the actual release, however, is very different from those of the Bloomberg consensus. For instance, let us analyze the situation in real time. In February 2010, the U.S. economy was still in recession, as the National Bureau of Economic Research (NBER) had not yet declared the end of the 2007 to 2009 recession.32 Moreover, the monthly net change in nonfarm payrolls was continuously showing net job losses since January 2008.33 A market consensus of +20K indicated that people were expecting marginal improvement in the labor market. The BVAR model, however, still forecasted a net job loss of 68K. In addition, the investment decisions would be different based on these two outlooks for the labor market since the labor market is a key indicator of the U.S. economy. The actual release of a negative number (–20K) was consistent with the BVAR but also with the signal that the economy remained weak. Furthermore, as Rigobon-Sack (2008) and Bartolini et al. (2008) suggest, markets move significantly when the actual release is different from market expectations. Therefore, had the trader acted on and accurately forecast the direction of the actual release, it was a perfect opportunity for the trader to make money, given that the actual release was weaker than expected. That is just one example of the importance of the directional accuracy. From Table 11.2, the BVAR nonfarm payrolls model has an average directional accuracy of 79 percent (79 percent of the time the direction of the forecast was right), which is higher than Bloomberg consensus (of 67 percent). Therefore, the nonfarm payrolls model using the BVAR approach is more accurate, on average, than Bloomberg, in terms of the real-time out-of-sample RMSE and average directional accuracy.

TABLE 11.2 Summary of the Results: Net Change in Nonfarm Payrolls

images

In Table 11.3, we provide real-time out-of-sample RMSE and directional accuracy of each of the 20 macroeconomic variables for the BVAR and the Bloomberg consensus. The results indicate that BVAR has smaller RMSEs for 18 of the 20 variables than those of the Bloomberg consensus, which reflects the median forecast likely derived from a number of forecasting approaches. Only for 2 variables (durable goods orders and trade balance) does the BVAR have a slightly higher RMSE than Bloomberg; however, these variables have a higher directional accuracy than those of Bloomberg. In terms of real-time average directional accuracy, 19 of the 20 models have a higher accuracy than those of Bloomberg. Only one model (existing home sales) has accuracy equal to Bloomberg. Nevertheless, this model has smaller RMSE than that of Bloomberg. From Table 11.4, for 17 of the 20 variables, the BVAR has a smaller RMSE as well as a higher (one has equal accuracy) average directional accuracy than that of Bloomberg. In addition, none of the BVAR model has a lower average directional accuracy than Bloomberg. As a result, the BVAR real-time forecasts are more accurate than those of the Bloomberg consensus, on average.

TABLE 11.3 Summary of the Results for 20 Variables

images

TABLE 11.4 Summary of the Results

images

TABLE 11.5 Summary of the Results

images

A summary of results are in Table 11.5. It can be seen that 7 of the 20 models have real-time out-of-sample average directional accuracy of 90 percent or more compared to Bloomberg, which has only 2. Moreover, 50 percent of the models (10 of the 20) have a directional accuracy of 80 percent or more compared to 25 percent (5 of the 20) for Bloomberg. If we set, for example, 70 percent directional accuracy as a benchmark for a best model, then 19 of the 20 models (95 percent) passed that test compared to 55 percent (11 variables) for Bloomberg. Overall, average directional accuracy for BVAR models is 82 percent; for Bloomberg, 71 percent. In conclusion, clearly, the BVAR's forecasts are more accurate than those of the Bloomberg consensus, on average. It is worth mentioning that over the past year, we have been cited by Bloomberg in the top-five forecasters for major macroeconomic variables such as nonfarm payrolls, unemployment rate, and housing starts.

There is another essential question: Do markets respond differently to the macroeconomic variable release announcements in different stages of the business cycle? The answer to this question, based on empirical evidence and personal experience, is yes. For instance, Anderson et al. (2007) suggested that the equity market reacts differently to the macroeconomic news announcements depending on the stage of business cycle. Moreover, Boyd et al. (2005) concluded that stock prices respond differently to changes in the unemployment rate during recessions and expansions because the dividend and discount rate effects have different weights at different points of business cycle.34 The vital points are: (1) the financial markets react differently to the news announcements depending on the stages of the business cycle—whether it is a recession or expansion, and (2) some variables may have a significant effect on the markets at some stage of the business cycle, during recession (or early phase of a recovery) rather than at a normal or steady stage of the economic expansion. For example, during the 2007 to 2010 recession and early phase of the recovery, the data release announcements regarding the housing sector received more financial market attention and thereby generated significant market volatility and reaction than in earlier cycles.35 The housing sector boom-bust was a major cause of the 2007 to 2009 recession, and a solid economic recovery without a housing sector recovery was considered unlikely. Therefore, the housing-related data announcements were very important to the financial market at least during the 2007 to 2010 time period and continue to be so at the time of this writing. As a result, accurate forecasts of housing-related data are more important during this cycle than in earlier cycles.

Housing-related data announcements were very important to the markets. From Table 11.3, the BVAR's forecasts for those variables were better than those of the Bloomberg consensus, on average. For instance, BVAR forecasts for housing starts, a key representative of the housing sector, had a real-time out-of-sample RMSE of 44K (44,000 units) compared to 57K for the Bloomberg consensus. The housing starts model has a higher (75 percent) real-time average directional accuracy than Bloomberg (42 percent); see the table for details. Another important variable is new home sales. The BVAR model has a lower RMSE (35K) and higher directional accuracy (62 percent) compared to those of Bloomberg (RMSE = 40K and 46 percent directional accuracy). In addition, forecasts for existing home sales were more accurate than those of the Bloomberg consensus, on average (see the table for details). In conclusion, the essential point is that there could be a few macroeconomic variables which become vital to the markets because of their relation to the causes of recession/recovery (e.g., the housing sector became a key to the 2007 to 2009 recession and recovery).

A crucial issue is that many macroeconomic variables are notorious for revisions, which makes short-term forecasting accuracy difficult. For example, in this analysis, housing starts, retail sales, and durable goods orders have been revised for all 20 months; employment was revised for 17 months; and industrial production was revised for 18 of the 20 months. This is a warning for researchers/forecasters often the fact using revised, not real-time, data in the short-term forecasting and that reduces forecast accuracy.

Summing up, first, the BVAR forecasts are more accurate than those of the Bloomberg consensus, on average. Second, some variables may get more financial sector attention due to their relation to the stages of business cycle. This implies that an analyst should be aware of the changing importance of these variables, and it is better to attempt to forecast all major macroeconomic variables accurately to build up some experience with real-time forecasts for each series. Finally, many macroeconomic variables are notorious for revisions, a fact that must be considered when using revised data to mimic real-time responses.36

SUMMARY

This chapter provides a real-time real short-term macroeconomic forecasting approach. Furthermore, we shed light on four important areas of macroeconomic forecasting:

  1. The macroeconomic variable release impacts financial market volatility and direction; moreover, the impact is most significant when the actual release is different from the market expectation.
  2. The economic value of a forecast methodology that is better than consensus is increased, provided that the markets move when the actual release values are significantly different from market consensus. An individual forecast approach that is better than consensus will provide more opportunities to make a profit or reduce losses.
  3. In short-term forecasting (one month ahead), the actual release timing of the target variable (dependent variable) as well as the predictors is very important and needs to be considered in model specification.
  4. Traditional forecast evaluation methods, such as R2, adjusted R2, RMSE, and so on, are necessary, but we recommend an additional step: directional accuracy.

Using the Silvia and Iqbal forecasting approach, we compared the BVAR's real-time forecasts with the Bloomberg real-time consensus and concluded that the BVAR forecasts are more accurate than those of the Bloomberg consensus, on average, for key macroeconomic variables.

APPENDIX 11A: LIST OF VARIABLES

Silvia and Iqbal (2012) included 20 variables in their forecasting comparison of the BVAR model and the Bloomberg consensus. The first column provides the name of a variable. The second column shows the specific form of the forecasted variable. For example, a month-over-month (MoM) percentage change of the business inventories is forecasted. All variables are monthly series.

TABLE 11.A Forecast Evaluation 2010

images

This chapter draws heavily on John Silvia and Azhar Iqbal (2012), “A Comparison of Consensus and BVAR Macroeconomic Forecasts,” Business Economics 47, no. 4.250–261.

1For a detailed discussion about real-time data, see Dean Croushore and Tom Stark (2001), “A Real-Time Data Set for Macroeconomists,” Journal of Econometrics 105 (November): 111–130.

2John Silvia and Azhar Iqbal (2012), “A Comparison of Consensus and BVAR Macroeconomic Forecasts,” Business Economics 47, no. 4.250–261.

3See the next section, “The Importance of the Real-Time Short-Term Forecasting,” for details.

4G. W. Schwert (1981), “The Adjustment of Stock Prices to Information About Inflation,” Journal of Finance 36: 15–29; D. Pearce and V. Roley (1985), “Stock Prices and Economic News,” Journal of Business 58: 49–67; P. C. Jain (1988), “Response of Hourly Stock Prices and Trading Volume to Economic News,” Journal of Business 61: 219–231.

5T. Andersen, Tim Bollerslev, Francis Diebold, and Clara Vega (2007), “Real-Time Price Discovery in Global Stock, Bond, and Foreign Exchange Markets.” Journal of International Economics 73, no. 2 (November): 251–277.

6Xin Huang (2007), “Macroeconomic News Announcements, Financial Market Volatility and Jumps,” working paper, Duke University, Durham, NC.

7Jon Faust, John Rogers, Shing-Yi Wang, and Jonathan Wright (2007), “The High-Frequency Response of Exchange Rates and Interest Rates to Macroeconomic Announcements,” Journal of Monetary Economics 54, no. 4 (May): 1051–1068.

8Thomas Gilbert, K. Shimon, and L. Lars (2007), “Investor Inattention and the Market Impact of Summary Statistics,” available at: http://papers.ssrn.com/sol3/papers.cfm?abstract_id = 1108050.

9J. Nikkinen and S. Petri (2001), “Impact of Scheduled U.S. Macroeconomic News on Stock Market Uncertainty: A Multivariate Perspective,” Multinational Finance Journal 5, no. 2.129–148.

10There are some other surveys too (e.g., Blue Chips and Wall Street Journal). But these surveys are for long-term forecasts (up to eight quarters ahead of major macroeconomic variables). The focus of this chapter is short-term forecasting, and the Bloomberg consensus is the best option for that purpose.

11The survey polls a group of economists, the number vary with the degree of interest in the indicator at issue. Surveys of highly watched indicators, such as nonfarm payrolls and unemployment rate, often have more than 70 respondents. The lag between the participants' response and the date of the indicator release also varies, from a few days to two weeks.

12Rigobon and Sack (2008) used the Money Market Services survey's consensus as proxy for market consensus before September 2004 and Bloomberg consensus after that.

13J. Boyd, Hu Jian, and J. Ravi (2005), “The Stock Market's Reaction to Unemployment News: Why Bad News Is Usually Good for Stocks,” Journal of Finance 60, no. 2 (April): 649–672.

14Silvia and Iqbal (2012) noted that in their own experience sitting on a trading floor, markets do respond to the releases of macroeconomic data, sometimes sharply. A number of releases, such as nonfarm payrolls and inflation measures, can produce large effects.

15C. A. Sims (1980), “Macroeconomics and Reality,” Econometrica 48, no. 1: 1–48.

16R. Litterman (1980), “Techniques for Forecasting with Vector Autoregressions,” PhD dissertation, University of Minnesota. Litterman(1986), “Forecasting with Bayesian Vector Autoregressions—5 Years of Experience,” Journal of Business and Economic Statistics 4: 25–38.

17C. A. Sims and T. Zha (1998), “Bayesian Methods for Dynamic Multivariate Models,” International Economic Review 39, no. 4: 949–968.

18J. C. Robertson and Ellis W. Tallman (1999), “Vector Autoregressions: Forecasting and Reality,” Federal Reserve Bank of Atlanta Economic Review 84, no. 1: 4–18.

19Sims and Zha (1998) have modified the original Litterman's prior by imposing a normal prior distribution for the coefficient and an inverse Wishart prior distribution for the covariance matrix of the residuals Ψ.

20H. Lütkepohl (1991), Introduction to Multiple Time Series Analysis (New York: Springer). P.C.B. Phillips (1991), “Bayesian Routes and Unit Roots: de Rebus Prioribus Semper Est Disputandum,” Journal of Applied Econometrics 6: 435–473.

21C. A. Sims, J. Stock, and M. Watson (1990), “Inference in Linear Time Series Models with Some Unit Roots,” Econometrica 58: 113–144.

22C. A. Sims (1991), “Comment on ‘Empirical Analysis of Macroeconomic Time Series: VAR and Structural Models,’ by Clements and Mizon,” European Economic Review 35: 922–932.

23L. J. Álvarez and F. C. Ballabriga (1994), “BVAR Models in the Context of Cointegration: A Monte Carlo Experiment,” Documento de Trabajo, no. 9405, Banco de España, Servicio de Estudios.

24The level form of a few variables is also important in some selected circumstances—for example, the ISM Manufacturing Index. However, these types of variables are very few.

25It is important to note that SAS code S11.1 can be utilized to forecast any macroeconomic or financial time series. An analyst just needs to replace the listed variables with the desired variables to generate forecast for the variable of interest.

26Except in May 2010, where the actual value (521K) is much larger than the upper limit (297K). This miss is a result of the 2010 Census. The U.S. Census Bureau hired hundreds of thousands of temporary workers that the model was unable to incorporate. Since we were aware of the census and could expect this census effect, we incorporated add-factors to the model's forecasted values, boosting the forecast number. For more details about add-factors, see Chapter 13 of this book.

27The conclusion of this example is that we should at least include a functional form of the predictors, which is consistent with the target variable in the model specification process. If the same functional form is not a best option, then we can use the form that is consistent with the forecasting objective.

28See Litterman (1986) for more details.

29In the second step, we included all variables and selected the top 10 variables other than those already selected in the first step. That way, we increased our choice list to 20 variables.

30Most of their models have five to eight predictors, and usually they start with a six-variable framework and then include/exclude variable(s) based on simulated out-of-sample RMSE.

31As an example, we present detailed results for nonfarm payrolls. For rest of the 19 variables, we provide a summary of the results in Tables 11.3 through 11.5.

32The NBER, however, declared on September 20, 2010, that June 2009 was the end of the recession of 2007 to 2009.

33The actual release of nonfarm payrolls indicates a negative number (a net job loss) during the time period from January 2008 to February 2010. However, the November 2009 data was revised to a positive number (a net job gain), the first positive number since January 2008.

34J. Boyd, Jian Hu, and J. Ravi (2005), “The Stock Market's Reaction to Unemployment News: Why Bad News Is Usually Good for Stocks,” Journal of Finance 60, no. 2 (April): 649–672.

35It is worth mentioning that it is our anecdotal observation; we leave an empirical proof for future research.

36See Croushore and Stark (2001) for more details.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset