Time series as a regression problem

Another machine learning approach is to treat time series problems as regression problems.

In the previous section, we looked at non-seasonal time series with trend and error components, and seasonal time series with an additional seasonal component. These are non-stationary aspects of the time series. The time series has to be made stationary before we use any regression methods.

Statistical stationarity
A stationary time series is one whose statistical properties, such as mean, variance, autocorrelation, and so on, are all constant over time.

Let us do a simple regression exercise with our time series data. We are going to predict the age of death of a king based on the age of death of the previous king. It's a simple example and hence may sound very strange. The idea is to show how regression is used to do time series forecasting.

Let us smooth the data to make it stationary:

> smooth.king <- SMA(king.ts, n=5)
> smooth.king
Time Series:
Start = 1
End = 42
Frequency = 1
[1] NA NA NA NA 55.2 51.6 53.0 52.6 56.2 53.6 58.2 55.0 51.4 44.6 45.8 41.0 36.8 34.4 38.2 39.6
[21] 34.6 40.6 47.4 48.6 47.0 55.6 64.2 61.4 63.2 63.8 58.6 51.2 53.6 55.4 61.4 68.2 72.6 75.4 73.6 71.4
[41] 73.4 70.4

We used the SMA function to perform the smoothing.

Let us now create our x values:

> library(zoo)
> library(quantmod)
>
> data <- as.zoo(smooth.king)
> x1 <- Lag(data,1)
> new.data <- na.omit(data.frame(Lag.1 = x1, y = data))
> head(new.data)
Lag.1 y
6 55.2 51.6
7 51.6 53.0
8 53.0 52.6
9 52.6 56.2
10 56.2 53.6
11 53.6 58.2
>

We are using the quantmod package to get the lag of the time series. Our predictor is the age of death of the previous king. Hence, we take a lag of 1. Finally, our data frame, new.data, holds our x and y.

Let us now build our linear regression model:

> model <- lm(y ~ Lag.1, new.data)
> model

Call:
lm(formula = y ~ Lag.1, data = new.data)

Coefficients:
(Intercept) Lag.1
2.8651 0.9548

>

By using the lm function, we have built our linear regression model:

Let us quickly plot our model:

plot(model)

Several plots are generated. Let us look at the qq plot:

This plot shows whether or not the residuals are normally distributed. The residuals follow a straight line.

Let us look at the residual plot:

This plot shows whether or not the residuals have non-linear patterns. We can see equally spread residuals around the red horizontal line, without distinct patterns, which is a good indication that we don't have non-linear relationships.

We will stop this section here. Hopefully, it gave you an idea about how to rearrange time series data to suit a regression problem. Some of the techniques shown here, such as lag, will be used later when we build our deep learning model.

For more about time series regression, refer to the book Forecasting: Principles and Practices at https://www.otexts.org/fpp/4/8. It introduces some basic time series definitions. More curious readers can refer to A Little Book of R for Time Series at https://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset