How to do it...

Let's move on to the data processing part:

First, we check the datatype of the Sales column in our data:

class(data$Sales)

Note that, in the data, the Sales column is of the factor datatype. We need to make this a numeric datatype in order to use it in our analysis:

data$Sales <- as.numeric(as.character(data$Sales))
class(data$Sales)

Now, the class of the Sales column has been changed into a numeric datatype.

To implement a time series forecast, we need to convert the data into stationary data. We can do this using the diff() function, which will calculate the iterated difference. We pass the value of the argument differences as 1 since we want the differencing to have a lag of 1:

data_differenced = diff(data$Sales, differences = 1)
head(data_differenced)

The following screenshot shows a piece of the data after differencing:

Next, we create a supervised dataset so that we can apply GRU. We transform the data_differenced series by creating a lag in the series with an order of 1; that is, the value at time (t-1) as the input will have the value at time t as the output:

data_lagged = c(rep(NA, 1), data_differenced[1:(length(data_differenced)-1)])
data_preprocessed = as.data.frame(cbind(data_lagged,data_differenced))
colnames(data_preprocessed) <- c( paste0('x-', 1), 'x')
data_preprocessed[is.na(data_preprocessed)] <- 0
head(data_preprocessed)

Here is how our supervised dataset looks:

Now, we need to split the data into training and testing sets. In time series problems, we can't do random sampling on the data since the order of the data matters. Thus, we need to split the data by taking the first 70% of the series as training data and the remaining 30% of the data as test data:

N = nrow(data_preprocessed)
n = round(N *0.7, digits = 0)
train = data_preprocessed[1:n, ]
test = data_preprocessed[(n+1):N,]
print("Training data snapshot :")
head(train)
print("Testing data snapshot :")
head(test)

The following screenshot shows a few records from the training dataset:

The following screenshot shows a few records from the test dataset:

Next, we normalize the data within the scale of the activation function that we are going to use. Since we are going with tanh as our choice of activation function, which ranges from -1 to +1, we scale the data using min-max normalization. Here, we normalize with respect to the training data:

scaling_data = function(train, test, feature_range = c(0, 1)) {
 x = train
 fr_min = feature_range[1]
 fr_max = feature_range[2]
 std_train = ((x - min(x) ) / (max(x) - min(x) ))
 std_test = ((test - min(x) ) / (max(x) - min(x) ))
 
 scaled_train = std_train *(fr_max -fr_min) + fr_min
 scaled_test = std_test *(fr_max -fr_min) + fr_min
 
 return( list(scaled_train = as.vector(scaled_train), scaled_test = as.vector(scaled_test) ,scaler= c(min =min(x), max = max(x))) )
 
}

Scaled = scaling_data(train, test, c(-1, 1))
y_train = Scaled$scaled_train[, 2]
x_train = Scaled$scaled_train[, 1]

y_test = Scaled$scaled_test[, 2]
x_test = Scaled$scaled_test[, 1]

Then, we write a function to revert the predicted values to the original scale. We will use this function while predicting the values:

## inverse-transform
invert_scaling = function(scaled, scaler, feature_range = c(0, 1)){
 min = scaler[1]
 max = scaler[2]
 t = length(scaled)
 mins = feature_range[1]
 maxs = feature_range[2]
 inverted_dfs = numeric(t)
 
 for( i in 1:t){
 X = (scaled[i]- mins)/(maxs - mins)
 rawValues = X *(max - min) + min
 inverted_dfs[i] <- rawValues
 }
 return(inverted_dfs)
}

Now, we define the model and configure the layers. We reshape our data into a 3D format so that it can be fed into the model:

# Reshaping the input to 3-dimensional
dim(x_train) <- c(length(x_train), 1, 1)

# specify required arguments
batch_size = 1 
units = 1 

model <- keras_model_sequential() 
model%>%
 layer_gru(units, batch_input_shape = c(batch_size, dim(x_train)[2], dim(x_train)[3]),      stateful=TRUE)%>%
 layer_dense(units = 1)

Let's have a look at the summary of the model:

summary(model)

The following screenshot shows a description of the model:

Next, we compile the model:

model %>% compile(
 loss = 'mean_squared_error',
 optimizer = optimizer_adam( lr= 0.01, decay = 1e-6 ), 
 metrics = c('accuracy')
)

Now, in each epoch, we fit our training data into the model and reset the states. We train the model for 50 epochs:

for(i in 1:50 ){
 model %>% fit(x_train, y_train, epochs=1, batch_size=batch_size, verbose=1, shuffle=FALSE)
 model %>% reset_states()
}

Finally, we predict the values for the test dataset and use the inverse_scaling function to scale back the predictions to the original scale:

scaler = Scaled$scaler
predictions = vector()

for(i in 1:length(x_test)){
 X = x_test[i]
 dim(X) = c(1,1,1)
 yhat = model %>% predict(X, batch_size=batch_size)
 # invert scaling
 yhat = invert_scaling(yhat, scaler, c(-1, 1))
 # invert differencing
 yhat = yhat + data$Sales[(n+i)]
 # store
 predictions[i] <- yhat
}

Let's look at the predictions for the test data:

predictions

The following screenshot shows the values that were predicted for the test dataset:

From the predicted values for the test data, we can infer that the model did a decent job.

Table of Contents for How to do it...

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...