How to do it...

Let's move on to the data processing part:

  1. First, we check the datatype of the Sales column in our data:
class(data$Sales)

Note that, in the data, the Sales column is of the factor datatype. We need to make this a numeric datatype in order to use it in our analysis:

data$Sales <- as.numeric(as.character(data$Sales))
class(data$Sales)

Now, the class of the Sales column has been changed into a numeric datatype.

  1. To implement a time series forecast, we need to convert the data into stationary data. We can do this using the diff() function, which will calculate the iterated difference. We pass the value of the argument differences as 1 since we want the differencing to have a lag of 1:
data_differenced = diff(data$Sales, differences = 1)
head(data_differenced)

The following screenshot shows a piece of the data after differencing:

  1. Next, we create a supervised dataset so that we can apply GRU. We transform the data_differenced series by creating a lag in the series with an order of 1; that is, the value at time (t-1) as the input will have the value at time t as the output:
data_lagged = c(rep(NA, 1), data_differenced[1:(length(data_differenced)-1)])
data_preprocessed = as.data.frame(cbind(data_lagged,data_differenced))
colnames(data_preprocessed) <- c( paste0('x-', 1), 'x')
data_preprocessed[is.na(data_preprocessed)] <- 0
head(data_preprocessed)

Here is how our supervised dataset looks:

  1. Now, we need to split the data into training and testing sets. In time series problems, we can't do random sampling on the data since the order of the data matters. Thus, we need to split the data by taking the first 70% of the series as training data and the remaining 30% of the data as test data:
N = nrow(data_preprocessed)
n = round(N *0.7, digits = 0)
train = data_preprocessed[1:n, ]
test = data_preprocessed[(n+1):N,]
print("Training data snapshot :")
head(train)
print("Testing data snapshot :")
head(test)

The following screenshot shows a few records from the training dataset:

The following screenshot shows a few records from the test dataset:

  1. Next, we normalize the data within the scale of the activation function that we are going to use. Since we are going with tanh as our choice of activation function, which ranges from -1 to +1, we scale the data using min-max normalization. Here, we normalize with respect to the training data:
scaling_data = function(train, test, feature_range = c(0, 1)) {
x = train
fr_min = feature_range[1]
fr_max = feature_range[2]
std_train = ((x - min(x) ) / (max(x) - min(x) ))
std_test = ((test - min(x) ) / (max(x) - min(x) ))

scaled_train = std_train *(fr_max -fr_min) + fr_min
scaled_test = std_test *(fr_max -fr_min) + fr_min

return( list(scaled_train = as.vector(scaled_train), scaled_test = as.vector(scaled_test) ,scaler= c(min =min(x), max = max(x))) )

}

Scaled = scaling_data(train, test, c(-1, 1))
y_train = Scaled$scaled_train[, 2]
x_train = Scaled$scaled_train[, 1]

y_test = Scaled$scaled_test[, 2]
x_test = Scaled$scaled_test[, 1]

Then, we write a function to revert the predicted values to the original scale. We will use this function while predicting the values:

## inverse-transform
invert_scaling = function(scaled, scaler, feature_range = c(0, 1)){
min = scaler[1]
max = scaler[2]
t = length(scaled)
mins = feature_range[1]
maxs = feature_range[2]
inverted_dfs = numeric(t)

for( i in 1:t){
X = (scaled[i]- mins)/(maxs - mins)
rawValues = X *(max - min) + min
inverted_dfs[i] <- rawValues
}
return(inverted_dfs)
}
  1. Now, we define the model and configure the layers. We reshape our data into a 3D format so that it can be fed into the model:
# Reshaping the input to 3-dimensional
dim(x_train) <- c(length(x_train), 1, 1)

# specify required arguments
batch_size = 1
units = 1

model <- keras_model_sequential()
model%>%
layer_gru(units, batch_input_shape = c(batch_size, dim(x_train)[2], dim(x_train)[3]), stateful=TRUE)%>%
layer_dense(units = 1)

Let's have a look at the summary of the model:

summary(model)

The following screenshot shows a description of the model:

Next, we compile the model:

model %>% compile(
loss = 'mean_squared_error',
optimizer = optimizer_adam( lr= 0.01, decay = 1e-6 ),
metrics = c('accuracy')
)
  1. Now, in each epoch, we fit our training data into the model and reset the states. We train the model for 50 epochs:
for(i in 1:50 ){
model %>% fit(x_train, y_train, epochs=1, batch_size=batch_size, verbose=1, shuffle=FALSE)
model %>% reset_states()
}
  1. Finally, we predict the values for the test dataset and use the inverse_scaling function to scale back the predictions to the original scale:
scaler = Scaled$scaler
predictions = vector()

for(i in 1:length(x_test)){
X = x_test[i]
dim(X) = c(1,1,1)
yhat = model %>% predict(X, batch_size=batch_size)
# invert scaling
yhat = invert_scaling(yhat, scaler, c(-1, 1))
# invert differencing
yhat = yhat + data$Sales[(n+i)]
# store
predictions[i] <- yhat
}

Let's look at the predictions for the test data:

predictions

The following screenshot shows the values that were predicted for the test dataset:

From the predicted values for the test data, we can infer that the model did a decent job.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset