The rnn package in R

To implement RNN in an R environment, we can use the rnn package available through CRAN. This package is widely used to implement an RNN. A brief description of the rnn package, extracted from the official documentation, is shown in the following table:

rnn: Recurrent Neural Network
Description:
Implementation of an RNN in R
Details:
Package: rnn
Type: Package
Version: 0.8.0
Date: 2016-09-11
License: GPL-3
Authors:

Bastiaan Quast
Dimitri Fichou

The main functions used from the rnn package are shown in this table:

predict_rnn

Predicts the output of an RNN model:

predict_rnn(model, X, hidden = FALSE, real_output = T, ...)

run.rnn_demo

A function to launch the rnn_demo app​:

run.rnn_demo(port = NULL)

trainr

This trains the RNN. The model is used by the predictr function.

predictr

This predicts the output of an RNN model:

predictr(model, X, hidden = FALSE, real_output = T, ...)

 

As always, to be able to use a library, we must first install and then load it into our script.

Remember, to install a library that is not present in the initial distribution of R, you must use the install.package function. This is the main function to install packages. It takes a vector of names and a destination library, downloads the packages from the repositories and installs them. This function should be used only once and not every time you run the code.

So let's install and load the library:

install.packages("rnn")
library("rnn")

When we load the library (library("rnn")), we may receive the following error:

> library("rnn")
Error: package or namespace load failed for ‘rnn’ in get(Info[i, 1], envir = env):
cannot open file 'C:/Users/Giuseppe/Documents/R/win-library/3.4/digest/R/digest.rdb': No such file or directory

Do not worry, as it's nothing serious! R is just saying that, in order to run the rnn library, you also need to install the digest library. Remember it; in future, if such a problem happens, you now know how to solve it. Just add the following command:

install.packages("digest")

Now we can launch the demo:

run.rnn_demo()

When we run run.rnn_demo() after installing the rnn package, we can access a web page through 127.0.0.1:5876, which allows us to run a demo of an RNN with preset values and also visually see how the parameters influence an RNN, as shown in the following figure:

At this point, we will be able to set the parameters of our network and choose the appropriate values to be inserted into the boxes via its labels. The following parameters must be set correctly:

  • time dimension
  • training sample dimension
  • testing sample dimension
  • number of hidden layers
  • Number of unit in the layer number 1
  • Number of unit in the layer number 2
  • learningrate
  • batchsize
  • numepochs
  • momentum
  • learningrate_decay

After doing this, we just have to click on the train button and the command will be built and trained.

The following figure shows the results of the simulation:

The trainr and predictr functions are the most important functions in the rnn package. The trainr() function trains a model with the set of X and Y parameters, which can be used for prediction using the predictr() function:

trainr(Y, X, 
learningrate,
learningrate_decay = 1,
momentum = 0,
hidden_dim = c(10),
network_type = "rnn",
numepochs = 1,
sigmoid = c("logistic", "Gompertz", "tanh"),
use_bias = F,
batch_size = 1,
seq_to_seq_unsync = F,
update_rule = "sgd",
epoch_function = c(epoch_print, epoch_annealing),
loss_function = loss_L1, ...)

predictr(model,
X,
hidden = FALSE,
real_output = T,
arguments to pass to sigmoid function)

The trainr() function takes the following parameters. The output is a model that can be used for prediction:

Y

Array of output values:

  • dim 1: Samples (must be equal to dim 1 of X)
  • dim 2: Time (must be equal to dim 2 of X)
  • dim 3: Variables (could be one or more, if a matrix, will be coerced to an array)
X

Array of input values:

  • dim 1: Samples
  • dim 2: Time
  • dim 3: Variables (could be one or more; if it is a matrix, will be coerced to an array)
learningrate Learning rate to be applied for weight iteration.
learningrate_decay Coefficient to apply to the learning rate at each epoch via the epoch_annealing function.
momemtum The coefficient of the last weight iteration to keep for faster learning.
hidden_dim The dimensions of the hidden layers.
network_type The type of network, which could be rnn, gru or lstm.
numepochs The number of iterations, that is, the number of times the whole dataset is presented to the network
sigmoid Method to be passed to the sigmoid function.
batch size Number of samples used at each weight iteration. Only one is supported for the moment.
epoch_function Vector of functions to be applied at each epoch loop. Use it to interact with the objects inside the list model or to print and plot at each epoch. It should return the model.
loss function Applied in each sample loop, vocabulary to verify.
...

Arguments to be passed to methods, to be used in user defined functions.

Now let's look at a simple example. This example included is in the official documentation of the CRAN rnn package to demonstrate the trainr and predictr functions and see the accuracy of the predictions.

We have X1 and X with random numbers in the range 0-127. Y is initialized as X1+X2. After converting X1, X2, and Y to binary values, we use trainr to train Y based on X(array of X1 and X2).

Using the model, we predict B based on another sample of A1+A2. The difference of errors is plotted as a histogram:

library("rnn")

#Create a set of random numbers in X1 and X2
X1=sample(0:127, 7000, replace=TRUE)
X2=sample(0:127, 7000, replace=TRUE)

#Create training response numbers
Y=X1 + X2

# Convert to binary
X1=int2bin(X1)
X2=int2bin(X2)
Y=int2bin(Y)

# Create 3d array: dim 1: samples; dim 2: time; dim 3: variables.
X=array( c(X1,X2), dim=c(dim(X1),2) )

# Train the model
model <- trainr(Y=Y[,dim(Y)[2]:1],
X=X[,dim(X)[2]:1,],
learningrate = 0.1,
hidden_dim = 10,
batch_size = 100,
numepochs = 100)

plot(colMeans(model$error),type='l',xlab='epoch',ylab='errors')

# Create test inputs
A1=int2bin(sample(0:127, 7000, replace=TRUE))
A2=int2bin(sample(0:127, 7000, replace=TRUE))

# Create 3d array: dim 1: samples; dim 2: time; dim 3: variables
A=array( c(A1,A2), dim=c(dim(A1),2) )

# Now, let us run prediction for new A
B=predictr(model,
A[,dim(A)[2]:1,] )
B=B[,dim(B)[2]:1]

# Convert back to integers
A1=bin2int(A1)
A2=bin2int(A2)
B=bin2int(B)

# Plot the differences as histogram
hist( B-(A1+A2) )

As usual, we will analyze the code line by line, explaining in detail all the features applied to capture the results:

library("rnn")

The first line of the initial code are used to load the library needed to run the analysis. Let's go to the following commands:

X1=sample(0:127, 7000, replace=TRUE)
X2=sample(0:127, 7000, replace=TRUE)

These lines create training response numbers; these two vectors will be the inputs of the network we are about to build. We have used the sample() function to take a sample of the specified size from the elements of x either with or without replacement. The two vectors contain 7,000 random integer values between 1 and 127.

Y = X1 + X2

This command creates training response numbers; this is our target, or what we want to predict with the help of the network.

X1=int2bin(X1)
X2=int2bin(X2)
Y=int2bin(Y)

These three lines of code convert integers into binary sequences. We need to transform numbers into binaries before adding bit by bit. In the end, we get a sequence of eight values for each value, these values being 0 or 1. To understand the transformation we analyze a preview of one of these variables:

> head(X1,n=10)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 1 0 0 1 0 0
[2,] 0 0 0 1 0 0 0 0
[3,] 1 0 0 0 1 0 1 0
[4,] 0 0 0 0 0 0 1 0
[5,] 0 1 0 0 0 0 0 0
[6,] 0 0 0 1 1 1 0 0
[7,] 1 0 1 1 0 1 1 0
[8,] 1 1 0 0 0 1 0 0
[9,] 1 0 1 0 0 0 0 0
[10,] 0 0 0 1 0 0 0 0

Let's go back to analyze the code:

X=array( c(X1,X2), dim=c(dim(X1),2) )

This code creates a 3D array as required by the trainr() function. In this array, we have the following:

  • dim 1: Samples (must be equal to dim 1 of inputs)
  • dim 2: Time (must be equal to dim 2 of inputs)
  • dim 3: Variables (could be one or more; if it is a matrix, this will be coerced to the array)
model <- trainr(Y=Y[,dim(Y)[2]:1],
X=X[,dim(X)[2]:1,],
learningrate = 0.1,
hidden_dim = 10,
batch_size = 100,
numepochs = 100)

The trainr() function trains an RNN in native R. It takes a few minutes as the training happens based on X and Y. The following code shows the last 10 trained epoch results displayed on the R prompt:

Trained epoch: 90 - Learning rate: 0.1
Epoch error: 3.42915263914405
Trained epoch: 91 - Learning rate: 0.1
Epoch error: 3.44100549476955
Trained epoch: 92 - Learning rate: 0.1
Epoch error: 3.43627697030863
Trained epoch: 93 - Learning rate: 0.1
Epoch error: 3.43541472188254
Trained epoch: 94 - Learning rate: 0.1
Epoch error: 3.43753094787383
Trained epoch: 95 - Learning rate: 0.1
Epoch error: 3.43622412149714
Trained epoch: 96 - Learning rate: 0.1
Epoch error: 3.43604894997742
Trained epoch: 97 - Learning rate: 0.1
Epoch error: 3.4407798878595
Trained epoch: 98 - Learning rate: 0.1
Epoch error: 3.4472752590403
Trained epoch: 99 - Learning rate: 0.1
Epoch error: 3.43720125450988
Trained epoch: 100 - Learning rate: 0.1
Epoch error: 3.43542353819336

We can see the evolution of the algorithm by charting the error made by the algorithm to subsequent epochs:

plot(colMeans(model$error),type='l',xlab='epoch',ylab='errors')

This graph shows the epoch versus error:

Now the model is ready and we can use it to test the network. But first, we need to create some test data:

A1=int2bin(sample(0:127, 7000, replace=TRUE))
A2=int2bin(sample(0:127, 7000, replace=TRUE))
A=array( c(A1,A2), dim=c(dim(A1),2) )

Now, let us run the prediction for new data:

B=predictr(model, A[,dim(A)[2]:1,] ) 
B=B[,dim(B)[2]:1]

Convert back to integers:

A1=bin2int(A1)
A2=bin2int(A2)
B=bin2int(B)

Finally, plot the differences as a histogram:

hist( B-(A1+A2) )

The histogram of errors is shown as follows:

As can be seen here, the bin with more frequent is near zero to indicate that in most cases, the predictions coincide with the current values. All the other bins are related to the errors. We can therefore say that the network simulates the system with good performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset