Comparing LSTM, GRU, Feedforward, and RNN operations

In order to help you see the difference in both the creation and results of all the network objects we have been dealing with, I created the sample code that follows. This sample will allow you to see the difference in training times for all four of the network types we have here. As stated previously, the GRU is the easiest to train and therefore will complete faster (in less iterations) than the other networks. When executing the code, you will see that the GRU achieves the optimal error rate typically in under 10,000 iterations, while a conventional RNN and/or LSTM can take 50,000 or more iterations to converge properly.

Here is what our sample code looks like:

static void Main(string[] args)
{
Console.WriteLine("Running GRU sample", Color.Yellow);
Console.ReadKey();
ExampleGRU.Run();
Console.ReadKey();
Console.WriteLine("Running LSTM sample", Color.Yellow);
Console.ReadKey();
ExampleLSTM.Run();
Console.ReadKey();
Console.WriteLine("Running RNN sample", Color.Yellow);
Console.ReadKey();
ExampleRNN.Run();
Console.ReadKey();
Console.WriteLine("Running Feed Forward sample", Color.Yellow);
Console.ReadKey();
ExampleFeedForward.Run();
Console.ReadKey();
}

And here is the output from the sample running:

Output 1

Output 2

Now, let's look at how we create the GRU network and run the program. In the following code segment, we will use our XOR Dataset generator to generate random data for us. For our network, we will have 2 inputs, 1 hidden layer with 3 neurons, and 1 output. Our learning rate is set to 0.001 and our standard deviation is set to 0.08.

We call our NetworkBuilder object, which is responsible for creating all our network variants. We pass all our described parameters to the NetworkBuilder. Once our network object is created we pass this variable to our trainer and train the network. Once the network training is completed we then test our network to ensure our results are satisfactory. When we create our Graph object for testing, we are sure to pass false to the constructor to let it know that we do not need back propagation:

public class ExampleGRU
 {
public static void Run()
 {
Random rng = new Random();
DataSet data = new XorDataSetGenerator();
int inputDimension = 2;
int hiddenDimension = 3;
int outputDimension = 1;
int hiddenLayers = 1;
double learningRate = 0.001;
double initParamsStdDev = 0.08;

INetwork nn = NetworkBuilder.MakeGru(inputDimension,
hiddenDimension, hiddenLayers, outputDimension, newSigmoidUnit(),
initParamsStdDev, rng);

int reportEveryNthEpoch = 10;
int trainingEpochs = 10000; // GRU's typically need less training
Trainer.train<NeuralNetwork>(trainingEpochs, learningRate, nn, data, reportEveryNthEpoch, rng);
Console.WriteLine("Training Completed.", Color.Green);
Console.WriteLine("Test: 1,1", Color.Yellow);
Matrix input = new Matrix(new double[] { 1, 1 });
Matrix output = nn.Activate(input, new Graph(false));
Console.WriteLine("Test: 1,1. Output:" + output.W[0], Color.Yellow);
Matrix input1 = new Matrix(new double[] { 0, 1 });
Matrix output1 = nn.Activate(input1, new Graph(false));
Console.WriteLine("Test: 0,1. Output:" + output1.W[0], Color.Yellow);
Console.WriteLine("Complete", Color.Yellow);
 }
 }

Table of Contents for Comparing LSTM, GRU, Feedforward, and RNN operations

Create new playlist

Sign In

Sign Up

Table of Contents for
Comparing LSTM, GRU, Feedforward, and RNN operations