How it works...

In step 1, we defined a graph vertex input as the following after calling the graphBuilder() method:

builder.addInputs("trainFeatures");

By calling graphBuilder(), we are actually constructing a graph builder to create a computation graph configuration.

Once the LSTM layers are added into the ComputationGraph configuration in step 3, they will act as input layers in the ComputationGraph configuration. We pass the previously mentioned graph vertex input (trainFeatures) to our LSTM layer, as follows:

builder.addLayer("L1", new LSTM.Builder()
     .nIn(INPUTS)
     .nOut(LSTM_LAYER_SIZE)
     .forgetGateBiasInit(1)
     .activation(Activation.TANH)
     .build(),"trainFeatures");

The last attribute, trainFeatures, refers to the graph vertex input. Here, we're specifying that the L1 layer is the input layer.

The main purpose of the LSTM neural network is to capture the long-term dependencies in the data. The derivatives of a tanh function can sustain for a long range before reaching the zero value. Hence, we use Activation.TANH as the activation function for the LSTM layer.

The forgetGateBiasInit() set forgets gate bias initialization. Values in the range of 1 to 5 could potentially help with learning or long-term dependencies.

We use the Builder strategy to define the LSTM layers along with the required attributes, such as nIn and nOut. These are input/output neurons, as we saw in Chapters 3, Building Deep Neural Networks for Binary Classification, and Chapter 4, Building Convolutional Neural Networks. We add LSTM layers using the addLayer method.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...