Bidirectional RNN

In a bidirectional RNN, we have two different layers of hidden units. Both of these layers connect from the input layer to the output layer. In one layer, the hidden states are shared from left to right, and in the other layer, they are shared from right to left.

But what does this mean? To put it simply, one hidden layer moves forward through time from the start of the sequence, while the other hidden layer moves backward through time from the end of the sequence.

As shown in the following diagram, we have two hidden layers: a forward hidden layer and a backward hidden layer, which are described as follows:

In the forward hidden layer, hidden state values are shared from past time steps, that is, is shared to , is shared to , and so on
In the backward hidden layer, hidden start values are shared from future time steps, that is, to , to , and so on

The forward hidden layer and the backward hidden layer are represented as shown in the following diagram:

What is the use of bidirectional RNNs? In certain cases, reading the input sequence from both sides is very useful. So, a bidirectional RNN consists of two RNNs, one reading the sentence forward and the other reading the sentence backward.

For instance, consider the following sentence:

Archie lived for 13 years in _____. So he is good at speaking Chinese.

If we use an RNN to predict the blank in the preceding sentence, it would be ambiguous. As we know, an RNN can make predictions based on only the set of words it has seen so far. In the preceding sentence, to predict the blank, the RNN has seen only the words Archie, lived, for, 13, years, and in, but these words alone do not provide much context and do not give any clarity to predict the correct word. It just says Archie lived for 13 years in. With this information alone, we cannot predict the next word correctly.

But if we read the words following the blank as well, which are So, he, is, good, at, speaking, and Chinese, then we can say that Archie lived for 13 years in China, since it is given that he is good at speaking Chinese. So, in this circumstance, if we use a bidirectional RNN to predict the blank, it will predict correctly, since it reads the sentence in both forward and backward directions before making predictions.

Bidirectional RNNs have been used in various applications, such as part-of-speech (POS) tagging, in which it is vital to know the word before and after the target word, language translation, predicting protein structure, dependency parsing, and more. However, a bidirectional RNN is not suitable for online settings where we don't know the future.

The forward propagation steps in bidirectional RNNs are given as follows:

Forward hidden layer:

Backward hidden layer:

Output:

Implementing a bidirectional RNN is simple with TensorFlow. Assuming we use the LSTM cell in the bidirectional RNN, we can do the following:

Import rnn from TensorFlow contrib as shown:

from tensorflow.contrib import rnn

Define forward and backward hidden layers:

forward_hidden_layer = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)

backward_hidden_layer = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)

Define the bidirectional RNN with rnn.static_bidirectional_rnn:

outputs, forward_states, backward_states = rnn.static_bidirectional_rnn(forward_hidden_layer, backward_hidden_layer, input)

Table of Contents for Bidirectional RNN

Create new playlist

Sign In

Sign Up

Table of Contents for
Bidirectional RNN