Gated Recurrent Units

Gated Recurrent Units (GRUs) are related to LSTMs, as both utilize different ways of gating information to prevent the vanishing gradient problem and store long-term memory. A GRU has two gates: a reset gate, r, and an update gate, z, as shown in the following diagram. The reset gate determines how to combine the new input with the previous hidden state, h_t-1, and the update gate defines how much of the previous state information to keep. If we set the reset to all ones and update gate to all zeros, we arrive at a simple RNN model:

GRUs are relatively new, and their performance is on a par with that of LSTMs, but they are computationally more efficient because of a simpler structure and fewer parameters. Here are a few structural differences between LSTMs and GRUs:

A GRU has two gates, while an LSTM has three gates. GRUs don't have the output gate that is present in LSTMs.
GRUs don't have additional internal memory, C_t, other than the hidden state.
In GRUs, non-linearity (tanh) is not applied when computing the output.

If enough data is available, using LSTMs is advisable as the greater expressive power of LSTMs may lead to better results.

Table of Contents for Gated Recurrent Units

Create new playlist

Sign In

Sign Up

Table of Contents for
Gated Recurrent Units