Stacked LSTMs

"Building a deep RNN by stacking multiple recurrent hidden states on top of each other. This approach potentially allows the hidden state at each level to operate at different timescale" How to Construct Deep Recurrent Neural Networks (link: https://arxiv.org/abs/1312.6026), 2013

"RNNs are inherently deep in time, since their hidden state is a function of all previous hidden states. The question that inspired this paper was whether RNNs could also benefit from depth in space; that is from stacking multiple recurrent hidden layers on top of each other, just as feedforward layers are stacked in conventional deep networks". Speech Recognition With Deep RNNs (link: https://arxiv.org/abs/1303.5778), 2013

Most researchers are using stacked LSTMs for challenging sequence prediction problems. A stacked LSTM architecture can be defined as an LSTM model comprised of multiple LSTM layers. The preceding LSTM layer provides a sequence output rather than a single-value output to the LSTM layer as follows.

Specifically, it's one output per input time step, rather than one output time step for all input time steps:

Figure 12: Stacked LSTMs

So in this example, we will be using this kind of stacked LSTM architecture, which gives better performance.

Table of Contents for Stacked LSTMs

Create new playlist

Sign In

Sign Up

Table of Contents for
Stacked LSTMs