The LSTM has a similar structure to RNN, however, the basic cell is very different as traditional RNN uses single multi-layer perceptron (MLP), whereas a single cell of LSTM includes four input layers interacting with each other. These three layers are:
- forget gate
- input gate
- output gate
The forget gate in LSTM decides which information to throw away and it depends on the last hidden state output ht-1, Xt, which represents input at time t.
In the earlier figure, Ct represents cell state at time t. The input data is represented by Xt and the hidden state is represented as ht-1. The earlier layer can be formulated as:
The input gate decides update values and decides the candidate values of the memory cell and updates the cell state, as shown in the following figure:
- The input it at time t is updated as:
-
The expected value of current state and the output from input gate is used to update the current state at time t as:
The output gates, as shown in the following figure, compute the output from the LSTM cell based on input Xt , previous layer output ht-1, and current state Ct:
The output based on output gate can be computed as follows: