LSTM variants

There are many variants of the LSTM network. Some of these variants include:

  • Gated recurrent neural network
  • LSTM4
  • LSTM4a
  • LSTM5
  • LSTM5a
  • LSMT6

Training and Test accuracy, σ = relu, η = 1e −4

One of those variants, a slightly more dramatic version of the LSTM, is called the gated recurrent unit, or GRU/GRNN. It combines the forget and input gates of the LSTM into a single gate called an update gate. This makes it simpler than the standard LSTM and has been increasingly growing in popularity.

Here is what a LSTM looks like:

LSTM

As you can see, there are various memory gates in the LSTM that the RNN does not have. This allows it to effortlessly retain both long-and short-term memory. So, if we want to understand text and need to look ahead or behind in time, LSTM is made for just such a scenario. Let’s talk about the different gates for a moment. As we mentioned, there are 3 of them. Let’s use the following phrase to explain how each of these works.

Bob lives in New York City. John talks to people on the phone all day, and commutes on the train.

Forget Gate: As soon as we get to the period after the word City, the forget gate realizes that there may be a change of context in the works. As a result, the subject Bob is forgotten and the place where the subject was is now empty. As soon as the sentence turns to John, the subject is now John. This process is caused by the forget gate.

Input Gate: So the important facts are that Bob lives in New York City, and that John commutes on the train and talks to people all day. However, the fact that he talks to people over the phone is not as important and can be ignored. The process of adding new information is done via the input gate.

Output Gate: If we were to have a sentence Bob was a great man. We salute ____. In this sentence we have an empty space with many possibilities. What we do know is that we are going to salute whatever is in this empty space, and this is a verb describing a noun. Therefore, we would be safe in assuming that the empty space will be filled with a noun. So, a good candidate could be Bob. The job of selecting what information is useful from the current cell state and showing it as an output is the job of the output gate.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset