The problem of long-term dependencies

Another challenging problem faced by researchers is the long-term dependencies that one can find in text. For example, if someone feeds a sequence like I used to live in France and I learned how to speak... the next obvious word in the sequence is the word French.

In these kind of situation vanilla RNNs will be able to handle it because it has short-term dependencies, as shown in Figure 6:

Figure 6: Showing short-term dependencies in the text (source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

Another example, if someone started the sequence by saying that I used to live in France.. and then he/she start to describe the beauty of living there and finally he ended the sequence by I learned to speak French. So, for the model to predict the language that he/she learned at the end of the sequence, the model needs to have some information about the early words live and France. The model won't be able to handle these kind of situation, if it doesn't manage to keep track of long term dependencies in the text:

Figure 7: The challenge of long-term dependencies in text (source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

To handle vanishing gradients and long-term dependencies in the text, researchers introduced a variation of the vanilla RNN network called Long Short Term Networks (LSTM).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset