Understanding skip-thoughts algorithm

Skip-thoughts is one of the popular unsupervised learning algorithms for learning the sentence embedding. We can see skip-thoughts as an analogy to the skip-gram model. We learned that in the skip-gram model, we try to predict the context word given a target word, whereas in skip-thoughts, we try to predict the context sentence given a target sentence. In other words, we can say that skip-gram is used for learning word-level vectors and skip-thoughts is used for learning sentence-level vectors.

The algorithm of skip-thoughts is very simple. It consists of an encoder-decoder architecture. The role of the encoder is to map the sentence to a vector and the role of the decoder is to generate the surrounding sentences that is the previous and next sentence of the given input sentence. As shown in the following diagram, the skip-thoughts vector consists of one encoder and two decoders, called a previous decoder and next decoder:

The working of an encoder and decoder is discussed next:

  • Encoder: An encoder takes the words in a sentence sequentially and generates the embeddings. Let's say we have a list of sentences. . denotes the word in a sentence and denotes its word embeddings. So the hidden state of an encoder is interpreted as a sentence representation.
  • Decoder: There are two decoders, called a previous decoder and next decoder. As the name suggests, the previous decoder is used to generate the previous sentence, and the next decoder is used to generate the next sentence. Let's say we have a sentence and its embeddings are . Both of the decoders take the embeddings as an input and the previous decoder tries to generate the previous sentence, , and the next decoder tries to generate the next sentence, .

So, we train our model by minimizing the reconstruction error of both the previous and next decoders. Because when the decoders reconstruct/generate correct previous and next sentences, it means that we have a meaningful sentence embedding . We send the reconstruction error to the encoder, so that encoder can optimize the embeddings and send better representations to the decoder. Once we have trained our model, we use our encoder to generate the embedding for a new sentence.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset