Integrating word2vec with CNN

So, last time when we created our word2vec model, we dumped that model into a binary file. Now its time to use that model as part of our CNN model. We perform this by initializing the weights W in the embeddings to these values.

Since we trained on a very small corpus in our previous word2vec model, let's choose the Word2Vec model that was pre-trained on the huge corpus. A good strategy is to use FastText embedding which is trained on online available documents and for 294 languages (https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md).

We will download the English Embedding (https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.en.zip) .
Extract the vocab and embedding vectors into the separate file.
Load them into the train.py file.

That's it, by introducing this step we can now feed the embedding layer with the pretraining word2vec model. This incorporation of information has a sufficient amount of features to improve the learning process of the CNN model.

Table of Contents for &#xA0;Integrating word2vec with CNN

Create new playlist

Sign In

Sign Up

Table of Contents for
Integrating word2vec with CNN