Using Word2Vec for sentence classification using CNNs

Neural networks require numerical inputs to perform their operations as expected. For text inputs, we cannot directly feed text data into a neural network. Since Word2Vec converts text data to vectors, it is possible to exploit Word2Vec so that we can use it with neural networks. We will use a pretrained Google News vector model as a reference and train a CNN network on top of it. At the end of this process, we will develop an IMDB review classifier to classify reviews as positive or negative. As per the paper found at https://arxiv.org/abs/1408.5882, combining a pretrained Word2Vec model with a CNN will give us better results.

We will employ custom CNN architecture along with the pretrained word vector model as suggested by Yoon Kim in his 2014 publication, https://arxiv.org/abs/1408.5882. The architecture is slightly more advanced than standard CNN models. We will also be using two huge datasets, and so the application might require a fair amount of RAM and performance benchmarks to ensure a reliable training duration and no OutOfMemory errors.

In this recipe, we will perform sentence classification using both Word2Vec and a CNN.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset