Word2vec

Vector representations of words allow for a continuous representation of semantically similar words, wherein words that are related to one another are mapped to points that are close to each other in a high dimensional space. Such an approach to word representations builds on the fact that words that share similar contexts also share semantic meanings. Word2vec is one such model, trying to directly predict a word by using its neighbors, learning small but dense vectors called embeddings. Word2vec is also a computationally efficient, unsupervised model that learns word embeddings from raw text. In order to learn these dense vectors, Word2vec is available in two flavors: the CBOW model and the skip-gram model (proposed by Mikolov et al.).

Word2vec is a shallow, three-layer neural network, where the first and last layers form the input and output; the intermediate layer builds latent representations, for the input words to be transformed into the output vector representation.

The Word2vec representation of words allows for exploring interesting mathematical relationships between word vectors, which is also an intuitive expression for words. For instance, we will be able to find out the value of this expression by using word representations:

king - man = queen - woman

Mathematically, what this expression evaluates is the equivalence of the latent space of the word vectors evaluated by the expressions. On the other hand, intuitively, we can understand that removing man from king and adding woman results in queen. Such a relationship can be built only when the contexts of the words are understood, which is possible when the positional relationships of the words are exploited. It is evident, from the semantics, that the word king occurs in a position along with man, in a manner similar to how the word queen and woman are present with one another:

Transformation of word vectors

The preceding screenshot shows the transformation of the word vectors from woman to queen, and how this is analogous to the transformation of man to king. The process of understanding such a relationship is achieved using Word2vec, where a simple, three-layer neural network is used to predict surrounding words (given an input word), or to predict the word (given the surrounding words). Both of these approaches are variations of the Word2vec approach. The approach wherein the surrounding words are predicted using the input word is the skip-gram model, while the approach wherein the target word is predicted using the surrounding words is the CBOW model.

Table of Contents for Word2vec

Create new playlist

Sign In

Sign Up

Table of Contents for
Word2vec