Visualization of word embeddings

We will embed the trained word vectors into TensorBoard and visualize the learned vectors by projecting them down to two dimensions. For such a projection, we can use methods such as t-SNE or PCA, which are available in TensorBoard. The following screenshot shows how a projection using PCA appears on TensorBoard:

The preceding visualization illustrates how TensorBoard shows embeddings on its projector. However, this visualization does not look useful, and it makes no sense when it's viewed, as it uses PCA dimensionality reduction. Hence, we will switch the visualization mode to using t-SNE, which is another dimensionality reduction technique, well-suited for visualizing high dimensional data:

In the t-SNE projection, the word embeddings seem to have some patterns, apparent from the clusters appearing in different areas of the projection. In order to understand the topics that these clusters have discovered, TensorBoard allows for selectively zooming in on areas and viewing the underlying data in them. This operation can be viewed as follows:

When we inspect an isolated cluster on TensorBoard, it becomes apparent that the vectors capture some general, semantic information about words and their relationships with one another. The words in the specific cluster that was selected have, interestingly, letters from the English alphabets of each just a single character long.

Next, we will search for a particular word and check its closest neighbors:

Clusterboard for the word Germany

In this case, we searched for the word germany, and found that the words closest to the given word are russia, italy, britain, and so on—all names of countries. It is very interesting to note that none of this information was provided to the model, in labels or any other form.

Another example of the model discovering semantic relationships between words is shown in the following screenshot:

Clusterboard for the word book

This example shows how the closest words to the word book are found to be story, novel, and album. This semantic relationship was discovered by the model, an example of how Word2vec proves to be very useful, even with no prior information provided.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset