How to visualize LDA results using pyLDAvis

Topic visualization facilitates the evaluation of topic quality using human judgment. pyLDAvis is a Python port of LDAvis, developed in R and D3.js. We will introduce the key concepts; each LDA implementation notebook contains examples.

pyLDAvis displays the global relationships between topics while also facilitating their semantic evaluation by inspecting the terms most closely associated with each topic and, inversely, the topics associated with each term. It also addresses the challenge that terms that are frequent in a corpus tend to dominate the multinomial distribution over words that define a topic. LDAVis introduces the relevance r of the term w to topic t, to produce a flexible ranking of key terms using a weight parameter 0<=ƛ<=1.

With as the model's probability estimate of observing the term w for topic t, and as the marginal probability of w in the corpus:

The first term measures the degree of association of term t with topic w, and the second term measures the lift or saliency, that is, how much more likely the term is for the topic than in the corpus.

Topic 14

The tool allows the user to interactively change ƛ to adjust the relevance, which updates the ranking of terms. User studies have found that ƛ=0.6 produces the most plausible results.

Table of Contents for How to visualize LDA results using pyLDAvis

Create new playlist

Sign In

Sign Up

Table of Contents for
How to visualize LDA results using pyLDAvis