How to visualize LDA results using pyLDAvis

Topic visualization facilitates the evaluation of topic quality using human judgment. pyLDAvis is a Python port of LDAvis, developed in R and D3.js. We will introduce the key concepts; each LDA implementation notebook contains examples.

pyLDAvis displays the global relationships between topics while also facilitating their semantic evaluation by inspecting the terms most closely associated with each topic and, inversely, the topics associated with each term. It also addresses the challenge that terms that are frequent in a corpus tend to dominate the multinomial distribution over words that define a topic. LDAVis introduces the relevance r of the term w to topic t, to produce a flexible ranking of key terms using a weight parameter 0<=ƛ<=1.

With  as the model's probability estimate of observing the term w for topic t, and as the marginal probability of w in the corpus:

The first term measures the degree of association of term t with topic w, and the second term measures the lift or saliency, that is, how much more likely the term is for the topic than in the corpus.

Topic 14

The tool allows the user to interactively change ƛ to adjust the relevance, which updates the ranking of terms. User studies have found that ƛ=0.6 produces the most plausible results.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset