Topic modeling with Spark MLlib and Stanford NLP

In this subsection, we represent a semi-automated technique of TM using Spark. Using other options as defaults, we train LDA on the dataset downloaded from GitHub at https://github.com/minghui/Twitter-LDA/tree/master/data/Data4Model/test. However, we will use more well-known text datasets in the model reuse and deployment phase later in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset