Classifying text

Our goal here is to build a classifier to predict Presidential party affiliation, either Democrat or Republican, since 1900. We will turn the word counts per year into features, create a DTM, create features using the term frequency-inverse document frequency (tf-idf), and use them in our model. As you can imagine, we will have thousands of features, so we will change how the data is prepared versus what we covered in prior sections, and also use the text2vec package for feature creation and modeling.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset