Installing NLTK and its modules

Before getting started with the examples, we will set the system up with NLTK and other dependent Python libraries. The pip installer can be used to install NLTK, with an optional installation of numpy, as follows:

sudo pip install -U nltk
sudo pip install -U numpy

The NLTK corpora and various modules can be installed by using the common NLTK downloader in the Python interactive shell or a Jupyter Notebook, shown as follows:

import nltk
nltk.download()

The preceding command will open an NLTK Downloader, as follows. Select the packages or collections that are required:

As shown in the preceding screenshot, specific collections, text corpora, NLTK models, or packages, can be selected and installed. Navigate to stopwords and install it for future use. The following is a list of modules that are required for this chapter's examples:

No	Package Name	Description
1	`brown`	Brown text corpus
2	`gutenberg`	Gutenberg text corpus
3	`max_ne_chunker`	Module for text chunking
4	`movie_reviews`	Movie review sentiment polarity data
5	`product_reviews_1`	Basic product reviews corpus
6	`punkt`	Word and sentence tokenizer modules
7	`treebank`	Penn Treebank dataset sample
8	`twitter_samples`	Twitter messages sample
9	`universal_tagset`	Universal POS tag mapping
10	`webtext`	Web text corpus
11	`wordnet`	WordNet corpus
12	`words`	Word list

Table of Contents for Installing NLTK and its modules

Create new playlist

Sign In

Sign Up

Table of Contents for
Installing NLTK and its modules