Installing and using NLTK

NLTK is a simple pip install nltk away.

To check whether your installation was successful, open a Python interpreter and type:

>>> import nltk
You will find a very nice tutorial on NLTK in the book Python 3 Text Processing with NLTK 3 Cookbook by Jacob Perkins, published by Packt Publishing.
To play around a little bit with a stemmer, you can visit the web page http://text-processing.com/demo/stem/.

NLTK comes with different stemmers. This is necessary, because every language has a different set of rules for stemming. For English, we can take SnowballStemmer:

>>> import nltk.stem
>>> s = nltk.stem.SnowballStemmer('english')
>>> s.stem("graphics")
'graphic'
>>> s.stem("imaging")
'imag'
>>> s.stem("image")
'imag'
>>> s.stem("imagination")
'imagin'
>>> s.stem("imagine")
'imagin'
The stemming does not necessarily have to result in valid English words.

It also works with verbs:

>>> s.stem("buys")
'buy'
>>> s.stem("buying")
'buy'

This means it works most of the time:

>>> s.stem("bought")
'bought' 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset