In this chapter, we will cover the following recipes:
Stemming words
Lemmatizing words with WordNet
Replacing words matching regular expressions
Removing repeating characters
Spelling correction with Enchant
Replacing synonyms
Replacing negations with antonyms
Introduction
In this chapter, we will go over various word replacement and correction techniques. The recipes cover the gamut of linguistic compression, spelling correction, and text normalization. All of these methods can be very useful for preprocessing text before search indexing, document classification, and text analysis.