Normalization

The disparities of words in sentences are converted into a normalized form. The words in a sentence may vary, such as sing, singer, sang, or singing, but they all would more or less fit into the same context and could be standardized.

There are different ways to normalize sentences:

  • Stemming: A basic rule-based process of stripping the suffixes (-ing, -ly, -es, -s) from a word. 
  • Lemmatization: The more sophisticated procedure to identify the root form of a word. It involves a more complex process of verifying the semantics and syntax.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset