Paragraph

A paragraph is the largest unit of text handled by an NLP task. Paragraph level boundaries by itself may not be much use unless broken down into sentences. Though sometimes the paragraph may be considered as context boundaries. Tokenizers that can split a document into paragraphs are available in some of the Python libraries. We will look at such tokenizers in later chapters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset