Introduction

In the current internet era, the amount of data is increasing at a rapid speed. There are various forms of data we produce every day. Text data is one of the most important sources of data for uncovering knowledge. In the medical field, biomedical text mining is one of the most important applications of text data for studying diseases, genes, protein relations, and drug discovery. In the financial market, the microblog, such as Twitter, plays an important role in uncovering people's sentiment about the market. Also, every day, a large number of scientific research articles are being published in various journals, and this is also a great source of text data. By doing text mining on scientific research articles, one can easily identify current research trends, important keywords, and links between entities.

Before actually performing the text mining, you should learn how to retrieve text data from various sources. This chapter is dedicated to walking you through text data retrieval from various sources and some pre-processing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset