Text data

At the start of the chapter, we looked at textual content analysis. Currently, Watson Analytics does not offer a lot of help with textual data. Let's see what you can do with Watson Analytics and textual data.

Data metrics

Within Watson Analytics Refine, you can click on Data Metrics to gain some knowledge of the textual date within your file. In our example (shown in the next screenshot), we see that Watson Analytics scores the Comments field as Low Quality and provides us a missing values percentage (56 percent):

Data metrics

Of course, the field is low quality because not every record in the file has comments (56 percent actually) —a legitimate situation.

Search and Filter

One approach for using Watson Analytics on textual data is perhaps looking for correlations between certain words or phrases found within the data. For example, it might be interesting to see if the presence of the word leadership in the comments field has any effect on the GPA average for the university. We can start by formulating a question: How do the values of GPA Average compare by Comments?

When Watson Analytics visualizes the answer, we can click on applied filter in the lower-left corner of the page:

Search and Filter

Then, we can use search to find any comments that contain the word we are interested in (leadership), select them, and set them as our filter:

Search and Filter

Finally, Watson Analytics shows us our filtered visualization:

Search and Filter

Comments used in our file with the word leadership are demonstrates leadership and outstanding leadership abilities and—according to Watson Analytics—it seems that these students do not have very exciting average GPAs.

The net result is that when it comes to textual data, you'll need to perhaps supplement your analysis with preprocessing outside of Watson Analytics.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset