Visualization

So far, we have seen how to select portions of a DataFrame and how to get summary statistics from our data, but let's now move on to learning how to visually inspect the data. But first, why even bother with visual inspection? Let's see an example to understand why.

Here is the summary statistics for four distinct series of x and y values:

Series of x and y

Values

Mean of x

9

Mean of y

7.5

Sample variance of x

11

Sample variance of y

4.1

Correlation between x and y

0.816

Regression line

y = 3.00 + 0.500x

Based on the series having identical summary statistics, you might assume that these series would appear visually similar. You would, of course, be wrong. Very wrong. The four series are part of Anscombe's quartet, and they were deliberately created to illustrate the importance of visual data inspection. Each series is plotted as follows:

Clearly, we would not treat these datasets as identical after having visualized them. So, now that we understand the importance of visualization, let's take a look at a pair of useful Python libraries for this.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset