Visualizing Multivariate Data

When we have big data that contains many variables, the plot types in Chapter 7, Visualizing Online Data may no longer be an effective way of data visualization. We may try to cramp as many variables in a single plot as possible, but the overcrowded or cluttered details would quickly reach the boundary of a human's visual perception capabilities.

In this chapter, we aim to introduce multivariate data visualization techniques; they enable us to better understand the distribution of data and the relationships between variables. Here is the outline of this chapter:

  • Getting End-of-Day (EOD) stock data from Quandl
  • Two-dimensional faceted plots:
    • Factor plot in Seaborn
    • Faceted grid in Seaborn
    • Pair plot in Seaborn
  • Other two-dimensional multivariate plots:
    • Heatmap in Seaborn
    • Candlestick plot in matplotlib.finance:
      • Visualizing various stock market indicators
    • Building a comprehensive stock chart
  • Three-dimensional plots:
    • Scatter plot
    • Bar chart
    • Caveats of using Matplotlib 3D

First, we will discuss faceted plots, which is a divide-and-conquer approach to visualizing multivariate data. The gestalt of this approach is to slice input data into different facets such that only a handful of attributes will be represented in each visualization panel. This will reduce visual clutter by allowing inspection of variables in reduced subsets. Sometimes, finding a suitable way to represent multivariate data in a 2D graph is difficult. Therefore, we are going to introduce 3D plotting functions in Matplotlib as well. 

The data used in this chapter was collected from Quandl's End-of-Day (EOD) stock database. Let's get the data from Quandl first.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset