Chapter 3. Data Analysis with pandas

In this chapter, we will explore another data analysis library called pandas. The goal of this chapter is to give you some basic knowledge and concrete examples for getting started with pandas.

An overview of the pandas package

pandas is a Python package that supports fast, flexible, and expressive data structures, as well as computing functions for data analysis. The following are some prominent features that pandas supports:

  • Data structure with labeled axes. This makes the program clean and clear and avoids common errors from misaligned data.
  • Flexible handling of missing data.
  • Intelligent label-based slicing, fancy indexing, and subset creation of large datasets.
  • Powerful arithmetic operations and statistical computations on a custom axis via axis label.
  • Robust input and output support for loading or saving data from and to files, databases, or HDF5 format.

Related to pandas installation, we recommend an easy way, that is to install it as a part of Anaconda, a cross-platform distribution for data analysis and scientific computing. You can refer to the reference at http://docs.continuum.io/anaconda/ to download and install the library.

After installation, we can use it like other Python packages. Firstly, we have to import the following packages at the beginning of the program:

>>> import pandas as pd
>>> import numpy as np
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset