Accessing Data

In almost any real-world data analysis, you need to load data from outside your program. Since pandas is built on Python, you can use any means available in Python to retrieve data. This makes it possible to access data from an almost unlimited set of sources, including but not limited to files, Excel spreadsheets, websites and services, databases, and cloud services.

However, when using standard Python functions to load data, you need to convert Python objects into pandas Series or DataFrame objects. This increases the complexity of your code. To help with managing this complexity, pandas offers a number of facilities to load data from various sources directly into pandas objects. We will examine many of these in this chapter.

Specifically, in this chapter, we will cover:

  • Reading a CSV file into a DataFrame
  • Specifying the index column when reading a CSV file
  • Data type inference and specification
  • Specifying column names
  • Specifying specific columns to load
  • Saving data to a CSV file
  • Working with general field-delimited data
  • Handling variants of formats in field-delimited data
  • Reading and writing data in Excel format
  • Reading and writing JSON files
  • Reading HTML data from the web
  • Reading and writing HDF5 format files
  • Reading and writing from/to SQL databases
  • Reading stock data from Yahoo! and Google Finance
  • Reading options data from Google Finance
  • Reading economic data from the FRED of St. Louis
  • Accessing Kenneth French's data
  • Accessing World Bank data
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset