Creating a Pandas DataFrame from a JSON file

Along with CSV, JSON is another commonly found format for datasets, especially when extracting data from web APIs.

How to do it…

  1. To create a Pandas DataFrame from a JSON file, first import the Python libraries that you need:
    import pandas as pd
  2. Next, define a variable for the JSON file and enter the full path to the file:
    customer_json_file = 'customer_data.json'
  3. Next, create a DataFrame from the JSON file using the read_json() method provided by Pandas. Note that the dates in our JSON file are stored in the ISO format, so we're going to tell the read_json() method to convert dates:
    customers_json = pd.read_json(customer_json_file,
       convert_dates=True)
  4. Finally, use the head() command to see the top five rows of data:
    customers_json.head()

How it works…

After importing Pandas and defining a variable for the full path to our JSON file, we use the read_json() method provided by Pandas to create a DataFrame from our JSON file.

read_json() takes a number of arguments, but here we keep it simple and use two: the file path variable and convert_dates. convert_dates is a list of columns to parse for dates that, when set to True, attempts to parse date-like columns.

See the official Pandas documentation for all possible arguments.

How it works…
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset