Importing online financial data in the JSON format

In this chapter, we will also draw upon financial data from Quandl's API to create insightful visualizations. If you are not familiar with Quandl, it is a financial and economic data warehouse that stores millions of datasets from hundreds of publishers. The best thing about Quandl is that these datasets are delivered via the unified API, without worrying about the procedures to parse the data correctly. Anonymous users can get up to 50 API calls per day, and you get up to 500 free API calls if you are a registered user. Readers can sign up for a free API key at https://www.quandl.com/?modal=register.

At Quandl, every dataset is identified by a unique ID, as defined by the Quandl Code on each search result webpage. For example, the Quandl code GOOG/NASDAQ_SWTX defines the historical NASDAQ index data published by Google Finance. Every dataset is available in three formats--CSV, JSON, and XML.

Although an official Python client library is available from Quandl, we are not going to use that for the sake of demonstrating the general procedures of importing JSON data. According to Quandl's documentation, we can fetch JSON formatted data tables through the following API call:
GET https://www.quandl.com/api/v3/datasets/{Quandl code}/data.json

Let's try to get the Big Mac index data from Quandl.

from urllib.request import urlopen
import json
import time
import pandas as pd


def get_bigmac_codes():
"""Get a Pandas DataFrame of all codes in the Big Mac index dataset

The first column contains the code, while the second header
contains the description of the code.

for example,
ECONOMIST/BIGMAC_ARG,Big Mac Index - Argentina
ECONOMIST/BIGMAC_AUS,Big Mac Index - Australia
ECONOMIST/BIGMAC_BRA,Big Mac Index - Brazil

Returns:
codes: Pandas DataFrame of Quandl dataset codes"""

codes_url = "https://www.quandl.com/api/v3/databases/ECONOMIST/codes"
codes = pd.read_csv(codes_url, header=None, names=['Code', 'Description'],
compression='zip', encoding='latin_1')

return codes


def get_quandl_dataset(api_key, code):
"""Obtain and parse a quandl dataset in Pandas DataFrame format

Quandl returns dataset in JSON format, where data is stored as a
list of lists in response['dataset']['data'], and column headers
stored in response['dataset']['column_names'].

for example, {'dataset': {...,
'column_names': ['Date',
'local_price',
'dollar_ex',
'dollar_price',
'dollar_ppp',
'dollar_valuation',
'dollar_adj_valuation',
'euro_adj_valuation',
'sterling_adj_valuation',
'yen_adj_valuation',
'yuan_adj_valuation'],
'data': [['2017-01-31',
55.0,
15.8575,
3.4683903515687,
10.869565217391,
-31.454736135007,
6.2671477203176,
8.2697553162259,
29.626894343348,
32.714616745128,
13.625825886047],
['2016-07-31',
50.0,
14.935,
3.3478406427854,
9.9206349206349,
-33.574590420925,
2.0726096168216,
0.40224795003514,
17.56448458418,
19.76377270142,
11.643103380531]
],
'database_code': 'ECONOMIST',
'dataset_code': 'BIGMAC_ARG',
... }}

A custom column--country is added to denote the 3-letter country code.

Args:
api_key: Quandl API key
code: Quandl dataset code

Returns:
df: Pandas DataFrame of a Quandl dataset

"""
base_url = "https://www.quandl.com/api/v3/datasets/"
url_suffix = ".json?api_key="

# Fetch the JSON response
u = urlopen(base_url + code + url_suffix + api_key)
response = json.loads(u.read().decode('utf-8'))

# Format the response as Pandas Dataframe
df = pd.DataFrame(response['dataset']['data'], columns=response['dataset']['column_names'])

# Label the country code
df['country'] = code[-3:]

return df


quandl_dfs = []
codes = get_bigmac_codes()

# Replace this with your own API key
api_key = "INSERT YOUR KEY HERE"

for code in codes.Code:
# Get the DataFrame of a Quandl dataset
df = get_quandl_dataset(api_key, code)

# Store in a list
quandl_dfs.append(df)

# Prevents exceeding the API speed limit
time.sleep(2)


# Concatenate the list of dataframes into a single one
bigmac_df = pd.concat(quandl_dfs)
bigmac_df.head()

The expected output is as follows:

- Date local_price dollar_ex dollar_price dollar_ppp dollar_valuation dollar_adj_valuation euro_adj_valuation sterling_adj_valuation yen_adj_valuation yuan_adj_valuation country
0 2017-01-31 55.0 15.85750 3.468390 10.869565 -31.454736 6.26715 8.26976 29.6269 32.7146 13.6258 ARG
1 2016-07-31 50.0 14.93500 3.347841 9.920635 -33.574590 2.07261 0.402248 17.5645 19.7638 11.6431 ARG
2 2016-01-31 33.0 13.80925 2.389703 6.693712 -51.527332 -24.8619 -18.714 -18.7209 0.40859 -17.029 ARG
3 2015-07-31 28.0 9.13500 3.065134 5.845511 -36.009727 -4.7585 -0.357918 -6.01091 30.8609 5.02868 ARG
4 2015-01-31 28.0 8.61000 3.252033 5.845511 -32.107881 0.540242 -0.804495 -2.49468 34.3905 6.01183 ARG

 

The Big Mac index was invented by The Economist in 1986 as a lighthearted guide to check whether currencies are at their correct level. It is based on the theory of purchasing power parity (PPP) and is considered an informal measure of currency exchange rates at PPP. It measures their value against a similar basket of goods and services, in this case, a Big Mac. Differing prices at market exchange rates would imply that one currency is undervalued or overvalued.

The code for parsing JSON from the Quandl API is a bit more complicated, and thus extra explanations might help you to understand it. The first function, get_bigmac_codes(), parses the list of all available dataset codes in the Quandl Economist database as a pandas DataFrame. Meanwhile, the second function, get_quandl_dataset(api_key, code), converts the JSON response of a Quandl dataset API query to a pandas DataFrame. All datasets obtained are concatenated using pandas.concat().

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset