We should bear in mind that the Big Mac index is not directly comparable between countries. Normally, we would expect commodities in poor countries to be cheaper than those in rich ones. To represent a fairer picture of the index, it would be better to show the relationship between Big Mac pricing and Gross Domestic Product (GDP) per capita.
We are going to acquire GDP per capita from Quandl's World Bank World Development Indicators (WWDI) dataset. Based on the previous code example of acquiring JSON data from Quandl, can you try to adapt it to download the GDP per capita dataset?
For those who are impatient, here is the full code:
import urllib
import json
import pandas as pd
import time
from urllib.request import urlopen
def get_gdp_dataset(api_key, country_code):
"""Obtain and parse a quandl GDP dataset in Pandas DataFrame format
Quandl returns dataset in JSON format, where data is stored as a
list of lists in response['dataset']['data'], and column headers
stored in response['dataset']['column_names'].
Args:
api_key: Quandl API key
country_code: Three letter code to represent country
Returns:
df: Pandas DataFrame of a Quandl dataset
"""
base_url = "https://www.quandl.com/api/v3/datasets/"
url_suffix = ".json?api_key="
# Compose the Quandl API dataset code to get GDP per capita
# (constant 2000 US$) dataset
gdp_code = "WWDI/" + country_code + "_NY_GDP_PCAP_KD"
# Parse the JSON response from Quandl API
# Some countries might be missing, so we need error handling code
try:
u = urlopen(base_url + gdp_code + url_suffix + api_key)
except urllib.error.URLError as e:
print(gdp_code,e)
return None
response = json.loads(u.read().decode('utf-8'))
# Format the response as Pandas Dataframe
df = pd.DataFrame(response['dataset']['data'], columns=response['dataset']['column_names'])
# Add a new country code column
df['country'] = country_code
return df
api_key = "INSERT YOUR KEY HERE"
quandl_dfs = []
# Loop through all unique country code values in the BigMac index DataFrame
for country_code in bigmac_df.country.unique():
# Fetch the GDP dataset for the corresponding country
df = get_gdp_dataset(api_key, country_code)
# Skip if the response is empty
if df is None:
continue
# Store in a list DataFrames
quandl_dfs.append(df)
# Prevents exceeding the API speed limit
time.sleep(2)
# Concatenate the list of DataFrames into a single one
gdp_df = pd.concat(quandl_dfs)
gdp_df.head()
The expected output:
WWDI/EUR_NY_GDP_PCAP_KD HTTP Error 404: Not Found
WWDI/SIN_NY_GDP_PCAP_KD HTTP Error 404: Not Found
WWDI/ROC_NY_GDP_PCAP_KD HTTP Error 404: Not Found
WWDI/UAE_NY_GDP_PCAP_KD HTTP Error 404: Not Found
Date | Value | country | |
0 | 2015-12-31 | 10501.660269 | ARG |
1 | 2014-12-31 | 10334.780146 | ARG |
2 | 2013-12-31 | 10711.229530 | ARG |
3 | 2012-12-31 | 10558.265365 | ARG |
4 | 2011-12-31 | 10780.342508 | ARG |
We can see that the GDP per capita dataset is not available for four geographical locations, but we can ignore that for now.
Next, we will merge the two DataFrames that contain Big Mac Index and GDP per capita respectively using pandas.merge(). The most recent record in WWDI's GDP per capita dataset was collected at the end of 2015, so let's pair that up with the corresponding Big Mac index dataset in the same year.
For those who are familiar with the SQL language, pandas.merge() supports four modes, namely left, right, inner, and outer joins. Since we are interested in rows that have matching countries in both DataFrames only, we are going to choose inner join:
merged_df = pd.merge(bigmac_df[(bigmac_df.Date == "2015-01-31")], gdp_df[(gdp_df.Date == "2015-12-31")], how='inner', on='country')
merged_df.head()
Date_x | local_price | dollar_ex | dollar_price | dollar_ppp | dollar_valuation | dollar_adj_valuation | euro_adj_valuation | sterling_adj_valuation | yen_adj_valuation | yuan_adj_valuation | country | Date_y | Value | |
0 | 2015-01-31 | 28.00 | 8.610000 | 3.252033 | 5.845511 | -32.107881 | 0.540242 | -0.804495 | -2.49468 | 34.3905 | 6.01183 | ARG | 2015-12-31 | 10501.660269 |
1 | 2015-01-31 | 5.30 | 1.227220 | 4.318705 | 1.106472 | -9.839144 | -17.8995 | -18.9976 | -20.3778 | 9.74234 | -13.4315 | AUS | 2015-12-31 | 54688.445933 |
2 | 2015-01-31 | 13.50 | 2.592750 | 5.206827 | 2.818372 | 8.702019 | 68.4555 | 66.2024 | 63.3705 | 125.172 | 77.6231 | BRA | 2015-12-31 | 11211.891104 |
3 | 2015-01-31 | 2.89 | 0.661594 | 4.368235 | 0.603340 | -8.805115 | 3.11257 | 1.73343 | 0 | 37.8289 | 8.72415 | GBR | 2015-12-31 | 41182.619517 |
4 | 2015-01-31 | 5.70 | 1.228550 | 4.639616 | 1.189979 | -3.139545 | -2.34134 | -3.64753 | -5.28928 | 30.5387 | 2.97343 | CAN | 2015-12-31 | 50108.065004 |