Since we are going to discuss stock data extensively, note that we do not guarantee the accuracy, completeness, or validity of the content presented; nor are we responsible for any errors or omissions that may have occurred. The data, visualizations, and analyses are provided on an “as is” basis for educational purposes only, without any representations, warranties, or conditions of any kind. Therefore, the publisher and the authors do not accept liability for your use of the content. It should be noted that past stock performance may not predict future performance. Readers should also be aware of the risks involved in stock investments and should not take any investment decisions based on the content in this chapter. In addition, readers are advised to conduct their own independent research into individual stocks before making an investment decision.
We are going to adapt the Quandl JSON API code in Chapter 7, Visualizing Online Data to get EOD stock data from Quandl. The historical stock data from January 1, 2017 to June 30, 2017 for six stock codes will be obtained: Apple Inc.(EOD/AAPL), The Procter & Gamble Company (EOD/PG), Johnson & Johnson (EOD/JNJ), Exxon Mobil Corporation (EOD/XOM), International Business Machines Corporation (EOD/IBM), and Microsoft Corporation (EOD/MSFT). Again, we will use the default urllib and json modules to handle Quandl API calls, followed by converting the data into a Pandas DataFrame:
from urllib.request import urlopen
import json
import pandas as pd
def get_quandl_dataset(api_key, code, start_date, end_date):
"""Obtain and parse a quandl dataset in Pandas DataFrame format
Quandl returns dataset in JSON format, where data is stored as a
list of lists in response['dataset']['data'], and column headers
stored in response['dataset']['column_names'].
Args:
api_key: Quandl API key
code: Quandl dataset code
Returns:
df: Pandas DataFrame of a Quandl dataset
"""
base_url = "https://www.quandl.com/api/v3/datasets/"
url_suffix = ".json?api_key="
date = "&start_date={}&end_date={}".format(start_date, end_date)
# Fetch the JSON response
u = urlopen(base_url + code + url_suffix + api_key + date)
response = json.loads(u.read().decode('utf-8'))
# Format the response as Pandas Dataframe
df = pd.DataFrame(response['dataset']['data'], columns=response['dataset']
['column_names'])
return df
# Input your own API key here
api_key = "INSERT YOUR KEY HERE"
# Quandl code for six US companies
codes = ["EOD/AAPL", "EOD/PG", "EOD/JNJ", "EOD/XOM", "EOD/IBM", "EOD/MSFT"]
start_date = "2017-01-01"
end_date = "2017-06-30"
dfs = []
# Get the DataFrame that contains the EOD data for each company
for code in codes:
df = get_quandl_dataset(api_key, code, start_date, end_date)
df["Company"] = code[4:]
dfs.append(df)
# Concatenate all dataframes into a single one
stock_df = pd.concat(dfs)
# Sort by ascending order of Company then Date
stock_df = stock_df.sort_values(["Company","Date"])
stock_df.head()
The dataframe contains Opening, High, Low, and Closing (OHLC) prices for each stock. Extra information is also available; for example, the dividend column reflects the cash dividend value on that day. The split column shows the ratio of new shares to old shares if a split occurred on that day. The adjusted prices account for price fluctuations due to distributions or corporate actions by assuming that all these actions were reinvested into the current stock. For more information about these columns, consult the documentation pages on Quandl.