The
recarray
class is a subclass of ndarray
. They can hold records like in a database, with different data types. For instance, we can store records about employees, containing numerical data such as salary, and strings such as the employee name.
Modern economic theory tells us that an investing
boils down to optimizing risk and reward. Risk is represented by the standard
deviation of log returns (for more information on Arithmetic and logarithmic return visit http://en.wikipedia.org/wiki/Rate_of_return#Arithmetic_and_logarithmic_return). Reward on the other hand, is represented by the average of log returns. We can come up with a relative score, where a high score means low risk and high reward. We will calculate the scores for several stocks and store them together with the stock symbol using a table format in a NumPy recarray
function.
We will start by creating the record array.
Create a record array with for each record a symbol, standard deviation score, mean score, and overall score:
weights = numpy.recarray((len(tickers),), dtype=[('symbol', numpy.str_, 16), ('stdscore', float), ('mean', float), ('score', float)])
To keep things simple, we will initialize the scores in a loop based on the log returns:
for i in xrange(len(tickers)): close = get_close(tickers[i]) logrets = numpy.diff(numpy.log(close)) weights[i]['symbol'] = tickers[i] weights[i]['mean'] = logrets.mean() weights[i]['stdscore'] = 1/logrets.std() weights[i]['score'] = 0
As you can see, we can access elements using the field names we defined in the previous step.
We now have some numbers, but they are not easy to compare with each other. Normalize the scores, so that we can combine them later. Here, normalizing means making sure that the scores add up to one:
for key in ['mean', 'stdscore']: wsum = weights[key].sum() weights[key] = weights[key]/wsum
The overall score will just be the average of the intermediate scores. Sort the records on the overall score to produce a ranking:
weights['score'] = (weights['stdscore'] + weights['mean'])/2 weights['score'].sort()
The following is the complete code for this example:
import numpy from matplotlib.finance import quotes_historical_yahoo from datetime import date # DJIA stock with div yield > 4 % tickers = ['MRK', 'T', 'VZ'] def get_close(ticker): today = date.today() start = (today.year - 1, today.month, today.day) quotes = quotes_historical_yahoo(ticker, start, today) return numpy.array([q[4] for q in quotes]) weights = numpy.recarray((len(tickers),), dtype=[('symbol', numpy.str_, 16), ('stdscore', float), ('mean', float), ('score', float)]) for i in xrange(len(tickers)): close = get_close(tickers[i]) logrets = numpy.diff(numpy.log(close)) weights[i]['symbol'] = tickers[i] weights[i]['mean'] = logrets.mean() weights[i]['stdscore'] = 1/logrets.std() weights[i]['score'] = 0 for key in ['mean', 'stdscore']: wsum = weights[key].sum() weights[key] = weights[key]/wsum weights['score'] = (weights['stdscore'] + weights['mean'])/2 weights['score'].sort() for record in weights: print "%s,mean=%.4f,stdscore=%.4f,score=%.4f" % (record['symbol'], record['mean'], record['stdscore'], record['score'])
This program produces the following output:
MRK,mean=0.1862,stdscore=0.2886,score=0.2374 T,mean=0.3570,stdscore=0.3556,score=0.3563 VZ,mean=0.4569,stdscore=0.3557,score=0.4063
We computed scores for several stocks, and stored them in a NumPy recarray
object. This array enables us to mix data of different data types, in this case, stock symbols and numerical scores. Record arrays allow us to
access fields as array members, for example, arr.field
. This tutorial covered the creation of a record array. More record array related functions can be found in the
numpy.recarray
module.