For the purpose of this recipe, imagine that we are operating a Hedge Fund. Let it sink in; you are part of the one percent now!
Power laws occur in a lot of places, see http://en.wikipedia.org/wiki/Power_law
for more information. The
Pareto principle (http://en.wikipedia.org/wiki/Pareto_principle) for instance, which is a power law, states that wealth is unevenly distributed. This principle tells us that if we group people by their wealth, the size of the groups will vary exponentially. To put it simply, there are not a lot of rich people, and there are even less billionaires; hence the one percent.
Assume that there is a power law in the closing stock prices log returns. This is a big assumption, of course, but power law assumptions seem to pop up all over the place.
We don't want to trade too often, because of involved transaction costs per trade. Let's say that we would prefer to buy and sell once a month based on a significant correction (in other words a big drop). The issue is to determine an appropriate signal given that we want to initiate a transaction every one out of about 20 days.
First, let's get historical end-of-day data for the past year from Yahoo Finance. After that, extract the close prices for this period. These steps are described in the previous recipe.
Now calculate the log returns for the close prices. For more information on log returns refer to http://en.wikipedia.org/wiki/Rate_of_return .
First, we will take the log of the close prices, and then compute the first difference of these values with the NumPy diff
function. Let's select the positive values from the log returns. Why the positive values? It doesn't really matter; I like being positive:
logreturns = numpy.diff(numpy.log(close)) pos = logreturns[logreturns > 0]
We need to get the frequencies of the returns with the histogram
function. Counts and an array of the bins are returned. At the end, we need to take the log of the frequencies in order to get a nice linear relation:
counts, rets = numpy.histogram(pos) rets = rets[:-1] + (rets[1] - rets[0])/2 freqs = 1.0/counts freqs = numpy.log(freqs)
Use the polyfit
function to do a linear fit:
p = numpy.polyfit(rets,freqs, 1)
Finally, we will plot the data and linear fit with Matplotlib:
matplotlib.pyplot.plot(rets, freqs, 'o') matplotlib.pyplot.plot(rets, p[0] * rets + p[1]) matplotlib.pyplot.show()
We get a nice plot of the linear fit, returns, and frequencies:
The following is the complete code:
from matplotlib.finance import quotes_historical_yahoo from datetime import date import numpy import sys import matplotlib.pyplot #1. Get close prices. today = date.today() start = (today.year - 1, today.month, today.day) quotes = quotes_historical_yahoo(sys.argv[1], start, today) close = numpy.array([q[4] for q in quotes]) #2. Get positive log returns. logreturns = numpy.diff(numpy.log(close)) pos = logreturns[logreturns > 0] #3. Get frequencies of returns. counts, rets = numpy.histogram(pos) rets = rets[:-1] + (rets[1] - rets[0])/2 freqs = 1.0/counts freqs = numpy.log(freqs) #4. Fit the frequencies and returns to a line. p = numpy.polyfit(rets,freqs, 1) #5. Plot the results. matplotlib.pyplot.plot(rets, freqs, 'o') matplotlib.pyplot.plot(rets, p[0] * rets + p[1]) matplotlib.pyplot.show()
The
histogram
function calculates the histogram of a data set. It returns the histogram values and bin edges. The
polyfit
function fits data to a polynomial of given order. In this case, we chose for a linear fit. We "discovered" a power law—you have to be careful making such claims, but the evidence looks promising.