Performing a normality test with scikits-statsmodels

The scikits-statsmodels package has lots of statistical tests. We will see an example of such a test—the Anderson-Darling test for normality (http://en.wikipedia.org/wiki/Anderson%E2%80%93Darling_test).

How to do it...

We will download price data as in the previous recipe; but this time for a single stock. Again, we will calculate the log returns of the close price of this stock, and use that as an input for the normality test function.

This function returns a tuple containing a second element—a p-value between zero and one. The complete code for this tutorial is as follows:

import datetime
import numpy
from matplotlib import finance
from statsmodels.stats.adnorm import normal_ad
import sys

#1. Download price data

# 2011 to 2012
start = datetime.datetime(2011, 01, 01)
end = datetime.datetime(2012, 01, 01)

print "Retrieving data for", sys.argv[1]
quotes = finance.quotes_historical_yahoo(sys.argv[1], start, end, asobject=True)

close = numpy.array(quotes.close).astype(numpy.float)
print close.shape

print normal_ad(numpy.diff(numpy.log(close)))

The following shows the output of the script with p-value of 0.13:

Retrieving data for AAPL
(252,)
(0.57103805516803163, 0.13725944999430437)

How it works...

This recipe demonstrated the Anderson Darling statistical test for normality, as found in scikits-statsmodels. We used the stock price data, which does not have a normal distribution, as input. For the data, we got a p-value of 0.13. Since probabilities range between zero and one, this confirms our hypothesis.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset