Time for action – predicting price with a linear model

Keeping an open mind, let's assume that we can express a stock price as a linear combination of previous values, that is, a sum of those values multiplied by certain coefficients we need to determine. In linear algebra terms, this boils down to finding a least squares solution. This recipe goes as follows.

  1. First, form a vector bbx containing N price values.
    bbx = c[-N:]
    bbx = b[::-1]
    print "bbx", x

    The result is as follows:

    bbx [ 351.99  346.67  352.47  355.76  355.36]
  2. Second, pre-initialize the matrix A to be N x N and containing zeroes.
    A = np.zeros((N, N), float)
    print "Zeros N by N", A
    Zeros N by N [[ 0.  0.  0.  0.  0.]
     [ 0.  0.  0.  0.  0.]
     [ 0.  0.  0.  0.  0.]
     [ 0.  0.  0.  0.  0.]
     [ 0.  0.  0.  0.  0.]]
  3. Third, fill the matrix A with N preceding price values for each value in bbx.
    for i in range(N):
       A[i, ] = c[-N - 1 - i: - 1 - i]
    print "A", A

    Now, A looks like this:

    A [[ 360.    355.36  355.76  352.47  346.67]
     [ 359.56  360.    355.36  355.76  352.47]
     [ 352.12  359.56  360.    355.36  355.76]
     [ 349.31  352.12  359.56  360.    355.36]
     [ 353.21  349.31  352.12  359.56  360.  ]]
  4. The objective is to determine the coefficients that satisfy our linear model, by solving the least squares problem. Employ the lstsq function of the NumPy linalg package to do that.
    (x, residuals, rank, s) = np.linalg.lstsq(A, b)
    
    print x, residuals, rank, s

    The result is as follows:

    [ 0.78111069 -1.44411737  1.63563225 -0.89905126  0.92009049] [] 5 [  1.77736601e+03   1.49622969e+01   8.75528492e+00   5.15099261e+00   1.75199608e+00]

    The tuple returned contains the coefficients xxb that we were after, an array comprising of residuals, the rank of matrix A, and the singular values of A.

  5. Once we have the coefficients of our linear model, we can predict the next price value. Compute the dot product (with the NumPy dot function) of the coefficients and the last known N prices.
    print numpy.dot(b, x)

    The dot product is the linear combination of the coefficients xxb and the prices x. As a result, we get the following:

    357.939161015

    I looked it up; the actual close price of the next day was 353.56. So, our estimate with N = 5 was not that far off.

What just happened?

We predicted tomorrow's stock price today. If this works in practice, we could retire early! See, this book was a good investment after all! We designed a linear model for the predictions. The financial problem was reduced to a linear algebraic one. NumPy's linalg package has a practical lstsq function that helped us with the task at hand—estimating the coefficients of a linear model. After obtaining a solution, we plugged the numbers in the NumPy dot function that presented us an estimate through linear regression (see linearmodel.py).

import numpy as np
import sys

N = int(sys.argv[1])

c = np.loadtxt('data.csv', delimiter=',', usecols=(6,), unpack=True)

b = c[-N:]
b = b[::-1]
print "b", b

A = np.zeros((N, N), float)
print "Zeros N by N", A

for i in range(N):
   A[i, ] = c[-N - 1 - i: - 1 - i]

print "A", A


(x, residuals, rank, s) = np.linalg.lstsq(A, b)

print x, residuals, rank, s

print np.dot(b, x)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset