The NumPy polyfit
function can fit a set of data points to a polynomial even if the underlying function is not continuous:
bhp=np.loadtxt('BHP.csv', delimiter=',', usecols=(6,), unpack=True) vale=np.loadtxt('VALE.csv', delimiter=',', usecols=(6,), unpack=True) t = np.arange(len(bhp)) poly = np.polyfit(t, bhp - vale, int(sys.argv[1])) print "Polynomial fit", poly
The polynomial fit (in this example, a cubic polynomial was chosen):
Polynomial fit [ 1.11655581e-03 -5.28581762e-02 5.80684638e-01 5.79791202e+01]
polyval
function and the polynomial object we got from the fit:print "Next value", np.polyval(poly, t[-1] + 1)
The next value we predict will be:
Next value 57.9743076081
roots
function:print "Roots", np.roots(poly)
The roots of the polynomial are as follows:
Roots [ 35.48624287+30.62717062j 35.48624287-30.62717062j -23.63210575 +0.j ]
polyder
function:der = np.polyder(poly) print "Derivative", der
The coefficients of the derivative polynomial are as follows:
Derivative [ 0.00334967 -0.10571635 0.58068464]
The numbers you see are the coefficients of the derivative polynomial.
print "Extremas", np.roots(der)
The extrema that we get are:
Extremas [ 24.47820054 7.08205278]
Let's double check; compute the values of the fit with polyval
:
vals = np.polyval(poly, t)
argmax
and argmin
:vals = np.polyval(poly, t) print np.argmax(vals) print np.argmin(vals)
This gives us the following expected results. Ok, not quite the same results, but, if we backtrack to step 1, we can see that t
was defined with the
arange
function:
7 24
plot(t, bhp - vale) plot(t, vals) show()
It results in this plot:
Obviously, the smooth line is the fit and the jagged line is the underlying data. It's not that good a fit, so you might want to try a higher order polynomial.
We fit data to a polynomial with the polyfit
function. We learned about the polyval
function that computes the values of a polynomial, the roots
function that returns the roots of the polynomial, and the polyder
function that gives back the derivative of a polynomial (see polynomials.py
):
import numpy as np import sys from matplotlib.pyplot import plot from matplotlib.pyplot import show bhp=np.loadtxt('BHP.csv', delimiter=',', usecols=(6,), unpack=True) vale=np.loadtxt('VALE.csv', delimiter=',', usecols=(6,), unpack=True) t = np.arange(len(bhp)) poly = np.polyfit(t, bhp - vale, int(sys.argv[1])) print "Polynomial fit", poly print "Next value", np.polyval(poly, t[-1] + 1) print "Roots", np.roots(poly) der = np.polyder(poly) print "Derivative", der print "Extremas", np.roots(der) vals = np.polyval(poly, t) print np.argmax(vals) print np.argmin(vals) plot(t, bhp - vale) plot(t, vals) show()
There are a number of things you could do to improve the fit. Try a different power as, in this tutorial, a cubic polynomial was chosen. Consider smoothing the data before fitting it. One way you could smooth is with a moving average. Examples of simple and exponential moving average calculations can be found in the previous chapter.