Time for action – sorting lexically

The NumPy lexsort() function returns an array of indices of the input array elements corresponding to lexically sorting an array. We need to give the function an array or tuple of sort keys:

  1. Let's go back to Chapter 3, Getting Familiar with Commonly Used Functions. In that chapter, we used stock price data of AAPL. We will load the close prices and the (always complex) dates. In fact, create a converter function just for the dates:
    def datestr2num(s):
       return datetime.datetime.strptime(s, "%d-%m-%Y").toordinal()
    dates, closes=np.loadtxt('AAPL.csv', delimiter=',', usecols=(1, 6), converters={1:datestr2num}, unpack=True)
  2. Sort the names lexically with the lexsort() function. The data is already sorted by date, but sort it by close as well:
    indices = np.lexsort((dates, closes))
    print("Indices", indices)
    print(["%s %s" % (datetime.date.fromordinal(dates[i]),
      closes[i]) for i in indices])

    The code prints the following:

    Indices [ 0 16  1 17 18  4  3  2  5 28 19 21 15  6 29 22 27 20  9  7 25 26 10  8 14 11 23 12 24 13]
    ['2011-01-28 336.1', '2011-02-22 338.61', '2011-01-31 339.32', '2011-02-23 342.62', '2011-02-24 342.88', '2011-02-03 343.44', '2011-02-02 344.32', '2011-02-01 345.03', '2011-02-04 346.5', '2011-03-10 346.67', '2011-02-25 348.16', '2011-03-01 349.31', '2011-02-18 350.56', '2011-02-07 351.88', '2011-03-11 351.99', '2011-03-02 352.12', '2011-03-09 352.47', '2011-02-28 353.21', '2011-02-10 354.54', '2011-02-08 355.2', '2011-03-07 355.36', '2011-03-08 355.76', '2011-02-11 356.85', '2011-02-09 358.16', '2011-02-17 358.3', '2011-02-14 359.18', '2011-03-03 359.56', '2011-02-15 359.9', '2011-03-04 360.0', '2011-02-16 363.13']
    

What just happened?

We sorted the close prices of AAPL lexically using the NumPy lexsort() function. The function returned the indices corresponding with sorting the array (see lex.py):

from __future__ import print_function
import numpy as np
import datetime

def datestr2num(s):
   return datetime.datetime.strptime(s, "%d-%m-%Y").toordinal()

dates, closes=np.loadtxt('AAPL.csv', delimiter=',', usecols=(1, 6), converters={1:datestr2num}, unpack=True)
indices = np.lexsort((dates, closes))

print("Indices", indices)
print(["%s %s" % (datetime.date.fromordinal(int(dates[i])),  closes[i]) for i in indices])

Have a go hero – trying a different sort order

We sorted using the dates and the close price sort order. Try a different order. Generate random numbers using the random module we learned about in the previous chapter and sort those using lexsort().

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset