Time for action – sorting lexically

The NumPy lexsort function returns an array of indices of the input array elements corresponding to lexically sorting an array. We need to give the function an array or tuple of sort keys. Perform the following steps:

  1. Now for something completely different, let's go back to Chapter 3, Get to Terms with Commonly Used Functions. In that chapter we used stock price data of AAPL. This is by now pretty old data. We will load the close prices and the always complex dates. In fact, we will need a converter function just for the dates.
    def datestr2num(s):
        return datetime.datetime.strptime(s, "%d-%m-%Y").toordinal()
    
    dates,closes=np.loadtxt('AAPL.csv', delimiter=',',
      usecols=(1, 6), converters={1:datestr2num}, unpack=True)
  2. Sort the names lexically with the lexsort function. The data is already sorted by date, but we will now sort it by close as well.
    indices = np.lexsort((dates, closes))
    
    print "Indices", indices
    print ["%s %s" % (datetime.date.fromordinal(dates[i]),
      closes[i]) for i in indices]

    The code prints the following:

    ['2011-01-28 336.1', '2011-02-22 338.61', '2011-01-31 339.32', '2011-02-23 342.62', '2011-02-24 342.88', '2011-02-03 343.44', '2011-02-02 344.32', '2011-02-01 345.03', '2011-02-04 346.5', '2011-03-10 346.67', '2011-02-25 348.16', '2011-03-01 349.31', '2011-02-18 350.56', '2011-02-07 351.88', '2011-03-11 351.99', '2011-03-02 352.12', '2011-03-09 352.47', '2011-02-28 353.21', '2011-02-10 354.54', '2011-02-08 355.2', '2011-03-07 355.36', '2011-03-08 355.76', '2011-02-11 356.85', '2011-02-09 358.16', '2011-02-17 358.3', '2011-02-14 359.18', '2011-03-03 359.56', '2011-02-15 359.9', '2011-03-04 360.0', '2011-02-16 363.13']
    

What just happened?

We sorted the close prices of AAPL lexically using the NumPy lexsort function. The function returned the indices corresponding with sorting the array (see lex.py).

import numpy as np
import datetime

def datestr2num(s):
    return datetime.datetime.strptime(s, "%d-%m-%Y").toordinal()

dates,closes=np.loadtxt('AAPL.csv', delimiter=',', usecols=(1, 6), converters={1:datestr2num}, unpack=True)
indices = np.lexsort((dates, closes))

print "Indices", indices
print ["%s %s" % (datetime.date.fromordinal(int(dates[i])),  closes[i]) for i in indices]

Have a go hero – trying a different sort order

We sorted using the dates, close price sort order. Try a different order. Generate random numbers using the random module we learned about in the previous chapter and sort those using lexsort.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset