Sorting with a key

Till now, we saw all the examples where a list or sequence was sorted by elements. Let's now proceed to see if we can sort it using keys. In the previous example, elements were the keys. In the real world, there are more complicated records where a record contains multiple columns and we would like to sort with one or more columns). We will work out our examples through a list of tuples and the same applies to the other sequence objects.

Getting ready

In our example, a single tuple represents a person's record, which includes his name, ID, and age. Let's write a sorting in order to sort by the various fields.

How to do it…

Let's define a record-like structure using a list and tuples. We will use this data to demonstrate the data sorting with a key:

#1.The first step is to create a list of tuples, which we will use to test our sorting.

employee_records = [ ('joe',1,53),('beck',2,26), 
                     ('ele',6,32),('neo',3,45),  
                    ('christ',5,33),('trinity',4,29), 
                    ]

# 2.Let us now sort it by employee name
print sorted(employee_records,key=lambda emp : emp[0])
"""
It prints as follows
[('beck', 2, 26), ('christ', 5, 33), ('ele', 6, 32), ('joe', 1, 53), ('neo', 3, 45), ('trinity', 4, 29)]
"""
# 3.Let us now sort it by employee id
print sorted(employee_records,key=lambda emp : emp[1])
"""
It prints as follows
[('joe', 1, 53), ('beck', 2, 26), ('neo', 3, 45), ('trinity', 4, 29), ('christ', 5, 33), ('ele', 6, 32)]
"""
# 4.Finally we sort it with employee age
print sorted(employee_records,key=lambda emp : emp[2])
"""
Its prints as follows
[('beck', 2, 26), ('trinity', 4, 29), ('ele', 6, 32), ('christ', 5, 33), ('neo', 3, 45), ('joe', 1, 53)]
"""

How it works…

In our example, each record has three fields: name, identification, and age. We used the lambda function to pass a key by which we need to sort the given records. In step 2, we passed the name as the key to sort. Similarly, in steps 2 and 3, we passed the ID and age as the keys. We can see the outputs in various steps; the outputs are sorted by the particular key that we want them to.

There's more…

Due to the importance of sorting by a key, Python provides a convenient function to access keys instead of writing lambdas. The operator module has the itemgetter, attrgetter, and methodcaller functions. The sorting example that we saw can be written as follows using itemgetter:

from operator import itemgetter
employee_records = [ ('joe',1,53),('beck',2,26), 
                     ('ele',6,32),('neo',3,45),  
                     ('christ',5,33),('trinity',4,29), 
                     ]
print sorted(employee_records,key=itemgetter(0))
"""
[('beck', 2, 26), ('christ', 5, 33), ('ele', 6, 32), ('joe', 1, 53), ('neo', 3, 45), ('trinity', 4, 29)]
"""
print sorted(employee_records,key=itemgetter(1))
"""
[('joe', 1, 53), ('beck', 2, 26), ('neo', 3, 45), ('trinity', 4, 29), ('christ', 5, 33), ('ele', 6, 32)]
"""
print sorted(employee_records,key=itemgetter(2))
"""
[('beck', 2, 26), ('trinity', 4, 29), ('ele', 6, 32), ('christ', 5, 33), ('neo', 3, 45), ('joe', 1, 53)]
"""

Note that we have not used the lambda functions, rather used itemgetter to specify the key by which we need to sort. More than one field can be given as an input to itemgetter when we need multiple level sorting; for example, let's say that we need to sort by name and then by age, our code would be as follows:

>>> sorted(employee_records,key=itemgetter(0,1))
[('beck', 2, 26), ('christ', 5, 33), ('ele', 6, 32), ('joe', 1, 53), ('neo', 3, 45), ('trinity', 4, 29)]

The attrgetter and methodcaller comes in handy when the elements of our iterable are class objects. Look at the following example:

# Let us now enclose the employee records as class objects,
class employee(object):
    def __init__(self,name,id,age):
        self.name = name
        self.id = id
        self.age = age
    def pretty_print(self):
       print self.name,self.id,self.age

# Now let us populate a list with these class objects.
employee_records = []
emp1 = employee('joe',1,53)
emp2 = employee('beck',2,26)
emp3 = employee('ele',6,32)

employee_records.append(emp1)
employee_records.append(emp2)
employee_records.append(emp3)

# Print the records
for emp in employee_records:
    emp.pretty_print()


from operator import attrgetter
employee_records_sorted = sorted(employee_records,key=attrgetter('age'))
# Now let us print the sorted list,
for emp in employee_records_sorted:
    emp.pretty_print()

The constructor initializes the class with three variables: name, age, and ID. We also have the pretty_print method to print the values of the class object.

Next, let's populate a list with these class objects:

employee_records = []
emp1 = employee('joe',1,53)
emp2 = employee('beck',2,26)
emp3 = employee('ele',6,32)

employee_records.append(emp1)
employee_records.append(emp2)
employee_records.append(emp3)

Now, we have a list of employee objects. There are three variables in each object: name, ID, and age. Let's print the list to see the order:

joe 1 53
beck 2 26
ele 6 32

As you can see, the order of the insertion has been preserved. Now, let's use attrgetter to sort the list with the age field:

employee_records_sorted = sorted(employee_records,key=attrgetter('age'))

Let's print the sorted list.

The output is as follows:

beck 2 26
ele 6 32
joe 1 53

You can see that the records are now sorted by age.

The methodcaller can be used to sort when we want to use a method in our class to decide the sorting. For demonstration purposes, let's add a random method, which divides the age by the ID:

class employee(object):
    def __init__(self,name,id,age):
        self.name = name
        self.id = id
        self.age = age

    def pretty_print(self):
       print self.name,self.id,self.age

    def random_method(self):
       return self.age / self.id 

# Populate data
employee_records = []
emp1 = employee('joe',1,53)
emp2 = employee('beck',2,26)
emp3 = employee('ele',6,32)

employee_records.append(emp1)
employee_records.append(emp2)
employee_records.append(emp3)

from operator import methodcaller
employee_records_sorted = sorted(employee_records,key=methodcaller('random_method'))
for emp in employee_records_sorted:
    emp.pretty_print() 

We can now sort the list by calling this method:

sorted(employee_records,key=methodcaller('random_method'))

Let's now print the list in a sorted order and see the output:

ele 6 32
beck 2 26
joe 1 53
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset