Profiling a program means measuring how long it takes to run and how much memory it uses. These measures—time and space—are fundamental to the theoretical study of algorithms. They are also important from a pragmatic point of view. Fast programs are more useful than slow ones, and programs that need more memory than what your computer has aren’t particularly useful at all.
This section introduces one way to time how long code takes to run. You’ll see how to run the three functions we developed to find the two lowest values in a list on 1,400 monthly readings of air pressure in Darwin, Australia, from 1882 to 1998.[7]
Module time contains functions related to time. One of these functions is perf_counter, which returns a time in seconds. We can call it before and after the code we want to time and take the difference to find out how many seconds elapsed. We multiply by 1000 in order to convert from seconds to milliseconds:
| import time |
| |
| t1 = time.perf_counter() |
| |
| # Code to time goes here |
| |
| t2 = time.perf_counter() |
| print('The code took {:.2f}ms'.format((t2 - t1) * 1000.)) |
We’ll want to time all three of our find_two_smallest functions. Rather than copying and pasting the timing code three times, we’ll write a function that takes another function as a parameter as well as the list to search in. We use type annotation typing.Callable for this parameter:
| Callable[[parameter types], return type] |
Since we’re not interested in what this function parameter returns, we use typing.Any as the return type. This timing function will return how many milliseconds it takes to execute a call on the function. After the timing function is the main program that reads the file of sea level pressures and then calls the timing function with each of the find_two_smallest functions:
| import time |
| import find_remove_find5 |
| import sort_then_find3 |
| import walk_through7 |
| |
| from typing import Callable, List, Any |
| |
| def time_find_two_smallest(find_func: Callable[[List[float]], Any], |
| lst: List[float]) -> float: |
| """Return how many seconds find_func(lst) took to execute. |
| """ |
| |
| t1 = time.perf_counter() |
| find_func(lst) |
| t2 = time.perf_counter() |
| return (t2 - t1) * 1000.0 |
| |
| if __name__ == '__main__': |
| # Gather the sea level pressures |
| sea_levels = [] |
| sea_levels_file = open('sea_levels.txt', 'r') |
| for line in sea_levels_file: |
| sea_levels.append(float(line)) |
| sea_levels_file.close() |
| |
| # Time each of the approaches |
| find_remove_find_time = time_find_two_smallest( |
| find_remove_find5.find_two_smallest, sea_levels) |
| |
| sort_get_minimums_time = time_find_two_smallest( |
| sort_then_find3.find_two_smallest, sea_levels) |
| |
| walk_through_time = time_find_two_smallest( |
| walk_through7.find_two_smallest, sea_levels) |
| |
| print('"Find, remove, find" took {:.2f}ms.'.format(find_remove_find_time)) |
| print('"Sort, get minimums" took {:.2f}ms.'.format( |
| sort_get_minimums_time)) |
| print('"Walk through the list" took {:.2f}ms.'.format(walk_through_time)) |
The execution times were as follows:
Algorithm |
Running Time (ms) |
---|---|
Find, remove, find |
0.09ms |
Sort, identify, index |
0.30ms |
Walk through the list |
0.28ms |
Notice how small these times are. No human being can notice the difference between values that are less than a millisecond; if this code never has to process lists with more than 1,400 values, we would be justified in choosing an implementation based on simplicity or clarity rather than on speed.
But what if we wanted to process millions of values? Find-remove-find outperforms the other two algorithms on 1,400 values, but how much does that tell us about how each will perform on data sets that are a thousand times larger? That will be covered in Chapter 13, Searching and Sorting.