The analysis of selection and insertion sort begs the question, how can list.sort be so much more efficient? The answer is the same as it was for binary search: by taking advantage of the fact that some values are already sorted.
Consider the following function:
| import bisect |
| |
| def bin_sort(values: list) -> list: |
| """Return a sorted version of the values. (This does not mutate values.) |
| >>> L = [3, 4, 7, -1, 2, 5] |
| >>> bin_sort(L) |
| [-1, 2, 3, 4, 5, 7] |
| """ |
| |
| result = [] |
| for v in values: |
| bisect.insort_left(result, v) |
| |
| return result |
This code uses bisect.insort_left to figure out where to put each value from the original list into a new list that is kept in sorted order. As we have already seen, doing this takes time proportional to log2 N, where N is the length of the list. Since N values have to be inserted, the overall running time ought to be N log2 N.
As shown in the following table, this grows much more slowly with the length of the list than N2.
N |
N2 |
N log2 N |
---|---|---|
10 |
100 |
33 |
100 |
10,000 |
664 |
1000 |
1,000,000 |
9965 |
Unfortunately, there’s a flaw in this analysis. It’s correct to say that bisect.insort_left needs only log2 N time to figure out where to insert a value, but actually inserting it takes time as well. To create an empty slot in the list, we have to move all the values above that slot up one place. On average, this means copying half of the list’s values, so the cost of insertion is proportional to N. Since there are N values to insert, our total time is N(N + log2 N). For large values of N, this is once again roughly proportional to N2.