Merge sort algorithms are based on the divide and conquer rule. Given a list of unsorted elements, we split the list into two approximate halves. We continue to divide the list into two halves recursively.
After a while, the sublists created as a result of the recursive call will contain only one element. At that point, we begin to merge the solutions in the conquer or merge step:
def merge_sort(unsorted_list):
if len(unsorted_list) == 1:
return unsorted_list
mid_point = int((len(unsorted_list))//2)
first_half = unsorted_list[:mid_point]
second_half = unsorted_list[mid_point:]
half_a = merge_sort(first_half)
half_b = merge_sort(second_half)
return merge(half_a, half_b)
Our implementation starts by accepting the list of unsorted elements into the merge_sort function. The if statement is used to establish the base case, where, if there is only one element in the unsorted_list, we simply return that list again. If there is more than one element in the list, we find the approximate middle using mid_point = int((len(unsorted_list)) // 2).
Using this mid_point, we divide the list into two sublists, namely, first_half and second_half:
first_half = unsorted_list[:mid_point]
second_half = unsorted_list[mid_point:]
A recursive call is made by passing the two sublists to the merge_sort function again:
half_a = merge_sort(first_half)
half_b = merge_sort(second_half)
Now for the merge step. When half_a and half_b have been passed their values, we call the merge function, which will merge or combine the two solutions stored in half_a and half_b, which are lists:
def merge(first_sublist, second_sublist):
i = j = 0
merged_list = []
while i < len(first_sublist) and j < len(second_sublist):
if first_sublist[i] < second_sublist[j]:
merged_list.append(first_sublist[i])
i += 1
else:
merged_list.append(second_sublist[j])
j += 1
while i < len(first_sublist):
merged_list.append(first_sublist[i])
i += 1
while j < len(second_sublist):
merged_list.append(second_sublist[j])
j += 1
return merged_list
The merge function takes the two lists we want to merge together, first_sublist and second_sublist. The i and j variables are initialized to 0, and are used as pointers to tell us where we are in the two lists with respect to the merging process.
The final merged_list will contain the merged list:
while i < len(first_sublist) and j < len(second_sublist):
if first_sublist[i] < second_sublist[j]:
merged_list.append(first_sublist[i])
i += 1
else:
merged_list.append(second_sublist[j])
j += 1
The while loop starts comparing the elements in first_sublist and second_sublist. The if statement selects the smaller of the two, first_sublist[i] or second_sublist[j], and appends it to merged_list. The i or j index is incremented to reflect where we are with the merging step. The while loop stops when either sublist is empty.
There may be elements left behind in either first_sublist or second_sublist. The last two while loops make sure that those elements are added to merged_list before it is returned.
The last call to merge(half_a, half_b) will return the sorted list.
Let's give the algorithm a dry run by merging the two sublists [4, 6, 8] and [5, 7, 11, 40]:
Step |
first_sublist |
second_sublist |
merged_list |
Step 0 |
[4 6 8] |
[5 7 11 40] |
[] |
Step 1 |
[ 6 8] |
[5 7 11 40] |
[4] |
Step 2 |
[ 6 8] |
[ 7 11 40] |
[4 5] |
Step 3 |
[ 8] |
[ 7 11 40] |
[4 5 6] |
Step 4 |
[ 8] |
[ 11 40] |
[4 5 6 7] |
Step 5 |
[ ] |
[ 11 40] |
[4 5 6 7 8] |
Step 6 |
[] |
[ ] |
[4 5 6 7 8 11 40] |
Note that the text in bold represents the current item referenced in the loops first_sublist (which uses the i index) and second_sublist (which uses the j index).
At this point in the execution, the third while loop in the merge function kicks in to move 11 and 40 into merged_list. The returned merged_list will contain the fully sorted list.
Note that while the merge algorithm takes O(n) time, the merge sort algorithm has a running time complexity of O(log n) T(n) = O(n)*O(log n) = O(n log n).