Heap sort is an improvised form of selection sort, wherein the algorithm initially splits the input vector into sorted and unsorted vectors, and then iteratively shrinks the unsorted vector by extracting the largest element and placing it in the sorted vector. It is based on the heap data structure which provides a non-quadratic asymptote even for the worst-case scenarios. Heaps are tree-based data structures with the following properties:
Figure 5.10: Illustration of complete and in-complete binary tree
Figure 5.11: Illustration of max-heap and min-heap
The heap sort algorithm possesses some structural advantages which enhance its performance efficiency. It adopts the concept of a complete binary tree wherein the tree is balanced. It requires less in-memory, as the values of the input vector are directly stored in the form of a binary tree. The values need not be explicitly inserted into each of the nodes within the tree. Hence, it is also suitable for large size vectors. The asymptotic performance is also non-quadratic in the best, average, and worst-case scenarios. The functional form of the system runtime is nlog n.
The heap sort algorithm is quite easy to implement. The input vector array is first converted into a max-heap (max_heap
function). Then, the maximum value from the heap is extracted iteratively, and placed at the end of the array, ensuring that the order of the heap remains intact. Consider a vector of length n wherein all the elements are positioned from 1 to n. The first extracted maximum element will be placed in the nth position, the second extracted maximum element will be placed in the (n-1)th position, and so on. The extraction continues till the heap becomes empty.
The following is the R code which implements the recursive form of the heap sort algorithm:
Heap_Sort <- function(V)
{
heapsize <- length(V) ## Initialize with total vector size
for (i in floor(length(V)/2):1)
V <- max_heap(V, i,heapsize) ## Build initial max-heap
for (i in length(V):2) {
temp <- V[i] ## replace ith with 1st element (maximum)
V[i] <- V[1]
V[1] <- temp
heapsize <- heapsize -1 ##Reduce size of input vector
V <- max_heap(V, 1,heapsize) ##Re-build max-heap with reduced input vecto0072
}
return(V)
}
## Following function recursively builds max-heap
max_heap <- function(V, i,heapsize) {
left <- 2*i
right <- 2*i+1
if (left<=heapsize && V[left]>V[i]){ ## build left sub-tree
largest <- left} else{
largest <- i
}
if (right<=heapsize && V[right]>V[largest])
largest <- right ## build right sub-tree
if (largest != i) {
temp2 <- V[largest] ##replace largest with ith element
V[largest] <- V[i]
V[i] <- temp2
V <- max_heap(V, largest,heapsize) ## Recursive run
}
return(V)
}
Figure 5.12 illustrates the step-by-step implementation of the heap sort algorithm. The first step shows the original vector V with 11 elements, which need to be sorted in ascending order. The second step shows the initial max-heap with the largest element in the first node. The third step shows the extraction of the largest element (here, 88). The extracted element is then placed in the last position of the array. The max-heap tree is again built with the new largest element as its first node. The fourth step shows the extraction of the corresponding largest element (here, 65). The extracted element is then placed in the second last position of the array. The max-heap tree is rebuilt with a new largest element as its first node:
Figure 5.12: Step-by-step illustration of heap sort
These steps continue till all the elements from the max-heap tree are extracted and placed in the relevant positions. The final sorted vector is as follows:
Figure 5.13: Final vector positions of heap sort
Now let's analyze the runtime performance of the algorithm assuming a vector of length n. The max-heap recursive function requires θ(n) runtime, and n extractions of the largest element require θ(log n) runtime. Thus, the total runtime of the heap sort algorithm for the best, average, and worst-case scenarios is θ(nlog n).