Shell sort

Shell sort (also called diminishing increment sort) is a non-intuitive (real-life) and a non-adjacent element comparison (and swap) type of sorting algorithm. It is a derivative of insertion sorting; however, it performs way better in worst-case scenarios. It is based on a methodology adopted by many other algorithms to be covered later: the entire vector (parent) is initially split into multiple subvectors (child), then sorting is performed on each subvector, and later all the subvectors are recombined into their parent vector.

Shell sort, in general, splits each vector into virtual subvectors. These subvectors are disjointed such that each element in a subvector is a fixed number of positions apart. Each subvector is sorted using insertion sort. The process of selecting a subvector and sorting continues till the entire vector is sorted. Let us understand the process in detail using an example and illustration (Figure 5.5):

Shell sort

Figure 5.5: Illustration of shell sort

Consider a numeric vector V of even length (16 elements) which needs to be sorted in ascending order. Also, let us assume that the subvector split is a multiple of two. Then, the shell sort arrange vector using iterative process as discussed below:

Iteration 1: Split the entire vector V into eight subvectors of two elements each such that each element within a subvector is eight positions apart, and the first element of all subvectors are in sequence, as shown in (i1). Then, perform insertion sorting on each subvector separately.

Iteration 2: Now increase the length of the subvectors by decreasing the splits. Next, split the entire vector V into four subvectors of four elements each such that each element within a subvector is four positions apart, and the first element of all subvectors are in sequence, as shown in (i2).

Similarly, perform iterations till the length of the subvector equals the entire vector, and finally, culminate the sorting with a normal insertion sort of all the elements.

The following R code performs shell sorting on both even and odd length vectors:

Shell_Sort <- function(V,n) { 
  if(n==0) stop("No elements to sort") 
  increment=round(n/2)  ## as.integer 
  while(increment>0) { 
    for(i in (increment+1):n) { 
      temp <- V[i] 
      j=i 
      while(j >= (increment+1)  && V[j-increment] > temp) { 
        V[j] <- V[j-increment] 
        j <- j-increment 
      } 
      V[j] <- temp 
    } 
    if(increment==2) { 
      increment <- 1} else{ 
        increment <- round(increment/2.2) 
      } 
    } 
  return(V) 
}

Shell sort is an improvement over the insertion sort, as the sorting is performed initially on subvectors before being performed on the entire vector. All the intermediate iterations nearly sort the entire vector prior to the final iteration. Now, the cost of iterating a nearly sorted vector is relatively much cheaper than performing insertion sorting on the raw input vector.

Another way of further improving shell sort performance is by increasing the length of the subvectors in the initial iteration. For example, in the preceding example, we started the iterations from two elements in each subvector, which can be increased to three. The advantages of increasing the length of the initial subvector are as follows:

  • The entire vector would be more nearly sorted for the final iteration
  • The number of iterations would reduce

In R, shell short implementation uses gap as 4k+3.2k-1+1 (with prefix of 1 and k≥1) which is a variant from Sedgewick (1986), which has a worst-case scenario of θ(n4/3). The 1 in the prefix is added to ensure sorting yields correct results. Thus, shell sort performs much better than lone insertion sort asymptotically. Shell sort also demonstrates how special properties of other sorting algorithms can be exploited to enhance their existing performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset