Sorting Techniques

This section explains the theory behind some of the most popular sorting techniques used today. Each subsection highlights a specific sorting technique, telling you about its specific characteristics and giving an example of how you could implement the technique. You can use or modify these examples to suit your needs . At the end of the chapter an overview is given of all sorting techniques discussed so you can quickly see which technique is useful for which situation.

In order to keep the theoretical narrative of this section as close to the point as possible, arrays of integers are used as examples of bases to be sorted. This way, the overhead of other implementation issues is minimized. Throughout the text, references are made to the array as thebase of data to be sorted; the value of each element to be sorted is also its sorting key as all elements are integers. Sorting techniques can, of course, be used to sort more complex elements, using one of their fields as a key. Examples of this are given in the sources of this chapter.

Insertion Sort

Insertion sort is a very basic sorting algorithm which is easy to understand and implement. Insertion sort looks at each element of the array and checks whether its value is smaller than that of the previous element. When this is the case, the element has to be moved to some place earlier in the array. In order to find the right place for the element, insertion sort moves back through the array until it encounters an element which is smaller. The element to be moved is placed after this smaller element. All elements between the old and the new place of the moved element are moved one place up. The following example demonstrates this:

Unsorted array: 30 20 40 10

First check:
30    40 10        20 < 30, so search backwards for a place to insert 20
   30 40 10        30 is moved forward and the front of the array is reached
20 30 40 10        20 placed at the front of the array

Second check:
20 30 40 10        40 > 30, nothing to do.

Third check:
20 30 40           10 < 40, so search backwards for a place to insert 10
   20 30 40        40, 30 and 20 are moved forward
10 20 30 40        10 is placed at the front of the array

Note that the first array element can initially be considered sorted. Only when the second element is compared to the first can a decision about the place of either be made. Listing 10.5 shows an example of an implementation of insertion sort.

Code Listing 10.5. Insertion Sort
void InsertionSort(long *data, long n)
{
    long i, j, item;

    // i divides the array into a sorted region, x<i
    // and an unsorted region, x >= i.

    for(i=1;i<n;i++)
    {
        // Select the item at the beginning of the as yet unsorted section.
        item = data[i];

        // If this element is greater than item, move it up one.
        for(j=i; (j > 0) && (data[j-1] > item); j--)
            data[j] = data[j-1];

        // Stopped when data[j-1] <= item, so put item at position j.
        data[j] = item;
    }
}

You can see in the sorting example that insertion sort is an O(n^2) algorithm; in a worst-case scenario each element will cause all the elements before it to be moved once; n*(n-1) is basically n*n, which is n^2. For insertion sort, the average case is also O(n^2).

The pros of insertion sort are

  • Simple implementation

  • High efficiency for small amounts of data (virtually no overhead)

  • Small runtime footprint (in-place)

  • Stable algorithm (original order of identical elements is preserved)

The con of insertion sort is

  • Low efficiency for normal to large amounts of data

Because of its small runtime footprint and high efficiency for small data sets, insertion sort is ideal for augmenting other sorting algorithms. After a size check, an overhead-heavy sorting algorithm can decide to invoke insertion sort for a data set which is too small to justify overhead. By doing so the last drop of performance can be squeezed out of a sorting algorithm.

Bubble Sort

Bubble sort is another fairly straightforward sorting algorithm. Bubble sort compares the first two elements of the array, switches their places when they are in the wrong order, and then moves up one position in the array to compare the following two elements. It does this repeatedly until it has reached the end of the array, and then it goes back to the beginning of the array and starts over. This whole process is repeated until bubble sort can pass over the entire array without switching any elements. With bubble sort the elements bubble step by step to the place they are supposed to be. The following example demonstrates bubble sort:

Unsorted array: 30 10 40 20

First iteration:
10 30 40 20       10 < 30, so switch these elements.
10 30 40 20       40 > 30, so leave second tuple.
10 30 20 40       20 < 40, so switch these elements

Second iteration:
10 30 20 40       30 > 10, so leave these elements
10 20 30 40       20 < 30, so switch these elements
10 20 30 40       20 < 30 and 30 < 40 so leave these elements.

Listing 10.6 shows an implementation of bubble sort.

Code Listing 10.6. Bubble Sort
void BubbleSort(unsigned long data[], unsigned long n)
{
    // Sort array data[] of n-items
    unsigned long i, j;
    bool changes = true;

    // Make max n passes through the array or stop when done.
    for(i=0;(i<n) && (changes == true);i++)
    {
        changes = false;
        for(j=1;j<(n-i);j++)
        {
            // If items are out of order --> exchange them.
            if( data[j-1]>data[j] )
            {
               long dummy = data[j-1];  // SWAP TWO ITEMS
               data[j-1] = data[j];
               data[j] = dummy;
               changes = true;
            }
        }
    }
}

This implementation consists of two loops. The inner loop runs through the data array and switches elements when needed. The outer loop controls the number of times the inner loop traverses the array. Note the double stop condition; in a worst-case scenario the inner loop is executed as many times as there are elements in the array. The outer loop can be preempted, however, when an iteration over the data array proves that the array is sorted—that is, no more elements were switched.

The pros of bubble sort are

  • Simple implementation

  • High efficiency of memory usage (in place)

  • Stable algorithm

The con of bubble sort is

  • Low efficiency—O(n^2)

Without further optimizations, bubble sort is slower than insertion sort. It is presented in this chapter because of its easy implementation and its great popularity.

Shell Sort

Shell sort, invented by Donald L. Shell, is still fairly easy to implement. It is, however, much faster than the previous sorting techniques when larger bases of data are sorted. Shell sort is similar to insertion sort with the difference that it moves data over larger, and variable, spacings. This is good because, in general, elements of an unsorted array are found more than one place removed from where they should be. As you saw in the section "Insertion Sort," insertion sort uses a spacing of 1. This means that two successive elements are compared and, if needed, switched in place. With shell sort you could choose a spacing of 2, for instance. In that case, elements 1 and 3 are compared, instead of elements 1 and 2. After this, elements 4 and 2 are compared, instead of elements 2 and 3. Shell sort is usually used with a different spacing for each time the algorithm iterates the array. The spacing is initially chosen as large and becomes smaller with each iteration until the final iteration through the array is done with a spacing of 1 (this is a normal insertion sort). The idea behind this is that fewer iterations are needed to completely sort the array. The following example demonstrates a shell sort iteration with a spacing of 2, followed by an iteration with a spacing of 1:

Unsorted array: 40 70 20 30 60

First iteration with spacing = 2.
40 70 20 30 60          40 > 20, so switch
20 70 40 30 60          70 > 30, so switch
20 30 40 70 60          40 < 60, no switch

Second iteration with spacing = 1.
20 30 40 60 70          20, 30, 40 are ok, 70 > 60, so these are switched.

Choosing the spacings for your shell sort algorithm is crucial to its performance. Luckily, a lot of research has already been done on different spacing strategies. The following spacing guidelines were proposed by Donald E. Knuth in his book The Art of Computer Programming:

Starting with a spacing of 1, a following spacing is calculated as follows: three times the previous spacing plus one.

This implies a fixed spacing range which is as follows: 1, (3*1+1) = 4, (3*4+1) = 13, (3*13+1) = 40, (3*40 +1) = 121, and so on. Knuth further proposes you choose as a starting spacing from this list the spacing which is two orders below the first spacing that is higher than the number of elements you want to sort. That may sound complicated but it's not that bad. The following example demonstrates:

Spacings list:           1, 4, 13, 40, 121, ..
Number of elements(n):   100

First spacing > n:       121
Starting spacing:        13

Here, 121 is the first spacing in the spacings list which is higher than the number of elements to sort (n = 100). So, as a starting space you choose the spacing which is found two elements below 121 in the spacings list. This a spacing of 13.

A shell sort of 100 elements will use the following order of spacings: 13, 4, 1.

Similarly, a shell sort of 122 elements will use the following order of spacings: 40, 13, 4, 1.

Listing 10.7 shows an implementation of shell sort, using Knuth's spacing advice.

Code Listing 10.7. Shell Sort
void ShellSort(unsigned long data[], long lb, long ub)
{
    long n, i, j, t, h;

    // Calculate largest stepsize.
    n = ub - lb + 1;
    h = 1;
    if (n < 14)
        h = 1;
    else
    {
        while (h < n)
        h = 3*h + 1;
        h /= 3;
        h /= 3;

    }

    while (h > 0)
    {
        // sort-by-insertion with stepsize of h.
        for (i = lb + h; i <= ub; i++)
        {
            t = data[i];
            for (j = i-h; j >= lb && (data[j] > t); j -= h)
                data[j+h] = data[j];
            data[j+h] = t;
        }

        // calculate next stepsize.
        h /= 3;
    }
}

The implementation starts with an if..else construct which determines the initial spacing, based on the concepts you have just seen. After this, a while loop handles the number of array iterations. Note the calculation of the new spacing at the end of the while: h /=3. The for loop inside the while takes care of a single array iteration, sorting the array in a way similar to insertion sort.

The average case sorting time for shell sort with Knuth spacing is O(n^1.25). The worst-case sorting time is O(n^1.5). This may seem like an arbitrary difference, but this is not really the case. For 100 elements, the difference is already 316 : 1000.

The pros of shell sort are

  • Simple to implement and still quite fast.

  • In-place (efficiency).

The con of shell sort is

  • Still there are faster methods.

An important characteristic of shell sort is that its worst-case performance is better than most more advanced sorting methods (like quick sort, as you will see in a later section). For this reason, shell sort is popular for life-critical systems where unexpected delays of O(n^2) can be dangerous.

There are sorting methods that have a better worst-case performance than shell sort; refer to Table 10.2 for an overview on sorting method characteristics.

Heap Sort

As the name suggests, heap sort makes use of a structure called a heap. Before we define exactly what a heap is, here is a quick recap of tree terminology: A tree is a data structure in which elements are related to each other in parent and child relationships. Parents are called nodes. Each child has at most one parent. Each parent can have several children but at most one parent. There is one parent that has no parents of its own; it is called the root. A child that is not a parent itself is called a leaf. Figure 10.1 shows a simple tree structure.

Figure 10.1. A tree structure.


In Figure 10.1, the leaf F has a parent called C which in turn is a child of the root node A.

Each node in a tree is, in fact, the root of a subtree. This is why nodes are sometimes referred to as subtrees. The tree in the previous example has two subtrees; the subtree with root B and the subtree with root C.

A tree is called binary and balanced when it adheres to the following constraints: The children are distributed evenly over the tree with each node having exactly two children. The only possible exception to this is the last node of the tree; this node can have a single child when there are not enough children to go around. Only the lowest nodes in the tree can have leaves as children. The tree in Figure 10.1 is thus a balanced binary tree. A balanced binary tree is said to have the heap property when each node has a value that is larger than that of its children; it can, of course, be equal to one of its children. Figure 10.2 shows some examples of different kinds of trees.

Figure 10.2. Different types of trees.


In the beginning of this section on sorting we promised you that all examples would sort arrays. In order to store a heap in an array, a mapping of the two data types has to be made. This mapping will of course also dictate how heap elements can be found in the array. The following two sections address these issues. Note that throughout this section the small letter n is used to indicate the number of elements in a given data structure.

Mapping a Heap on an Array

You can map a heap onto an array by simply reading it like a piece of text—from left to right, and top to bottom. Figure 10.3 shows this.

Figure 10.3. Heap-to-array mapping.


The top part of Figure 10.3 shows a heap structure which contains the letters A through O. The lower part of Figure 10.3 shows the array that will represent the heap. In order to get from the heap to the array, you read the heap and place its elements in the array in the order of reading: The first line in the heap picture consists of a single letter, the letter A. It is placed at the beginning of the array. The second line in the heap picture contains the letters B and C. They are placed in the following two slots of the array and so on. For extra clarity, the eventual array indexes accompany the elements in both pictures, so A will be stored in array[0], and O will be stored in array[14].

Note

Technically the tree in the previous example is not a heap because the integer value of A is lower than that of B and C instead of higher. This setup is chosen, however, because mapping from the tree to the array is easier to understand this way.


Array Indexing

Navigating the heap is made easy by the mapping chosen earlier in the chapter.

Here are some important access points:

Total number of nodes in the heap is n / 2, and in the example, 15 / 2 = 7. Note that the number is rounded downwards, 7.5 becoming 7.

The last node in the array is (n / 2) –1, and in the example, (15 / 2) – 1 = 6; array[6] = G.

For any node array[x] , the leaves of that node are found at array[2x+1] and array[2x+2] . In the example array[6] = G, its leaves are array[13] = N and array[14] = O.

To use a heap in a sorting algorithm, two different kinds of functions need to be performed. First of all, the initial tree needs to be transformed into a heap. This means moving around elements of the array until each node of the tree has a value higher than that of its (two) leaves. Secondly, the heap needs to be transformed into a sorted array. The following two sections address these issues.

Making an Initial Heap

Listing 10.8 shows the first half of what will become a HeapSort function. It processes an array of integers, called data. It addresses this array as if it represents a balanced binary tree which may or may not yet have the heap property. The code snippet reorders the array elements of data so that each node has a value that is larger than that of its children, effectively creating a heap.

Code Listing 10.8. Heap Sort, Part I
int i, j, j2, k;
int tmp;

// Outerloop: considers all nodes, starting at the last one and
//  working back to the root.
for(k = (n>>1) - 1; k >= 0; k--)
{
    // k points to the node under evaluation.
    // j2+1 and j2+2 point at the children of k.
    tmp = data[k];

    // Innerloop: for each changed node, recursively check the
    //  child node which was responsible for the change.
    for(j = k; (j<<1) <= n-2; j = i)
    {
        // find the largest child of node k.
        j2 = j <<1;
        if (j2+2 > n-1) // only one child
            i = j2+1;
        else
        {
            if (data[j2+1] < data[j2+2])

                 i = j2+2;
            else
                 i = j2+1;
        }

        // i now points to the child with the highest value.
        if (tmp < data[i])
             data[j] = data[i]; // promote child
        else
             break;
    }
    data[j] = tmp;
}

How does Listing 10.8 create a heap? Basically, it looks at each node of the tree and makes it a heap; this means switching the largest child of a node with the node itself whenever the node is not the largest value of the three. The algorithm starts at the last node of the tree and works its way back to the root. This means, for the example heap, the following nodes are initially evaluated: G, F, E, D. These are the nodes in the last line of the heap—that is, nodes with leaves as children. The next node to be evaluated is node C. The children of C are, of course, nodes themselves. When a change is made to node C, one of the subtrees of C needs to be reevaluated. For instance, initially C has the children F and G. G is the highest of the three so elements G and C are switched. Subtree F is unaffected by this but the subtree that was G (and now is C) needs to be reevaluated. This needs to be done recursively, of course. This is exactly what the inner loop of the previous code snippet does; for each node, it checks for changes and recursively checks each subnode responsible for the change. The outer loop simply backtracks through the array from the last node up to the root. When the algorithm finally arrives at the root node, the array represents a heap. Now all that remains is sorting that heap.

Sorting a Heap

Listing 10.9 shows the second half of what will become a HeapSort function. It sorts an array of integers, called data. It addresses this array as if it represents a heap.

Code Listing 10.9. Heap Sort, Part II
// remove roots continuously
for(k=n-1; k>0; k--)
{
    // k points to the (sorted) back of the array.
    //  switch the root with the element at the back.
    tmp = data[k];
    data[k] = data[0];

    // make the array into a heap again.
    for(j = 0; (j<<1) <= k-2; j = i)
    {
       j2 = j<<1;
       if ( j2+2 > k-1 )    // only one child
           i = j2+1;
       else
       {
           if ( data[j2+1] < data[j2+2] )
             i = j2+2;
           else
             i = j2+1;
       }

       // i now points to the child with the highest value.
       if (tmp < data[i])
          data[j] = data[i];     // promote child
       else
          break;
    }
    data[j]  = tmp;
}

How does Listing 10.9 sort a heap? It takes the root node of the heap, which is the largest number in the array, and switches it with the last element in the array; this is the final leaf child of the final node. The ex-root element is now exactly where it needs to be. However, the new heap needs to be reevaluated. This is done in the same way as was done in the inner loop of the code snippet that made the initial heap. When the re-valuation of the heap has been done, the root element is again taken and switched with the final leaf of the final node. The sorted part of the array thus grows from the back of the array to the front.

Pros and Cons of Heap Sort

Heap sort is an O(n log2 n) algorithm (for best and worst case!), which means it is the fastest algorithm we have seen so far. Furthermore, its overhead is quite small, which means it can be used also for small amounts of data.

The pros of heap sort are

  • High efficiency for average and worst case

  • Small runtime footprint (in-place)

The cons of heap sort are

  • Unstable algorithm

  • More complex algorithm than the previous sorting methods discussed

In the companion file 10source03.cpp, a complete HeapSort function can be found which is usable as a template. All that is missing from the snippets in this section to create a complete HeapSort() function is the function header:

void HeapSort(int *data, int n)
{
    // add Part I: Listing 10.8.

    // add Part II:  Listing 10.9.
}

Quick Sort

Quick sort, invented by C. A. R. Hoare, is an O(n^2). It is less complex to implement than heap sort, which is why it can be slightly faster; however, worst-case performance is O(n^2). Precautions can be taken to make worst-case performance as unlikely as possible. Because of its high speed, quick sort has been added to the standard C library. Its name in the library is qsort(). When calling qsort you need to provide your own compare function which qsort will use to determine the larger of two array elements as it sorts through your data. This makes it possible to use the standard implementation for a wide variety of types. The compare function must, of course, conform to the rules specified for arguments and the return value, or qsort will not work. Listing 10.10 shows an example of the use of qsort with a self-made compare function that allows qsort to sort arrays of longs.

Code Listing 10.10. Using the Standard QSort
int compare( const void *arg1, const void *arg2 )
{
/* Return value:
    < 0 elem1 less than elem2
    0 elem1 == elem2
    > 0 elem1 greater than elem2
*/
   return (*(long*) arg1 -  *(long*)arg2);
}

void main()
{
    qsort((void *)longArray, (size_t) n, sizeof(long),  compare);
}

In this example, the first qsort argument is longArray, which is the array of longs that is to be sorted. The second argument is n, which is the number of elements in the array. The third argument is the size of each array element, which is calculated with the statement sizeof(long). The final argument is a pointer to the compare function. Note that the compare function receives two pointers to array elements. Because you, as a user of the qsort function, know which type or array elements you are sorting, you can do a cast to that type in the compare routine. The valid range of return values is included as a comment in the example.

When you want the use a quick sort for only a specific type of data structure, it can be wise to implement your own version of the algorithm. At the very least, you will be able to eliminate the call to the compare function by directly inserting the compare code into the quick sort routine itself. You may be able to do more, depending on you data set. Here is the theory behind quick sort.

The quick sort algorithm sorts a data array in two phases: the partition phase and the sort phase. In the partition phase, the data array is divided into two around what is called a pivot element. The other array elements are pivoted around this pivot element. This means that all the elements which are smaller than the pivot are put together in one part of the array, and all the elements which are larger than the pivot are placed on the other side of the pivot. Here is an example:

Initial array:  1 7 3 9 6 4 8 2 5
Pivot:          array[4] = 6

Pivoting:
1 7 3 9 6 4 8 2 5      7>6 and 5<6, switch
1 5 3 9 6 4 8 2 7      9>6 and 2<6, switch
1 5 3 2 6 4 8 9 7      4<6, switch
1 5 3 2 4 6 8 9 7

Starting from the two outermost boundaries of the array (the first and the last elements), elements are compared with the pivot value. Values which are larger than the pivot and are on its left side are switched with elements which are smaller than the pivot on the right side. Note that the number of mismatches on the left side does not have to coincide with the number of mismatches on the right side. Array evaluation simply continues from both sides until they meet in the middle.

How far this first partition step actually helps you along the way depends on the value of the pivot element which is chosen. The maximum gains are had when the pivot element just happens to be the median of all values in the array. In the preceding example, the element in the middle of the array was chosen but any other element would also have worked as it is the value of the pivot that counts and not its position in the array. There is a danger in choosing the first or the last element in an array as the pivot element, however. When you use quick sort on an array which is already sorted, taking the first or last element as a pivot creates the largest overhead.

After the partition phase has been completed, the sort phase commences. The sort phase consists of two recursive calls to the quick sort function itself. The first call will execute another quick sort on the left half of the partitioned array, and the second call will execute another quick sort on the right half of the array. This way, each half of the array is pivoted once more around a new pivot, after which two more recursive calls can be made. Recursion stops when a quick sort invocation receives an array with size 1. It becomes clear now why choosing the first (or last) element of a sorted array as a pivot creates an unfortunate situation; each recursive call will split the original array into a single element and the rest of the array. This means there will be as many recursive calls as there are elements in the array, resulting in a worst case scenario of O(n^2). Here is a functional description of the quick sort algorithm:

int Partition (int data[], int n)
{
   // Select a pivot element.

   // re-order data[0] till data[n-1] until the following is true:

   //  data[0] till data[pivot-1] <= data[pivot] and
   //  data[pivot+1] till data[n-1] >= data[pivot]

   return pivot; // position of pivot element AFTER re-ordering.
}

void QuickSort(int data[], int n)
{
   if (n > 1)
   {
      int pivot = Partition(data, n);

      if (n < 3) return;

      QuickSort(data, pivot);
      QuickSort(data + pivot + 1, n - pivot - 1);
   }
}

Note that actual sorting can be speeded up by selecting a good pivot value. However, remember that the more complicated you make the method of determining the pivot, the slower the partitioning phase will be, unless, of course, you have specific information on the array you want to sort. Note also the two if statements. The first if statement stops execution for arrays with one element or fewer; there is just no sorting to do with fewer than two elements. The second if statement stops on an array of two elements or fewer. This is possible because after partitioning an array of two elements, no more sorting needs to be done because the two elements will be in the correct order. There is another optimization in this example. Many quick sort implementations used today take three parameters: the data array and an upper and lower bound variable to identify which part of the array is to be sorted. By passing an updated array beginning, and the total number of elements that are left in the part of the array to sort, one parameter has been eliminated. This is interesting because calls to quick sort are made recursively and the less information each recursion places on stack, the smaller the runtime footprint will be. This brings us to another optimization, which is slightly more complicated. The two recursive calls that quick sort makes to itself can be translated into one recursive call and one iteration. Functionally the body of the quick sort function would then look like this:

void QuickSort(int data[], int n)
{
   while(n > 1)
   {
      int pivot = Partition(data, n);

      if (n < 3) return;

      // call quick sort on the second part.
      QuickSort(data + pivot + 1, n - pivot - 1);

      // All that remains to be sorted is first part,
      //  adapt n and restart.
      n = pivot; // n is upperbound + 1!
   }
}

Of course the Partition functionality does not have to be placed in a separate routine. By placing it in the quick sort itself, another function call is eliminated. Also, you can tweak which part of the partitioned array you select for the recursive call and which part for the iteration. Listing 10.11 shows a quick sort function which takes all the mentioned optimizations into account.

Code Listing 10.11. Fast Quick Sort
template <class T> inline void FQuickSort(T *data, long n)
{
   T pivot, tempElem;
   long left, right, tempIndex;

   while(n > 1)
   {

     right = n-1;
     left = 1;

     // switch the pivot with the first array element.
     tempIndex = right / 2;
     pivot = data[tempIndex];
     data[tempIndex] = data[0];

     // Partition the array.
     while (1)
     {
        while (pivot > data[left] && left < right) left++;
        while (pivot <= data[right] && right >= left) right--;
        if (left >= right) break;

        // Switch two array elements.
        tempElem = data[left];
        data[left] = data[right];
        data[right] = tempElem;
        right--; left++;
     }
     // switch pivot back to where it belongs.
     data[0] = data[right];
     data[right] = pivot;

     // Sort phase.
     if (n < 3) return;

     tempIndex = n - right - 1;
     if (right > tempIndex)
     {
        // first part is largest.
        FQuickSort(data + right + 1, n - right - 1);
        n = right; // n is upperbound + 1!
     }
     else
     {
        // second part is largest.
        FQuickSort(data, right);
        data += right + 1;
        n = tempIndex;
     }
  }
}

The quick sort implementation of Listing 10.11 is presented as a template so it can be called for arrays of different types. Calling examples of this are given after a brief walkthrough.

There are only two new items introduced in the FQuickSort function . First of all, the pivot is chosen as the middle element in the array. To make the partition routine less complex, the pivot is switched with the first array element before partitioning. Second, the largest of the partitioned array parts is sent into iteration, the smallest into recursion.

As promised, here is an example of the usage of this quick sort implementation:

int   integerArray[] = { 100, 1001, 24, 317 , 314, 2000, 2009, 7009} ;
short shortArray[]   = { 10, 20, 11, 317 , 314, 510, 12, 709} ;

FQuickSort(integerArray, 8);
FQuickSort(shortArray, 8);

The advantage of quick sort is

  • High efficiency for average case

The cons of quick sort are

  • Very low efficiency for worst case

  • Large runtime footprint

  • Unstable algorithm

Quick sort averages on O(nlog2 n) with a worst case of O(n^2). Note that although quick sort does the sorting of its data in-place, it is not, in fact, an in-place algorithm as its stack usage can run up pretty high on large data sets.

As promised, the accompanying sources on the Web site contain sorting examples which sort more than just arrays of integers, and comparisons are made between sorting with a quick sort, a radix sort, and the standard qsort :

10source04.cpp Sorts arrays pointers to strings
10source05.cpp Sorts records by using operator overloading

At the end of the following section, more will be said about these sources.

Radix Sort

Radix sort takes a whole different approach to sorting. Radix sort no longer compares sortable items with each other; the essence of radix sort is that it sorts according to the value of subelements. For instance, when the elements to sort are of type integer, radix sort will iterate the array four times, once for each byte of an integer. During the first iteration, it sorts the elements of the array according to the value of the least significant byte, and it ends up sorting elements according to the value of the most significant byte during the fourth iteration. Here is a numeric example:

Original   Iteration Iteration   Iteration Iteration
array:         I         II         III        IV

1005         2204       2204       1005       1005
2204         1005       1005       2037       1508
2037         2037       1508       2204       2037
1508         1508       2037       1508       2204

The first column of the preceding example shows a random array. The second column shows what the array looks like after the first iteration. During this first iteration, the numbers are ordered according to the value of the least significant digit. Radix sort thus focuses on the following values during the first iteration: 5, 4, 7, 8. It places them in the order 4, 5, 7, 8. The second iteration looks only at the following digit, recognizing the values 0, 0, 3, 0. The third iteration recognizes the values 0, 0, 2, 5 and the fourth iteration recognizes the values 1, 2, 2, 1. As this example demonstrates, the number of iterations that radix sort need to make in order to sort an array of elements depends not on the number of elements to be sorted, but on the size of a single element. This means that radix sort is an O(m*n) algorithm—where m is the number of subelements. Sorting a data type of two bytes can be done in two iterations.

A radix sort implementation needs enough space for a second array. This is because radix loop iterations copy all the elements from one array to the other, placing the elements at a different position in the second array. However, a following iteration will copy all the elements back to the first array, so only two arrays are ever needed. But how does radix sort know where in the second array to place the copied elements? In order to do this, it needs to divide the new array into sections. There should be a section for every possible value of the subelement which it focuses on.

In Figure 10.4, the array elements consist of two subelements: the first and second digit. Two iterations are needed to sort this kind of element. The first iteration will focus only on the least-significant digit. Before this iteration starts, the occurrences of each value of the least- significant digit are counted. The results are two occurrences of the value 0, two occurrences of the value 1, and a single occurrence of the value 6. With this information the destination array can be divided into subsections. An index is created for each subsection, pointing to the first available space in that subsection. Now the source array is iterated over and each source value is copied to the appropriate subsection in the destination array. Note that when a value is copied to a subsection, the index of that subsection must be increased by 1. This way an index will always point to the next available space.

Figure 10.4. Radix sort for two subelements.


The second iteration will focus only on the most significant digit. It finds the following occurrences of that digit: a single occurrence of the value 2, two occurrences of value 3, and another two occurrences of value 4. Three subsection indexes are set accordingly and the elements are copied back to the original array (which has become the destination for this iteration).

Note that radix sort must always traverse the source array from the least-significant digit or element to the most-significant digit or element, which ensures that the ordering created by a previous iteration is maintained. For example, in the first iteration of the preceding example, the values { 46, 41} are placed in the order { 41, 46} in relation to each other. In the second iteration, these values come together in the same subsection. If this second iteration had started at the back of the destination array, these values would have been switched again, because the value 46 would have been found and copied before the value 41 would be found. This would cause an unsorted output. This is also where endian-ness starts to play a role. On Intel machines, which use little-endian byte order, the number 0x01020304 is placed in memory as follows: 0x04030201. On big-endian machines, such as 68xx architectures and MIPS, this number is placed in memory as: 0x01020304. Simply put, this means that finding the least and most significant bytes may have to be done differently on different machines. With the cpu_info() function found in the booktools, you can determine the endian-ness of a target machine. Note that endian-ness only plays a role for base types. For strings, for instance, the endian-ness of a target machine is of no consequence because strings are placed in memory byte for byte. Listing 10.12 shows an implementation of radix sort.

Code Listing 10.12. Radix Sort, Part I
template <class T>
inline void radix(int byte, unsigned long n, T *source, T *dest)
{
    unsigned long count[256], i, c, s=0;
    unsigned char *q;
    int size = sizeof(T);

    // Create array of sub section indexes.
    memset(&count[0], 0x0, 256*sizeof(unsigned long));

    // Count occurrence of every byte value in T
    q = (unsigned char *)source;
    for (i = 0; i < n; i++)
    {
       count[q[byte]]++;
       q+=size;
    }

    // Create indexes from counters.
    for (i = 0; i < 256; i++)
    {
         c = count[i];
         count[i] = s;
         s += c;
    }

    // Copy elements from the source array to the destination array.
    q = (unsigned char *)source;
    for (i = 0; i < n; i++)
    {
         dest[count[q[byte]]++] = *(T*)q;
         q+=size;
    }
}

This implementation is presented as a template so it can be called for arrays of different types. Calling examples of this are given after a brief walkthrough.

Note that because computers work easier with bytes than with digits 0–9, each iteration will look at a single byte of an array element. There will, of course, be at most 256 indexes, one for each value of a byte. The array count is used to store the number of occurrences of each subelement value. The memset command sets the initial values of this array to 0. After the occurrences of each byte value have been counted, indexes must be created. The value of a subsection index is the sum of all preceding counts. The final loop in this implementation copies array elements from the source to the destination array. You will notice that this implementation does, in fact, only a single loop iteration; it receives the position of the subelement to focus on through the argument byte . Thus the function radix() must be called once for each subelement in an array. The other arguments that this function receives are n , denoting the number of elements in the array; source , which is the source array; and dest , which is the destination array. Listing 10.13 shows a function that will call radix() for every subelement of an array.

Code Listing 10.13. Radix Sort, Part II
#define L_ENDIAN // or don't define L_ENDIAN.

template <class T> void RadixSort (T *data, unsigned long n)
{

    int i, elemsize = sizeof(T);
    T *temp = (T*)malloc (n * elemsize);

    if (temp != NULL)
    {
#ifdef L_ENDIAN
        for (i=0; i < elemsize-1; i+=2)
        {
           radix(i, n, data, temp);
           radix(i+1, n, temp, data);
        }
        if (elemsize & 1)    // odd elemsize  -> one more to go?
        {
            radix(elemsize-1, n, data, temp);
            memcpy(data, temp, n * elemsize);
        }
#else
        for (i = elemsize-1; i > 0; i-=2)
        {
           radix(i, n, data, temp);
           radix(i-1, n, temp, data);
        }
        if (elemsize & 1)    // odd elemsize  -> one more to go?
        {
                  radix(0, n, data, temp);
            memcpy(data, temp, n * elemsize);
        }

#endif
       free (temp);
    }
}

As promised, here is an example of the usage of this radix sort implementation:

int   integerArray[] = { 100, 1001, 24, 317 , 314, 2000, 2009, 7009} ;
short shortArray[]   = { 10, 20, 11, 317 , 314, 510, 12, 709} ;

RadixSort(integerArray, 8);
RadixSort(shortArray, 8);

As was mentioned before, radix sort is an O(m*n) algorithm. The number of elements in an array determines how long an iteration will take, but the number of subelements determines how many iterations are needed.

The pros of radix sort are

  • High efficiency for average-size data elements

  • Fairly easy to implement

The cons of radix sort are

  • Low efficiency for large data elements

  • Large runtime footprint

  • Unstable algorithm

For data types of which the size is a multiple of two bytes, radix sort could be tweaked to look at a word per iteration instead of a byte.

Because sorting time for radix sort is a function of the key length—O(m*n)—and for quick sort it is a function of the number of elements—O(n log2 n)—it would seem that radix sort would be faster than quick sort when m < log2 n . However, this is too simply stated as both methods have a different kind of overhead. The real pivot at which implementations of both sorting techniques overtake each other can be determined by doing speed tests with different element lengths. The following two files that can be found on the Web site do this for different string lengths:

10source04.cpp Sorts an array of pointers to strings
10source05.cpp Sorts records using operator overloading

You should note that for a sufficiently large amount of records and a sufficiently small key length, radix sort is faster than quick sort.

Merge Sort

Merge sort is not really a sorting algorithm; rather, it is an algorithm that merges different sorted bases together. Its obvious use is that of merging two previously separate databases of the same type together. It can, however, also be used to sort a database that does not fit into memory in its entirety. This means that merge sort can help overcome serious footprint issues.

In order to sort a database that does not fit into memory, you first have to divide it up into smaller chunks that can be fit into memory. Let's say you keep your (un)sorted data on some kind of external disk. Each chunk needs to be loaded from the disk, sorted (using whatever algorithm you deem best), and stored back on the disk. Now you have a collection of sorted chunks. This is where merge sort comes into the picture. Merge sort takes two or more of these sorted chunks and merges them into a single chunk, which is also sorted. Merge sort keeps merging chunks of sorted data until a single, sorted output is created.

A single merge has to reserve memory on the disk for its output file. This output file will be as large as the sum of the sizes of the selected chunks that will be merged. The space that is avail able on the disk thus determines which, and how many, chunks to select for a single merge step. The merging itself consists of taking the first elements of each of the selected chunks and placing the smallest of those in the output file. This is done repeatedly until all elements of the selected chunks have been placed in the output file.

At this point, the selected chunks can be deleted, creating space on the disk. This space can be used for the following merge step. Figure 10.5 shows a possible two-step merging of selected chunks.

Figure 10.5. Two-step merge sort.


In Figure 10.5, there are four sorted chunks that need to be merged into a single database (or output file): A, B, C, and D. The first merge creates intermediate output files containing all elements of A+B and C+D. The second merge creates a single output file containing all elements of A+B+C+D.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset