Arrays

One of the most straightforward data structures is the array. Using one-dimensional arrays for storing static data sets or data sets with a predictable size causes few problems. And it is footprint-efficient too; no extra memory is needed for storing pointers to other elements of the data set and fragmentation does not occur during the addition and deletion of elements. Using arrays for dynamic data sets is a different matter, however, as you have already seen in Chapter 9, "Efficient Memory Management." This section, therefore, looks at how efficient arrays actually are when it comes to inserting, deleting, searching, traversing, and sorting data sets.

The file 11Source01.cpp that can be found on the Web site compares the performance of arrays against that of other data structures. There is a section at the end of this chapter that presents the timing results of these tests.

Inserting

Adding elements to arrays can be tricky at times. Basically, there are two situations in which to add elements to an array: when there is still enough space left in the array and when there is not enough space left in the array. The following sections take a closer look at performance and footprint implications of these two situations.

Inserting Elements into an Array That Is Not Full

When an array is not full yet, all elements found after the place where the new element is to be inserted must be moved up one position. In the worst case, the new element must be placed at the front of the array (array[0]) and all other elements need to be moved. This results in an O(n) operation. In the best case, the new element must be placed at the back of the array resulting in an O(1) operation. This proves that adding sorted elements to an array is wildly efficient. Furthermore, when elements are read from a sorted storage—a database file, for instance—the whole file can be read as a single block and placed into memory. Chapter 12, "Optimizing IO," which deals with efficient IO usage, will come back to this.

Inserting Elements into an Array That Is Full

When you need to add an element to an array that is full, you are often out of luck. It is often not possible for you to reserve more memory at the back of an existing block or array. This means that to increase the number of elements that will fit in the array, a new array probably needs to be allocated. All the elements from the old array need to be copied to this new array, and the new element itself needs to be added to the new array also—at least an O(n) operation. This is a lot of work and uses quite a lot of memory, because for a short time both the old and the new array exist simultaneously. When this is not possible because of footprint constraints, it means that parts of the original array need to be transferred to some external storage and back again in order to successfully create the new array. It seems unlikely that you will want to do this for each element that needs to be added to a full array so, in practice, a full array will be extended with a number of empty spaces. An example of this was given in Chapter 10, "Blocks of Data." It is worth thinking about whether or not you can predict the number of elements that will be—or are likely to be—added.

Deleting

Deleting elements from an array is very similar to inserting elements, as far as performance and footprint issues are concerned. Keeping a sorted array without any holes in it involves moving all the elements after the deleted element down a space in the array. In the worst case, this is again an O(n) operation; in the best case, it is an O(1) operation. Resizing the array to take up less memory as it shrinks may again involve a costly copy action. An idea here may be to mark items as deleted and postpone actual removal of elements until a good number of elements can be removed in one go.

Searching

A great advantage of arrays is that they are random access. No extra pointers or indexes, or even traversing, is necessary when you know the number of the element you want (a = array[5];). This is an O(1) operation. When you need to find out if a particular element is part of an array, you have a worst-case searching time of O(n) when the array is not sorted. This is because you simply must look at the elements one at a time; when you are out of luck, the element you are looking for is either not in the array or it just happens to be the last element of the array that you check. For sorted arrays, searching is a much happier business. You could opt to use a binary search to quickly locate an element. With a binary search, you recursively check the value of the search element against a value that lies between two boundaries inside the array. In a single check, the search value is checked against the value of the element in the middle of the array. When the search value is greater than the value in the middle, a binary search is started on the second half of the array. When the search value is smaller than the value in the middle, a binary search is started on the first half of the array. Here is an example:

Search element: 453.
Array:          001 003 005 007 011 099 207 305 407 453 599 670 700 803 999
n:              15

Step 1: Compare 453 against the value of the middle element (which is array[n/2] = 305). 453 > 305, so do another binary search on the second half of the array. For this new binary search, the array to focus on is a subset of the original: the lower boundary (lb) = 8, the upper boundary (ub) is 15, and n' = 14 – 8 + 1 = 7.

Step 2: Compare 453 against the value of the middle element (array[((ub-lb+1)/2)+lb] = 670). 453 < 670, so do another binary search on the first half of this new array. For this new binary search, the array to focus on is a subset of the previous: the lower boundary = 8, the upper boundary is 10, and n" = 10 – 8 + 1 = 3.

Step 3: Compare 453 against the value of the middle element (array[((ub-lb+1)/2)+lb] = 453). 453 = 453—game, set, and match.

Listing 11.1 shows an implementation of a binary search algorithm.

Code Listing 11.1. Binary Search Algorithm
long binsearch(int item, int data[], long lb, long ub)
{
   while(1)
   {
      long n = ub-lb;  // n = number of elements - 1

      // When there are 1 or 2 elements left to check.
      if (n <= 1)
         if (item == data[lb])
            return lb;
         else if (item == data[ub])
            return ub;
         else
            return -1;

      // Where there are more than 2 elements left to check.
      long middle = ((n+1)/2) + lb;
      int mItem = data[middle];

      if (item == mItem)
         return middle;

      if(item < mItem)
         ub = middle-1;
      else
         lb = middle+1;
   }
}

This binsearch function is utilized in the file 11Source01.cpp that can be found on the Web site.

binsearch() takes as input parameters the item to search for (item), a pointer to the first element in the array (data), the index of the first element (lb), and the index of the last element (ub). Usage is as follows:

int array[] = { 1,2,5} ;
long      n = 3;
int    item = 2

long a = binsearch(item, array, 0, n-1);

Finding an array element using a binary sort is an O(log2 n) operation. It is very similar to searching a balanced binary tree, as you will see later on.

Traversing

Traversing an array is pretty straightforward. An incrementing counter as array index will do the job nicely (array[i++]). Furthermore, arrays can be traversed in both directions (forward and backward) without special functionality. It does not matter on which array element you start and traversing can easily wrap from the end of an array back to the front: (i++; i %= n;).

Sorting

Arrays can be sorted very efficiently, as you learned in Chapter 10.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset