Handling Any Number of Prizes

One of the primary advantages of the method we've used so far to find the heaviest pumpkin(s) is that we didn't have to save the weights of all the pumpkins as we went along. If we don't mind saving all the weights, then we can solve the “three prize” problem in a different way. Let's assume for the purpose of simplicity that there are only five weights to be saved, in which case the solution looks like this:

1.
Read in all of the weights.

2.
Make a list consisting of the three highest weights in descending order.

3.
Award the first, second, and third prizes, in that order, to the three entries in the list of highest weights.

Now let's break those down into substeps that can be more easily translated into C++:

1.
Read in all of the weights.

a. Read first number

b. Read next number

c. If we haven't read five weights yet, go back to 1b

2.
Make a list consisting of the three highest weights in descending order.

a. Find the largest number in the original list of weights

b. Copy it to the sorted list

c. If we haven't found the three highest numbers, go back to 2a

Oops. That's not going to work, since we'll get the same number each time.[2] To prevent that from happening, we have to mark off each number as we select it. Here's the revised version of step 2:

[2] I realize I'm breaking a cardinal rule of textbooks: Never admit that the solution to a problem is anything but obvious, so the student feels like an idiot if it isn't actually obvious. In reality, even a simple program is difficult to get right, and indicating the sort of thought processes that go into analyzing a programming problem might help demystify this difficult task.

2.
Make a list consisting of the three highest weights in descending order.

a. Find the largest number in the original list of weights

b. Copy it to the sorted list

c. Mark it off in the original list of weights, so we don't select it again

d. If we haven't found the three highest numbers, go back to 2a

3.
Award the first, second, and third prizes, in that order, to the three entries in the list of highest weights.

a. Display first number in the list

b. Display next number in the list

c. If we haven't displayed them all, go back to 3b

Unlike our previous approach, this can easily be generalized to handle any number of prizes. However, we have to address two problems before we can use this approach: First, how do we keep track of the weights? And second, how do we select out the highest three weights? Both of these problems are much easier to solve if we don't have a separate variable for each weight.

The vector Data Type

We often need to keep track of a number of closely related variables in C++ programs, so the standard library provides a type called vector, defined in the header file <vector>. A vector is a variable containing a number of “sub-variables” that can be addressed by position in the vector; each of these sub-variables is called an element. A vector has a name, just like a regular variable, but the elements do not. Instead, each element has a number, corresponding to its position in the vector. For example, we might want to create a vector of short values called Weight, with five elements. To do this, we would write this line: vector<short> Weight(5);. Then, to refer to one of the elements of the vector Weight, say element 3, we could use the expression Weight[3].

The standard library is the result of an immense amount of work by very talented programmers over the course of quite a few years. Whenever possible, in code that you intend to be used over a period of time, you should use the standard library, as you are very unlikely to be able to create better code yourself. And even if you could improve on it somewhat, you would probably spend your time more productively by writing the parts of your program that aren't already taken care of by the standard library.

The Vec Data Type

So you may be surprised to learn that we're not going to use the vector data type, but one called Vec, defined in the header file "Vec.h". I have created this type, which uses the standard library vector to do most of the actual work, following the example of Bjarne Stroustrup in The C++ Programming Language.

Why are Bjarne and I creating our own Vec types rather than using one from the standard library? Because the standard library is designed for professional programmers, not for people who are just learning the language. For this reason, it is oriented more toward efficiency than toward safety. To make the learning process easier and more enjoyable for novices, the Vec type always checks whether an element number (such as 3 in the above example) is valid before trying to use it, which the standard vector does not. We'll see how this helps us later on.

We haven't heard from Susan for awhile, but the following exchange should make up for that.

Susan: OK, why do we need another header (#include "Vec.h")?

Steve: Each header contains definitions for a specific purpose. For example, <iostream> contains definitions that allow us to get information in (I) and out (O) of the computer. On the other hand, “Vec.h" contains definitions that allow us to use Vecs.

Susan: So then using a Vec is just another way of writing this same program, only making it a little more efficient?

Steve: In this case, the new program can do more than the old program could: the new program can easily be changed to handle virtually any number of prizes, whereas the old program couldn't.

Susan: So there is more than one way to write a program that does basically the same thing?

Steve: As many ways as there are to write a book about the same topic.

Susan: I find this to be very odd. I mean, on one hand the code seems to be so unrelentingly exact; on the other, it can be done in as many ways as there are artists to paint the same flower. That must be where the creativity comes in. Then I would expect that the programs should behave in different manners, yet accomplish the same goal.

Steve: It's possible for two programs to produce similar (or even exactly the same) results from the user's perspective and yet work very differently internally. For example, the "vectorized" version of the weighing program, if we had it display only the top two weights, would produce exactly the same final results as the final "non-vectorized" version, even though the method of finding the top two weights was quite different.

Now we can refer to the individual elements of the Vec called Weight by using their numbers, enclosed in square brackets ([ ]); the number in the brackets is called the index. Here are some examples:

Weight[1] = 123;
Weight[2] = 456;
Weight[3] = Weight[1] + Weight[2];
Weight[i+1] = Weight[i] + 5;

As these examples indicate, an element of a Vec can be used anywhere a “regular” variable can be used.[3] But an element of a Vec has an attribute that makes it much more valuable than a "regular" variable for our purposes here: we can vary which element we are referring to in a given statement, by varying an index. Take a look at the last sample line, in which two elements of the Vec Weight are used: the first one is element i+1 and the other is element i.[4] As this indicates, we don't have to use a constant value for the element number, but can calculate it while the program is executing. In this case, if i is 0, the two elements referred to are element 1 and element 0, while if i is 5, the two elements are elements 6 and 5, respectively.

[3] What I'm calling a "regular" variable here is technically known as a scalar variable; that is, one with only one value at any given time.

[4] By the way, if you're wondering how to pronounce Weight[i], it's “weight sub i”. “Sub” is short for subscript, which is an old term for “index”.

The ability to refer to an element of a Vec by number rather than by name allows us to write statements that can refer to any element in a Vec, depending on the value of the index variable in the statements. To see how this works in practice, let's look at Figure 4.6, which solves our three-prize problem.[5]

[5] Note that the #include statement for Vec.h in Figure 4.6 uses "" rather than <> around the file name. The use of "" tells the compiler to search for Vec.h in the current directory first, and then the "normal" places that header files supplied with the compiler are located. This is necessary because Vec.h, in fact, is in the current directory. If we had written #include <Vec.h>, the compiler would look only in the "normal" places, and therefore would not find Vec.h. We will use "" for our header files, and <> for ones supplied with the compiler.

Figure 4.6. Using a Vec (codevect1.cpp)
#include <iostream>
#include "vec.h"
using namespace std;

int main()
{
  Vec<short> Weight(5);
  Vec<short> SortedWeight(3);
  short HighestWeight;
  short HighestIndex;
  short i;
  short k;

  cout << "I'm going to ask you to type in five weights, in pounds." << endl;

  for (i = 0; i < 5; i ++)
   {
   cout << "Please type in weight #" << i+1 << ": ";
   cin >> Weight[i];
   }

  for (i = 0; i < 3; i ++)
    {
    HighestWeight = 0;
    for (k = 0; k < 5; k ++)
       {
       if (Weight[k] > HighestWeight)
           {
           HighestWeight = Weight[k];
           HighestIndex = k;
           }
       }
    SortedWeight[i] = HighestWeight;
    Weight[HighestIndex] = 0;
    }

  cout << "The highest weight was: " << SortedWeight[0] << endl;
  cout << "The second highest weight was: " << SortedWeight[1] << endl;
  cout << "The third highest weight was: " << SortedWeight[2] << endl;

  return 0;
}

If you wish, you can try out this program. First, you have to compile it by following the compilation instructions on the CD. Then type vect1 to run the program under DOS. It will ask you for five weights. After you've entered five weights, the program will sort them and display the top three.

You can also run it under the debugger, by following the usual instructions for that method. When you are asked for a weight, type one in and hit ENTER just as when executing normally. After you've entered 5 weights, the program will start the sorting process.

This program uses several new features of C++ which need some explanation. First, of course, there is the line that defines the Vec Weight:

Vec<short> Weight(5);

As you might have guessed, this means that we want a Vec of five elements, each of which is a short. As we have already seen, this means that there are five distinct index values each of which refers to one element. However, what isn't so obvious is what those five distinct index values actually are. You might expect them to be 1, 2, 3, 4 and 5; actually, they are 0, 1, 2, 3, and 4.

This method of referring to elements in a Vec is called zero-based indexing, as contrasted with the seemingly more natural one-based indexing where we start counting at 1. Although zero-based indexing might seem odd, assembly language programmers find it perfectly natural because the calculation of the address of an element is simpler with such indexing; the formula is “address of an element = (address of first element) + (element number) * (size of element)”.

This bit of history is relevant because C, the predecessor of C++, was originally intended to replace assembly language so that programs could be moved from one machine architecture to another with as little difficulty as possible. One reason for some of the eccentricities of C++ is that it has to be able to replace C as a “portable assembly language” that doesn't depend on any specific machine architecture. This explains, for example, the great concern of the inventor of C++ for run-time efficiency, as he wished to allow programmers to avoid the use of C or assembly language for efficiency.[6] Since C++ was intended to replace C completely, it has to be as efficient as possible; otherwise, programmers might switch back from C++ to C whenever they were concerned about the speed and size of their programs.

[6] Run-time efficiency means the amount of time a program takes to run, as well as how much memory it uses. These issues are very significant when writing a program to be sold to or used by others, as an inefficient program may be unacceptable to the users.

A Historical Example of One-based Indexing

While we're on the topic of zero-based indexing, I'd like to mention a case where the inventors of a commonly used facility should have used zero-based indexing, but didn't. We're still suffering from the annoyances of this one.

Long ago, there was no standard calendar with year numbers progressing from one to the next when January 1st came around. Instead, years were numbered relative to the reign of the current monarch; for example, the Bible might refer to “the third year of Herod's reign”. This was fine in antiquity, when most people really didn't care what year it was. There weren't very many retirement plans or 50th wedding anniversaries to celebrate anyway. However, eventually historians realized that it was a major nuisance to try to calculate the age of someone who was born in the fourth year of someone's reign and died in the tenth year of someone else's. According to Grolier's Multimedia Encyclopedia:

'About AD 525, a monk named Dionysius Exiguus suggested that years be counted from the birth of Christ, which was designated AD (anno Domini, “the year of the Lord”) 1. This proposal came to be adopted throughout Christendom during the next 500 years. The year before AD 1 is designated 1 BC (before Christ).'

The encyclopedia doesn't state when the use of the term BC started, but the fact that its expansion is English rather than Latin is a suspicious sign indicating that this development was considerably later. In any event, this numbering system made matters considerably easier. Now, you could tell that someone who was born in AD 600 and died in AD 650 was approximately 50 years old at death.

Unfortunately, however, there was still a small problem. Zero hadn't yet made it to Europe from Asia when the new plan was adopted, so the new calendar numbered the years starting with 1, rather than 0; that is, the year after 1 BC was 1 AD. While this may seem reasonable, it accounts for a number of oddities of our current calendar:

  1. Date ranges spanning AD and BC are hard to calculate, since you can't just treat BC as negative. For example, if someone were born in 1 BC and died in 1 AD, how old was that person? You might think that this could be calculated as 1 - (-1), or 2; however, the last day of 1 BC immediately preceded the first day of 1 AD, so the person might have been only a few days old.

  2. The 20th century consists of the years 1901 to 2000; the year numbers of all but the last year of that century actually start with the digits 19 rather than 20.

  3. Similarly, the third millennium started on January 1, 2001, not 2000.

The reason for the second and third of these oddities is that since the first century started in 1 AD, the second century had to start in 101 AD; if it started in 100 AD, the first century would have consisted of only 99 years (1-99), rather than 100.

If only they had known about the zero! Then the zeroth century would have started at the beginning of 0 AD and ended on the last day of 99 AD. The first century would have started at 100 AD, and so on; coming up to current time, we would be living through the first years of the 20th century, which would be defined as all of those years whose year numbers started with 20. The second millennium would have started on January 1, 2000, as everyone would expect.[7]

[7] For much more on the history of zero, see Charles Seife's Zero: The Biography of a Dangerous Idea (ISBN 0-670-88457-X).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset