Introduction to Polymorphism

To select the correct function to be called based on the actual type of an object at run time, we have to use polymorphism. Polymorphic behavior of our StockItem and DatedStockItem classes means that we can (for example) mix StockItem and DatedStockItem objects in a Vec and have the right Reorder function executed for each object in the Vec.

Susan wanted to know the motivation for using polymorphism here:

Susan: Why would you want to handle several different types of data as though they were the same type?

Steve: Because the objects of these two classes perform the same operation, although in a slightly different way, which is why they can have the same interface. In our example, a DatedStockItem acts just like a StockItem except that it has an additional data field and produces different reordering information. Ideally, we would be able to mix these two types in the application program without having to worry about which class each object belongs to except when creating an individual item (at which time we have to know whether the item has an expiration date).

Susan: Yes, but I don't understand why we need to do this in the first place. Why don't we just have two Vecs, one for the StockItem objects and one for the DatedStockItem objects?

Steve: Yes, it would be possible to do that. But it would make the program more complicated and wouldn't allow for adding further derived classes in a simple way. Imagine how messy the program would be if we had 10 derived classes instead of just one!

Susan also had a question about the relationship between base and derived classes:

Susan: What do the base and derived classes share besides an interface?

Steve: The derived class contains all of the member variables of the base class and can access those member variables or call any of the member functions of the base class, if they are public or protected. Of course, the derived class can also add whatever new member functions and member variables it needs.

However, there is a serious complication in using polymorphism: we have to refer to the objects via a pointer rather than directly.[1] While C++ does have a “native” means of doing this, it exposes us to all the dangers of pointers, both those that you're already acquainted with and others that we'll get to later in this chapter.

[1] We could also use a reference, as we'll see in the implementation of the << and >> operators. However, that still wouldn't provide the flexibility of using real objects. In particular, you can't create a Vec of references.

Susan wanted more details on why pointers are dangerous; here's the first installment of our discussion of this point.

Susan: You keep saying that pointers are dangerous; what do they do that is so dangerous?

Steve: It's not what they do but what their users do: mostly, create memory leaks and dangling pointers (which point to memory that has already been freed).

Susan: So pointers are dangerous because it is just too easy to make mistakes when you use them?

Steve: Yes. In theory, pointers are fine, which is probably why they're so popular in computer science courses. In practice, however, they are very error-prone.

The ideal solution to this problem is to confine pointers to the interior of classes we design so that we can keep track of them ourselves and let the application programmer worry about getting the job done. As it happens, this is possible; thus, we can obtain the benefits of polymorphism without exposing the application programmer (as opposed to the class designers; i.e., us) to the hazards of pointers. We'll see how to do that later in this chapter.

But before investigating that more sophisticated method of providing polymorphism, we need to understand the workings of the native polymorphism mechanism in C++. As we saw in Chapter 9, the address of a derived class object can be assigned to a pointer declared to be a pointer to a base class of that derived class. While this does not by itself solve the problem of calling the correct function in these circumstances, there is a way to get the behavior we want. If we define a special kind of function called a virtual function and refer to it through a pointer (or a reference) to an object, the version of that function to be executed will be determined by the actual type of the object to which the pointer (or reference) refers, rather than by the declared type of the pointer (or reference). This implies that if we declare a function to be virtual, when a function with that signature is called via a base class pointer, the actual function to be called is selected at run time rather than at compile time, as happens with non-virtual functions. Clearly, if the actual run-time type of the object determines which version of the function is called, the compiler can't select the function at compile time.

Because the determination of the function to be called is delayed until run time, the compiler has to add code to each function call to make that determination. This code uses a construct called a vtable to keep track of the locations of all the functions for a given type of object so that the compiler-generated code can find the right function when the call is about to be executed.

As you might imagine, Susan had some questions about this notion of virtual function calls. Here's the beginning of that discussion:

Susan: I don't understand how the function to be executed is selected.

Steve: The mechanism depends on whether it is a virtual function. If not, the linker can figure out the exact address of the function when it is linking the program, because the type of the pointer (which is known at compile time) is used to determine which function will be called. On the other hand, with a virtual function declaration, the function to be executed depends on the actual type of the object pointed to rather than the type of the pointer to the object; since that information can't be known at compile time, the linker can't make the determination of which function to call. Therefore, in such cases, the compiler sticks code in the executable program that figures it out at run time by consulting the vtable for the particular type of object the base class pointer refers to.

The virtual Keyword

But exactly how does this help us with our Reorder function? Let's see how a virtual function affects the behavior of our final example program from Chapter 9 (nvirtual.cpp, Figure 9.39). Figure 10.1 shows the same interface as before, except that StockItem::Reorder isdeclared to be virtual.[2] Because the current test program (virtual.cpp) and implementation file (itemb.cpp) are almost identical to the final test program (nvirtual.cpp) and implementation file (itema.cpp) in Chapter 9, differing only in that the new ones #include "itemb.h" rather than "itema.h", I haven't reproduced the new versions of those files.

[2] You will notice that the virtual declaration for Reorder is repeated in DatedStockItem — this is optional. Even if you don't write virtual again in the derived class declaration of Reorder, it's still a virtual function in that class; the rule is “once virtual, always virtual”. Even so, I think it's clearer to explicitly state that the derived class function is virtual, so that's how I will show it in this book.

If you printed out the corresponding files from the previous chapter, you might just want to mark them up to indicate these changes. Otherwise, I strongly recommend that you print out the files that contain this interface and its implementation, as well as the test program, for reference as you go through this section of the chapter; those files are itemb.h, itemb.cpp, and virtual.cpp, respectively.

Figure 10.1. Dangerous polymorphism: Interfaces of StockItem and DatedStockItem with virtual Reorder function (codeitemb.h)
// itemb.h

class StockItem
{
public:
  StockItem(std::string Name, short InStock, short MinimumStock);
virtual void Reorder(std::ostream& os);

protected:
  std::string m_Name;
  short m_InStock;
  short m_MinimumStock;
};

class DatedStockItem: public StockItem // deriving a new class
{
public:
  DatedStockItem(std::string Name, short InStock, short MinimumStock,
  std::string Expires);

virtual void Reorder(std::ostream& os);

protected:
static std::string Today();

protected:
 std::string m_Expires;
};

Figure 10.2 shows the output of the new test program.

Figure 10.2. virtual function call example output (codevirtual.out)
StockItem::Reorder says:
Reorder 68 units of soup

DatedStockItem::Reorder says:
Return 10 units of milk
StockItem::Reorder says:
Reorder 15 units of milk

StockItem::Reorder says:
Reorder 70 units of beans

DatedStockItem::Reorder says:
Return 22 units of ham
StockItem::Reorder says:
Reorder 30 units of ham

DatedStockItem::Reorder says:
Return 90 units of steak
StockItem::Reorder says:
Reorder 95 units of steak

Notice that the output of this program is exactly the same as the output of the previous test program (Figure 9.39 on page 643), except for the last entry. With the non-virtual Reorder function in the previous program, we got the following output:

StockItem::Reorder says:
Reorder 5 units of steak

whereas with our virtual Reorder function, we get this output:

DatedStockItem::Reorder says:
Return 90 units of steak

StockItem::Reorder says:
Reorder 95 units of steak

According to our rules, the correct answer is 95 units of steak because the stock has expired, so the program that uses the virtual Reorder function works correctly while the previous one didn't. Why is this? Because when we call a virtual function through a base class pointer, the function executed is the one defined in the class of the actual object to which the pointer points, not the one defined in the class of the pointer.

To see how this works, let's start by looking at the way in which the layout of an object with virtual functions differs from that of a “normal” object. First, Figure 10.3 shows a possible memory representation of a simplified StockItem without virtual functions.

Figure 10.3. A simplified StockItem object without virtual functions


One of the interesting points about this figure is that there is no connection at run time between the StockItem object and its functions. Such a connection is unnecessary because the compiler can tell exactly which function will be called whenever a function is referenced for this object, whether directly or through a pointer, and therefore can provide the linker with enough information to generate a call directly to the appropriate function.

The situation is different if we have virtual functions. In that case, the compiler can't determine exactly which function will be called for an object pointed to by a StockItem* because the actual object may be a descendant of StockItem rather than an actual StockItem. If so, we want the function defined in the derived class (e.g., DatedStockItem) to be called even though the pointer is declared to point to an object of the base class (e.g., StockItem).

Since the actual type of the object for which we want to call the function isn't available at compile time, another way must be found to determine which function should be called. The most logical place to store this information is in the object itself, because after all, we need to know where the object is in order to call the function for it. In fact, an object of a class for which any virtual functions are declared does have an extra data item in it for exactly this purpose. So whenever a call to a virtual function is compiled, the compiler translates that call into instructions that use the information in the object to determine at run time which version of the virtual function will be called.

Here's the next installment of my discussion with Susan on the topic of virtual functions:

Susan: So, is a virtual function polymorphism?

Steve: Not quite. You need virtual functions to implement polymorphism in C++, but they're not the same thing.

Susan: Where in the definition of Reorder does it say it's virtual? The implementation file is the same as it was before.

Steve: It's in the declaration of Reorder in the interface of the StockItem class in the itemb.h header file: virtual void Reorder(ostream& os);. I've also repeated it in the derived class function declaration even though that's not strictly necessary. After a function is declared as virtual in a base class, we don't have to say it's virtual in the derived class or classes; the rule is “once virtual, always virtual”.

If every object needed to contain the addresses of all its virtual functions, objects might be a lot larger than they would otherwise have to be. However, this is not necessary because all objects of the same class have the same virtual functions. Therefore, the addresses of all of the virtual functions for a given class are stored in a virtual function address table, or vtable for short, and every object of that class contains the address of the vtable for that class.

Given this description of the vtable, if we make the Reorder function virtual, a StockItem object will look something like Figure 10.4, and a DatedStockItem will resemble Figure 10.5.[3]

[3] Please note that the layout of this figure and other similar figures has been simplified by the omission of the details of the m_Name field, which actually contains a pointer to the data of the string value of that field.

Figure 10.4. Dangerous polymorphism: A simplified StockItem object with a virtual function


Figure 10.5. Dangerous polymorphism: A simplified DatedStockItem object with a virtual function


Susan had some more questions about vtables:

Susan: Are vtables customized for each class?

Steve: Yes.

Susan: Where do they come from, how are they created, and how do they do what they do?

Steve: The linker creates them based on instructions from the compiler after the compiler examines the class definition. All they do is store the addresses of the virtual functions for that class so that the compiler can generate code that will select the correct function for the object being referred to at run time.

Susan: How is this different from derivation?

Steve: It's part of making derivation work correctly when we want to use pointers to the base class, and mix base and derived class objects in our program.

Susan: I don't get this vtable stuff. Does it just point the Reorder function in the proper direction at run time?

Steve: Not exactly. It allows the program to pick the correct Reorder function at run time.

Susan: This stuff is beyond “UGH!”. It is just outrageous.

Steve: It wasn't that easy for me either. Acquiring a full understanding of virtual functions is one of the major milestones in learning C++, even for programmers with substantial experience in other languages.

Now that we have declared Reorder as a virtual function, let's see how this affects the operation of the function call examples we saw in Chapter 9 (Figures 9.36 through 9.38). First, Figure 10.6 shows how a virtual (i.e., dynamically determined) function call works when Reorder is called for a StockItem object through a StockItem pointer such as SIPtr.

Figure 10.6. Dangerous polymorphism: Calling a virtual Reorder function through a StockItem pointer to a StockItem object


The net result of the call illustrated in Figure 10.6 is the same as that illustrated in Figure 9.36: StockItem::Reorder is called, which is correct in this situation. Next, Figure 10.7 shows a virtual call for a DatedStockItem object through a DatedStockItem pointer.

Figure 10.7. Dangerous polymorphism: Calling a virtual Reorder function through a DatedStockItem pointer to a DatedStockItem object


Again, the net result of the call illustrated in Figure 10.7 is the same as that illustrated in Figure 9.37: DatedStockItem::Reorder is called. This is correct in this situation. Finally, Figure 10.8 shows a virtual call for a DatedStockItem object through a StockItem pointer.

Figure 10.8. Dangerous polymorphism: Calling a virtual Reorder function through a StockItem pointer to a DatedStockItem object


Figure 10.8 is where the virtual function pays off. The correct function, DatedStockItem::Reorder, is called even though the type of the pointer through which it is called is StockItem*. This is in contrast to the result of that same call with the non-virtual function, illustrated in Figure 9.38. In that case, StockItem::Reorder rather than DatedStockItem::Reorder was called.

Susan had a question about those last few example programs:

Susan: I didn't see where you ever deleted the memory for those pointers. Wouldn't that cause a memory leak?

Steve: Oops, you're right. That's a good example of how easy it is to misuse dynamic memory allocation!

What happens if we add another virtual function, Write, for instance, to the StockItem class after the Reorder function? The new virtual function will be added to the vtables for both the StockItem and DatedStockItem classes. Then the situation for a StockItem object might look like Figure 10.9, and the situation for a DatedStockItem might look like Figure 10.10.

Figure 10.9. Dangerous polymorphism: A simplified StockItem object with two virtual functions


Figure 10.10. Dangerous polymorphism: A simplified DatedStockItem with two virtual functions


As you can see, the new function has been added to both vtables, so a call to Write through a base class pointer will call the correct function.

To translate this virtual function mechanism into what I hope is understandable English, we can express the call to the virtual function Write in the line SIPtr->Write(cout); as follows:

  1. Get the vtable address from the object whose address is in SIPtr.

  2. Since we are calling Write through a StockItem*, and Write is the second defined virtual function in the StockItem class, retrieve the address of the Write function from the second function address slot in the vtable for the actual object that the StockItem* points to.

  3. Execute the function at that address.

By following this sequence, you can see that while both versions of Write are referred to via the same relative position in both the StockItem and the DatedStockItem vtables, the particular version of Write that is executed depends on which vtable the object refers to. Since all objects of the same class have the same member functions, all StockItem objects point to the same StockItem vtable and all DatedStockItem objects point to the same DatedStockItem vtable.

Susan had some questions about adding a new virtual function:

Susan: What do you mean by “added to both vtables”? Do StockItem and DatedStockItem each have their own?

Steve: Yes.

Susan: How does the vtable get the address for the new StockItems?

Steve: It's the other way around. Each StockItem, when it's created by the constructor, has its vtable address filled in by the compiler automatically.

Problems with Using Pointers for Polymorphism

Unfortunately, it's not quite as simple to make polymorphism work for us as this might suggest. As is so often the case, the culprit is the use of pointers. To see how pointers cause trouble with polymorphism, let's start by adding the standard I/O functions, operator << and operator >>, to our simplified interface for the StockItem and DatedStockItem classes. Figure 10.11 shows a test program illustrating how we can use these new functions, Figure 10.12 shows the output of the test program, and Figure 10.13 shows the new version of the interface. I strongly recommend that you print out that header file and the test program for reference as you leaf through this section of the chapter; the latter file is polyioa.cpp.

Figure 10.11. Dangerous polymorphism: Using operator << with a StockItem* (codepolyioa.cpp)
#include <iostream>
#include "Vec.h"
#include "itemc.h"
using namespace std;

int main()
{
   Vec <StockItem*> x(2);

   x[0] = new StockItem("3-ounce cups",71,78);

   x[1] = new DatedStockItem("milk",76,87,"19970719");

   cout << "A StockItem: " << endl;
   cout << x[0] << endl;
   cout << "A DatedStockItem: " << endl;
   cout << x[1] << endl;

   delete x[0];
   delete x[1];

   return 0;
}

Figure 10.12. Result of using operator << with a StockItem* (codepolyioa.out)
A StockItem:
0
3-ounce cups
71
78

A DatedStockItem:
19970719
milk
76
87

Figure 10.13. Dangerous polymorphism: StockItem interface with operator << and operator >> (codeitemc.h)
class StockItem
{
friend std::ostream& operator << (std::ostream& os, StockItem* Item);
friend std::istream& operator >> (std::istream& is, StockItem*& Item);

public:
  StockItem(std::string Name, short InStock, short MinimumStock);
virtual ~StockItem();

virtual void Reorder(std::ostream& os);
virtual void Write(std::ostream& os);

protected:
  std::string m_Name;
  short m_InStock;
  short m_MinimumStock;
};

class DatedStockItem: public StockItem
{
public:
  DatedStockItem(std::string Name, short InStock,
   short MinimumStock, std::string Expires);

virtual void Reorder(std::ostream& os);
virtual void Write(std::ostream& os);

protected:
static std::string Today();

protected:
  std::string m_Expires;
};

Susan had some questions about the StockItem::~StockItem destructor declared in this latest version of the interface.

Susan: Why do we need a destructor for StockItem now, when we didn't need one before?

Steve: The reason we haven't needed a destructor for the StockItem class until now is that the compiler-generated destructor works fine as long as two conditions are present. First, the member variables of the class must all be of concrete data types (which they are here). Second, the class must have no virtual functions, which of course isn't true for StockItem anymore. We've discussed the reason for the first condition: if we have member variables that are not of concrete data types (e.g., pointers), they won't clean up after themselves properly. We'll find out exactly why the second condition is important as soon as we get through looking at the output of the sample program.

Susan: Okay, I'm sure I can wait. But why is the destructor virtual?

Steve: We'll cover that at the same time.

The first item of note in the test program in Figure 10.11 is that we can create a Vec of StockItem*s to hold the addresses of any mixture of StockItems and DatedStockItems, because we can assign the addresses of variables of either of those types to a base class pointer (i.e., a StockItem*). Once we have the Vec of StockItem*s, we use operator new to acquire the memory for whichever type of object we're creating. This allows us to access these objects via pointers rather than directly and thus to use polymorphism. Once we finish using the objects, we have to make sure they are properly disposed of by calling operator delete at the end of the program; otherwise, a memory leak results.

The calls to delete in Figure 10.11 also hold the key to Susan's question about why we needed to write a destructor for this new version of the StockItem class. You see, when we call operator delete for an object of a class type, delete calls the destructor for that object to do whatever cleanup is necessary at the end of the object's lifespan. For this reason, it is very important that the correct destructor is called. If a base class destructor were called instead of a derived class destructor, the cleanup of the fields defined in the derived class wouldn't occur. However, when we delete a derived class object through a base class pointer, as we are doing in the current example program, the compiler can't tell at compile time which destructor it should call when the program executes. What do we do when we need to delay the determination of the exact version of a function until run time? We use a virtual function. Therefore, whenever we want to call delete on an object through a base class pointer, we need to make the destructor for that object virtual.[4]

[4] As with other virtual functions, if a base class destructor is virtual, the destructors in all classes derived from that class will also automatically be virtual, so we don't have to make them virtual explicitly.

But that still doesn't explain exactly why we need a virtual destructor whenever we have any other virtual functions. The reason for that rule is that there isn't much point in referring to an object through a base class pointer if it doesn't have any virtual functions, because the correct function will never be called in that case! Therefore, although the strict rule is “the destructor must be virtual if there are any calls to delete through a base class pointer”, that amounts to the same thing as “the destructor must be virtual if there are any other virtual functions in the class”, and the latter rule is easier to remember and follow.

Now let's take a look at the new implementation of the StockItem class, which is shown in Figure 10.14. This code is in codeitemc.cpp if you want to print it out for reference.

Figure 10.14. Dangerous polymorphism: StockItem implementation with operator << and operator >> (codeitemc.cpp)
#include <iostream>
#include <iomanip>
#include <sstream>
#include <string>
#include "itemc.h"
#include <dos.h>
using namespace std;

StockItem::StockItem(string Name, short InStock,
short MinimumStock)
: m_InStock(InStock), m_Name(Name),
  m_MinimumStock(MinimumStock)
{
}

StockItem::~StockItem()
{
}

void StockItem::Reorder(ostream& os)
{
 short ReorderAmount;

 if(m_InStock < m_MinimumStock)
   {
   ReorderAmount = m_MinimumStock-m_InStock;
   os << "Reorder " << ReorderAmount << " units of " << m_Name;
   }
}

ostream& operator << (ostream& os, StockItem* Item)
{
  Item->Write(os);
  return os;
}

void StockItem::Write(ostream& os)
{
  os << 0 << endl;
  os << m_Name << endl;
  os << m_InStock << endl;
  os << m_MinimumStock << endl;
}

istream& operator >> (istream& is, StockItem*& Item)
{
   string Expires;
   short InStock;
   short MinimumStock;
   string Name;

   getline(is,Expires);
   getline(is,Name);
   is >> InStock;
   is >> MinimumStock;
   is.ignore();

   if (Expires == "0")
    Item = new StockItem(Name,InStock,MinimumStock);
  else
    Item = new DatedStockItem(Name,InStock,
    MinimumStock,Expires);

  return is;
}

void DatedStockItem::Reorder(ostream& os)
{
  if (m_Expires < Today())
      {
      os << "DatedStockItem::Reorder says:" << endl;
      os << "Return " << m_InStock <<  " units of ";
      os << m_Name << endl;
      m_InStock = 0;
      }

  StockItem::Reorder(os);
}

string DatedStockItem::Today()
{
  struct date d;
  unsigned short year;
  unsigned short day;
  unsigned short month;
  string TodaysDate;
  stringstream FormatStream;

  getdate(&d);
  year = d.da_year;
  day = d.da_day;
  month = d.da_mon;

  FormatStream << setfill('0') << setw(4) << year <<
   setw(2) << month << setw(2) << day;
  FormatStream >> TodaysDate;

  return TodaysDate;
}

DatedStockItem::DatedStockItem(string Name, short InStock,
short MinimumStock, string Expires)
: StockItem(Name, InStock,MinimumStock),
  m_Expires(Expires)
{
}

void DatedStockItem::Write(ostream& os)
{
  os << m_Expires << endl;
  os << m_Name << endl;
  os << m_InStock << endl;
  os << m_MinimumStock << endl;
}

Susan had some questions about the test program and how it relates to the implementation file for the StockItem classes.

Susan: Why do you need the same headers in the test program as you do in the implementation file?

Steve: Because otherwise the compiler doesn't know how to allocate memory for a StockItem or what functions it can perform.

Susan: I didn't know that the use of headers also allocates memory.

Steve: It doesn't. However, the compiler needs the headers to figure out how large every object is so it can allocate storage for each object.

Susan: How does it figure that out?

Steve: It adds up the sizes of all the components in the object you're defining. For example, if you've defined a StockItem to contain three shorts and two strings, then a StockItem object will have to be big enough to contain three shorts plus the size of two strings, with possibly some additional space for other stuff the compiler knows about, such as a vtable pointer.

Susan: Why do you have to allocate storage anyway? I mean, why can't you just tell the compiler how much memory you have left and let it use as much as it wants until the memory is used up? Then you know you're done. <g>

Steve: It does use as much memory as it needs, but it has to know how much of the memory it needs to set aside for each object that you create.

Susan: So, if you have a string class in the implementation of a program and you intend to use it in the interface, then it has to be included in both because they both get compiled separately?

Steve: Sort of. An interface (i.e., a header file) doesn't get compiled separately; it's #included wherever it's needed.

Susan: Yes, but why does it need to be in both places; why isn't one place good enough?

Steve: Because each .cpp file is compiled separately; when the compiler is handling any particular .cpp file, it doesn't know about any header file that isn't mentioned in that file. Therefore, we have to mention every header in every .cpp file that uses objects defined in that header.

Susan: So they are compiled separately. How are they ever connected, and if they do become connected, why is it necessary to write them twice?

Steve: They are connected only by the linker. You don't need to write them twice.

Susan: If they are included in the implementations, aren't they included in the test programs automatically?

Steve: No, because the test program is compiled separately from the implementations. In fact, the writer of the test program may not even have the source code for the implementations, such as when you buy certain libraries that come without source code.

Susan: And if they are needed, then why aren't the other header files needed in the test programs or any other programs for that matter?

Steve: You only have to include those header files that the compiler needs to figure out the size and functions of any object you use.

Susan: Yes, but then if they're necessary for the implementation, then they should be needed for the test programs, I would think. I still don't get it.

Steve: Each header file is needed only in source files that refer to the objects whose classes are defined in that header file. For example, if you aren't using strings in your program, then you don't have to #include <string>.

Susan: Well, if you're writing an implementation for a program, then I think that every source file that uses the class needs to include all the header files, no?

Steve: Yes, except that sometimes you have objects that are used only inside the implementation of a class, as we'll see later in this chapter.

Let's start our analysis of the new versions of the I/O functions with the declaration of operator <<, which is friend ostream& operator << (ostream& os, StockItem* Item);. The second argument to this function is a StockItem* rather than a StockItem because we have to refer to our StockItem and DatedStockItem objects through a base class pointer (i.e., a StockItem*) to get the benefits of polymorphism. Although operator << isn't a virtual function (since it's not a member function at all), we will see that it still makes use of polymorphism.

Susan wanted to know why we needed new I/O functions again:

Susan: Why are you explaining >> and << again? Why won't the old ones do?

Steve: Because the old ones can use StockItems directly, whereas the new ones have to operate on StockItem*s instead. Whenever you change the types of arguments to a function, you have to change the function also. This particular change is part of what is wrong with the standard method of using virtual functions to achieve polymorphism.

Susan: Why aren't you showing how polymorphism is done with real data instead of these >> and << things again?

Steve: This is how polymorphism is done with real data, if we expose the pointers to the application program.

Susan: Do you have to write everything that you see in your classes? Do you even have to define your periods?

Steve: No, as a matter of fact, the “.” operator is (unfortunately) one of the few operators that can't be redefined.

The next point worthy of discussion is that we can use the same operator << to display either a StockItem or a DatedStockItem even though the display functions for those two types are actually different. Let's look at the implementation of this version of operator <<, shown in Figure 10.15.

Figure 10.15. Dangerous polymorphism: The implementation of operator << with a StockItem* (from codeitemc.cpp)
ostream& operator << (ostream& os, StockItem* Item)
{
  Item->Write(os);
  return os;
}

Susan had a question about the argument list for this function:

Susan: Why do we need the argument os?

Steve: Because that's where we want the output to go.

This implementation looks pretty simple, as it merely calls a function called Write to do the actual work. In fact, this code looks too simple: How does it decide whether to display a StockItem or a DatedStockItem?

Using a virtual Function for I/O

This is an application of polymorphism: operator << doesn't have to decide whether to call the version of Write in the StockItem class or the one in the DatedStockItem class because that decision is made automatically at run time. Write is a virtual function declared in the StockItem class; therefore, the exact version of Write called through a StockItem* is determined by the run-time type of the object that the StockItem* actually points to.

To complete the explanation of how operator << works, we'll need to examine Write. Let's look at its implementation for the simplified versions of our StockItem (Figure 10.16) and DatedStockItem (Figure 10.17) classes.

Figure 10.16. Dangerous polymorphism: StockItem::Write (from codeitemc.cpp)
void StockItem::Write(ostream& os)
{
  os << 0 << endl;
  os << m_Name << endl;
  os << m_InStock << endl;
  os << m_MinimumStock << endl;
}

Figure 10.17. Dangerous polymorphism: DatedStockItem::Write (from codeitemc.cpp)
void DatedStockItem::Write(ostream& os)
{
  os << m_Expires << endl;
  os << m_Name << endl;
  os << m_InStock << endl;
  os << m_MinimumStock << endl;
}

The only thing that might not be obvious about these functions is why StockItem::Write writes the “0” out as its first action. We know that there's no date for a StockItem, so why not just write out the data that it does have? The reason is that if we want to read the data back in, we need some way to distinguish between a StockItem and a DatedStockItem. Since “0” is not a valid date, we can use it as an indicator meaning “the following data belongs to a StockItem, not to a DatedStockItem”. In other words, when we read data from the inventory file to create our StockItem and DatedStockItem objects, any set of data that starts with a “0” will produce a StockItem while any set that starts with a valid date will produce a DatedStockItem.

If this still isn't perfectly clear, don't worry. The next section, which covers operator >>, should clear it up.

References to Pointers

First, let's examine the header of the operator >> function:

istream& operator >> (istream& is, StockItem*& Item)

Most of this should be familiar by now, but there is one oddity: the declaration of the second argument to this function is StockItem*&. What does that mean?

It's a reference to a pointer. Now, before you decide to throw in the towel, recall that we use a reference argument when we need to modify a variable in the calling function. In this case, that variable is a StockItem* (a pointer to a StockItem or one of its derived classes), and we are going to have to change it by assigning the address of a newly created StockItem or DatedStockItem to it. Hence, our argument has to be a reference to the variable in the calling function; since that variable is a StockItem*, our argument has to be declared as a reference to a StockItem*, which we write as StockItem*&.

Having cleared up that point, let's look at how we would use this new function (Figure 10.18). In case you want to print out the file containing this code, it is polyiob.cpp.

Susan had a question about the argument to the ifstream constructor:

Susan: What is polyiob.in?

Steve: It's the data file we're going to read the data from.

Figure 10.18. Dangerous polymorphism: Using operator >> and operator << with a StockItem* (codepolyiob.cpp)
#include <iostream>
#include <fstream>
#include "Vec.h"
#include "itemc.h"
using namespace std;

int main()
{
   StockItem* x;
   StockItem* y;

   ifstream ShopInfo("polyiob.in");

   ShopInfo >> x;

   ShopInfo >> y;

   cout << "A StockItem: " << endl;
   cout << x;

   cout << endl;

   cout << "A DatedStockItem: " << endl;
   cout << y;

   delete x;
   delete y;

   return 0;
}

Before we continue to analyze this program, look at Figure 10.19, which shows the output it produces.

Figure 10.19. Dangerous polymorphism: The results of using operator >> and operator << with a StockItem* (codepolyiob.out)
A StockItem:
0
3-ounce cups
71
78

A DatedStockItem:
19970719
milk
76
87

Now let's get back to the code. If you are really alert, you may have noticed something odd here. How can we assign a value to a variable such as x or y without allocating any memory for it? For that matter, how can we call operator delete for a pointer variable that hasn't had memory assigned to it? In fact, these aren't errors but consequences of the way we have to implement operator >> with the tools we have so far. To see why this is so, let's take a look at that implementation, in Figure 10.20.

This starts out reasonably enough by declaring variables to hold the expiration date (Expires), number in stock (InStock), minimum number desired in stock (MinimumStock), and name of the item (Name). Then we read values for these variables from the istream supplied as the left-hand argument in the operator >> call, which in the case of our example program is ShopInfo. Next, we examine the variable Expires, which was the first variable to be read in from the istream. If the value of Expires is “0”, meaning “not a date”, we create a new StockItem by calling the normal constructor for that class and assigning memory to that new object via operator new. If the Expires value isn't “0”, we assume it's a date and create a new DatedStockItem by calling the constructor for DatedStockItem and assigning memory for the new object via operator new. Finally, we return the istream so it can be used in further operator >> calls.

The fact that we have to create a different type of object in these two cases is the key to why we have to allocate the memory in the operator >> function rather than in the calling program. The actual type of the object isn't known until we read the data from the file, so we can't allocate memory for the object until that time. This isn't necessarily a bad thing in itself; the trouble is that we can't free the memory automatically because the calling program owns the StockItem pointers and has to call delete to free the memory allocated to those pointers when the objects are no longer needed.

Figure 10.20. Dangerous polymorphism: The implementation of operator >> (from codeitemc.cpp)
istream& operator >> (istream& is, StockItem*& Item)
{
   string Expires;
   short InStock;
   short MinimumStock;
   string Name;

   getline(is,Expires);
   getline(is,Name);
   is >> InStock;
   is >> MinimumStock;
   is.ignore();

   if (Expires == "0")
       Item = new StockItem(Name,InStock,MinimumStock);
   else
       Item = new DatedStockItem(Name,InStock,
       MinimumStock,Expires);

   return is;
}

While it is legal (and unfortunately not unusual) to write programs in which memory is allocated and freed in this way, it isn't a good idea. The likelihood of error in any large program that uses this method of memory management is approximately 100%. Besides the problem of forgetting to free memory or using memory that has already been freed, we also have the problem that copying pointers leaves two pointers pointing to the same data, which makes it even more likely that the data will either be freed prematurely or not freed at all when it is no longer in use.

Susan had some questions about the dangers of pointers:

Susan: So the programmer forgets to free memory?

Steve: Yes.

Susan: Can't you just write the code to free the memory once?

Steve: Yes, unless you ever change the program.

Susan: Or can these bad things happen on their own even if the program is written properly?

Steve: Yes, that can happen under certain circumstances, but luckily we won't run into any of those circumstances in this book.

We'll begin to solve these problems right after some exercises.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset