Objectives of This Chapter

By the end of this chapter, you should

  1. Understand how to implement all the concrete data type functions for a class that uses pointers, namely our homegrown string class.

  2. Understand in detail the operation and structure of a string class that is useful in some real programming situations.

  3. Understand how to write appropriate input and output functions (operator >> and operator <<) for the objects of our string class.

  4. Understand how to use some additional C library functions such as memcmp and memset.

  5. Understand the (dreaded) C data type, the array, and some of the reasons why it is hazardous to use.

  6. Understand the friend declaration, which allows access to private members by selected nonmember functions.

Why We Need a Reference Argument for operator =

Now we're finally ready to examine exactly why the code for our operator = needs a reference argument rather than a value argument. I've drawn two diagrams that illustrate the difference between a value argument and a reference argument. First, Figure 8.1 illustrates what happens when we call a function with a value argument of type string using the compiler-generated copy constructor.[1]

[1] If this diagram looks familiar, it's the same as the one illustrating the problem with the compiler-generated operator =, Figure 7.11, except for labels.

Figure 8.1. Call by value ("normal argument") using the compiler-generated copy constructor


In other words, with a value argument, the called routine makes a copy of the argument on its stack. This won't work properly with a string argument; instead, it will destroy the value of the caller's variable upon return to the calling function. Why is this?

Premature Destruction

The problem occurs when the destructor is called at the end of a function's execution to dispose of the copy of the input argument made at entry. Since the copy points to the same data as the caller's original variable, the destruction of the copy causes the memory allocated to the caller's variable to be freed prematurely.

This is due to the way in which a variable is copied in C++ by the compiler-generated copy constructor. This constructor, like the compiler-generated operator =, makes a copy of all of the parts of the variable (a so-called memberwise copy). In the case of our string variable, this results in copying only the length m_Length and the pointer m_Data, and not the data that m_Data points to. That is, both the original and the copy refer to the same data, as indicated by Figure 8.1. If we were to implement our operator = with a string argument rather than a string& argument, then the following sequence of events would take place during execution of the statement s = n;:

  1. A default copy like the one illustrated by Figure 8.1 would be made of the input argument n, causing the variable Str in the operator = code to point to the same data as the caller's variable n.

  2. The Str variable would be used in the operator = code.

  3. The Str variable would be destroyed at the end of the operator = function. During this process, the destructor would free the memory that Str.m_Data points to by calling delete [].

Since Str.m_Data holds the same address as the caller's variable n.m_Data, the latter now points to memory that has been freed and may be overwritten or assigned to some other use at any time. This is a bug in the program caused by the string destructor being called for a temporary copy of a string that shares data with a caller's variable. When we use a reference argument, however, the variable in the called function is nothing more (and nothing less) than another name for the caller's variable. No copy is made on entry to the operator = code; therefore, the destructor is not called on exit. This allows the caller's variable n to remain unmolested after operator = terminates.

That may sound good, but Susan wanted some more explanation.

Susan: I don't get why a value argument makes a copy and a reference argument doesn't. Help.

Steve: The reason is that a argument:valuevalue argument is actually a new auto variable, just like a regular auto variable, except that it is initialized to the value of the caller's actual argument. Therefore, it has to be destroyed when the called function ends. On the other hand, a reference argument just renames the caller's variable; since the compiler hasn't created a new auto variable when the called routine starts, it doesn't need to call the destructor to destroy that variable at the end of the routine.

Figure 8.2 helped her out a bit by illustrating the same call as in Figure 8.1, using a reference argument instead of a value argument.

Figure 8.2. Call by reference


Finally, we've finished examining the intricacies that result from the apparently simple statement s = n; in our test program (Figure 8.3).

Figure 8.3. Our first test program for the string class (codestrtst1.cpp)
#include "string1.h"

int main()
{
   string s;
   string n("Test");
   string x;

   s = n;
   n = "My name is Susan";

   x = n;
   return 0;
}

Now let's take a look at the next statement in that test program, n = "My name is Susan";. The type of the C string literal expression "My name is Susan" is char*; that is, the compiler stores the character data somewhere and provides a pointer to it. In other words, this line is attempting to assign a char* to a string. Although the compiler has no built-in knowledge of how to do this, we don't have to write any more code to handle this situation because the code we've already written is sufficient. That's because if we supply a value of type char* where a string is needed, the constructor string::string(char*) is automatically invoked. Such automatic conversion is another of the features of C++ that makes user defined types more like native types.[2]

[2] There are situations, however, where this usually helpful feature is undesirable; for this reason, C++ provides a way of preventing the compiler from supplying such conversions automatically. We'll see how to do that in Chapter 12.

The sequence of events during compilation of the line n = "My name is Susan"; is something like this:

  1. The compiler sees a string on the left of an =, which it interprets as a call to some version of string::operator =.

  2. It looks at the expression on the right of the = and sees that its type is not string, but char*.

  3. Have we defined a function with the signature string::operator = (char*)? If so, use it.

  4. In this case, we have not defined such an operator. Therefore, the compiler checks to see whether we have defined a constructor with the signature string::string(char*) for the string class.

  5. Yes, there is such a constructor. Therefore, the compiler interprets the statement as n.operator = (string("My name is Susan"));. If there were no such constructor, that line would be flagged as an error.

So the actual interpretation of n = "My name is Susan"; is n.operator = (string("My name is Susan"));. What exactly does this do?

Figure 8.4 is a picture intended to illustrate the compiler's "thoughts" in this situation; that is, when we assign a C string with the value "My name is Susan" to a string called n via the constructor string::string(char*).[3]

[3] Rather than showing each byte address of the characters in the strings and C strings as I've done in previous diagrams, I'm just showing the address of the first character in each group, so that the figure will fit on one page.

Figure 8.4. Assigning a char* value to a string via string::string(char*)


The Compiler Generates a Temporary Variable

Let's go over Figure 8.4, step by step. The first thing that the compiler does is to call the constructor string::string(char*) to create a temporary (jargon for temporary variable) of type string, having the value "My name is Susan". This temporary is then used as the argument to the function string::operator = (const string& Str) (see Figure 7.15).

Since the argument is a reference, no copy is made of the temporary; the variable Str in the operator = code actually refers to the (unnamed) temporary variable. When the operator = code is finished executing, the string n has been set to the same value as the temporary (i.e., "My name is Susan"). Upon return from the operator = code, the temporary is automatically destroyed by a destructor call inserted by the compiler.

This sequence of events also holds the key to understanding why the argument of string::operator = must be a const string& (that is, a constant reference to a string) rather than just a string& (that is, a reference to a string) if we want to allow automatic conversion from a C string to a string. You see, if we declared the function string::operator = to have a string& argument rather than a const string& argument, then it would be possible for that function to change the value of the argument. However, any attempt to change the caller's argument wouldn't work properly if, as in the current example, the argument turned out to be a temporary string constructed from the original argument (the char* value "My name is Susan"); clearly, changing the temporary string would have no effect on the original argument. Therefore, if the argument to string::operator = were a string&, the line n = "My name is Susan"; would produce a compiler warning to the effect that we might be trying to alter an argument that was a temporary value. The reason we don't get this warning is that the compiler knows that we aren't going to try to modify the value of an argument that has the specification const string&; therefore, constructing a temporary value and passing it to string::operator = is guaranteed to have the behavior that we want.[4]

[4] By the way, the compiler insists that a function isn't going to modify an argument with the const specifier; if we wrote a function with a const argument and then tried to modify such an argument, it wouldn't compile.

This example is anything but intuitively obvious and as you might imagine, led to an extended discussion with Susan.

Susan: So no copy of the argument is made, but the temporary is a copy of the variable to that argument?

Steve: The temporary is an unnamed string created from the C string literal that was passed to operator = by the statement n = "My name is Susan";.

Susan: Okay. But tell me this: Is the use of a temporary the result of specifying a reference argument? If so, then why don't you discuss this when you first discuss reference arguments?

Steve: It's not exactly because we're using a reference argument. When a function is called with the "wrong" type of argument but a constructor is available to make the "right" type of argument from the "wrong" one that was supplied, then the compiler will supply the conversion automatically. In the case of calling operator = with a char* argument rather than a string, there is a constructor that can make a string from a char*. Therefore, the compiler will use that constructor to make a temporary string out of the supplied char* and use that temporary string as the actual argument to the function operator =. However, if the argument type were specified as a string& rather than a const string&, then the compiler would warn us that we might be trying to change the temporary string that it had constructed. Since we have a const string& argument, the compiler knows that we won't try to change that temporary string, so it doesn't need to warn us about this possibility.

Susan: Well, I never looked at it that way, I just felt that if there is a constructor for the argument then it is an OK argument.

Steve: As long as the actual argument matches the type that the constructor expects, there is no problem.

Susan: So, if the argument type were a string& and we changed the temporary argument, what would happen? I don't see the problem with changing something that was temporary; I see that it would be a problem for the original argument but not the temporary.

Steve: The reason why generating a temporary is acceptable in this situation is that the argument is a const reference. If we didn't add the const in front of the argument specifier, then the compiler would warn us about our possibly trying to modify the temporary. Since we have a const reference, the compiler knows that we won't try to modify the argument and thus it's safe for the compiler to generate the temporary value.

Susan: OK, then the temporary is created any time you call a reference argument? I thought that the whole point of a temporary was so you could modify it and not the original argument and the purpose of the const was to ensure that would be the case.

Steve: The point is precisely that nothing would happen to the original argument if we changed the copy. Since one of the reasons that reference arguments are available is to allow changing of the caller's argument, the compiler warns us if we appear to be interested in doing that (a non-const reference argument) in a situation where such a change would have no effect because the actual argument is a temporary.

Susan: So, if we have a non-const string& argument specification with an actual argument of type char* then a temporary is made that can be changed (without affecting the original argument). If the argument is specified as a const string& and the actual argument is of type char* then a temporary is made that cannot be changed.

Steve: You've correctly covered the cases where a temporary is necessary, but haven't mentioned the other cases. Here is the whole truth and nothing but the truth:

1. If we specify the argument type as string& and a temporary has to be created because the actual argument is a char* rather than a string, then the compiler will warn us that changes to that temporary would not affect the original argument.

2. If we specify the argument type as const string& and a temporary has to be created because the actual argument is a char*, then the compiler won't warn us that our (hypothetical) change would be ineffective, because it knows that we aren't going to make such a change.

3. However, if the actual argument is a string, then no temporary needs to be made in either of these cases (string& or const string&). Therefore, the argument that we see in the function is actually the real argument, not a temporary, and the compiler won't warn us about trying to change a (nonexistent) temporary.

Susan: OK, this clears up another confusion I believe, because I was getting confused with the notion of creating a temporary that is basically a copy but I remember that you said that a reference argument doesn't make a copy; it just renames the original argument. So that would be the case in 3 here, but the temporary is called into action only when you have a situation such as in 1 or 2, where a reference to a string is specified as the argument type in the function declaration, while the actual argument is a char*.

Steve: Right.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset