Concrete Data Types

While different classes vary considerably in the facilities that they provide, there are significant benefits to a class whose objects behave like those of native types. As I've just mentioned, such a class is called a concrete data type. To make a class a concrete data type, we must define certain member functions that allow creation, copying, and deletion to behave as with a native variable.

Susan wanted to see a chart illustrating the correspondence between what the compiler does for a native type and what we have to do to make a type a concrete data type. Of course, I complied with her request (see Figure 6.2).

Figure 6.2. Comparison between native and user-defined types


Because the member functions listed in that chart are so fundamental to the proper operation of a class, the compiler will generate a version of each of them for us if we don't write them ourselves, just as the corresponding behavior is automatically supplied for the native types. As we will see in Chapter 7, the compiler-generated functions are generally too simplistic to be used in a complex class. In such a case we need to create our own versions of these functions and I'll illustrate how to do that at the appropriate time. However, with a simple class such as the one we're creating here, the compiler-generated versions of the assignment operator, copy constructor, and destructor are perfectly adequate, so we won't be creating our own versions of these functions for StockItem.

Susan was a bit confused about the distinction between the compiler-generated versions of these essential functions and the compiler's built-in knowledge of the native types:

Susan: Aren't the compiler-generated versions the same thing as the native versions?

Steve: No, they're analogous but not the same. The compiler-generated functions are created only for objects, not for native types. The behavior of the native types is implemented directly in the compiler, not by means of functions.

Susan: I'm confused. Maybe it would help if you explained what you mean by "implemented directly in the compiler". Are you just saying that objects are implemented only by functions, whereas the native types are implemented by the built-in facilities of the compiler?

Steve: You're not confused, you're correct.

Susan: OK, here we go again. About the assignment operator, what is this "version"? I thought you said earlier that if you don't write your own assignment operator it will use the native operator. So I don't get this.

Steve: There is no native assignment operator for any class type; instead, the compiler will generate an assignment operator for a class if we don't do it ourselves.

Susan: Then how can the compiler create an assignment operator if it doesn't know what it is doing?

Steve: All the compiler-generated assignment operator does is to copy all of the members of the right-hand variable to the left-hand variable. This is good enough with the StockItem class. We'll see in Chapter 7 why this isn't always acceptable.

Susan: Isn't a simple copy all that the native assignment operator does?

Steve: The only native assignment operators that exist are for native types. Once we define our own types, the compiler has to generate assignment operators for us if we don't do it ourselves; otherwise, it would be impossible to copy the value of one variable of a class type to another without writing an assignment operator explicitly.

Susan: OK, this is what confused me, I just thought that the native functions would be used as a default if we didn't define our own in the class type, even though they would not work well.

Steve: There aren't any native functions that work on user-defined types. That's why the compiler has to generate them when necessary. But I think we have a semantic problem here, not a substantive one.

Susan: Why doesn't it default to the native assignment operator if it doesn't have any other information to direct it to make a class type operator? This is most distressing to me.

Steve: There isn't any native assignment operator for a StockItem. How could there be? The compiler has never heard of a StockItem until we define that class.

Susan: So it would be a third type of assignment operator. At this point, I am aware of the native type, the user-defined type and a compiler-generated type.

Steve: Right. The native type is built into the compiler, the user-defined type is defined by the user, and the compiler-generated type is created by the compiler for user-defined types where the user didn't define his own.

Susan: Then the native and the compiler-generated assignment operator are the same? If so, why did you agree with me that there must be three different types of assignment operators? In that case there would really only be two.

Steve: No, there is a difference. Here is the rundown:

1. (Native assignment) The knowledge of how to assign values of every native type is built into the compiler; whenever such an assignment is needed, the compiler emits prewritten code that copies the value from the source variable to the destination variable.

2. (Compiler-generated assignment) The knowledge of how to create a default assignment operator for any class type is built into the compiler; if we don't define an assignment operator for a given class, the compiler generates code for an assignment operator that merely copies all of the members of the source variable to the destination variable. Note that this is slightly different from 1, where the compiler copies canned instructions directly into the object file whenever the assignment is done; here, it generates an assignment operator for the specific class in question and then uses that operator whenever an assignment is done.

3. (User-defined assignment) This does exactly what we define it to do.

Susan: Did you ever discuss the source variable and the destination variable? I don't recall that concept in past discussions. I like this. All I remember is when you said that = means to set the variable on the left to the value on the right. Does this mean that the variable on the left is the destination variable and the value on the right is the source variable?

Steve: Yes, if the value on the right is a variable; it could also be an expression such as "x + 2".

Susan: But how could it be a variable if it is a known value?

Steve: It's not its value that is known, but its name. Its value can vary at run time, depending on how the program has executed up till this point.

Susan: So the main difference is that in 1 the instructions are already there to be used. In 2 the instructions for the assignment operator have to be generated before they can be used.

Steve: That's a good explanation.

After my explanation of the advantages of a concrete data type, Susan became completely convinced, so much so that she wondered why we would ever want anything else.

Susan: On your definition for concrete data types. . . this is fine, but what I am thinking is that if something wasn't a concrete data type, then it wouldn't work, that is unless it was native. So what would a workable alternative to a concrete data type be?

Steve: Usually, we do want our objects to be concrete data types. However, there are times when, say, we don't want to copy a given object. For example, in the case of an object representing a window on the screen, copying such an object might cause another window to be displayed, which is probably not what we would want to happen.

Susan: OK, so what would you call an object that isn't of a concrete data type?

Steve: There's no special name for an object that isn't of a concrete data type.

Susan: So things that are not of a concrete data type have no names?

Steve: No, they have names; I was just saying that there's no term like non-concrete data type, meaning one that doesn't act like a native variable. There is a term abstract data type, but that means something else.

Susan: See, this is where I am still not clear. Again, if something is not a concrete data type, then what is it?

Steve: What is the term for a person who is not a programmer? There isn't any special term for such a person. Similarly, there's no special term for a class that doesn't act like a native variable type. If something isn't a concrete data type, then you can't treat it like a native variable. Either you can't copy it, or you can't assign to it, or you can't construct it by default, or it doesn't go away automatically at the end of the function where it is defined (or some combination of these). The lack of any of those features prevents a class from being a concrete data type.

Susan: Of what use would it be to have a class of a non-concrete data type? To me, it just sounds like an error.

Steve: Sometimes it does make sense. For example, you might want to create a class that has no default constructor; to create an element of such a class, you would have to supply one or more arguments. This is useful in preventing the use of an object that doesn't have any meaningful content; however, the lack of a default constructor does restrict the applicability of such a class, so it's best to provide such a constructor if possible.

Before we can implement the member functions for our StockItem class, we have to define what a StockItem is in more detail than my previous sketch.[8] Let's start with the initial version of the interface specification for that class (Figure 6.3), which includes the specification of the default constructor, the Display function, and another constructor that is specific to the StockItem class.

[8] By the way, in using a reasonably functional class such as StockItem to illustrate these concepts, I'm violating a venerable tradition in C++ tutorials. Normally, example classes represent zoo animals, or shapes, or something equally useful in common programming situations.

I strongly recommend that you print out the files that contain this interface and its implementation (as well as the test program) for reference as you are going through this part of the chapter. Those files are item1.h, item1.cpp, and itemtst1.cpp, respectively.

Figure 6.3. The initial interface of the StockItem class (codeitem1.h)
class StockItem
{
public:
  StockItem();

  StockItem(std::string Name, short InStock, short Price,
  std::string Distributor, std::string UPC);

  void Display();

private:
  short m_InStock;
  short m_Price;
  std::string m_Name;
  std::string m_Distributor;
  std::string m_UPC;
};

Your first reaction is probably something like "What a bunch of malarkey!" Let's take it a little at a time, and you'll see that this seeming gibberish actually has a rhyme and reason to it. First we have the line class StockItem. This tells the compiler that what follows is the definition of a class interface, which as we have already seen is a description of the operations that can be performed on objects of a given user-defined type; in this case, the type is StockItem. So that the compiler knows where this description begins and ends, it is enclosed in {}, just like any other block of information that is to be treated as one item.[9]

[9] However, there is one oddity about the declaration of a class when compared with other blocks: you need the “;” at the end of the block, after the closing “}”, which isn't necessary for most other blocks. This is a leftover from C, as are many of the quirks of C++.

After the opening {, the next line says public:. This is a new type of declaration called an access specifier, which tells the compiler the "security classification" of the item(s) following it, up to the next access specifier. This particular access specifier, public, means that any part of the program, regardless of whether it is defined in this class, can use the items starting immediately after the public declaration and continuing until there is another access specifier. In the current case, all of the items following the public specifier are operations that we wish to perform on StockItem objects. Since they are public, we can use them anywhere in our programs. You may be wondering why everything isn't public; why should we prevent ourselves (or users of our classes) from using everything in the classes? It's not just hardheartedness; it's actually a way of improving the reliability and flexibility of our software, as I'll explain later.

As you might imagine, this notion of access specifiers didn't get past Susan without a serious discussion. Here's the play-by-play account.

Susan: So, is public a word that is used often or is it just something you made up for this example?

Steve: It's a keyword of the C++ language, which has intrinsic meaning to the compiler. In this context, it means "any function, inside or outside this class, can access the following stuff, up to the next access specifier (if any)". Because it is a keyword, you can't have a variable named public, just as you can't have one named if.

Susan: These access specifiers: What are they, anyway? Are they always used in classes?

Steve: Yes.

Susan: Why aren't they needed for native variables?

Steve: Because you can't affect the implementation of native types; their internals are all predefined in the compiler.

Susan: What does internals mean? Do you mean stuff that is done by the compiler rather than stuff that can be done by the programmer?

Steve: Yes, in the case of native data types. In the case of class types, internals means the details of implementation of the type rather than what it does for the user.

Susan: You know, I understand what you are saying about internals; that is, I know what the words mean, but I just can't picture what you are doing when you say implementation. I don't see what is actually happening at this point.

Steve: The implementation of a class is the code that is responsible for actually doing the things that the interface says the objects of the class can do. All of the code in the item1.cpp file is part of the implementation of StockItem. In addition, the private member variables in the header file are logically part of the implementation, since the user of the class can't access them directly.

Susan: Why is a class function called a member function? I like class function better; it is more intuitive.

Steve: Sorry, I didn't make up the terminology. However, I think member function is actually more descriptive, because these functions are members (parts) of the objects of the class.

Susan: So on these variables, that m_ stuff; do you just do that to differentiate them from a native variable? If so, why would there be a confusion, since you have already told the compiler you are defining a class? Therefore, all that is in that class should already be understood to be in the class rather than the native language. I don't like to look at that m_ stuff; it's too cryptic.

Steve: It's true that the compiler can tell whether a variable is a member variable or a global variable. However, it can still be useful to give a different name to a member variable so that the programmer can tell which is which. Remember, a member variable looks like a global variable in a class implementation, because you don't declare it as you would an argument or a local variable.

Now we're up to the line that says StockItem();. This is the declaration for a function called a constructor, which tells the compiler what to do when we define a variable of a user-defined type. This particular constructor is the default constructor for the StockItem class. It's called the "default" constructor because it is used when no initial value is specified by the user; the empty parentheses after the name of the function indicate the lack of arguments to the function. The name of the function is the clue that it's a constructor. The name of a constructor is always the same as the name of the class for which it's a constructor, to make it easier for the compiler to identify constructors among all of the possible functions in a class.

This idea of having variables and functions "inside" objects wasn't intuitively obvious to Susan:

Susan: Now, where you talk about mixing a string and a short in the same function, can this not be done in the native language?

Steve: It's not in the same function but in the same variable. We are creating a user-defined variable that can be used just like a native variable.

Susan: OK, so you have a class StockItem. And it has a function called StockItem. But a StockItem is a variable, so in this respect a function can be inside a variable?

Steve: Correct. A StockItem is a variable that is composed of a number of functions and other variables.

Susan: OK, I think I am seeing the big picture now. But you know that this seems like such a departure from what I thought was going on before, where we used native types in functions rather than the other way around. Like when I wrote my little program, it would have shorts in it but they would be in the function main. So this is a complete turnabout from the way I used to think about them; this is hard.

Steve: Yes, that is a difficult transition to make. Interestingly enough, experience isn't necessarily an advantage here; you haven't had as much trouble with it as some professional programmers who have a lot more experience in writing functions as "stand-alone" things with no intrinsic ties to data structures. However, it is one of the essentials in object-oriented programming; most functions live "inside" objects and do the bidding of those objects, rather than being wild and free.

Why do we need to write our own default constructor? Well, although we have already specified the member variables used by the class so that the compiler can assign storage as with any other static or auto variable, that isn't enough information for the compiler to know how to initialize the objects of the class correctly.[10] Unlike a native variable, the compiler can't set a newly created StockItem to a reasonable value, since it doesn't understand what the member variables of a StockItem are used for. That is, it can't do the initialization without help from us. In the code for our default constructor, we will initialize the member variables to legitimate values so that we don't have to worry about having an uninitialized StockItem lying around as we did with a short in a previous example. Figure 6.4 shows what the code to our first default constructor looks like.

[10] In case it isn't obvious how the compiler can figure out the size of the object, consider that the class definition specifies all of the variables that are used to implement the objects of the class. When we define a new class, the types of all of the member variables of the class must already be defined. Therefore, the compiler can calculate the size of our class variables based on the sizes of those member variables. By the way, the size of our object isn't necessarily the sum of the sizes of its member variables; the compiler often has to add some other information to the objects of a class besides the member variables. We'll see one of the reasons for this later in this book.

Figure 6.4. The default constructor for the StockItem class (from codeitem1.cpp)
StockItem::StockItem()
: m_Name(), m_InStock(0), m_Price(0), m_Distributor(), m_UPC()
{
}

Let's use this example of a StockItem class to illuminate the distinction between interface and implementation. As I've already mentioned, the implementation of a class is the code that is responsible for actually doing the things promised by the interface of that class. The interface was laid out in Figure 6.3. With the exception of the different versions of the test program that illustrates the use of the StockItem class, all of the code that we will examine in this chapter is part of the implementation. This includes the constructors and the Display member function.

So you can keep track of where this fits into the "big picture", the code in Figure 6.4 is the implementation of the function StockItem::StockItem() (i.e., the default constructor for the class StockItem), whose interface was defined in Figure 6.3. Now, how does it work? Actually, this function isn't all that different from a "regular" function, but there are some important differences. First of all, the name looks sort of funny: Why is StockItem repeated?

The answer is that, unlike "regular" (technically, global) functions, a member function always belongs to a particular class. That is, such a function has special access to the data and other functions in the class, and vice versa. To mark its membership, its name consists of the name of the class (in this case, StockItem), followed by the class membership operator ::, followed by the name of the function (which in this case, is also StockItem); as we have already seen, the name of a constructor is always the same as the name of its class. Figure 6.5 shows how each component of the function declaration contributes to the whole.

Figure 6.5. Declaring the default constructor for the StockItem class


If you've really been paying attention, there's one thing that you may have noticed about this declaration as compared with the original declaration of this function in the class interface definition for StockItem (Figure 6.3). In that figure, we declared this same function as StockItem();, without the additional StockItem:: on the front.[11] Why didn't we need to use the StockItem:: class membership notation in the class interface definition? Because inside the declaration of a class, we don't have to specify what class the member functions belong to; by definition, they belong to the class we're defining. Thus, StockItem() in the class interface declaration means “the member function StockItem, having no arguments”; i.e., the default constructor for the StockItem class.

[11] By the way, spaces between components of the name aren't significant; that is, we can leave them out as in Figure 6.4, or include them as in Figure 6.5.

Susan didn't have any trouble with this point, which was quite a relief to me:

Susan: Oh, so you don't have to write StockItem::StockItem in the interface definition because it is implied by the class StockItem declaration?

Steve: Right.

Now let's look at the part of the constructor that initializes the member variables of the StockItem class, the member initialization list. The start of a member initialization list is signified by a : after the closing “)” of the constructor declaration, and the expressions in the list are separated by commas. A member initialization list can be used only with constructors, not any other type of functions.

The member initialization list of the default StockItem constructor is: : m_InStock(0), m_Price(0), m_Name(), m_Distributor(), m_UPC(). What does this mean exactly? Well, as its name indicates, it is a list of member initialization expressions, each of which initializes one member variable. In the case of a member variable of a native type such as short, a member initialization expression is equivalent to creating the variable with the initial value specified in the parentheses. In the case of a member variable of a class type, a member initialization expression is equivalent to creating the variable by calling the constructor that matches the type(s) of argument(s) specified in the parentheses, or the default constructor if there are no arguments specified. So the expression m_InStock(0) is equivalent to the creation and simultaneous initialization of a local variable by the statement short m_InStock = 0;. Similarly, the expression m_Name() is equivalent to the creation and simultaneous initialization of a local variable by the statement string m_Name;. Such a statement, of course, would initialize the string m_Name to the default value for a string, which happens to be the empty C string literal "".

Using a member initialization list is the best way to set up member variables in a constructor, for two reasons. First, it's more efficient than using assignment statements to set the values of member variables. For example, suppose that we were to write this constructor as shown in Figure 6.6.

Figure 6.6. Another way to write the default StockItem constructor
StockItem::StockItem()
{
 m_InStock = 0;
 m_Price = 0;
 m_Name = "";
 m_Distributor = "";
 m_UPC = "";
}

If we wrote the constructor that way, before we got to the opening “{” of the constructor, all of the member variables that had constructors (here, the strings) would be initialized to their default values. After the “{”, they would be set to the values we specified in the code for the constructor. It's true that we could solve this problem in this specific example by simply not initializing the strings at all, as that would mean that they would be initialized to their default values anyway; but that solution wouldn't apply in other constructors such as the one in Figure 6.7, where the member variables have specified values rather than default ones.

The second reason that we should use a member initialization list to initialize our member variables is that some member “variables” aren't variables at all but constants. We'll see how to define consts, as they are called in C++, in a later chapter. For now, it's enough to know that you can't assign a value to a const, but you can (and indeed have to) initialize it; therefore, when dealing with member consts, a member initialization list isn't just a good idea, it's the law.

There is one fine point that isn't obvious from looking at the code for this constructor: The expressions in a member initialization list are executed in the order in which the member variables being initialized are declared in the class definition, which is not necessarily the order in which the expressions appear in the list. In our example, since m_InStock appears before m_Name in the class definition, the member initialization expression for m_InStock will be executed before the expression initializing m_Name. This doesn't matter right now, but it will be important in Chapter 7, where we will be using initialization expressions whose order of execution is important.

You may have noticed that the body of the function (the part inside the {}) shown in Figure 6.4 is empty, because all of the work has already been done by the member initialization list. This is fairly common when writing constructors, but not universal; as we'll see in Chapter 7, sometimes a constructor has to do something other than initialize member variables, in which case we need some code inside the {}.

Susan objected to my cavalier use of the empty C string literal "":

Susan: Excuse me, but what kind of value is " " ? Do you know how annoying it is to keep working with nothing?

Steve: It's not " ", but "". The former has a space between the quotes and the latter does not; the former is a one-character C string literal consisting of one space, while the latter is a zero-character C string literal.

Susan: OK, so "" is an empty C string literal, but could you please explain how this works?

Steve: The "" means that we have a C string literal with no data in it. The compiler generates a C string literal consisting of just the terminating null byte.

Susan: What good does that do? I don't get it.

Steve: Well, a string has to have some value for its char* to point to; if we don't have any real data, then using an empty C string literal for that purpose is analogous to setting a numeric value to 0.

Susan: OK, so this is only setting the strings in the default constructor to a value that the compiler can understand so you don't get an error message, although there is no real data. We're trying to fool the compiler, right?

Steve: Close, but not quite. Basically, we want to make sure that we know the state of the strings in a default StockItem. We don't want to have trouble with uninitialized variables; remember how much trouble they can cause?

Susan: Yes, I remember. So this is just the way to initialize a string when you don't know what real value it will end up having?

Steve: Yes, that's how we're using it here.

Now let's get back to the member variables of StockItem. One important characteristic of any variable is its scope, so we should pay attention to the scope of these variables. In Chapter 5, we saw two scopes in which a variable could be defined: local (i.e., available only within the block where it was defined) and global (i.e., available anywhere in the program). Well, these variables aren't arguments (which have local scope) since they don't appear in the function's header. On the other hand, they aren't defined in the function; therefore, they aren't local variables. Surely they can't be global variables, after I showed you how treacherous those can be.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset