The char and string Types

This should give you some idea of how numeric variables and values work. But what about nonnumeric ones?

This brings us to the subject of two new variable types and the values they can contain. These are the char (short for “character”) and its relative, the string. What are these good for, and how do they work?[22]

[22] In case you were wondering, the most common pronunciation of char has an a like the a in “married”, while the ch sounds like “k”.

A variable of type char corresponds to 1 byte of storage. Since a byte has 8 bits, it can hold any of 256 (28) values; the exact values depend on whether it is signed or unsigned, as with the short variables we have seen before.[23] Going strictly according to this description, you might get the idea that a char is just a “really short” numeric variable. A char indeed can be used for this purpose in cases where no more than 256 different numeric values are to be represented. In fact, this explains why you might want a signed char. Such a variable can be used to hold numbers from –128 to +127; an unsigned char, on the other hand, has a range from 0 to 255. This facility isn't used very much any more, but in the early days of C, memory was very expensive and scarce, so it was sometimes worth the effort to use 1-byte variables to hold small values.

[23] Again, the C++ language does not require that a byte have exactly eight bits, just that it has at least eight bits. On the other hand, by definition a char variable occupies exactly one byte of storage, however big a byte may be.

However, the main purpose of a char is to represent an individual letter, digit, punctuation mark, “special character” (e.g., $, @, #, %, and so on) or one of the other “printable” and displayable units from which words, sentences, and other textual data such as this paragraph are composed.[24]

[24] As we will see shortly, not all characters have visible representations; some of these “nonprintable” characters are useful in controlling how our printed or displayed information looks.

These 256 different possibilities are plenty to represent any character in English, as well as a number of other European languages. But the written forms of “ideographic” languages, such as Chinese, consist of far more than 256 characters, so 1 byte isn't big enough to hold a character in these languages. While they have been supported to some extent by schemes that switch among a number of sets of 256 characters each, such clumsy approaches to the problem made programs much more complicated and error prone. As the international market for software is increasing rapidly, it has become more important to have a convenient method of handling large character sets; as a result, a standard method of representing the characters of such languages by using 2 bytes per character has been developed. It's called the “Unicode standard”. There's even a proposed solution that uses 32 bits per character, for the day when Unicode doesn't have sufficient capacity.

Since one char isn't good for much by itself, we often use groups of them, called strings, to make them easier to handle. Just as with numeric values, these variables can be set to literal values which represent themselves. Figure 3.16 is an example of how to specify and use each of these types we've just encountered. This is the first complete program we've seen, so there are a couple of new constructs that I'll have to explain to you.

By the way, in case the program in Figure 3.16 doesn't seem very useful, that's because it isn't; it's just an example of the syntax of defining and using variables and literal values. However, we'll use these constructs to do useful work later, so going over them now isn't a waste of time.

Figure 3.16. Some real characters and strings (codeasic00.cpp)
#include <string>
using namespace std;

int main()
{
   char c1;
   char c2;
   string s1;
   string s2;

   c1 = 'A';
   c2 = c1;

   s1 = "This is a test ";
   s2 = "and so is this.";

   return 0;
}

Why do we need the line #include <string>? Because we have to tell the compiler that we want to manipulate strings; the code that allows us to do that isn't automatically included with our programs unless we ask for it. For the moment, it's enough to know that including <string> is necessary to tell the compiler that we want to use strings; we'll get into some details of this mechanism later.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset