The Binary Number System

Once upon a time, the Acme company had a factory that made golf carts. One day, Bob, the President of Acme, decided to add an odometer to the carts so that the purchaser of the cart could estimate when to recharge the battery. To save money, Bob decided to buy the little numbered wheels for the odometers and have his employees put the odometers together. The minimum order was for a thousand odometer wheels, which was more than he needed for his initial run of 50 odometers. When he got the wheels, however, he noticed that they were defective. Instead of the numbers 0 – 9, each wheel had only two numbers, 0 and 1. Of course, he was quite irritated by this error and attempted to contact the company from which he had purchased the wheels, but it had closed down for a month for summer vacation. What was he to do until it reopened?

While he was fretting about this problem, the employee who had been assigned the task of putting the odometers together from the wheels came up with a possible solution. This employee, Jim, came into Bob's office and said, “Bob, I have an idea. Since we have lots of orders for these odometer-equipped carts, maybe we can make an odometer with these funny wheels and tell the customers how to read the numbers on the odometer.”

Bob was taken aback by this idea. “What do you mean, Jim? How can anyone read those screwy odometers?”

Jim had given this some thought. “Let's take a look at what one of these odometers, say with five wheels, can display. Obviously, it would start out reading 00000, just like a normal odometer. Then when one mile has elapsed, the rightmost wheel turns to 1, so the whole display is 00001; again, this is just like a normal odometer.”

“Now we come to the tricky part. The rightmost wheel goes back to 0, not having any more numbers to display, and pushes the 'tens' wheel to 1; the whole number now reads 00010. Obviously, one more mile makes it 00011, which gives the situation shown in Figure 2.3.”

Figure 2.3. The first few numbers


Jim continued, “What's next? This time, the rightmost wheel turns over again to 0, triggering the second wheel to its next position. However, this time, the second wheel is already at its highest value, 1; therefore, it also turns over to 0 and increments the third wheel. It's not hard to follow this for a few more miles, as shown in Figure 2.4.”

Figure 2.4. The next few numbers


Bob said, “I get it. It's almost as though we were counting normally, except that you skip all the numbers that have anything but 0s or 1s in them.”

“That's right, Bob. So I suppose we could make up a list of the 'real' numbers and give it to the customers to use until we can replace these odometers with normal ones. Perhaps they'll be willing to work with us on this problem.”

“Okay, Jim, if you think they'll buy it. Let's get a few of the customers we know best and ask them if they'll try it; we won't charge them for the odometers until we have the real ones, but maybe they'll stick with us until then. Perhaps any odometer would be better than no odometer at all.”

Jim went to work, making some odometers out of the defective wheels; however, he soon figured out that he had to use more than five wheels because that allowed only numbers from 0 to 31. How did he know this?

Each wheel has two numbers, 0 and 1. So with one wheel, we have a total of two combinations. Two wheels can have either a 0 or a 1 for the first number, and for each position of the first wheel there are two possibilities for the second; 2*2 = 4, so there are a total of four combinations for two wheels.[24] With three wheels, the same analysis holds: two numbers for the first wheel * two for the second wheel * two for the third wheel, or eight possibilities in all; actually, they are the same eight possibilities we saw in Figures 2.3 and 2.4.

[24] Note that the “*” symbol is used here to represent multiplication. We can't use the typical “x” symbol used in mathematics for this purpose, because letters are used for specific purposes in the C++ language. We'll get into that in detail later.

A pattern is beginning to develop. For each added wheel, we get twice as many possible combinations. To see how this continues, take a look at Figure 2.5[25], which shows the count of combinations vs. the number of wheels for all wheel counts up to 16 (i.e., quantities that can be expressed by 16-bit numbers).

[25] If you think that last number in Figure 2.5 looks familiar, you're right. It's the number of different values that, as I mentioned on page 30, can be stored in a type of numeric variable called a short. This is no coincidence; read on for the detailed explanation.

Jim decided that 14 wheels would do the job since the lifespan of the golf cart probably wouldn't exceed 16383 miles, and so he made up the odometers. The selected customers turned out to be agreeable and soon found that having even a weird odometer was better than none, especially since they didn't have to pay for it. However, one customer did have a complaint. The numbers on the wheels didn't seem to make sense when translated with the chart supplied by Acme.

The customer estimated that he had driven the cart about 9 miles, but the odometer displayed

11111111110111

which, according to his translation chart, was 16375 miles. What could have gone wrong?

Jim decided to have the cart brought in for a checkup and what he discovered was that the odometer cable had been hooked up backwards. That is, instead of turning the wheels forward, they were going backwards. That was part of the solution, but why was the value 16375? Just like a car odometer, in which 99999 (or 999999, if you have a six-wheel odometer) is followed by 0, going backwards from 0 reverses that progression. Similarly, the number 11111111111111 on the funny odometers would be followed by 00000000000000, since the “carry” off the leftmost digit is lost.

Figure 2.5. How many combinations?


Therefore, if you start out at 0 and go backward 1 mile, you'll get

11111111111111

The next mile will turn the last digit back to 0, producing

11111111111110

What happens next? The last wheel turns back to 1 and triggers the second wheel to switch as well, with the result

1111111111101

The next few “backward” numbers look like this:

11111111111100

11111111111011

11111111111010

11111111111001

11111111111000

11111111110111

and so on. If you look at the right-hand end of these numbers, you'll see that the progression is just the opposite of the “forward” numbers.

As for the customer's actual mileage, the last one of these is the number the customer saw on his backward odometer. Apparently, he was right about the distance driven, since this is the ninth “backward” number. So Jim fixed the backward odometer cable and reset the value to the correct number, 00000000001001, or 9 miles.

Eventually, Acme got the right odometer wheels with 0 – 9 on them, replaced the peculiar ones and everyone lived happily ever after.

THE END

Signed Variables

Since the wheels that made up the funny odometers contain only two digits, 0 and 1, the odometers use the binary system for counting. Now it should be obvious why we will see numbers like 65536 and 32768 in our discussions of the number of possible different values that a variable can hold; variables are stored in RAM as collections of bytes, each of which contains 8 bits.[26] As the list of combinations indicates, 8 bits (1 byte) provide 256 different combinations, while 16 bits (2 bytes) can represent 65536 different possible values.

[26] I should mention here that the C++ standard does not require the size of a byte to be 8 bits. But on the most common machines and compilers, it is.

But what about the “backward” numbers with a lot of 1s on the left? As the fable suggests, they correspond to negative numbers. That is, if moving 2 miles forward from 0 registers as 00000000000010, and moving 2 miles backward from 0 registers as 11111111111110, then the latter number is in some sense equivalent to – 2 miles. This in fact is the way that negative integers are stored in the computer; integer variables that can store either positive or negative values are called signed variables. If we don't specify whether we want to be able to store either positive or negative values in a given variable, the C++ language assumes that we want that ability and provides it for us by default.

However, adding the ability to represent negative numbers has a drawback: you can't represent as many positive numbers. This should be fairly obvious, since if we interpret some of the possible patterns as negative, they can't also be used for positive values. Of course, we don't have to worry about negative numbers when counting, for example, how many employees there are working for our company; in such cases, we can specify that we want to use unsigned variables, which will always be interpreted as positive (or 0) values. An example is an unsigned short variable, which uses 16 bits (that is, 2 bytes) to hold any number from 0 to 65535, which totals 65536 different values.[27] This capacity can be calculated as follows: since each byte is 8 bits, 2 bytes contain a total of 16 bits, and 216 is 65536.

[27] As previously noted, the size of a short varies from one compiler and machine to another. However, on the most common current compilers, including the one on the CD in the back of the book, it is 16 bits.

It's important to understand that the difference between a short (that is, a signed short) and an unsigned short is not size, but which 65536 values each can hold. An unsigned short can hold any whole number from 0 to 65535, whereas a short can hold any value from – 32768 to +32767.[28]

[28] You might also think that this would apply to numbers such as the total count of books sold in a particular category since publication date. But apparently that number can be negative, at least according to my royalty statements.

I hope this is clear to you, but in case it isn't, let's see how Susan and I worked over this point.

Susan: I really don't think I understand what a short is besides being 2 bytes of RAM, and I don't really know what signed and unsigned mean.

Steve: A short is indeed 2 bytes (that is, 16 bits) of RAM, at least with our current compiler. This means that it can hold any of 216 (65536) different values. This is a very nice range of values for holding the number of pounds that a pumpkin weighs (for example). You'll see some more uses for this type of variable later.

The difference between a (signed) short and an unsigned short is exactly which 65536 values each can hold. An unsigned short can hold any whole number from 0 to 65535, whereas a (signed) short can hold any value from – 32768 to +32767. The difference between these is solely in the interpretation that we (and the compiler) give to the values. In other words, it's not possible to tell whether a given 2 bytes of RAM represent a short or an unsigned short by looking at the contents of those bytes; you have to know how the variable was defined in the program.

Susan: OK, let's start over. A short is 2 bytes of RAM. A short is a variable. A short is a numeric variable. It can be signed (why is that a default?), meaning its value can be – 32768 to +32767, or unsigned, meaning its value can be 0 – 65535. How's that?

Steve: That's fine. Since you've asked, the reason signed is the default is because that's the way it was in C, and changing it in C++ would “break” C programs that depend on this default. Bjarne Stroustrup, the inventor of C++, has a rule that C++ must be as close to C as possible but no closer. In this case, there's no real reason to change the default, so it wasn't changed.

Susan: Oh, why is it that every time you say something is fairly obvious, my mind just shuts down? When you say “if we interpret some of the possible patterns as negative, they can't also be used for positive values.” Huh? Then if that is the case, would not the reverse also be true? I can see how this explains the values of the signed and unsigned short, but really, I don't think I have grasped this concept.

Steve: What I was trying to explain is that you have to choose one of the following two possibilities:

  1. (signed) short range: – 32768 to +32767

  2. unsigned short range: 0 to 65535

In other words, you have to decide whether you want a given variable to represent

  1. Any of 65536 values from – 32768 to +32767, including 0.

  2. Any of 65536 values from 0 to 65535

If you want a variable with a range like that in selection 1, use a (signed) short; if you prefer the range in selection 2, use an unsigned short. For example, for the number of lines in a text file, you could use an unsigned short, since the maximum number of lines could be limited to less than 65000 lines and couldn't ever be negative. On the other hand, to represent the number of copies of a book that have been sold in the last month, including the possibility of returns exceeding sales, a signed short would be better, since the value could be either positive or negative.[29]

[29] If neither of these does what you want, don't despair. Other types of numeric variables have different ranges, as we'll see starting in Chapter 9.

Susan: In other words, if you are going to be using variables that might have a negative value, then use a signed short, and if you want strictly positive numbers, including 0 as “positive”, then use an unsigned short. Right?

Steve: Exactly!

Susan: Well, then, how do you write a short to indicate that it is signed or unsigned?

Steve: When you define it, you have to specify that it is unsigned if you want it to be unsigned; the default is signed. In other words, if we define a variable x as short x;, it will be signed, whereas if we want a variable called x that is an unsigned short, we have to say unsigned short x;.

Susan: So does it make any difference if your variable is going to overlap the signed and unsigned short ranges? For example, if you are using numbers from 10000 to 30000, would it matter which short you used? It falls under the definition of both.

Steve: You can use whichever you wish in that case.

A (Very) Short Course in Hexadecimal Notation

You may have noticed that it's tedious and error prone to represent numbers in binary; a long string of 0s and 1s is hard to remember or to copy. For this reason, the pure binary system is hardly ever used to specify numbers in computing. However, we have already seen that binary is much more “natural” for computers than the more familiar decimal system. Is there a number system that we humans can use a little more easily than binary, while retaining the advantages of binary for describing internal events in the computer?

As it happens, there is. It's called hexadecimal, which means “base 16”. As a rule, the term hexadecimal is abbreviated to hex. Since there are 16 possible combinations of 4 bits (2*2*2*2), hexadecimal notation allows 4 bits of a binary number to be represented by one hex digit. Unfortunately, however, there are only 10 “normal” digits, 0 – 9. To represent a number in any base, you need as many different digit values as the base, so that any number less than the base can be represented by one digit. For example, in base 2, you need only two digits, 0 and 1. In base 8 (octal), you need eight digits, 0 – 7.[30] So far, so good. But what about base 16? To use this base, we need 16 digits. Since only 10 numeric digits are available, hex notation needs a source for the other six digits. Because letters of the alphabet are available and familiar, the first six letters, a – f, were adopted for this service.[31],[32]

[30] In the early days of computing, base 8 was sometimes used instead of base 16, especially on machines that used 12-bit and 36-bit registers; however, it has fallen into disuse because almost all current machines have 32-bit registers. 32, of course, is divisible by 4 but not by 3. Base 8 is unlikely to make a comeback because the upcoming generation of new CPUs will have 64-bit registers, which are again evenly divisible into hex digits but not octal ones.

[31] Either upper or lower case letters are acceptable to most programs (and programmers). I'll use lower case because such letters are easier to distinguish than upper case ones; besides, I find them less irritating to look at.

[32] See On Beyond Zebra (ISBN 0-394-80084-2) for another possible approach to this problem.

The notion of a base-16 numbering system can really throw someone who learned normal decimal arithmetic solely by rote, without understanding the concepts on which it is based. Susan and I spent quite a bit of time on it, starting with this discussion:

Susan: I don't get this at all! What is the deal with the letters in the hex system? I guess it would be okay if 16 wasn't represented by 10!

Steve: Well, there are only 10 “normal” digits, 0 – 9. To represent a number in any base, you need as many “digits” as the base, so that any number less than the base can be represented by one “digit”. This is no problem with a base less than ten, such as octal, but what about base 16? To use this base we need 16 digits, 0 – 9 and a – f. One way to remember this is to imagine that the “hex” in “hexadecimal” stands for the six letters a – f and the “decimal” stands for the 10 digits 0 – 9.

Susan: OK, so a hex digit represents 16 bits? So then is hex equal to 2 bytes? According to the preceding a hex digit is 4 bits.

Steve: Yes, a hex digit represents 4 bits. Let's try a new approach. First, let me define a new term, a hexit. That's short for “hexadecimal digit”, just like “bit” is short for “binary digit”.

  1. How many one-digit decimal numbers exist?

  2. How many two-digit decimal numbers exist?

  3. How many three-digit decimal numbers exist?

  4. How many four-digit decimal numbers exist?

  5. How many one-bit binary numbers exist?

  6. How many two-bit binary numbers exist?

  7. How many three-bit binary numbers exist?

  8. How many four-bit binary numbers exist?

  9. How many one-hexit hexadecimal numbers exist?

  10. How many two-hexit hexadecimal numbers exist?

  11. How many three-hexit hexadecimal numbers exist?

  12. How many four-hexit hexadecimal numbers exist?

The answers are:

  1. 10

  2. 100

  3. 1000

  4. 10000

  5. 2

  6. 4

  7. 8

  8. 16

  9. 16

  10. 256

  11. 4096

  12. 65536

What do all these answers have in common? Let's look at the answers a little differently, in powers of 10, 2, and 16, respectively:

  1. 10 = 101

  2. 100 = 102

  3. 1000 = 103

  4. 10000 = 104

  5. 2 = 21

  6. 4 = 22

  7. 8 = 22

  8. 16 = 22

  9. 16 = 162

  10. 256 = 162

  11. 4096 = 162

  12. 65536 = 162

That is, a number that has one digit can represent “base” different values, where “base” is two, ten, or sixteen (in our examples). Every time we increase the size of the number by one more digit, we can represent “base” times as many possible different values, or in other words, we multiply the range of values that the number can represent by the base. Thus, a two-digit number can represent any of “base*base” values, a three-digit number can represent any of “base*base*base” values, and so on. That's the way positional number systems such as decimal, binary, and hex work. If you need a bigger number, you just add more digits.

Okay, so what does this have to do with hex? If you look at the above table, you'll see that 24 (16) is equal to 161. That means that 4 bits are exactly equivalent to one hexit in their ability to represent different numbers: exactly 16 possible numbers can be represented by four bits, and exactly 16 possible numbers can be represented by one hexit.

This means that you can write one hexit wherever you would otherwise have to use four bits, as illustrated in Figure 2.6.

Figure 2.6. Binary to hex conversion table

So an 8-bit number, such as:

0101 1011

can be translated directly into a hex value, like this:

5b

For this reason, binary is almost never used. Instead, we use hex as a shortcut to eliminate the necessity of reading, writing, and remembering long strings of bits.

Susan: A hex digit or hexit is like a four-wheel odometer in binary. Since each wheel is capable of only one of two values, being either (1) or (0), then the total number of possible values is 16. Thus your 2*2*2*2 = 16. I think I've got this down.

Steve: You certainly do!

Susan: If it has 4 bits and you have 2 of them then won't there be eight “wheels” and so forth? So 2 hex would hold XXXXXXXX places and 3 hex would hold XXXXXXXXXXXX places.

Steve: Correct. A one-hexit number is analogous to a one-digit decimal number. A one-hexit number contains 4 bits and therefore can represent any of 16 values. A two-hexit number contains 8 bits and therefore can represent any of 256 values.

Now that we've seen how each hex digit corresponds exactly to a group of four binary digits, here's an exercise you can use to improve your understanding of this topic: Invent a random string of four binary digits and see where it is in Figure 2.6. I guarantee it'll be there somewhere! Then look at the “hex” column and see what “digit” it corresponds to. There's nothing really mysterious about hex; since we have run out of digits after 9, we have to use letters to represent the numbers 'ten', 'eleven', 'twelve', 'thirteen', 'fourteen', and 'fifteen'.

Another reason to use hex rather than decimal is that byte values expressed as hex digits can be combined directly to produce larger values, which is not true with decimal digits. In case this isn't obvious, let's go over it in more detail. Since each hex digit (0 – f) represents exactly 4 bits, two of them (00 – ff) represent 8 bits, or one byte. Similarly, 4 hex digits (0000 – ffff) represent 16 bits, or a short value; the first two digits represent the first byte of the 2-byte value, and the last two digits, the second byte. This can be extended to any number of bytes. On the other hand, representing 4 bits requires two decimal digits, as the values range from 00 – 15, whereas it takes three digits (000 – 255) to represent one byte. A 2-byte value requires five decimal digits, since the value can be from 00000 to 65535. As you can see, there's no simple relationship between the decimal digits representing each byte and the decimal representation of a 2-byte value.

Figure 2.7 is a table showing the correspondence between some decimal, hex, and binary numbers, with the values of each digit position in each number base indicated, and the calculation of the total of all of the bit values in the binary representation.

Figure 2.7. Different representations of the same numbers
  Decimal      Hexadecimal     Binary           Sum of binary
Place Values  Place Values  Place Values         digit values
 10  1          16 1        16 8 4 2 1

     0           0 0         0 0 0 0 0   =    0 + 0 + 0 + 0 + 0
     1           0 1         0 0 0 0 1   =    0 + 0 + 0 + 0 + 1
     2           0 2         0 0 0 1 0   =    0 + 0 + 0 + 2 + 0
     3           0 3         0 0 0 1 1   =    0 + 0 + 0 + 2 + 1
     4           0 4         0 0 1 0 0   =    0 + 0 + 4 + 0 + 0
     5           0 5         0 0 1 0 1   =    0 + 0 + 4 + 0 + 1
     6           0 6         0 0 1 1 0   =    0 + 0 + 4 + 2 + 0
     7           0 7         0 0 1 1 1   =    0 + 0 + 4 + 2 + 1
     8           0 8         0 1 0 0 0   =    0 + 8 + 0 + 0 + 0
     9           0 9         0 1 0 0 1   =    0 + 8 + 0 + 0 + 1
   1 0           0 a         0 1 0 1 0   =    0 + 8 + 0 + 2 + 0
   1 1           0 b         0 1 0 1 1   =    0 + 8 + 0 + 2 + 1
   1 2           0 c         0 1 1 0 0   =    0 + 8 + 4 + 0 + 0
   1 3           0 d         0 1 1 0 1   =    0 + 8 + 4 + 0 + 1
   1 4           0 e         0 1 1 1 0   =    0 + 8 + 4 + 2 + 0
   1 5           0 f         0 1 1 1 1   =    0 + 8 + 4 + 2 + 1
   1 6           1 0         1 0 0 0 0   =   16 + 0 + 0 + 0 + 0
   1 7           1 1         1 0 0 0 1   =   16 + 0 + 0 + 0 + 1
   1 8           1 2         1 0 0 1 0   =   16 + 0 + 0 + 2 + 0
   1 9           1 3         1 0 0 1 1   =   16 + 0 + 0 + 2 + 1

Susan had some more thoughts on the hexadecimal number system. Let's listen in:

Susan: I think you need to spend a little more time reviewing the hex system, like an entire chapter.<G> Well, I am getting the impression that we are going to be working with hex, so I am trying to concentrate my understanding on that instead of binary. I think this all moves a little too fast for me. I don't know what your other reviewers are saying but I just feel like I get a definition of an abstract concept, and the next thing I know I am supposed to be doing something with it, like make it work. Ha! I personally need to digest new concepts, I really need to think them over a bit, to take them in and absorb them. I just can't start working with it right away.

As usual, I've complied with her request; the results are immediately ahead.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset