Chapter 7. (Downplaying) Pointers in Modern C++

Images

Objectives

In this chapter, you’ll:

Learn what pointers are, and declare and initialize them.

Use the address (&) and indirection (*) pointer operators.

Compare the capabilities of pointers and references.

Use pointers to pass arguments to functions by reference.

Use pointer-based arrays and strings mostly in legacy code.

Use const with pointers and the data they point to.

Use operator sizeof to determine the number of bytes that store a value of a particular type.

Understand pointer expressions and pointer arithmetic that you’ll see in legacy code.

Use C++11’s nullptr to represent pointers to nothing.

Use C++11’s begin and end library functions with pointer-based arrays.

Learn various C++ Core Guidelines for avoiding pointers and pointer-based arrays to create safer, more robust programs.

Use C++20’s to_array function to convert built-in arrays and initializer lists to std::arrays.

Continue our objects-natural approach by using C++20’s class template span to create objects that are views into built-in arrays, std::arrays and std::vectors.

7.1 Introduction

This chapter discusses pointers, built-in pointer-based arrays and pointer-based strings (also called C-strings), each of which C++ inherited from the C programming language.

Downplaying Pointers in Modern C++

Pointers are powerful but challenging to work with and error-prone. So, Modern C++ (C++20, C++17, C++14 and C++11) has added features that eliminate the need for most pointers. New software-development projects generally should prefer:

• using references to using pointers,

• using std::array1 and std::vector objects (Chapter 6) to using built-in pointer-based arrays, and

1. We pronounce “std::” as “standard,” so throughout this chapter we say “a std::array” rather than “an std::array,” which assumes “std::” is pronounced as its individual letters s, t and d.

• using std::string objects (Chapters 2 and 8) to pointer-based C-strings.

Sometimes Pointers Are Still Required

You’ll encounter pointers, pointer-based arrays and pointer-based strings frequently in the massive installed base of legacy C++ code. Pointers are required to:

• create and manipulate dynamic data structures, like linked lists, queues, stacks and trees that can grow and shrink at execution time—though most programmers will use the C++ standard library’s existing dynamic containers like vector and the containers we discuss in Chapter 17,

• process command-line arguments, which a program receives as a pointer-based array of pointer-based C-strings, and

• pass arguments by reference if there’s a possibility of a nullptr2 (i.e., a pointer to nothing; Section 7.2.2)—a reference must refer to an actual object.

2. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-ptr-ref.

Pointer-Related C++ Core Guidelines

CG We mention C++ Core Guidelines that encourage you to make your code safer and more robust by recommending you use techniques that avoid pointers, pointer-based arrays and pointer-based strings. For example, several guidelines recommend implementing pass-by-reference using references, rather than pointers.3

3. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#S-functions.

C++20 Features for Avoiding Pointers

20 For programs that still require pointer-based arrays (e.g., command-line arguments), C++20 adds two new features that help make your programs safer and more robust:

• Function to_array converts a pointer-based array to a std::array, so you can take advantage of the features we demonstrated in Chapter 6.

spans offer a safer way to pass built-in arrays to functions. They’re iterable, so you can use them with range-based for statements to conveniently process elements without risking out-of-bounds array accesses. Also, because spans are iterable, you can use them with standard library container-processing algorithms, such as accumulate and sort. We’ll cover spans in this chapter’s objects natural case study where you’ll see that they also work with std::array and std::vector.

The key takeaway from reading this chapter is to avoid using pointers, pointer-based arrays and pointer-based strings whenever possible. If you must use them, take advantage of to_array and spans.

Other Concepts Presented in This Chapter

We declare and initialize pointers and demonstrate the pointer operators & and *. In Chapter 5, we performed pass-by-reference with references. Here, we show that pointers also enable pass-by-reference. We demonstrate built-in, pointer-based arrays and their intimate relationship with pointers.

We show how to use const with pointers and the data they point to, and we introduce the sizeof operator to determine the number of bytes that store values of particular fundamental types and pointers. We demonstrate pointer expressions and pointer arithmetic.

C-strings were used widely in older C++ software. This chapter briefly introduces C-strings. You’ll see how to process command-line arguments—a simple task for which C++ still requires you to use both pointer-based C-strings and pointer-based arrays.

7.2 Pointer Variable Declarations and Initialization

Pointer variables contain memory addresses as their values. Usually, a variable directly contains a specific value. A pointer contains the memory address of a variable that, in turn, contains a specific value. In this sense, a variable name directly references a value, and a pointer indirectly references a value as shown in the following diagram:

Images

Referencing a value through a pointer as in this diagram is called indirection.

7.2.1 Declaring Pointers

The following declaration declares the variable countPtr to be of type int* (i.e., a pointer to an int value) and is read (right-to-left), “countPtr is a pointer to an int”:

int* countPtr;

This * is not an operator; rather, it indicates that the variable to its right is a pointer. We like to include the letters Ptr in each pointer variable name to make it clear that the variable is a pointer and must be handled accordingly.

7.2.2 Initializing Pointers

11 Images Security Initialize each pointer to nullptr (from C++11) or a memory address. A pointer with the value nullptr “points to nothing” and is known as a null pointer. From this point forward, when we refer to a “null pointer,” we mean a pointer with the value nullptr. Initialize all pointers to prevent pointing to unknown or uninitialized areas of memory.

7.2.3 Null Pointers Prior to C++11

In earlier C++ versions, the value specified for a null pointer was 0 or NULL. NULL is defined in several standard library headers to represent the value 0. Initializing a pointer to NULL is equivalent to initializing it to 0, but prior to C++11, 0 was used by convention. The value 0 is the only integer value that can be assigned directly to a pointer variable without first casting the integer to a pointer type (generally via a reinterpret_cast; Section 9.8).

7.3 Pointer Operators

The unary operators & and * create pointer values and “dereference” pointers, respectively. We show how to use these operators in the following sections.

7.3.1 Address (&) Operator

The address operator (&) is a unary operator that obtains the memory address of its oper- and. For example, assuming the declarations

int y{5}; // declare variable y
int* yPtr{nullptr}; // declare pointer variable yPtr

the following statement assigns the address of the variable y to pointer variable yPtr:

yPtr = &y; // assign address of y to yPtr

Variable yPtr is said to “point to” yyPtr indirectly references the variable y’s value (5).

The & in the preceding statement is not a reference variable declaration, where & is always preceded by a type name. When declaring a reference, the & is part of the type. In an expression like &y, the & is the address operator.

The following diagram shows a memory representation after the previous assignment:

Images

The “pointing relationship” is indicated by drawing an arrow from the box that represents the pointer yPtr in memory to the box that represents the variable y in memory.

The following diagram shows another pointer memory representation with int variable y stored at memory location 600000 and pointer variable yPtr at location 500000

Images

The address operator’s operand must be an lvalue—the address operator cannot be applied to literals or to expressions that result in temporary values (like the results of calculations).

7.3.2 Indirection (*) Operator

Applying the unary * operator to a pointer results in an lvalue representing the object to which its pointer operand points. This operator is commonly referred to as the indirection operator or dereferencing operator. If yPtr points to y and y contains 5 (as in the preceding diagrams), the statement

cout << *yPtr << endl;

displays y’s value (5), as would the statement

cout << y << endl;

Using * in this manner is called dereferencing a pointer. A dereferenced pointer also can be used as an lvalue in an assignment. The following assigns 9 to y:

*yPtr = 9;

In this statement, *yPtr is an alias for y. The dereferenced pointer may also be used to receive an input value as in

cin >> *yPtr;

which places the input value in y.

Undefined Behaviors

Images Security Dereferencing an uninitialized pointer results in undefined behavior that could cause a fatal execution-time error. This also could lead to accidentally modifying important data, allowing the program to run to completion, possibly with incorrect results. This is a potential security flaw that an attacker might be able to exploit to access data, overwrite data or even execute malicious code.4,5,6 Dereferencing a null pointer results in undefined behavior and typically causes a fatal execution-time error. In industrial-strength code, ensure that a pointer is not nullptr before dereferencing it.7

4. “Undefined Behavior.” Wikipedia. Wikimedia Foundation, May 30, 2020. https://en.wikipedia.org/wiki/Undefined_behavior.

5. “Common Weakness Enumeration.” CWE. Accessed June 14, 2020. https://cwe.mitre.org/data/definitions/824.html.

6. “Dangling Pointer.” Wikipedia. Wikimedia Foundation, June 8, 2020. https://en.wikipedia.org/wiki/Dangling_pointer.

7. The C++ Core Guidelines recommend using the gsl::not_null class template from the Guidelines Support Library (GSL) to declare pointers that should not have the value nullptr. Throughout this book, we adhere to the C++ Core Guidelines as appropriate. At the time of this writing, the Guidelines Support Library’s gsl::not_null implementation did not produce helpful error messages in our compilers, so we chose not to use gsl::not_null in our code.

7.3.3 Using the Address (&) and Indirection (*) Operators

Figure 7.1 demonstrates the & and * pointer operators, which have the third-highest level of precedence (see the Appendix A for the complete operator-precedence chart). Memory locations are output by << in this example as hexadecimal (i.e., base-16) integers. (See Appendix D, Number Systems, for more information on hexadecimal integers.) The output shows that variable a’s address (line 10) and aPtr’s value (line 11) are identical, confirming that a’s address was indeed assigned to aPtr (line 8). The outputs from lines 12–13 confirm that *aPtr has the same value as a. The memory addresses output by this program with cout and << are compiler- and platform-dependent, and typically change with each program execution, so you’ll likely see different addresses.

 1    // fig07_01.cpp
 2    // Pointer operators & and *.
 3    #include <iostream>
 4    using namespace std;
 5
 6    int main() {
 7       constexpr int a{7}; // initialize a with 7
 8       const int* aPtr = &a; // initialize aPtr with address of int variable a
 9
10      cout << "The address of a is " << &a
11         << "
The value of aPtr is " << aPtr;
12      cout << "

The value of a is " << a
13         << "
The value of *aPtr is " << *aPtr << endl;
14    }
The address of a is 002DFD80
The value of aPtr is 002DFD80

The value of a is 7
The value of *aPtr is 7

Fig. 7.1 Pointer operators & and *.

7.4 Pass-by-Reference with Pointers

There are three ways in C++ to pass arguments to a function:

• pass-by-value

• pass-by-reference with a reference argument

pass-by-reference with a pointer argument.

Images PERF Chapter 5 showed the first two. Here, we explain pass-by-reference with a pointer. Pointers, like references, can be used to modify variables in the caller or to pass large data objects by reference to avoid the overhead of copying objects. You accomplish pass-by-reference via pointers and the indirection operator (*). When calling a function that receives a pointer, pass a variable’s address by applying the address operator (&) to the variable’s name.

An Example of Pass-By-Value

Figures 7.2 and 7.3 present two functions that each cube an integer. Figure 7.2 passes variable number by value (line 12) to function cubeByValue (lines 17–19), which cubes its argument and passes the result back to main using a return statement (line 18). We stored the new value in number (line 12), though that is not required. For instance, the calling function might want to examine the function call’s result before modifying variable number.

 1    // fig07_02.cpp
 2    // Pass-by-value used to cube a variable’s value.
 3    #include <iostream>
 4    using namespace std;
 5
 6    int cubeByValue(int n); // prototype
 7
 8    int main() {
 9       int number{5};
10
11       cout << "The original value of number is " << number;
12       number = cubeByValue(number); // pass number by value to cubeByValue
13       cout << "
The new value of number is " << number << endl;
14     }
15
16     // calculate and return cube of integer argument               
17     int cubeByValue(int n) {                                       
18        return n * n * n; // cube local variable n and return result
19     }                                                              
The original value of number is 5
The new value of number is 125

Fig. 7.2 Pass-by-value used to cube a variable’s value.

 1    // fig07_03.cpp
 2    // Pass-by-reference with a pointer argument used to cube a
 3    // variable’s value.
 4    #include <iostream>
 5    using namespace std;
 6
 7    void cubeByReference(int* nPtr); // prototype
 8
 9    int main() {
10       int number{5};
11
12       cout << "The original value of number is " << number;
13       cubeByReference(&number); // pass number address to cubeByReference
14       cout << "
The new value of number is " << number << endl;
15    }
16
17    // calculate cube of *nPtr; modifies variable number in main
18    void cubeByReference(int* nPtr) {                           
19       *nPtr = *nPtr * *nPtr * *nPtr; // cube *nPtr             
20    }                                                           
The original value of number is 5
The new value of number is 125

Fig. 7.3 Pass-by-reference with a pointer argument used to cube a variable’s value.

An Example of Pass-By-Reference with Pointers

Figure 7.3 passes the variable number to function cubeByReference using pass-by-reference with a pointer argument (line 13)—the address of number is passed to the function. Function cubeByReference (lines 18–20) specifies parameter nPtr (a pointer to int) to receive its argument. The function uses the dereferenced pointer*nPtr, an alias for number in main—to cube the value to which nPtr points (line 19). This directly changes the value of number in main (line 10). Line 19 can be made clearer with redundant parentheses:

*nPtr = (*nPtr) * (*nPtr) * (*nPtr); // cube *nPtr

A function receiving an address as an argument must define a pointer parameter to receive the address. For example, function cubeByReference’s header (line 18) specifies that the function receives a pointer to an int as an argument, stores the address in nPtr and does not return a value.

Insight: Pass-By-Reference with a Pointer Actually Passes the Pointer By Value

Passing a variable by reference with a pointer does not actually pass anything by reference. Rather, a pointer to that variable is passed by value. That pointer value is copied into the function’s corresponding pointer parameter. The called function can then access the caller’s variable by dereferencing the pointer, thus accomplishing pass-by-reference.

Graphical Analysis of Pass-By-Value and Pass-By-Reference

Figures 7.47.5 graphically analyze the execution of Fig. 7.2 and Fig. 7.3, respectively. The rectangle above a given expression or variable contains the value being produced by a step in the diagram. Each diagram’s right column shows functions cubeByValue (Fig. 7.2) and cubeByReference (Fig. 7.3) only when they’re executing.

Images

Fig. 7.4 Pass-by-value analysis of the program of Fig. 7.2.

Images

Fig. 7.5 Pass-by-reference analysis of the program of Fig. 7.3.

7.5 Built-In Arrays

Images Security Here we present built-in arrays, which like std::arrays are also fixed-size data structures. We include this presentation mostly because you’ll see built-in arrays in legacy C++ code. New applications should use std::array and std::vector to create safer, more robust applications.

20 20 In particular, std::array and std::vector objects always know their own size—even when passed to other functions, which is not the case for built-in arrays. If you work on applications containing built-in arrays, you can use C++20’s to_array function to convert them to std::arrays (Section 7.6), or you can process them more safely using C++20’s spans (Section 7.10). There are some cases in which built-in arrays are required, such as receiving command-line arguments, which we demonstrate in Section 7.11.

7.5.1 Declaring and Accessing a Built-In Array

As with std::array, you must specify a built-in array’s element type and number of elements, but the syntax is different. For example, to reserve five elements for a built-in array of ints named c, use

int c[5]; // c is a built-in array of 5 integers

You use the subscript ([]) operator to access a built-in array’s elements. Recall from Chapter 6 that the subscript ([]) operator does not provide bounds checking for std::arrays—this is also true for built-in arrays. Of course, you can use std::array’s at member function to do bounds checking.

7.5.2 Initializing Built-In Arrays

You can initialize the elements of a built-in array using an initializer list. For example,

int n[5]{50, 20, 30, 10, 40};

creates and initializes a built-in array of five ints. If you provide fewer initializers than the number of elements, the remaining elements are value initialized—fundamental numeric types are set to 0, bools are set to false, pointers are set to nullptr and, as we’ll see in Chapter 10, objects receive the default initialization specified by their class definitions. If you provide too many initializers, a compilation error occurs.

The compiler can size a built-in array by counting an initializer list’s elements. For example, the following creates a five-element array:

int n[]{50, 20, 30, 10, 40};

7.5.3 Passing Built-In Arrays to Functions

The value of a built-in array’s name is implicitly convertible to a const or non-const pointer to the built-in array’s first element—this is known as decaying to a pointer. So the array name n above is equivalent to &n[0], which is a pointer to the element containing 50. You don’t need to take the address (&) of a built-in array to pass it to a function—you simply pass its name. As you saw in Section 7.4, a function that receives a pointer to a variable in the caller can modify that variable in the caller. For built-in arrays, the called function can modify all the elements in the caller—unless the parameter is declared const. Applying const to a built-in array parameter to prevent the argument array in the caller from being modified in the called function is another example of the principle of least privilege.

7.5.4 Declaring Built-In Array Parameters

You can declare a built-in array parameter in a function header, as follows:

int sumElements(const int values[], size_t numberOfElements)

which indicates that the function’s first argument should be a one-dimensional built-in array of ints that should not be modified by the function. Unlike std::arrays and std::vectors, built-in arrays don’t know their own size, so a function that processes a built-in array should also receive the built-in array’s size.

The preceding header also can be written as

int sumElements(const int* values, size_t numberOfElements)

The compiler does not differentiate between a function that receives a pointer and a function that receives a built-in array. In fact, the compiler converts const int values[] to const int* values under the hood. This means the function must “know” when it’s receiving a built-in array vs. a single variable that’s being passed by reference.

CG 20 The C++ Core Guidelines specifically say not to pass built-in arrays to functions;8 rather, you should pass C++20 spans because they maintain a pointer to the array’s first element and the array’s size. In Section 7.10, we’ll demonstrate spans and you’ll see that passing a span is superior to passing a built-in array and its size to a function.

8. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Ri-array.

7.5.5 C++11: Standard Library Functions begin and end

11 In Section 6.12, we sorted a std::array of strings called colors as follows:

sort(begin(colors), end(colors)); // sort contents of colors

Functions begin and end specified that the entire std::array should be sorted. Function sort (and many other C++ Standard Library functions) also can be applied to built-in arrays. For example, to sort the built-in array n (Section 7.5.2), you can write

sort(begin(n), end(n)); // sort contents of built-in array n

20 For a built-in array, begin and end work only in the scope that originally defines the array, which is where the compiler knows the array’s size. Again, you should pass built-in arrays to other functions using C++20 spans, which we demonstrate in Section 7.10.

7.5.6 Built-In Array Limitations

Built-in arrays have several limitations:

• They cannot be compared using the relational and equality operators—you must use a loop to compare two built-in arrays element by element. If you had two int arrays named array1 and array2, the condition array1 == array2 would always be false, even if the arrays’ contents are identical. Remember, array names decay to const pointers to the arrays’ first elements. And, of course, for separate arrays, those elements reside at different memory locations.

• They cannot be assigned to one another—an array name is effectively a const pointer, so it can’t be changed by assignment.

• They don’t know their own size—a function that processes a built-in array typically receives both the built-in array’s name and its size as arguments.

• They don’t provide automatic bounds checking—you must ensure that array-access expressions use subscripts within the built-in array’s bounds.

7.6 C++20: Using to_array to Convert a Built-in Array to a std::array

20 CG Images Security In industry, you’ll encounter C++ legacy code that uses built-in arrays. The C++ Core Guidelines say you should prefer std::arrays and std::vectors to built-in arrays because they’re safer, and they do not become pointers when you pass them to functions.9 C++20’s new to_array function10 (header <array>) makes it convenient to create a std::array from a built-in array or an initializer list. Figure 7.6 demonstrates to_array. We use a generic lambda expression (lines 9–13) to display each std::array’s contents. Again, specifying a lambda parameter’s type as auto enables the compiler to infer the parameter’s type, based on the context in which the lambda appears. In this program, the generic lambda automatically determines the type of the std::array over which it iterates.

9. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rsl-arrays.

10. “to_array From LFTS with Updates.” to_array from LFTS with updates—HackMD. Accessed June 14, 2020. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0325r4.html.

 1    // fig07_06.cpp
 2    // C++20: Creating std::arrays with to_array.          20
 3    #include <iostream>
 4    #include <array>
 5    using namespace std;
 6
 7    int main() {
 8       // lambda to display a collection of items
 9       const auto display = [](const auto& items) {
10          for (const auto& item : items) {
11             cout << item << " ";
12          }
13       };
14

Fig. 7.6 C++20: Creating std::arrays with to_array.

Using to_array to create a std::array from a Built-In Array

Line 18 creates a three-element std::array of ints by copying the contents of built-in array values1. We use auto to infer the std::array variable’s type and size. If we declare the array’s type and size explicitly and it does not match to_array’s return value, a compilation error occurs. We assign the result to the variable array1. Lines 20 and 21 display the std::array’s size and contents to confirm that it was created correctly.

15   const int values1[3]{10, 20, 30};
16
17   // creating a std::array from a built-in array
18   const auto array1 = to_array(values1);
19
20   cout << "array1.size() = " << array1.size() << "
array1: ";
21   display(array4); // use lambda to display contents
22
array1.size() = 3
array1: 10 20 30
Using to_array to create a std::array from an Initializer List

Line 24 shows that to_array can create a std::array from an initializer list. Lines 25 and 26 display the array’s size and contents to confirm that it was created correctly.

23     // creating a std::array from an initializer list
24     const auto array2 = to_array({1, 2, 3, 4});
25     cout << "

array2.size() = " << array2.size() << "
array2: ";
26     display(array2); // use lambda to display contents
27
28     cout << endl;
29
array2.size() = 4
array2: 1 2 3 4

7.7 Using const with Pointers and the Data Pointed To

This section discusses how to combine const with pointer declarations to enforce the principle of least privilege. Chapter 5 explained that pass-by-value copies an argument’s value into a function’s parameter. If the copy is modified in the called function, the original value in the caller does not change. In some instances, even the copy of the argument’s value should not be altered in the called function.

If a value does not (or should not) change in the body of a function to which it’s passed, declare the parameter const. Before using a function, check its function prototype to determine the parameters that it can and cannot modify.

There are four ways to pass a pointer to a function:

• a nonconstant pointer to nonconstant data,

• a nonconstant pointer to constant data (Fig. 7.7),

• a constant pointer to nonconstant data (Fig. 7.8) and

• a constant pointer to constant data (Fig. 7.9).

Each combination provides a different level of access privilege.

7.7.1 Using a Nonconstant Pointer to Nonconstant Data

The highest privileges are granted by a nonconstant pointer to nonconstant data:

• the data can be modified through the dereferenced pointer, and

• the pointer can be modified to point to other data.

Such a pointer’s declaration (e.g., int* countPtr) does not include const.

7.7.2 Using a Nonconstant Pointer to Constant Data

A nonconstant pointer to constant data is

• a pointer that can be modified to point to any data of the appropriate type, but

• the data to which it points cannot be modified through that pointer.

The declaration for such a pointer places const to the left of the pointer’s type, as in11

11. Some programmers prefer to write this as int const* countPtr;. They’d read this declaration from right to left as “countPtr is a pointer to a constant integer.”

const int* countPtr;

The declaration is read from right to left as “countPtr is a pointer to an integer constant” or, more precisely, “countPtr is a nonconstant pointer to an integer constant.

Figure 7.7 demonstrates the GNU C++ compilation error produced when you try to modify data via a nonconstant pointer to constant data.

 1    // fig07_07.cpp
 2    // Attempting to modify data through a
 3    // nonconstant pointer to constant data.
 4
 5    int main() {
 6       int y{0};
 7       const int* yPtr{&y};
 8       *yPtr = 100; // error: cannot modify a const object
 9    }

GNU C++ compiler error message:
fig07_07.cpp: In function 'int main()':
fig07_07.cpp:8:10: error: assignment of read-only location '* yPtr'
    8 |    *yPtr = 100; // error: cannot modify a const object
      |    ~~~~~~^~~~~

Fig. 7.7 Attempting to modify data through a nonconstant pointer to const data.

Images PERF Images Security Use pass-by-value to pass fundamental-type arguments (e.g., ints, doubles, etc.) unless the called function must directly modify the value in the caller. This is another example of the principle of least privilege. If large objects do not need to be modified by a called function, pass them using references to constant data or using pointers to constant data—though references are preferred. This gives the performance benefits of pass-by-reference and avoids the copy overhead of pass-by-value. Passing large objects using references to constant data or pointers to constant data also offers the security of pass-by-value.

7.7.3 Using a Constant Pointer to Nonconstant Data

A constant pointer to nonconstant data is a pointer that

• always points to the same memory location, and

• the data at that location can be modified through the pointer.

Pointers that are declared const must be initialized when they’re declared, but if the pointer is a function parameter, it’s initialized with the pointer that’s passed to the function. Each successive call to the function reinitializes that function parameter.

Figure 7.8 attempts to modify a constant pointer. Line 9 declares pointer ptr to be of type int* const. The declaration is read from right to left as “ptr is a constant pointer to a nonconstant integer.” The pointer is initialized with the address of integer variable x. Line 12 attempts to assign the address of y to ptr, but the compiler generates an error message. No error occurs when line 11 assigns the value 7 to *ptr. The nonconstant value to which ptr points can be modified using the dereferenced ptr, even though ptr itself has been declared const.

 1    // fig07_08.cpp
 2    // Attempting to modify a constant pointer to nonconstant data.
 3
 4    int main() {
 5       int x, y;
 6
 7       // ptr is a constant pointer to an integer that can be modified   
 8       // through ptr, but ptr always points to the same memory location.
 9       int* const ptr{&x}; // const pointer must be initialized          
10
11       *ptr = 7; // allowed: *ptr is not const
12       ptr = &y; // error: ptr is const; cannot assign to it a new address
13    }

Microsoft Visual C++ compiler error message:
error C3892: 'ptr': you cannot assign to a variable that is const

Fig. 7.8 Attempting to modify a constant pointer to nonconstant data.

7.7.4 Using a Constant Pointer to Constant Data

The minimum access privileges are granted by a constant pointer to constant data:

• such a pointer always points to the same memory location, and

• the data at that location cannot be modified via the pointer.

Figure 7.9 declares pointer variable ptr to be of type const int* const (line 12). This declaration is read from right to left as “ptr is a constant pointer to an integer constant.” The figure shows the Apple Clang compiler’s error messages for attempting to modify the data to which ptr points (line 16) and attempting to modify the address stored in the pointer variable (line 17). In line 14, no errors occur, because neither the pointer nor the data it points to is being modified.

 1    // fig07_09.cpp
 2    // Attempting to modify a constant pointer to constant data.
 3    #include <iostream>
 4    using namespace std;
 5
 6    int main() {
 7       int x{5}, y;
 8
 9       // ptr is a constant pointer to a constant integer.
10       // ptr always points to the same location; the integer
11       // at that location cannot be modified.
12       const int* const ptr{&x};
13         
14       cout << *ptr << endl;
15
16       *ptr = 7; // error: *ptr is const; cannot assign new value 
17       ptr = &y; // error: ptr is const; cannot assign new address
18    }

Apple Clang compiler error messages:
fig07_09.cpp:16:9: error: read-only variable is not assignable
   *ptr = 7; // error: *ptr is const; cannot assign new value
   ~~~~ ^
fig07_09.cpp:17:8: error: cannot assign to variable 'ptr' with const-qualified type 'const int *const'
   ptr = &y; // error: ptr is const; cannot assign new address
   ~~~ ^

Fig. 7.9 Attempting to modify a constant pointer to constant data.

7.8 sizeof Operator

The compile-time unary operator sizeof determines the size in bytes of a built-in array or of any other data type, variable or constant during program compilation. When applied to a built-in array’s name, as in Fig. 7.1012 (line 12), sizeof returns the total number of bytes in the built-in array as a value of type size_t. The computer we used to compile this program stores double variables in 8 bytes of memory. numbers is declared to have 20 elements (line 10), so it uses 160 bytes in memory. Applying sizeof to a pointer parameter (line 20) in a function that receives a built-in array, returns the size of the pointer in bytes (4 on the system we used), not the built-in array’s size. Using the sizeof operator in a function to find the size in bytes of a built-in array parameter returns the size in bytes of a pointer, not the size in bytes of the built-in array.

12. This is a mechanical example to demonstrate how sizeof works. If you use static code-analysis tools, such as the C++ Core Guidelines checker in Microsoft Visual Studio, you’ll receive warnings because you should not pass built-in arrays to functions.

 1    // fig07_10.cpp
 2    // Sizeof operator when used on a built-in array's name
 3    // returns the number of bytes in the built-in array.
 4    #include <iostream>
 5    using namespace std;
 6
 7    size_t getSize(double* ptr); // prototype
 8
 9    int main() {
10      double numbers[20]; // 20 doubles; occupies 160 bytes on our system
11
12      cout << "The number of bytes in the array is " << sizeof(numbers);
13
14      cout << "
The number of bytes returned by getSize is "
15         << getSize(numbers) << endl;
16    }
17
18    // return size of ptr        
19    size_t getSize(double* ptr) {
20    return sizeof(ptr);          
21    }                            
The number of bytes in the array is 160
The number of bytes returned by getSize is 4

Fig. 7.10 sizeof operator when applied to a built-in array’s name returns the number of bytes in the built-in array.

Determining the Sizes of the Fundamental Types, a Built-In Array and a Pointer

11 Figure 7.11 uses sizeof to calculate the number of bytes used to store various standard data types. The output was produced using the Apple Clang compiler in Xcode. Type sizes are platform dependent. When we run this program on our Windows system, for example, long is 4 bytes and long long is 8 bytes, whereas on our Mac, they’re both 8 bytes. In this example13, lines 7–15 implicitly initialize each variable to 0 using a C++11 empty initializer list, {}.

13. Line 16 uses const rather than constexpr to prevent a type mismatch compilation error. The name of the built-in array of ints (line 15) decays to a const int*, so we must declare ptr with that type.

 1    // fig07_11.cpp
 2    // sizeof operator used to determine standard data type sizes.
 3    #include <iostream>
 4    using namespace std;
 5
 6    int main() {
 7       constexpr char c{}; // variable of type char
 8       constexpr short s{}; // variable of type short
 9       constexpr int i{}; // variable of type int
10       constexpr long l{}; // variable of type long
11       constexpr long long ll{}; // variable of type long long
12       constexpr float f{}; // variable of type float
13       constexpr double d{}; // variable of type double
14       constexpr long double ld{}; // variable of type long double
15       constexpr int array[20]{}; // built-in array of int
16       const int* const ptr{array}; // variable of type int*
17
18       cout << "sizeof c = " << sizeof c
19          << "	sizeof(char) = " << sizeof(char)
20          << "
sizeof s = " << sizeof s
21          << "	sizeof(short) = " << sizeof(short)
22          << "
sizeof i = " << sizeof i
23          << "	sizeof(int) = " << sizeof(int)
24          << "
sizeof l = " << sizeof l
25          << "	sizeof(long) = " << sizeof(long)
26          << "
sizeof ll = " << sizeof ll
27          << "	sizeof(long long) = " << sizeof(long long)
28          << "
sizeof f = " << sizeof f
29          << "	sizeof(float) = " << sizeof(float)
30          << "
sizeof d = " << sizeof d
31          << "	sizeof(double) = " << sizeof(double)
32          << "
sizeof ld = " << sizeof ld
33          << "	sizeof(long double) = " << sizeof(long double)
34          << "
sizeof array = " << sizeof array
35          << "
sizeof ptr = " << sizeof ptr << endl;
36    } 
sizeof c = 1    sizeof(char) = 1
sizeof s = 2    sizeof(short) = 2
sizeof i = 4    sizeof(int) = 4
sizeof l = 8    sizeof(long) = 8
sizeof ll = 8   sizeof(long long) = 8
sizeof f = 4    sizeof(float) = 4
sizeof d = 8    sizeof(double) = 8
sizeof ld = 16  sizeof(long double) = 16
sizeof array = 80
sizeof ptr = 8

Fig. 7.11 sizeof operator used to determine standard data type sizes.

The number of bytes used to store a particular data type may vary among systems and compilers. When writing programs that depend on data type sizes, always use sizeof to determine the number of bytes used to store the data types.

Operator sizeof can be applied to any expression or type name. When applied to a variable name (which is not a built-in array’s name) or other expression, the number of bytes used to store the corresponding type is returned. The parentheses used with sizeof are required only if a type name (e.g., int) is supplied as its operand. The parentheses used with sizeof are not required when sizeof’s operand is an expression. Remember that sizeof is a compile-time operator, so its operand is not evaluated at runtime.

7.9 Pointer Expressions and Pointer Arithmetic

C++ enables pointer arithmetic—arithmetic operations that may be performed on pointers. This section describes the operators that have pointer operands and how these operators are used with pointers.

CG Pointer arithmetic is appropriate only for pointers that point to built-in array elements. You’re likely to encounter pointer arithmetic in legacy code. However, the C++ Core Guidelines indicate that a pointer should refer only to a single object (not an array),14 and that you should not use pointer arithmetic because it’s highly error prone.15 If you need to process built-in arrays, use C++20 spans instead (Section 7.10).

14. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Res-ptr.

15. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#SS-bounds.

Valid pointer arithmetic operations are:

• incrementing (++) or decrementing (--),

• adding an integer to a pointer (+ or +=) or subtracting an integer from a pointer (- or -=), and

• subtracting one pointer from another of the same type

Subtracting pointers is appropriate only for two pointers that point to elements of the same built-in array.

Most computers today have four-byte (32-bit) or eight-byte (64-bit) integers, though some of the billions of resource-constrained Internet of Things (IoT) devices are built using 8-bit or 16-bit hardware. Integer sizes typically are based on the hardware architecture, so such hardware might use one- or two-byte integers, respectively. The results of pointer arithmetic depend on the size of the memory objects a pointer points to, so pointer arithmetic is machine-dependent.

Assume that int v[5] has been declared and that its first element is at memory location 3000. Assume that pointer vPtr has been initialized to point to v[0] (i.e., the value of vPtr is 3000). The following diagram illustrates this situation for a machine with four-byte integers:

Images

Variable vPtr can be initialized to point to v with either of the following statements (because a built-in array’s name implicitly converts to the address of its zeroth element):

int* vPtr{v};
int* vPtr{&v[0]};

7.9.1 Adding Integers to and Subtracting Integers from Pointers

In conventional arithmetic, the addition 3000 + 2 yields the value 3002. This is normally not the case with pointer arithmetic. Adding an integer to or subtracting an integer from a pointer increments or decrements the pointer by that integer times the size of the type to which the pointer refers. The number of bytes depends on the memory object’s data type. For example, the statement

vPtr += 2;

would produce 3008 (from the calculation 3000 + 2 * 4), assuming that an int is stored in four bytes of memory. In the built-in array v, vPtr would now point to v[2] as in the diagram below:

Images

If vPtr had been incremented to 3016, which points to v[4], the statement

vPtr -= 4;

would set vPtr back to 3000—the beginning of the built-in array. If a pointer is being incremented or decremented by one, the increment (++) and decrement (--) operators can be used. Each of the statements

++vPtr;
vPtr++;

increments the pointer to point to the built-in array’s next element. Each of the statements

--vPtr;
vPtr--;

decrements the pointer to point to the built-in array’s previous element.

CG There’s no bounds checking on pointer arithmetic, so the C++ Core Guidelines recommend using std::spans instead, which we demonstrate in Section 7.10. You must ensure that every pointer arithmetic operation that adds an integer to or subtracts an integer from a pointer results in a pointer that references an element within the built-in array’s bounds. As you’ll see, std::spans have bounds checking, which helps you avoid errors.

7.9.2 Subtracting One Pointer from Another

Pointer variables pointing to the same built-in array may be subtracted from one another. For example, if vPtr contains the address 3000 and v2Ptr contains the address 3008, the statement

x = v2Ptr - vPtr;

would assign to x the number of built-in array elements from vPtr to v2Ptr—in this case, 2. Pointer arithmetic is meaningful only on a pointer that points to a built-in array. We cannot assume that two variables of the same type are stored contiguously in memory unless they’re adjacent elements of a built-in array. Subtracting or comparing two pointers that do not refer to elements of the same built-in array is a logic error.

7.9.3 Pointer Assignment

A pointer can be assigned to another pointer if both pointers are of the same type.16 The exception to this rule is the pointer to void (i.e., void*), which is a pointer capable of representing any pointer type. Any pointer to a fundamental type or class type can be assigned to a pointer of type void* without casting. However, a pointer of type void* cannot be assigned directly to a pointer of another type—the pointer of type void* must first be cast to the proper pointer type (generally via a reinterpret_cast; discussed in Section 9.8).

16. Of course, const pointers cannot be modified.

7.9.4 Cannot Dereference a void*

A void* pointer cannot be dereferenced. For example, the compiler “knows” that an int* points to four bytes of memory on a machine with four-byte integers. Dereferencing an int* creates an lvalue that is an alias for the int’s four bytes in memory. A void*, however, simply contains a memory address for an unknown data type. You cannot dereference a void* because the compiler does not know the type of the data to which the pointer refers and thus not the number of bytes.

The allowed operations on void* pointers are:

• comparing void* pointers with other pointers,

• casting void* pointers to other pointer types and

• assigning addresses to void* pointers.

All other operations on void* pointers are compilation errors.

7.9.5 Comparing Pointers

Pointers can be compared using equality and relational operators. Relational comparisons using are meaningless unless the pointers point to elements of the same built-in array. Pointer comparisons compare the addresses stored in the pointers. Comparing two pointers pointing to the same built-in array could show, for example, that one pointer points to a higher-numbered element than the other. A common use of pointer comparison is determining whether a pointer has the value nullptr (i.e., a pointer to nothing).

7.10 Objects Natural Case Study: C++20 spans—Views of Contiguous Container Elements

20 We now continue our objects natural approach by taking C++20 span objects for a spin. A span (header <span>) enables programs to view contiguous elements of a container, such as a built-in array, a std::array or a std::vector. A span is a “view” into a container— it “sees” the container’s contents, but does not have its own copy of the container’s data.

CG Earlier, we discussed how C++ built-in arrays decay to pointers when passed to functions. In particular, the function’s parameter loses the size information that was provided when you declared the array. You saw this in our sizeof demonstration in Fig. 7.10. The C++ Core Guidelines recommend passing built-in arrays to functions as spans17, which represent both a pointer to the array’s first element and the array’s size. Figure 7.12 demonstrates some key span capabilities.

17. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rr-ap.

 1    // fig07_12.cpp
 2    // C++20 spans: Creating views into containers.
 3    #include <array>
 4    #include <iostream>
 5    #include <numeric>
 6    #include <span>
 7    #include <vector>
 8    using namespace std;
 9

Fig. 7.12 C++20 spans: Creating views into containers.

Function displayArray

Images Security CG Passing a built-in array to a function typically requires both the array’s name and the array’s size. The parameter items (line 12), though declared with [], is simply a pointer to an int—the pointer does not “know” how many elements the function’s argument contains. There are various problems with this approach. For instance, the code that calls displayArray could pass the wrong value for size. In this case, the function might not process all of items’ elements, or the function might access an element outside items’ bounds—a logic error and a potential security issue. In addition, we previously discussed the disadvantages of external iteration, as used in lines 13–15. The C++ Core Guidelines checker in Visual Studio issues several warnings about displayArray and passing built-in arrays to functions. We include function displayArray in this example only for comparison with passing spans in function displaySpan, which is the recommended approach.

10   // items parameter is treated as a const int* so we also need the size to
11   // know how to iterate over items with counter-controlled iteration
12   void displayArray(const int items[], size_t size) {
13      for (size_t i{0}; i < size; ++i) {
14         cout << items[i] << " ";
15      }
16   }
17
Function displaySpan

CG The C++ Core Guidelines indicate that a pointer should point only to one object, not an array18 and that functions like displayArray, which receive a pointer and a size, are error-prone.19 To fix these issues, you should pass arrays to functions using spans, as in displaySpan (lines 20–24), which receives a span containing const ints because the function does not need to modify the data to display it.

18. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Res-ptr.

19. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Ri-array.

18  // span parameter contains both the location of the first item
19  // and the number of elements, so we can iterate using range-based for
20  void displaySpan(span<const int> items) {
21     for (const auto& item : items) { // spans are iterable
22        cout << item << " ";
23     }
24  }
25

CG Images PERF A span encapsulates both a pointer and a count of the number of contiguous elements. When you pass a built-in array to displaySpan, C++ implicitly creates a span containing a pointer to the array’s first element and the array’s size, which the compiler can determine from the array’s declaration. This span is a view of the data in the original array that you pass as an argument. The C++ Core Guidelines indicate that you can pass a span by value because it’s just as efficient as passing the pointer and size separately20, as we did in displayArray.

20. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-range.

Images Security A span has many capabilities similar to arrays and vectors, such as iteration via the range-based for statement. Because a span is created based on the array’s original size as determined by the compiler, the range-based for guarantees that we cannot access an element outside the bounds of the array that the span views, thus fixing the various problems associated with displayArray and helping prevent security issues like buffer overflows.

Function times2

A span is a view into an existing container, so changing the span’s elements changes the container’s original data. Function times2 multiplies every item in its span<int> by 2. Note that we use a non-const reference to modify each element that the span views.

26  // spans can be used to modify elements in the original data structure
27  void times2(span<int> items) {
28     for (int& item : items) {
29        item *= 2;
30     }
31  }
32
Passing an Array to a Function to Display the Contents

Lines 34–36 create the int built-in array values1, the std::array values2 and the std::vector values3. Each has five elements and stores its elements contiguously in memory. Line 41 calls displayArray to display values1’s contents. The displayArray function’s first parameter is a pointer to an int, so we cannot use a std::array’s or std::vector’s name to pass these objects to displayArray.

33 int main() {
34    int values1[5]{1, 2, 3, 4, 5};
35    array<int, 5> values2{6, 7, 8, 9, 10};
36    vector<int> values3{11, 12, 13, 14, 15};
37
38    // must specify size because the compiler treats displayArray's items
39    // parameter as a pointer to the first element of the argument
40    cout << "values1 via displayArray: ";
41    displayArray(values1, 5);
42
values1 via displayArray: 1 2 3 4 5
Implicitly Creating spans and Passing Them to Functions

Line 46 calls displaySpan with values1 as an argument. The function’s parameter was declared as

span<const int>

so C++ creates a span containing a const int* that points to the array’s first element and the array’s size, which the compiler gets from the declaration of values1 (line 34). Because spans can view any contiguous sequence of elements, you may also pass a std::array or std::vector of int to displaySpan, and C++ will create an appropriate span representing a pointer to the container’s first element and the container’s size. This makes function displaySpan more flexible than displayArray, which could receive only the built-in array in this example.

43    // compiler knows values' size and automatically creates a span
44    // representing &values1[0] and the array's length
45    cout << "
values1 via displaySpan: ";
46    displaySpan(values1);
47
48    // compiler also can create spans from std::arrays and std::vectors
49    cout << "
values2 via displaySpan: ";
50    displaySpan(values2);
51    cout << "
values3 via displaySpan: ";
52    displaySpan(values3);
53
values1   via   displayArray: 1 2 3 4 5
values1   via   displaySpan: 1 2 3 4 5
values2   via   displaySpan: 6 7 8 9 10
values3   via   displaySpan: 11 12 13 14 15
Changing a span’s Elements Modifies the Original Data

As we mentioned, function times2 multiplies its span’s elements by 2. Line 55 calls times2 with values1 as an argument. The function’s parameter was declared as

span<int>

so C++ creates a span containing an int* that points to the array’s first element and the array’s size, which the compiler gets from the declaration of values1 (line 34). To prove that times2 modified the original array’s data, line 57 displays values1’s updated values. Like displaySpan, times2 can be called with this program’s std::array or std::vector as well.

54     // changing a span's contents modifies the original data
55     times2(values1);
56     cout << "

values1 after times2 modifies its span argument: ";
57     displaySpan(values1);
58
values1 after times2 modifies its span argument: 2 4 6 8 10
Manually Creating a Span and Interacting with It

You can explicitly create spans and interact with them. Line 60 creates a span<int> that views the data in values1. Lines 61–62 demonstrate the span’s front and back member functions, which return the first and last element of the view, and thus, the first and last element of the built-in array values1, respectively.

59     // spans have various array-and-vector-like capabilities
60     span<int> mySpan{values1};
61     cout << "

mySpan's first element: " << mySpan.front()
62        << "
mySpan's last element: " << mySpan.back();
63
mySpan's first element: 2
mySpan's last element: 10

CG An essential philosophy of the C++ Core Guidelines is to “prefer compile-time checking to runtime checking.”21 This enables the compiler to find and report errors at compile-time, rather than you having to write code to help prevent runtime errors. In line 60, the compiler determines the span’s size (5) from the values1 declaration in line 34. You can state the span’s size, as in

21. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rp-compile-time.

span<int, 5> mySpan{values1};

In this case, the compiler ensures that the span’s declared size matches values1’s size; otherwise, a compilation error occurs.

Using a span with the Standard Library’s accumulate Algorithm

As you’ve seen in this example, spans are iterable. This means you also can use the begin and end functions with spans to pass them to C++ standard library algorithms, such as accumulate (line 66) or sort. We cover standard library algorithms in depth in Chapter 18.

64    // spans can be used with standard library algorithms
65    cout << "

Sum of mySpan's elements: "
66       << accumulate(begin(mySpan), end(mySpan), 0);
67
Sum of mySpan's elements: 30
Creating Subviews

Sometimes, you might want to process subsets of the data a span views. A span’s first, last and subspan member functions create subviews, which are themselves views. Lines 70 and 72 use first and last to get spans representing the first three and last three elements of values1, respectively. Line 74 uses subspan to get a span that views the 3 elements starting from index 1. In each case, we pass the subview’s span to displaySpan to confirm what the span represents.

68    // spans can be used to create subviews of a container
69    cout << "

First three elements of mySpan: ";
70    displaySpan(mySpan.first(3));
71    cout << "
Last three elements of mySpan: ";
72    displaySpan(mySpan.last(3));
73    cout << "
Middle three elements of mySpan: ";
74    displaySpan(mySpan.subspan(1, 3));
75
First three elements of mySpan: 2 4 6
Last three elements of mySpan: 6 8 10
Middle three elements of mySpan: 4 6 8
Changing a Subview’s Elements Modifies the Original Data

A subview of non-const data can modify that data. Line 77 passes to function times2 a span that views the 3 elements starting from index 1 of values1. Line 79 displays the updated values1 elements to confirm the results.

76    // changing a subview's contents modifies the original data
77    times2(mySpan.subspan(1, 3));
78    cout << "

values1 after modifying middle three elements via span: ";
79    displaySpan(values1);
80
values1 after modifying middle three elements via span: 2 8 12 16 10
Accessing a View’s Elements Via the [] Operator

Like built-in arrays, std::arrays and std::vectors, you can access and modify span elements via the [] operator. Line 82 displays the element at index 2. Line 85 attempts to access an element that does not exist. On the Microsoft Visual C++ compiler, this results in an exception that displays the message,22 "Expression: span index out of range".

22. At the time of this writing, the draft C++20 standard document makes no mention of the [] operator throwing an exception. Neither GNU C++ nor the Apple Clang C++ throw exceptions on line 85. They simply display whatever is in that memory location.

81     // access a span element via []
82     cout << "

The element at index 2 is: " << mySpan[2];
83
84     // attempt to access an element outside the bounds
85     cout << "

The element at index 10 is: " << mySpan[10] << endl;
86 }
The element at index 2 is: 12

7.11 A Brief Intro to Pointer-Based Strings

We’ve already used the C++ Standard Library string class to represent strings as full-fledged objects. Chapter 8 presents class std::string in detail. This section introduces C-style, pointer-based strings (as defined by the C programming language). Here. we’ll refer to these as C-strings or strings and use std::string when referring to the C++ standard library’s string class.

Images Security std::string is preferred because it eliminates many of the security problems and bugs that can be caused by manipulating C-strings. However, there are some cases in which C-strings are required, such as reading in command-line arguments. Also, if you work with legacy C and C++ programs, you’re likely to encounter pointer-based strings. We cover C-strings in detail in Appendix F.

Characters and Character Constants

Characters are the fundamental building blocks of C++ source programs. Every program is composed of characters that—when grouped meaningfully—are interpreted by the compiler as instructions and data used to accomplish a task. A program may contain character constants, each of which is an integer value represented as a character in single quotes. The value of a character constant is the integer value of the character in the machine’s character set. For example, 'z' represents the integer value of z (122 in the ASCII character set; see Appendix B), and ' ' represents the integer value of newline (10 in the ASCII character set).

Pointer-Based Strings

A C-string (also called a pointer-based string) is a built-in array of characters ending with a null character (''), which marks where the string terminates in memory. A C-string is accessed via a pointer to its first character (no matter how long the string is). The result of sizeof for a string literal (which is a C-string) is the length of the string, including the terminating null character.

String Literals as Initializers

A string literal may be used as an initializer in the declaration of either a built-in array of chars or a variable of type const char*. The declarations

char color[]{"blue"};
const char* colorPtr{"blue"};

each initialize a variable to the string "blue". The first declaration creates a five-element built-in array color containing the characters 'b', 'l', 'u', 'e' and ''. The second declaration creates pointer variable colorPtr that points to the letter b in the string "blue" (which ends in '') somewhere in memory. The first declaration above also may be implemented using an initializer list of individual characters, as in:

char color[]{'b', 'l', 'u', 'e', ''};

String literals exist for the duration of the program. They may be shared if the same string literal is referenced from multiple locations in a program. String literals are immutable—they cannot be modified.

Problems with C-Strings

Not allocating sufficient space in a built-in array of chars to store the null character that terminates a string is a logic error. Creating or using a C-string that does not contain a terminating null character can lead to logic errors.

Images Security When storing a string of characters in a built-in array of chars, be sure that the builtin array is large enough to hold the largest string that will be stored. C++ allows strings of any length. If a string is longer than the built-in array of chars in which it’s to be stored, characters beyond the end of the built-in array will overwrite subsequent memory locations. This could lead to logic errors, program crashes or security breaches.

Displaying C-Strings

A built-in array of chars representing a null-terminated string can be output with cout and <<. The statement

cout << sentence;

displays the built-in array sentence. cout does not care how large the built-in array of chars is. The characters are output until a terminating null character is encountered; the null character is not displayed. cin and cout assume that built-in arrays of chars should be processed as strings terminated by null characters. cin and cout do not provide similar input and output processing capabilities for other built-in array types.

7.11.1 Command-Line Arguments

There are cases in which built-in arrays and C-strings must be used, such as processing a program’s command-line arguments, which are often passed to applications to specify configuration options, file names to process and more.

You supply command-line arguments to a program by placing them after the program’s name when executing it from the command line. Such arguments typically pass options to a program. For example, on a Windows system, the command

dir /p

uses the /p argument to list the contents of the current directory, pausing after each screen of information. Similarly, on Linux or macOS, the following command uses the -la argument to list the contents of the current directory with details about each file and directory:

ls -la

Command-line arguments are passed into a C++ program as C-strings, and the application name is treated as the first command line argument. To use the arguments as std::strings or other data types (int, double, etc.), you must convert the arguments to those types. Figure 7.13 displays the number of command-line arguments passed to the program, then displays each argument on a separate line of output.

 1    // fig07_13.cpp
 2    // Reading in command-line arguments.
 3    #include <iostream>
 4    using namespace std;
 5
 6    int main(int argc, char* argv[]) {
 7       cout << "There were " << argc << " command-line arguments:
";
 8       for (int i{0}; i < argc; ++i) {
 9          cout << argv[i] << endl;
10       }
11    }
fig07_13 Amanda Green 97
There were 4 command-line arguments
fig07_13
Amanda
Green
97

Fig. 7.13 Reading in command-line arguments.

To receive command-line arguments, declare main with two parameters (line 6), which by convention are named argc and argv, respectively. The first is an int representing the number of arguments. The second is a char* built-in array. The first element of the array is a C-string for the application name. The remaining elements are C-strings for the other command-line arguments.

The command

fig07_13 Amanda Green 97

passes "Amanda", "Green" and 97" to the application fig07_13 (on macOS and Linux you’d run this program with "./fig07_13"). Command-line arguments are separated by white space, not commas. When this command executes, fig07_13’s main function receives the argument count 4 and a four-element array of C-strings:

argv[0] contains the application’s name "fig07_13" (or "./fig07_13" on macOS or Linux), and

argv[1] through argv[3] contain "Amanda", "Green" and "97", respectively.

You determine how to use these arguments in your program.

7.11.2 Revisiting C++20’s to_array Function

20 Section 7.6 demonstrated converting built-in arrays to std::arrays with to_array. Figure 7.14 shows another purpose of to_array. We use the same lambda expression (lines 9–13) as in Fig. 7.6 to display the std::array contents after each to_array call.

 1    // fig07_14.cpp
 2    // C++20: Creating std::arrays from string literals with to_array.
 3    #include <iostream>
 4    #include <array>
 5    using namespace std;
 6
 7    int main() {
 8    // lambda to display a collection of items
 9    const auto display = [](const auto& items) {
10       for (const auto& item : items) {
11          cout << item << " ";
12       }
13    };
14

Fig. 7.14 C++20: Creating std::arrays from string literals with to_array.

Initializing a std::array from a String Literal Creates a One-Element array

Function to_array fixes an issue with initializing a std::array from a string literal. Rather than creating a std::array of the individual characters in the string literal, line 17 creates a one-element array containing a const char* pointing to the C-string "abc".

15     // initializing an array with a string literal
16     // creates a one-element array<const char*>
17     const auto array1 = array{"abc"};
18     cout << "

array1.size() = " << array1.size() << "
array1: ";
19     display(array1); // use lambda to display contents
20
array1.size() = 1
array1: abc
Passing a String Literal to to_array Creates a std::array of char

On the other hand, passing a string literal to to_array (line 22) creates a std::array of chars containing elements for each character and the terminating null character. Line 23 confirms that the array’s size is 6. Line 24 confirms the array’s contents. The null character does not have a visual representation, so it does not appear in the output.

21     // creating std::array of characters from a string literal
22     const auto array2 = to_array("C++20");
23     cout << "

array2.size() = " << array2.size() << "
array2: ";
24     display(array2); // use lambda to display contents
25
26     cout << endl;
27 }
array2.size() = 6
array2: C + + 2 0

7.12 Looking Ahead to Other Pointer Topics

In later chapters, we’ll introduce additional pointer topics:

• In Chapter 13, Object-Oriented Programming: Polymorphism, we’ll use pointers with class objects to show that the “runtime polymorphic processing” associated with object-oriented programming can be performed with references or pointers—you should favor references.

• In Chapter 14, Operator Overloading, we introduce dynamic memory management with pointers, which allows you at execution time to create and destroy objects as needed. Improperly managing this process is a source of subtle errors, such as “memory leaks.” We’ll show how “smart pointers” can automatically manage memory and other resources that should be returned to the operating system when they’re no longer needed.

• In Chapter 18, Standard Library Algorithms, we show that a function’s name is also a pointer to its implementation in memory, and that functions can be passed into other functions via function pointers—exactly as lambda expressions are.

7.13 Wrap-Up

This chapter discussed pointers, built-in pointer-based arrays and pointer-based strings (C-strings). We pointed out Modern C++ guidelines that recommend avoiding most pointers—preferring references to pointers, std::array23 and std::vector objects to built-in arrays, and std::string objects to C-strings.

23. We pronounce “std::” as “standard,” so throughout this chapter we say “a std::array” rather than “an std::array,” which assumes “std::” is pronounced as its individual letters s, t and d.

We declared and initialized pointers and demonstrated the pointer operators & and *. We showed that pointers enable pass-by-reference, but you should generally prefer references for that purpose. We used built-in, pointer-based arrays and showed their intimate relationship with pointers.

We discussed various combinations of const with pointers and the data they point to and used the sizeof operator to determine the number of bytes that store values of particular fundamental types and pointers. We demonstrated pointer expressions and pointer arithmetic.

We briefly discussed C-strings then showed how to process command-line arguments—a simple task for which C++ still requires you to use both pointer-based C-strings and pointer-based arrays.

As a reminder, the key takeaway from reading this chapter is that you should avoid using pointers, pointer-based arrays and pointer-based strings whenever possible. For programs that still use pointer-based arrays, you can use C++20’s to_array function to convert built-in arrays to std::arrays and C++20’s spans as a safer way to process built-in pointer-based arrays. In the next chapter, we discuss typical string-manipulation operations provided by std::string and introduce file-processing capabilities.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset