11
C++ Quirks, Oddities, and Incidentals

REFERENCES

Professional C++ code, including much of the code in this book, uses references extensively. It is helpful to step back and think about what exactly references are, and how they behave.

A reference in C++ is an alias for another variable. All modifications to the reference change the value of the variable to which it refers. You can think of references as implicit pointers that save you the trouble of taking the address of variables and dereferencing the pointer. Alternatively, you can think of references as just another name for the original variable. You can create stand-alone reference variables, use reference data members in classes, accept references as parameters to functions and methods, and return references from functions and methods.

Reference Variables

Reference variables must be initialized as soon as they are created, like this:

int x = 3;
int& xRef = x;

Subsequent to this assignment, xRef is another name for x. Any use of xRef uses the current value of x. Any assignment to xRef changes the value of x. For example, the following code sets x to 10 through xRef:

xRef = 10;

You cannot declare a reference variable outside of a class without initializing it:

int& emptyRef; // DOES NOT COMPILE!

You cannot create a reference to an unnamed value, such as an integer literal, unless the reference is to a const value. In the following example, unnamedRef1 does not compile because it is a non-const reference to a constant. That would mean you could change the value of the constant, 5, which doesn’t make sense. unnamedRef2 works because it’s a const reference, so you cannot for example write “unnamedRef2 = 7”.

int& unnamedRef1 = 5;       // DOES NOT COMPILE
const int& unnamedRef2 = 5; // Works as expected

The same holds for temporary objects. You cannot have a non-const reference to a temporary object, but a const reference is fine. For example, suppose you have the following function returning an std::string object:

std::string getString() { return "Hello world!"; }

You can have a const reference to the result of calling getString(), and that const reference keeps the std::string object alive until the reference goes out of scope:

std::string& string1 = getString();       // DOES NOT COMPILE
const std::string& string2 = getString(); // Works as expected

Modifying References

A reference always refers to the same variable to which it is initialized; references cannot be changed once they are created. This rule leads to some confusing syntax. If you “assign” a variable to a reference when the reference is declared, the reference refers to that variable. However, if you assign a variable to a reference after that, the variable to which the reference refers is changed to the value of the variable being assigned. The reference is not updated to refer to that variable. Here is a code example:

int x = 3, y = 4;
int& xRef = x;
xRef = y; // Changes value of x to 4. Doesn't make xRef refer to y.

You might try to circumvent this restriction by taking the address of y when you assign it:

xRef = &y; // DOES NOT COMPILE!

This code does not compile. The address of y is a pointer, but xRef is declared as a reference to an int, not a reference to a pointer.

Some programmers go even further in their attempts to circumvent the intended semantics of references. What if you assign a reference to a reference? Won’t that make the first reference refer to the variable to which the second reference refers? You might be tempted to try this code:

int x = 3, z = 5;
int& xRef = x;
int& zRef = z;
zRef = xRef; // Assigns values, not references

The final line does not change zRef. Instead, it sets the value of z to 3, because xRef refers to x, which is 3.

References to Pointers and Pointers to References

You can create references to any type, including pointer types. Here is an example of a reference to a pointer to int:

int* intP;
int*& ptrRef = intP;
ptrRef = new int;
*ptrRef = 5;

The syntax is a little strange: you might not be accustomed to seeing * and & right next to each other. However, the semantics are straightforward: ptrRef is a reference to intP, which is a pointer to int. Modifying ptrRef changes intP. References to pointers are rare, but can occasionally be useful, as discussed in the “Reference Parameters” section later in this chapter.

Note that taking the address of a reference gives the same result as taking the address of the variable to which the reference refers. Here is an example:

int x = 3;
int& xRef = x;
int* xPtr = &xRef; // Address of a reference is pointer to value
*xPtr = 100;

This code sets xPtr to point to x by taking the address of a reference to x. Assigning 100 to *xPtr changes the value of x to 100. Writing a comparison “xPtr == xRef” will not compile because of a type mismatch; xPtr is a pointer to an int while xRef is a reference to an int. The comparisons “xPtr == &xRef” and “xPtr == &x” both compile without errors and are both true.

Finally, note that you cannot declare a reference to a reference, or a pointer to a reference. For example, neither “int& &” nor “int&*” is allowed.

Reference Data Members

As Chapter 9 explains, data members of classes can be references. A reference cannot exist without referring to some other variable. Thus, you must initialize reference data members in the constructor initializer, not in the body of the constructor. The following is a quick example:

class MyClass
{
    public:
        MyClass(int& ref) : mRef(ref) {}
    private:
        int& mRef;
};

Consult Chapter 9 for details.

Reference Parameters

C++ programmers do not often use stand-alone reference variables or reference data members. The most common use of references is for parameters to functions and methods. Recall that the default parameter-passing semantics are pass-by-value: functions receive copies of their arguments. When those parameters are modified, the original arguments remain unchanged. References allow you to specify pass-by-reference semantics for arguments passed to the function. When you use reference parameters, the function receives references to the function arguments. If those references are modified, the changes are reflected in the original argument variables. For example, here is a simple swap function to swap the values of two ints:

void swap(int& first, int& second)
{
    int temp = first;
    first = second;
    second = temp;
}

You can call it like this:

int x = 5, y = 6;
swap(x, y);

When swap() is called with the arguments x and y, the first parameter is initialized to refer to x, and the second parameter is initialized to refer to y. When swap() modifies first and second, x and y are actually changed.

Just as you can’t initialize normal reference variables with constants, you can’t pass constants as arguments to functions that employ pass-by-non-const-reference:

swap(3, 4); // DOES NOT COMPILE

References from Pointers

A common quandary arises when you have a pointer to something that you need to pass to a function or method that takes a reference. You can “convert” a pointer to a reference in this case by dereferencing the pointer. This action gives you the value to which the pointer points, which the compiler then uses to initialize the reference parameter. For example, you can call swap() like this:

int x = 5, y = 6;
int *xp = &x, *yp = &y;
swap(*xp, *yp);

Pass-by-Reference versus Pass-by-Value

Pass-by-reference is required when you want to modify the parameter and see those changes reflected in the variable passed to the function or method. However, you should not limit your use of pass-by-reference to only those cases. Pass-by-reference avoids copying the arguments to the function, providing two additional benefits in some cases:

  1. Efficiency. Large objects and structs could take a long time to copy. Pass-by-reference passes only a reference to the object or struct into the function.
  2. Correctness. Not all objects allow pass-by-value. Even those that do allow it might not support deep copying correctly. As Chapter 9 explains, objects with dynamically allocated memory must provide a custom copy constructor and copy assignment operator in order to support deep copying.

If you want to leverage these benefits, but do not want to allow the original objects to be modified, you should mark the parameters const, giving you pass-by-const-reference. This topic is covered in detail later in this chapter.

These benefits to pass-by-reference imply that you should use pass-by-value only for simple built-in types like int and double for which you don’t need to modify the arguments. Use pass-by-const-reference or pass-by-reference in all other cases.

Reference Return Values

You can also return a reference from a function or method. The main reason to do so is efficiency. Instead of returning a whole object, return a reference to the object to avoid copying it unnecessarily. Of course, you can only use this technique if the object in question continues to exist following the function termination.

Note that if the type you want to return from your function supports move semantics, discussed in Chapter 9, then returning it by value is almost as efficient as returning a reference.

A second reason to return a reference is if you want to be able to assign to the return value directly as an lvalue (the left-hand side of an assignment statement). Several overloaded operators commonly return references. Chapter 9 shows some examples, and you can read about more applications of this technique in Chapter 15.

Rvalue References

An rvalue is anything that is not an lvalue, such as a constant value, or a temporary object or value. Typically, an rvalue is on the right-hand side of an assignment operator. Rvalue references are discussed in detail in Chapter 9, but here is a quick reminder:

// lvalue reference parameter
void handleMessage(std::string& message)
{
    cout << "handleMessage with lvalue reference: " << message << endl;
}

With only this version of handleMessage(), you cannot call it as follows:

handleMessage("Hello World"); // A literal is not an lvalue.

std::string a = "Hello ";
std::string b = "World";
handleMessage(a + b);         // A temporary is not an lvalue.

To allow these kinds of calls, you need a version that accepts an rvalue reference:

// rvalue reference parameter
void handleMessage(std::string&& message)
{
    cout << "handleMessage with rvalue reference: " << message << endl;
}

See Chapter 9 for more details.

Deciding between References and Pointers

References in C++ could be considered redundant: everything you can do with references, you can accomplish with pointers. For example, you could write the earlier shown swap() function like this:

void swap(int* first, int* second)
{
    int temp = *first;
    *first = *second;
    *second = temp;
}

However, this code is more cluttered than the version with references. References make your programs cleaner and easier to understand. They are also safer than pointers: it’s impossible to have a null reference, and you don’t explicitly dereference references, so you can’t encounter any of the dereferencing errors associated with pointers. These arguments, saying that references are safer, are only valid in the absence of any pointers. For example, take the following function that accepts a reference to an int:

void refcall(int& t) { ++t; }

You could declare a pointer and initialize it to point to some random place in memory. Then you could dereference this pointer and pass it as the reference argument to refcall(), as in the following code. This code compiles fine, but it is undefined what will happen when executed. It could for example cause a crash.

int* ptr = (int*)8;
refcall(*ptr);

Most of the time, you can use references instead of pointers. References to objects even support polymorphism in the same way as pointers to objects. However, there are some use-cases in which you need to use a pointer. One example is when you need to change the location to which it points. Recall that you cannot change the variable to which references refer. For example, when you dynamically allocate memory, you need to store a pointer to the result in a pointer rather than a reference. A second use-case in which you need to use a pointer is when the pointer is optional, that is, when it can be nullptr. Yet another use-case is if you want to store polymorphic types in a container.

A way to distinguish between appropriate use of pointers and references in parameters and return types is to consider who owns the memory. If the code receiving the variable becomes the owner and thus becomes responsible for releasing the memory associated with an object, it must receive a pointer to the object. Better yet, it should receive a smart pointer, which is the recommended way to transfer ownership. If the code receiving the variable should not free the memory, it should receive a reference.

Consider a function that splits an array of ints into two arrays: one of even numbers and one of odd numbers. The function doesn’t know how many numbers in the source array will be even or odd, so it should dynamically allocate the memory for the destination arrays after examining the source array. It should also return the sizes of the two new arrays. Altogether, there are four items to return: pointers to the two new arrays and the sizes of the two new arrays. Obviously, you must use pass-by-reference. The canonical C way to write the function looks like this:

void separateOddsAndEvens(const int arr[], size_t size, int** odds,
    size_t* numOdds, int** evens, size_t* numEvens)
{
    // Count the number of odds and evens
    *numOdds = *numEvens = 0;
    for (size_t i = 0; i < size; ++i) {
        if (arr[i] % 2 == 1) {
            ++(*numOdds);
        } else {
            ++(*numEvens);
        }
    }

    // Allocate two new arrays of the appropriate size.
    *odds = new int[*numOdds];
    *evens = new int[*numEvens];

    // Copy the odds and evens to the new arrays
    size_t oddsPos = 0, evensPos = 0;
    for (size_t i = 0; i < size; ++i) {
        if (arr[i] % 2 == 1) {
            (*odds)[oddsPos++] = arr[i];
        } else {
            (*evens)[evensPos++] = arr[i];
        }
    }
}

The final four parameters to the function are the “reference” parameters. In order to change the values to which they refer, separateOddsAndEvens() must dereference them, leading to some ugly syntax in the function body. Additionally, when you want to call separateOddsAndEvens(), you must pass the address of two pointers so that the function can change the actual pointers, and the address of two ints so that the function can change the actual ints. Note also that the caller is responsible for deleting the two arrays created by separateOddsAndEvens()!

int unSplit[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int* oddNums = nullptr;
int* evenNums = nullptr;
size_t numOdds = 0, numEvens = 0;

separateOddsAndEvens(unSplit, std::size(unSplit),
    &oddNums, &numOdds, &evenNums, &numEvens);

// Use the arrays...

delete[] oddNums; oddNums = nullptr;
delete[] evenNums; evenNums = nullptr;

If this syntax annoys you (which it should), you can write the same function by using references to obtain true pass-by-reference semantics:

void separateOddsAndEvens(const int arr[], size_t size, int*& odds,
    size_t& numOdds, int*& evens, size_t& numEvens)
{
    numOdds = numEvens = 0;
    for (size_t i = 0; i < size; ++i) {
        if (arr[i] % 2 == 1) {
            ++numOdds;
        } else {
            ++numEvens;
        }
    }

    odds = new int[numOdds];
    evens = new int[numEvens];

    size_t oddsPos = 0, evensPos = 0;
    for (size_t i = 0; i < size; ++i) {
        if (arr[i] % 2 == 1) {
            odds[oddsPos++] = arr[i];
        } else {
            evens[evensPos++] = arr[i];
        }
    }
}

In this case, the odds and evens parameters are references to int*s. separateOddsAndEvens() can modify the int*s that are used as arguments to the function (through the reference), without any explicit dereferencing. The same logic applies to numOdds and numEvens, which are references to ints. With this version of the function, you no longer need to pass the addresses of the pointers or ints. The reference parameters handle it for you automatically:

separateOddsAndEvens(unSplit, std::size(unSplit),
    oddNums, numOdds, evenNums, numEvens);

Even though using reference parameters is already much cleaner than using pointers, it is recommended that you avoid dynamically allocated arrays as much as possible. For example, by using the Standard Library vector container, the previous separateOddsAndEvens() function can be rewritten to be much safer, more elegant, and much more readable, because all memory allocation and deallocation happens automatically:

void separateOddsAndEvens(const vector<int>& arr,
    vector<int>& odds, vector<int>& evens)
{
    for (int i : arr) {
        if (i % 2 == 1) {
            odds.push_back(i);
        } else {
            evens.push_back(i);
        }
    }
}

This version can be used as follows:

vector<int> vecUnSplit = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
vector<int> odds, evens;
separateOddsAndEvens(vecUnSplit, odds, evens);

Note that you don’t need to deallocate the odds and evens containers; the vector class takes care of this. This version is much easier to use than the versions using pointers or references. The Standard Library vector container is discussed in detail in Chapter 17.

image The version using vectors is already much better than the versions using pointers or references, but it’s usually recommended to avoid output parameters as much as possible. If a function needs to return something, you should just return it, instead of using output parameters. Especially since C++11 introduced move semantics, returning something by value from a function is efficient. And now that C++17 has introduced structured bindings, see Chapter 1, it is really convenient to return multiple values from a function.

So, for the separateOddsAndEvens() function, instead of accepting two output vectors, it should simply return a pair of vectors. The std::pair utility class, defined in <utility>, is discussed in detail in Chapter 17, but its use is rather straightforward. Basically, a pair can store two values of two different or equal types. It’s a class template, and it requires two types between the angle brackets to specify the type of both values. A pair can be created using std::make_pair(). Here is the separateOddsAndEvens() function returning a pair of vectors:

pair<vector<int>, vector<int>> separateOddsAndEvens(const vector<int>& arr)
{
    vector<int> odds, evens;
    for (int i : arr) {
        if (i % 2 == 1) {
            odds.push_back(i);
        } else {
            evens.push_back(i);
        }
    }
    return make_pair(odds, evens);
}

By using a structured binding, the code to call separateOddsAndEvens() becomes very compact, yet very easy to read and understand:

vector<int> vecUnSplit = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
auto[odds, evens] = separateOddsAndEvens(vecUnSplit);

KEYWORD CONFUSION

Two keywords in C++ appear to cause more confusion than any others: const and static. Both of these keywords have several different meanings, and each of their uses presents subtleties that are important to understand.

The const Keyword

The keyword const is short for “constant” and specifies that something remains unchanged. The compiler enforces this requirement by marking any attempt to change it as an error. Furthermore, when optimizations are enabled, the compiler can take advantage of this knowledge to produce better code. The keyword has two related roles. It can mark variables or parameters, and it can mark methods. This section provides a definitive discussion of these two meanings.

const Variables and Parameters

You can use const to “protect” variables by specifying that they cannot be modified. One important use is as a replacement for #define to define constants. This use of const is its most straightforward application. For example, you could declare the constant PI like this:

const double PI = 3.141592653589793238462;

You can mark any variable const, including global variables and class data members.

You can also use const to specify that parameters to functions or methods should remain unchanged. For example, the following function accepts a const parameter. In the body of the function, you cannot modify the param integer. If you do try to modify it, the compiler will generate an error.

void func(const int param)
{
     // Not allowed to change param...
}

The following subsections discuss two special kinds of const variables or parameters in more detail: const pointers and const references.

const Pointers

When a variable contains one or more levels of indirection via a pointer, applying const becomes trickier. Consider the following lines of code:

int* ip;
ip = new int[10];
ip[4] = 5;

Suppose that you decide to apply const to ip. Set aside your doubts about the usefulness of doing so for a moment, and consider what it means. Do you want to prevent the ip variable itself from being changed, or do you want to prevent the values to which it points from being changed? That is, do you want to prevent the second line or the third line?

In order to prevent the pointed-to values from being modified (as in the third line), you can add the keyword const to the declaration of ip like this:

const int* ip;
ip = new int[10];
ip[4] = 5; // DOES NOT COMPILE!

Now you cannot change the values to which ip points.

An alternative but semantically equivalent way to write this is as follows:

int const* ip;
ip = new int[10];
ip[4] = 5; // DOES NOT COMPILE!

Putting the const before or after the int makes no difference in its functionality.

If you instead want to mark ip itself const (not the values to which it points), you need to write this:

int* const ip = nullptr;
ip = new int[10]; // DOES NOT COMPILE!
ip[4] = 5;        // Error: dereferencing a null pointer

Now that ip itself cannot be changed, the compiler requires you to initialize it when you declare it, either with nullptr as in the preceding code or with newly allocated memory as follows:

int* const ip = new int[10];
ip[4] = 5;

You can also mark both the pointer and the values to which it points const like this:

int const* const ip = nullptr;

Here is an alternative but equivalent syntax:

const int* const ip = nullptr;

Although this syntax might seem confusing, there is actually a very simple rule: the const keyword applies to whatever is directly to its left. Consider this line again:

int const* const ip = nullptr;

From left to right, the first const is directly to the right of the word int. Thus, it applies to the int to which ip points. Therefore, it specifies that you cannot change the values to which ip points. The second const is directly to the right of the *. Thus, it applies to the pointer to the int, which is the ip variable. Therefore, it specifies that you cannot change ip (the pointer) itself.

The reason this rule becomes confusing is an exception. That is, the first const can go before the variable like this:

const int* const ip = nullptr;

This “exceptional” syntax is used much more commonly than the other syntax.

You can extend this rule to any number of levels of indirection, as in this example:

const int * const * const * const ip = nullptr;

const References

const applied to references is usually simpler than const applied to pointers for two reasons. First, references are const by default, in that you can’t change to what they refer. So, there is no need to mark them const explicitly. Second, you can’t create a reference to a reference, so there is usually only one level of indirection with references. The only way to get multiple levels of indirection is to create a reference to a pointer.

Thus, when C++ programmers refer to a “const reference,” they mean something like this:

int z;
const int& zRef = z;
zRef = 4; // DOES NOT COMPILE

By applying const to the int&, you prevent assignment to zRef, as shown. Similar to pointers, const int& zRef is equivalent to int const& zRef. Note, however, that marking zRef const has no effect on z. You can still modify the value of z by changing it directly instead of through the reference.

const references are used most commonly as parameters, where they are quite useful. If you want to pass something by reference for efficiency, but don’t want it to be modifiable, make it a const reference, as in this example:

void doSomething(const BigClass& arg)
{
   // Implementation here
}

const Methods

Chapter 9 explains that you can mark a class method const, which prevents the method from modifying any non-mutable data members of the class. Consult Chapter 9 for an example.

The constexpr Keyword

C++ always had the notion of constant expressions, and in some circumstances constant expressions are required. For example, when defining an array, the size of the array needs to be a constant expression. Because of this restriction, the following piece of code is not valid in C++:

const int getArraySize() { return 32; }

int main()
{
    int myArray[getArraySize()];    // Invalid in C++
    return 0;
}

Using the constexpr keyword, the getArraySize() function can be redefined to make it a constant expression. Constant expressions are evaluated at compile time!

constexpr int getArraySize() { return 32; }

int main()
{
    int myArray[getArraySize()];    // OK
    return 0;
}

You can even do something like this:

int myArray[getArraySize() + 1];    // OK

Declaring a function as constexpr imposes quite a lot of restrictions on what the function can do because the compiler has to be able to evaluate the function at compile time, and the function is not allowed to have any side effects. Here are a couple of restrictions, although this is not an exhaustive list:

  • The function body shall not contain any goto statements, try catch blocks, uninitialized variables, or variable definitions that are not literal types,1 and shall not throw any exceptions. It is allowed to call other constexpr functions.
  • The return type of the function shall be a literal type.
  • If the constexpr function is a member of a class, the function cannot be virtual.
  • All the function parameters shall be literal types.
  • A constexpr function cannot be called until it’s defined in the translation unit because the compiler needs to know the complete definition.
  • dynamic_cast() and reinterpret_cast() are not allowed.
  • new and delete expressions are not allowed.

By defining a constexpr constructor, you can create constant expression variables of user-defined types. A constexpr constructor also has a lot of restrictions. Here are some of them:

  • The class cannot have any virtual base classes.
  • All the constructor parameters shall be literal types.
  • The constructor body cannot be a function-try-block (see Chapter 14).
  • The constructor body either shall be explicitly defaulted, or shall satisfy the same requirements as the body of a constexpr function.
  • All data members shall be initialized with constant expressions.

For example, the following Rect class defines a constexpr constructor satisfying the previous requirements. It also defines a constexpr getArea() method that is performing some calculation.

class Rect
{
    public:
        constexpr Rect(size_t width, size_t height)
            : mWidth(width), mHeight(height) {}

        constexpr size_t getArea() const { return mWidth * mHeight; }
    private:
        size_t mWidth, mHeight;
};

Using this class to declare a constexpr object is straightforward:

constexpr Rect r(8, 2);
int myArray[r.getArea()];    // OK

The static Keyword

There are several uses of the keyword static in C++, all seemingly unrelated. Part of the motivation for “overloading” the keyword was attempting to avoid having to introduce new keywords into the language.

static Data Members and Methods

You can declare static data members and methods of classes. static data members, unlike non-static data members, are not part of each object. Instead, there is only one copy of the data member, which exists outside any objects of that class.

static methods are similarly at the class level instead of the object level. A static method does not execute in the context of a specific object.

Chapter 9 provides examples of both static data members and methods.

static Linkage

Before covering the use of the static keyword for linkage, you need to understand the concept of linkage in C++. C++ source files are each compiled independently, and the resulting object files are linked together. Each name in a C++ source file, including functions and global variables, has a linkage that is either external or internal. External linkage means that the name is available from other source files. Internal linkage (also called static linkage) means that it is not. By default, functions and global variables have external linkage. However, you can specify internal (or static) linkage by prefixing the declaration with the keyword static. For example, suppose you have two source files: FirstFile.cpp and AnotherFile.cpp. Here is FirstFile.cpp:

void f();

int main()
{
    f();
    return 0;
}

Note that this file provides a prototype for f(), but doesn’t show the definition. Here is AnotherFile.cpp:

#include <iostream>

void f();

void f()
{
    std::cout << "f
";
}

This file provides both a prototype and a definition for f(). Note that it is legal to write prototypes for the same function in two different files. That’s precisely what the preprocessor does for you if you put the prototype in a header file that you #include in each of the source files. The reason to use header files is that it’s easier to maintain (and keep synchronized) one copy of the prototype. However, for this example, I don’t use a header file.

Each of these source files compiles without error, and the program links fine: because f() has external linkage, main() can call it from a different file.

However, suppose you apply static to the f() prototype in AnotherFile.cpp. Note that you don’t need to repeat the static keyword in front of the definition of f(). As long as it precedes the first instance of the function name, there is no need to repeat it:

#include <iostream>

static void f();

void f()
{
    std::cout << "f
";
}

Now each of the source files compiles without error, but the linker step fails because f() has internal (static) linkage, making it unavailable from FirstFile.cpp. Some compilers issue a warning when static methods are defined but not used in that source file (implying that they shouldn’t be static, because they’re probably used elsewhere).

An alternative to using static for internal linkage is to employ anonymous namespaces. Instead of marking a variable or function static, wrap it in an unnamed namespace like this:

#include <iostream>

namespace {
    void f();

    void f()
    {
        std::cout << "f
";
    }
}

Entities in an anonymous namespace can be accessed anywhere following their declaration in the same source file, but cannot be accessed from other source files. These semantics are the same as those obtained with the static keyword.

The extern Keyword

A related keyword, extern, seems like it should be the opposite of static, specifying external linkage for the names it precedes. It can be used that way in certain cases. For example, consts and typedefs have internal linkage by default. You can use extern to give them external linkage. However, extern has some complications. When you specify a name as extern, the compiler treats it as a declaration, not a definition. For variables, this means the compiler doesn’t allocate space for the variable. You must provide a separate definition line for the variable without the extern keyword. For example, here is the content of AnotherFile.cpp:

extern int x;
int x = 3;

Alternatively, you can initialize x in the extern line, which then serves as the declaration and definition:

extern int x = 3;

The extern in this case is not very useful, because x has external linkage by default anyway. The real use of extern is when you want to use x from another source file, FirstFile.cpp:

#include <iostream>

extern int x;

int main()
{
    std::cout << x << std::endl;
}

Here, FirstFile.cpp uses an extern declaration so that it can use x. The compiler needs a declaration of x in order to use it in main(). If you declared x without the extern keyword, the compiler would think it’s a definition and would allocate space for x, causing the linkage step to fail (because there are now two x variables in the global scope). With extern, you can make variables globally accessible from multiple source files.

static Variables in Functions

The final use of the static keyword in C++ is to create local variables that retain their values between exits and entrances to their scope. A static variable inside a function is like a global variable that is only accessible from that function. One common use of static variables is to “remember” whether a particular initialization has been performed for a certain function. For example, code that employs this technique might look something like this:

void performTask()
{
    static bool initialized = false;
    if (!initialized) {
        cout << "initializing" << endl;
        // Perform initialization.
        initialized = true;
    }
    // Perform the desired task.
}

However, static variables are confusing, and there are usually better ways to structure your code so that you can avoid them. In this case, you might want to write a class in which the constructor performs the required initialization.

Sometimes, however, they are quite useful. One example is for implementing the Meyer’s singleton design pattern, as explained in Chapter 29.

Order of Initialization of Nonlocal Variables

Before leaving the topic of static data members and global variables, consider the order of initialization of these variables. All global variables and static class data members in a program are initialized before main() begins. The variables in a given source file are initialized in the order they appear in the source file. For example, in the following file, Demo::x is guaranteed to be initialized before y:

class Demo
{
    public:
        static int x;
};
int Demo::x = 3;
int y = 4;

However, C++ provides no specifications or guarantees about the initialization ordering of nonlocal variables in different source files. If you have a global variable x in one source file and a global variable y in another, you have no way of knowing which will be initialized first. Normally, this lack of specification isn’t cause for concern. However, it can be problematic if one global or static variable depends on another. Recall that initialization of objects implies running their constructors. The constructor of one global object might access another global object, assuming that it is already constructed. If these two global objects are declared in two different source files, you cannot count on one being constructed before the other, and you cannot control the order of initialization. This order might not be the same for different compilers or even different versions of the same compiler, and the order might even change when you simply add another file to your project.

Order of Destruction of Nonlocal Variables

Nonlocal variables are destroyed in the reverse order they were initialized. Nonlocal variables in different source files are initialized in an undefined order, which means that the order of destruction is also undefined.

TYPES AND CASTS

The basic types in C++ are reviewed in Chapter 1, while Chapter 8 shows you how to write your own types with classes. This section explores some of the trickier aspects of types: type aliases, type aliases for function pointers, type aliases for pointers to methods and data members, typedefs, and casts.

Type Aliases

A type alias provides a new name for an existing type declaration. You can think of a type alias as syntax for introducing a synonym for an existing type declaration without creating a new type. The following gives a new name, IntPtr, to the int* type declaration:

using IntPtr = int*;

You can use the new type name and the definition it aliases interchangeably. For example, the following two lines are valid:

int* p1;
IntPtr p2;

Variables created with the new type name are completely compatible with those created with the original type declaration. So, it is perfectly valid, given these definitions, to write the following, because they are not just “compatible” types, they are the same type:

p1 = p2;
p2 = p1;

The most common use for type aliases is to provide manageable names when the real type declarations become too unwieldy. This situation commonly arises with templates. For example, Chapter 1 introduces the std::vector from the Standard Library. To declare a vector of strings, you need to declare it as std::vector<std::string>. It’s a templated class, and thus requires you to specify the template parameters any time you want to refer to the type of this vector. Templates are discussed in detail in Chapter 12. For declaring variables, specifying function parameters, and so on, you would have to write std::vector<std::string>:

void processVector(const std::vector<std::string>& vec) { /* omitted */ }

int main()
{
    std::vector<std::string> myVector;
    processVector(myVector);
    return 0;
}

With a type alias, you can create a shorter, more meaningful name:

using StringVector = std::vector<std::string>;

void processVector(const StringVector& vec) { /* omitted */ }

int main()
{
    StringVector myVector;
    processVector(myVector);
    return 0;
}

Type aliases can include the scope qualifiers. The preceding example shows this by including the scope std for StringVector.

The Standard Library uses type aliases extensively to provide shorter names for types. For example, std::string is actually a type alias that looks like this:

using string = basic_string<char>;

Type Aliases for Function Pointers

You don’t normally think about the location of functions in memory, but each function actually lives at a particular address. In C++, you can use functions as data. In other words, you can take the address of a function and use it like you use a variable.

Function pointers are typed according to the parameter types and return type of compatible functions. One way to work with function pointers is to use a type alias. A type alias allows you to assign a type name to the family of functions that have the given characteristics. For example, the following line defines a type called MatchFunction that represents a pointer to any function that has two int parameters and returns a bool:

using MatchFunction = bool(*)(int, int);

Now that this new type exists, you can write a function that takes a MatchFunction as a parameter. For example, the following function accepts two int arrays and their size, as well as a MatchFunction. It iterates through the arrays in parallel and calls the MatchFunction on corresponding elements of both arrays, printing a message if the call returns true. Notice that even though the MatchFunction is passed in as a variable, it can be called just like a regular function:

void findMatches(int values1[], int values2[], size_t numValues,
                 MatchFunction matcher)
{
    for (size_t i = 0; i < numValues; i++) {
        if (matcher(values1[i], values2[i])) {
            cout << "Match found at position " << i <<
                " (" << values1[i] << ", " << values2[i] << ")" << endl;
        }
    }
}

Note that this implementation requires that both arrays have at least numValues elements. To call the findMatches() function, all you need is any function that adheres to the defined MatchFunction type—that is, any type that takes in two ints and returns a bool. For example, consider the following function, which returns true if the two parameters are equal:

bool intEqual(int item1, int item2)
{
    return item1 == item2;
}

Because the intEqual() function matches the MatchFunction type, it can be passed as the final argument to findMatches(), as follows:

int arr1[] = { 2, 5, 6, 9, 10, 1, 1 };
int arr2[] = { 4, 4, 2, 9, 0, 3, 4 };
size_t arrSize = std::size(arr1); // Pre-C++17: sizeof(arr1)/sizeof(arr1[0]);
cout << "Calling findMatches() using intEqual():" << endl;
findMatches(arr1, arr2, arrSize, &intEqual);

The intEqual() function is passed into the findMatches() function by taking its address. Technically, the & character is optional—if you omit it and only put the function name, the compiler will know that you mean to take its address. The output is as follows:

Calling findMatches() using intEqual():
Match found at position 3 (9, 9)

The benefit of function pointers lies in the fact that findMatches() is a generic function that compares parallel values in two arrays. As it is used here, it compares based on equality. However, because it takes a function pointer, it could compare based on other criteria. For example, the following function also adheres to the definition of MatchFunction:

bool bothOdd(int item1, int item2)
{
    return item1 % 2 == 1 && item2 % 2 == 1;
}

The following code calls findMatches() using bothOdd:

cout << "Calling findMatches() using bothOdd():" << endl;
findMatches(arr1, arr2, arrSize, &bothOdd);

The output is as follows:

Calling findMatches() using bothOdd():
Match found at position 3 (9, 9)
Match found at position 5 (1, 3)

By using function pointers, a single function, findMatches(), is customized to different uses based on a parameter, matcher.

While function pointers in C++ are uncommon, you may need to obtain function pointers in certain cases. Perhaps the most common example of this is when obtaining a pointer to a function in a dynamic link library. The following example obtains a pointer to a function in a Microsoft Windows Dynamic Link Library (DLL). Details of Windows DLLs are outside the scope of this book on platform-independent C++, but it is so important to Windows programmers that it is worth discussing, and it is a good example to explain the details of function pointers in general.

Consider a DLL, hardware.dll, that has a function called Connect(). You would like to load this library only if you need to call Connect(). Loading the library at run-time is done with the Windows LoadLibrary() kernel function:

HMODULE lib = ::LoadLibrary("hardware.dll");

The result of this call is what is called a “library handle” and will be NULL if there is an error. Before you can load the function from the library, you need to know the prototype for the function. Suppose the following is the prototype for Connect(), which returns an integer and accepts three parameters: a Boolean, an integer, and a C-style string.

int __stdcall Connect(bool b, int n, const char* p);

The __stdcall is a Microsoft-specific directive to specify how parameters are passed to the function and how they are cleaned up.

You can now use a type alias to define a name (ConnectFunction) for a pointer to a function with the preceding prototype:

using ConnectFunction = int(__stdcall*)(bool, int, const char*);

Having successfully loaded the library and defined a name for the function pointer, you can get a pointer to the function in the library as follows:

ConnectFunction connect = (ConnectFunction)::GetProcAddress(lib, "Connect");

If this fails, connect will be nullptr. If it succeeds, you can call the loaded function:

connect(true, 3, "Hello world");

A C programmer might think that you need to dereference the function pointer before calling it as follows:

(*connect)(true, 3, "Hello world");

This was true decades ago, but now, every C and C++ compiler is smart enough to know how to automatically dereference a function pointer before calling it.

Type Aliases for Pointers to Methods and Data Members

You can create and use pointers to both variables and functions. Now, consider pointers to class data members and methods. It’s perfectly legitimate in C++ to take the addresses of class data members and methods in order to obtain pointers to them. However, you can’t access a non-static data member or call a non-static method without an object. The whole point of class data members and methods is that they exist on a per-object basis. Thus, when you want to call the method or access the data member via the pointer, you must dereference the pointer in the context of an object. Here is an example using the Employee class introduced in Chapter 1:

Employee employee;
int (Employee::*methodPtr) () const = &Employee::getSalary;
cout << (employee.*methodPtr)() << endl;

Don’t panic at the syntax. The second line declares a variable called methodPtr of type pointer to a non-static const method of Employee that takes no arguments and returns an int. At the same time, it initializes this variable to point to the getSalary() method of the Employee class. This syntax is quite similar to declaring a simple function pointer, except for the addition of Employee:: before the *methodPtr. Note also that the & is required in this case.

The third line calls the getSalary() method (via the methodPtr pointer) on the employee object. Note the use of parentheses surrounding employee.*methodPtr. They are needed because () has higher precedence than *.

The second line can be made easier to read with a type alias:

Employee employee;
using PtrToGet = int (Employee::*) () const;
PtrToGet methodPtr = &Employee::getSalary;
cout << (employee.*methodPtr)() << endl;

Using auto, it can be simplified even further:

Employee employee;
auto methodPtr = &Employee::getSalary;
cout << (employee.*methodPtr)() << endl;

Pointers to methods and data members usually won’t come up in your programs. However, it’s important to keep in mind that you can’t dereference a pointer to a non-static method or data member without an object. Every so often, you may want to try something like passing a pointer to a non-static method to a function such as qsort() that requires a function pointer, which simply won’t work.

typedefs

Type aliases were introduced in C++11. Before C++11, you had to use typedefs to accomplish something similar but in a more convoluted way.

Just as a type alias, a typedef provides a new name for an existing type declaration. For example, take the following type alias:

using IntPtr = int*;

Without type aliases, you had to use a typedef which looked as follows:

typedef int* IntPtr;

As you can see, it’s much less readable! The order is reversed, which causes a lot of confusion, even for professional C++ developers. Other than being more convoluted, a typedef behaves almost the same as a type alias. For example, the typedef can be used as follows:

IntPtr p;

Before type aliases were introduced, you also had to use typedefs for function pointers, which is even more convoluted. For example, take the following type alias:

using FunctionType = int (*)(char, double);

Defining the same FunctionType with a typedef looks as follows:

typedef int (*FunctionType)(char, double);

This is more convoluted because the name FunctionType is somewhere in the middle of it.

Type aliases and typedefs are not entirely equivalent. Compared to typedefs, type aliases are more powerful when used with templates, but that is covered in Chapter 12 because it requires more details about templates.

Casts

C++ provides four specific casts: const_cast(), static_cast(), reinterpret_cast(), and dynamic_cast().

The old C-style casts with () still work in C++, and are still used extensively in existing code bases. C-style casts cover all four C++ casts, so they are more error-prone because it’s not always obvious what you are trying to achieve, and you might end up with unexpected results. I strongly recommend you only use the C++ style casts in new code because they are safer and stand out better syntactically in your code.

This section describes the purposes of each C++ cast and specifies when you would use each of them.

const_cast()

const_cast() is the most straightforward of the different casts available. You can use it to add const-ness to a variable, or cast away const-ness of a variable. It is the only cast of the four that is allowed to cast away const-ness. Theoretically, of course, there should be no need for a const cast. If a variable is const, it should stay const. In practice, however, you sometimes find yourself in a situation where a function is specified to take a const variable, which it must then pass to a function that takes a non-const variable. The “correct” solution would be to make const consistent in the program, but that is not always an option, especially if you are using third-party libraries. Thus, you sometimes need to cast away the const-ness of a variable, but you should only do this when you are sure the function you are calling will not modify the object; otherwise, there is no other option than to restructure your program. Here is an example:

extern void ThirdPartyLibraryMethod(char* str);

void f(const char* str)
{
    ThirdPartyLibraryMethod(const_cast<char*>(str));
}

image Starting with C++17, there is a helper method called std::as_const(), defined in <utility>, that returns a const reference version of its reference parameter. Basically, as_const(obj) is equivalent to const_cast<const T&>(obj), where T is the type of obj. As you can see, using as_const() is shorter than using const_cast(). Here is an example:

std::string str = "C++";
const std::string& constStr = std::as_const(str);

Watch out when using as_const() in combination with auto. Remember from Chapter 1 that auto strips away reference and const qualifiers! So, the following result variable has type std::string, not const std::string&:

auto result = std::as_const(str);

static_cast()

You can use static_cast() to perform explicit conversions that are supported directly by the language. For example, if you write an arithmetic expression in which you need to convert an int to a double in order to avoid integer division, use a static_cast(). In this example, it’s enough to only use static_cast() with i, because that makes one of the two operands a double, making sure C++ performs floating point division.

int i = 3;
int j = 4;
double result = static_cast<double>(i) / j;

You can also use static_cast() to perform explicit conversions that are allowed because of user-defined constructors or conversion routines. For example, if class A has a constructor that takes an object of class B, you can convert a B object to an A object with static_cast(). In most situations where you want this behavior, however, the compiler performs the conversion automatically.

Another use for static_cast() is to perform downcasts in an inheritance hierarchy, as in this example:

class Base
{
    public:
        virtual ~Base() = default;
};

class Derived : public Base
{
    public:
        virtual ~Derived() = default;
};

int main()
{
    Base* b;
    Derived* d = new Derived();
    b = d; // Don't need a cast to go up the inheritance hierarchy
    d = static_cast<Derived*>(b); // Need a cast to go down the hierarchy

    Base base;
    Derived derived;
    Base& br = derived;
    Derived& dr = static_cast<Derived&>(br);
    return 0;
}

These casts work with both pointers and references. They do not work with objects themselves.

Note that these casts using static_cast() do not perform run-time type checking. They allow you to convert any Base pointer to a Derived pointer, or Base reference to a Derived reference, even if the Base really isn’t a Derived at run time. For example, the following code compiles and executes, but using the pointer d can result in potentially catastrophic failure, including memory overwrites outside the bounds of the object.

Base* b = new Base();
Derived* d = static_cast<Derived*>(b);

To perform the cast safely with run-time type checking, use dynamic_cast(), which is explained a little later in this chapter.

static_cast() is not all-powerful. You can’t static_cast() pointers of one type to pointers of another unrelated type. You can’t directly static_cast() objects of one type to objects of another type if there is no converting constructor available. You can’t static_cast() a const type to a non-const type. You can’t static_cast() pointers to ints. Basically, you can’t do anything that doesn’t make sense according to the type rules of C++.

reinterpret_cast()

reinterpret_cast() is a bit more powerful, and concomitantly less safe, than static_cast(). You can use it to perform some casts that are not technically allowed by the C++ type rules, but which might make sense to the programmer in some circumstances. For example, you can cast a reference to one type to a reference to another type, even if the types are unrelated. Similarly, you can cast a pointer type to any other pointer type, even if they are unrelated by an inheritance hierarchy. This is commonly used to cast a pointer to a void*. This can be done implicitly, so no explicit cast is required. However, casting a void* back to a correctly-typed pointer requires reinterpret_cast(). A void* pointer is just a pointer to some location in memory. No type information is associated with a void* pointer. Here are some examples:

class X {};
class Y {};

int main()
{
    X x;
    Y y;
    X* xp = &x;
    Y* yp = &y;
    // Need reinterpret cast for pointer conversion from unrelated classes
    // static_cast doesn't work.
    xp = reinterpret_cast<X*>(yp);
    // No cast required for conversion from pointer to void*
    void* p = xp;
    // Need reinterpret cast for pointer conversion from void*
    xp = reinterpret_cast<X*>(p);
    // Need reinterpret cast for reference conversion from unrelated classes
    // static_cast doesn't work.
    X& xr = x;
    Y& yr = reinterpret_cast<Y&>(x);
    return 0;
}

One use-case for reinterpret_cast() is with binary I/O of trivially copyable types.2 For example, you can write the individual bytes of such types to a file. When you read the file back into memory, you can use reinterpret_cast() to correctly interpret the bytes read from the file.

However, in general, you should be very careful with reinterpret_cast() because it allows you to do conversions without performing any type checking.

dynamic_cast()

dynamic_cast() provides a run-time check on casts within an inheritance hierarchy. You can use it to cast pointers or references. dynamic_cast() checks the run-time type information of the underlying object at run time. If the cast doesn’t make sense, dynamic_cast() returns a null pointer (for the pointer version), or throws an std::bad_cast exception (for the reference version).

For example, suppose you have the following class hierarchy:

class Base
{
    public:
        virtual ~Base() = default;
};

class Derived : public Base
{
    public:
        virtual ~Derived() = default;
};

The following example shows a correct use of dynamic_cast():

Base* b;
Derived* d = new Derived();
b = d;
d = dynamic_cast<Derived*>(b);

The following dynamic_cast() on a reference will cause an exception to be thrown:

Base base;
Derived derived;
Base& br = base;
try {
    Derived& dr = dynamic_cast<Derived&>(br);
} catch (const bad_cast&) {
    cout << "Bad cast!" << endl;
}

Note that you can perform the same casts down the inheritance hierarchy with a static_cast() or reinterpret_cast(). The difference with dynamic_cast() is that it performs run-time (dynamic) type checking, while static_cast() and reinterpret_cast() perform the casting even if they are erroneous.

As Chapter 10 discusses, the run-time type information is stored in the vtable of an object. Therefore, in order to use dynamic_cast(), your classes must have at least one virtual method. If your classes don’t have a vtable, trying to use dynamic_cast() will result in a compilation error. Microsoft VC++, for example, gives the following error:

error C2683: 'dynamic_cast' : 'MyClass' is not a polymorphic type.

Summary of Casts

The following table summarizes the casts you should use for different situations.

SITUATION CAST
Remove const-ness const_cast()
Explicit cast supported by the language (for example, int to double, int to bool) static_cast()
Explicit cast supported by user-defined constructors or conversions static_cast()
Object of one class to object of another (unrelated) class Can’t be done
Pointer-to-object of one class to pointer-to-object of another class in the same inheritance hierarchy dynamic_cast() recommended, or static_cast()
Reference-to-object of one class to reference-to-object of another class in the same inheritance hierarchy dynamic_cast() recommended, or static_cast()
Pointer-to-type to unrelated pointer-to-type reinterpret_cast()
Reference-to-type to unrelated reference-to-type reinterpret_cast()
Pointer-to-function to pointer-to-function reinterpret_cast()

SCOPE RESOLUTION

As a C++ programmer, you need to familiarize yourself with the concept of a scope. Every name in your program, including variable, function, and class names, is in a certain scope. You create scopes with namespaces, function definitions, blocks delimited by curly braces, and class definitions. Variables that are initialized in the initialization statement of for loops are scoped to that for loop and are not visible outside that for loop. Similarly, C++17 introduced initializers for if and switch statements; see Chapter 1. Variables initialized in such initializers are scoped to the if or switch statement and are not visible outside that statement. When you try to access a variable, function, or class, the name is first looked up in the nearest enclosing scope, then the next scope, and so forth, up to the global scope. Any name not in a namespace, function, block delimited by curly braces, or class is assumed to be in the global scope. If it is not found in the global scope, at that point the compiler generates an undefined symbol error.

Sometimes names in scopes hide identical names in other scopes. Other times, the scope you want is not part of the default scope resolution from that particular line in the program. If you don’t want the default scope resolution for a name, you can qualify the name with a specific scope using the scope resolution operator ::. For example, to access a static method of a class, one way is to prefix the method name with the name of the class (its scope) and the scope resolution operator. A second way is to access the static method through an object of that class. The following example demonstrates these options. The example defines a class Demo with a static get() method, a get() function that is globally scoped, and a get() function that is in the NS namespace.

class Demo
{
    public:
        static int get() { return 5; }
};

int get() { return 10; }

namespace NS
{
    int get() { return 20; }
}

The global scope is unnamed, but you can access it specifically by using the scope resolution operator by itself (with no name prefix). The different get() functions can be called as follows. In this example, the code itself is in the main() function, which is always in the global scope:

int main()
{
    auto pd = std::make_unique<Demo>();
    Demo d;
    std::cout << pd->get() << std::endl;    // prints 5
    std::cout << d.get() << std::endl;      // prints 5
    std::cout << NS::get() << std::endl;    // prints 20
    std::cout << Demo::get() << std::endl;  // prints 5
    std::cout << ::get() << std::endl;      // prints 10
    std::cout << get() << std::endl;        // prints 10
    return 0;
}

Note that if the namespace called NS is given as an unnamed namespace, then the following line will give an error about ambiguous name resolution, because you would have a get() defined in the global scope, and another get() defined in the unnamed namespace.

std::cout << get() << std::endl;

The same error occurs if you add the following using clause right before the main() function:

using namespace NS;

ATTRIBUTES

Attributes are a mechanism to add optional and/or vendor-specific information into source code. Before attributes were standardized in C++, vendors decided how to specify such information. Examples are __attribute__, __declspec, and so on. Since C++11, there is standardized support for attributes by using the double square brackets syntax [[attribute]].

The C++ standard defines only six standard attributes. One of them, [[carries_dependency]], is a rather exotic attribute and is not discussed further. The others are discussed in the following sections.

[[noreturn]]

[[noreturn]]means that a function never returns control to the call site. Typically, the function either causes some kind of termination (process termination or thread termination), or throws an exception. With this attribute, the compiler can avoid giving certain warnings or errors because it now knows more about the intent of the function. Here is an example:

[[noreturn]] void forceProgramTermination()
{
    std::exit(1);
}

bool isDongleAvailable()
{
    bool isAvailable = false;
    // Check whether a licensing dongle is available...
    return isAvailable;
}

bool isFeatureLicensed(int featureId)
{
    if (!isDongleAvailable()) {
    // No licensing dongle found, abort program execution!
        forceProgramTermination();
    } else {
        bool isLicensed = false;
        // Dongle available, perform license check of the given feature...
        return isLicensed;
    }
}

int main()
{
    bool isLicensed = isFeatureLicensed(42);
}

This code snippet compiles fine without any warnings or errors. However, if you remove the [[noreturn]] attribute, the compiler generates the following warning (output from Visual C++):

warning C4715: 'isFeatureLicensed': not all control paths return a value

[[deprecated]]

[[deprecated]] can be used to mark something as deprecated, which means you can still use it, but its use is discouraged. This attribute accepts an optional argument that can be used to explain the reason of the deprecation, as in this example:

[[deprecated("Unsafe method, please use xyz")]] void func();

If you use this deprecated function, you’ll get a compilation error or warning. For example, GCC gives the following warning:

warning: 'void func()' is deprecated: Unsafe method, please use xyz

image [[fallthrough]]

Starting with C++17, you can tell the compiler that a fallthrough in a switch statement is intentional using the [[fallthrough]] attribute. If you don’t specify this attribute for intentional fallthroughs, the compiler might give you a warning. You don’t need to specify the attribute for empty cases. For example:

switch (backgroundColor) {
    case Color::DarkBlue:
        doSomethingForDarkBlue();
        [[fallthrough]];
    case Color::Black:
        // Code is executed for both a dark blue or black background color
        doSomethingForBlackOrDarkBlue();
        break;
    case Color::Red:
    case Color::Green:
        // Code to execute for a red or green background color
        break;
}

image [[nodiscard]]

The [[nodiscard]] attribute can be used on a function returning a value to let the compiler issue a warning when that function is used without doing something with the returned value. Here is an example:

[[nodiscard]] int func()
{
    return 42;
}

int main()
{
    func();
    return 0;
}

The compiler issues a warning similar to the following:

warning C4834: discarding return value of function with 'nodiscard' attribute

This feature can, for example, be used for functions that return error codes. By adding the [[nodiscard]] attribute to such functions, the error codes cannot be ignored.

image [[maybe_unused]]

The [[maybe_unused]] attribute can be used to suppress the compiler from issuing a warning when something is unused, as in this example:

int func(int param1, int param2)
{
    return 42;
}

If your compiler warning level is set high enough, this function definition might result in two compiler warnings. For example, Microsoft VC++ gives these warnings:

warning C4100: 'param2': unreferenced formal parameter
warning C4100: 'param1': unreferenced formal parameter

By using the [[maybe_unused]] attribute, you can suppress such warnings:

int func(int param1, [[maybe_unused]] int param2)
{
    return 42;
}

In this case, the second parameter is marked with the attribute suppressing its warning. The compiler now only issues a warning for param1:

warning C4100: 'param1': unreferenced formal parameter

Vendor-Specific Attributes

Most attributes will be vendor-specific extensions. Vendors are advised not to use attributes to change the meaning of the program, but to use them to help the compiler to optimize code or detect errors in code. Because attributes of different vendors could clash, vendors are recommended to qualify them. Here is an example:

[[clang::noduplicate]]

USER-DEFINED LITERALS

C++ has a number of standard literals that you can use in your code. Here are some examples:

  • 'a': character
  • "character array": zero-terminated array of characters, C-style string
  • 3.14f: float floating point value
  • 0xabc: hexadecimal value

However, C++ also allows you to define your own literals. User-defined literals should start with an underscore. The first character following the underscore must be a lowercase letter. Some examples are: _i, _s, _km, _miles, and so on. User-defined literals are implemented by writing literal operators. A literal operator can work in raw or cooked mode. In raw mode, your literal operator receives a sequence of characters, while in cooked mode your literal operator receives a specific interpreted type. For example, take the C++ literal 123. A raw literal operator receives this as a sequence of characters '1', '2', '3'. A cooked literal operator receives this as the integer 123. As another example, take the C++ literal 0x23. A raw operator receives the characters '0', 'x', '2', '3', while a cooked operator receives the integer 35. One last example, take the C++ literal 3.14. A raw operator receives this as '3', '.', '1', '4', while a cooked operator receives the floating point value 3.14.

A cooked-mode literal operator should have either of the following:

  • one parameter of type unsigned long long, long double, char, wchar_t, char16_t, or char32_t to process numeric values, or
  • two parameters where the first is a character array and the second is the length of the character array, to process strings (for example, const char* str, size_t len).

As an example, the following implements a cooked literal operator for the user-defined literal _i to define a complex number literal:

std::complex<long double> operator"" _i(long double d)
{
    return std::complex<long double>(0, d);
}

This _i literal can be used as follows:

std::complex<long double> c1 = 9.634_i;
auto c2 = 1.23_i;       // c2 has as type std::complex<long double>

A second example implements a cooked literal operator for a user-defined literal _s to define std::string literals:

std::string operator"" _s(const char* str, size_t len)
{
    return std::string(str, len);
}

This literal can be used as follows:

std::string str1 = "Hello World"_s;
auto str2 = "Hello World"_s;   // str2 has as type std::string

Without the _s literal, the auto type deduction would be const char*:

auto str3 = "Hello World";     // str3 has as type const char*

A raw-mode literal operator requires one parameter of type const char*, a zero-terminated C-style string. The following example defines the literal _i, but using a raw literal operator:

std::complex<long double> operator"" _i(const char* p)
{
    // Implementation omitted; it requires parsing the C-style
    // string and converting it to a complex number.
}

Using this raw-mode literal operator is exactly the same as using the cooked version.

Standard User-Defined Literals

C++ defines the following standard user-defined literals. Note that these standard user-defined literals do not start with an underscore.

  • “s” for creating std::strings
    For example: auto myString = "Hello World"s;
    Requires a using namespace std::string_literals;
  • image “sv” for creating std::string_views
    For example: auto myStringView = "Hello World"sv;
    Requires a using namespace std::string_view_literals;
  • “h”, “min”, “s”, “ms”, “us”, “ns”, for creating std::chrono::duration time intervals, discussed in Chapter 20
    For example: auto myDuration = 42min;
    Requires a using namespace std::chrono_literals;
  • “i”, “il”, “if” for creating complex numbers, complex<double>, complex<long double>, and complex<float>, respectively
    For example: auto myComplexNumber = 1.3i;
    Requires a using namespace std::complex_literals;

A using namespace std; also makes these standard user-defined literals available.

HEADER FILES

Header files are a mechanism for providing an abstract interface to a subsystem or piece of code. One of the trickier parts of using headers is avoiding multiple includes of the same header file and circular references.

For example, suppose A.h includes Logger.h, defining a Logger class, and B.h also includes Logger.h. If you have a source file called App.cpp, which includes both A.h and B.h, you end up with duplicate definitions of the Logger class because the Logger.h header is included through A.h and B.h.

This problem of duplicate definitions can be avoided with a mechanism known as include guards. The following code snippet shows the Logger.h header with include guards. At the beginning of each header file, the #ifndef directive checks to see if a certain key has not been defined. If the key has been defined, the compiler skips to the matching #endif, which is usually placed at the end of the file. If the key has not been defined, the file proceeds to define the key so that a subsequent include of the same file will be skipped.

#ifndef LOGGER_H
#define LOGGER_H

class Logger
{
    // ...
};

#endif // LOGGER_H

Nearly all compilers these days support the #pragma once directive which replaces include guards. For example:

#pragma once

class Logger
{
    // ...
};

Another tool for avoiding problems with header files is forward declarations. If you need to refer to a class but you cannot include its header file (for example, because it relies heavily on the class you are writing), you can tell the compiler that such a class exists without providing a formal definition through the #include mechanism. Of course, you cannot actually use the class in the code because the compiler knows nothing about it, except that the named class will exist after everything is linked together. However, you can still make use of pointers or references to forward-declared classes in your code. You can also declare functions that return such forward-declared classes by value, or that have such forward-declared classes as pass-by-value function parameters. Of course, both the code defining the function and any code calling the function will need to include the right header files that properly defines the forward-declared classes.

For example, assume that the Logger class uses another class called Preferences, that keeps track of user settings. The Preferences class may in turn use the Logger class, so you have a circular dependency which cannot be resolved with include guards. You need to make use of forward declarations in such cases. In the following code, the Logger.h header file uses a forward declaration for the Preferences class, and subsequently refers to the Preferences class without including its header file.

#pragma once

#include <string_view>

class Preferences;  // forward declaration

class Logger
{
    public:
        static void setPreferences(const Preferences& prefs);
        static void logError(std::string_view error);
};

It’s recommended to use forward declarations as much as possible in your header files instead of including other headers. This can reduce your compilation and recompilation times, because it breaks dependencies of your header file on other headers. Of course, your implementation file needs to include the correct headers for types that you’ve forward-declared; otherwise, it won’t compile.

image To query whether a certain header file exists, C++17 adds the __has_include("filename") and __has_include(<filename>) preprocessor constants. These constants evaluate to 1 if the header file exists, 0 if it doesn’t exist. For example, before the <optional> header file was fully approved for C++17, a preliminary version existed in <experimental/optional>. You could use __has_include() to check which of the two header files is available on your system:

#if __has_include(<optional>)
    #include <optional>
#elif __has_include(<experimental/optional>)
    #include <experimental/optional>
#endif

C UTILITIES

There are a few obscure C features that are also available in C++ and which can occasionally be useful. This section examines two of these features: variable-length argument lists and preprocessor macros.

Variable-Length Argument Lists

This section explains the old C-style variable-length argument lists. You need to know how these work because you might find them in legacy code. However, in new code you should use variadic templates for type-safe variable-length argument lists, which are described in Chapter 22.

Consider the C function printf() from <cstdio>. You can call it with any number of arguments:

printf("int %d
", 5);
printf("String %s and int %d
", "hello", 5);
printf("Many ints: %d, %d, %d, %d, %d
", 1, 2, 3, 4, 5);

C/C++ provides the syntax and some utility macros for writing your own functions with a variable number of arguments. These functions usually look a lot like printf(). Although you shouldn’t need this feature very often, occasionally you will run into situations in which it’s quite useful. For example, suppose you want to write a quick-and-dirty debug function that prints strings to stderr if a debug flag is set, but does nothing if the debug flag is not set. Just like printf(), this function should be able to print strings with an arbitrary number of arguments and arbitrary types of arguments. A simple implementation looks like this:

#include <cstdio>
#include <cstdarg>

bool debug = false;

void debugOut(const char* str, ...)
{
    va_list ap;
    if (debug) {
        va_start(ap, str);
        vfprintf(stderr, str, ap);
        va_end(ap);
    }
}

First, note that the prototype for debugOut() contains one typed and named parameter str, followed by ... (ellipses). They stand for any number and type of arguments. In order to access these arguments, you must use macros defined in <cstdarg>. You declare a variable of type va_list, and initialize it with a call to va_start. The second parameter to va_start() must be the rightmost named variable in the parameter list. All functions with variable-length argument lists require at least one named parameter. The debugOut() function simply passes this list to vfprintf() (a standard function in <cstdio>). After the call to vfprintf() returns, debugOut() calls va_end() to terminate the access of the variable argument list. You must always call va_end() after calling va_start() to ensure that the function ends with the stack in a consistent state.

You can use the function in the following way:

debug = true;
debugOut("int %d
", 5);
debugOut("String %s and int %d
", "hello", 5);
debugOut("Many ints: %d, %d, %d, %d, %d
", 1, 2, 3, 4, 5);

Accessing the Arguments

If you want to access the actual arguments yourself, you can use va_arg() to do so. It accepts a va_list as first argument, and the type of the argument to interpret. Unfortunately, there is no way to know what the end of the argument list is unless you provide an explicit way of doing so. For example, you can make the first parameter a count of the number of parameters. Or, in the case where you have a set of pointers, you may require the last pointer to be nullptr. There are many ways, but they are all burdensome to the programmer.

The following example demonstrates the technique where the caller specifies in the first named parameter how many arguments are provided. The function accepts any number of ints and prints them out:

void printInts(size_t num, ...)
{
    int temp;
    va_list ap;
    va_start(ap, num);
    for (size_t i = 0; i < num; ++i) {
        temp = va_arg(ap, int);
        cout << temp << " ";
    }
    va_end(ap);
    cout << endl;
}

You can call printInts() as follows. Note that the first parameter specifies how many integers will follow:

printInts(5, 5, 4, 3, 2, 1);

Why You Shouldn’t Use C-Style Variable-Length Argument Lists

Accessing C-style variable-length argument lists is not very safe. There are several risks, as you can see from the printInts() function:

  • You don’t know the number of parameters. In the case of printInts(), you must trust the caller to pass the right number of arguments as the first argument. In the case of debugOut(), you must trust the caller to pass the same number of arguments after the character array as there are formatting codes in the character array.
  • You don’t know the types of the arguments. va_arg() takes a type, which it uses to interpret the value in its current spot. However, you can tell va_arg() to interpret the value as any type. There is no way for it to verify the correct type.

Preprocessor Macros

You can use the C++ preprocessor to write macros, which are like little functions. Here is an example:

#define SQUARE(x) ((x) * (x)) // No semicolon after the macro definition!

int main()
{
    cout << SQUARE(5) << endl;
    return 0;
}

Macros are a remnant from C that are quite similar to inline functions, except that they are not type-checked, and the preprocessor dumbly replaces any calls to them with their expansions. The preprocessor does not apply true function-call semantics. This behavior can cause unexpected results. For example, consider what would happen if you called the SQUARE macro with 2 + 3 instead of 5, like this:

cout << SQUARE(2 + 3) << endl;

You expect SQUARE to calculate 25, which it does. However, what if you left off some parentheses on the macro definition, so that it looks like this?

#define SQUARE(x) (x * x)

Now, the call to SQUARE(2 + 3) generates 11, not 25! Remember that the macro is dumbly expanded without regard to function-call semantics. This means that any x in the macro body is replaced by 2 + 3, leading to this expansion:

cout << (2 + 3 * 2 + 3) << endl;

Following proper order of operations, this line performs the multiplication first, followed by the additions, generating 11 instead of 25!

Macros can also have a performance impact. Suppose you call the SQUARE macro as follows:

cout << SQUARE(veryExpensiveFunctionCallToComputeNumber()) << endl;

The preprocessor replaces this with the following:

cout << ((veryExpensiveFunctionCallToComputeNumber()) * 
         (veryExpensiveFunctionCallToComputeNumber())) << endl;

Now you are calling the expensive function twice—another reason to avoid macros.

Macros also cause problems for debugging because the code you write is not the code that the compiler sees, or that shows up in your debugger (because of the search-and-replace behavior of the preprocessor). For these reasons, you should avoid macros entirely in favor of inline functions. The details are shown here only because quite a bit of C++ code out there still employs macros. You need to understand them in order to read and maintain that code.

SUMMARY

This chapter explained some of the aspects of C++ that generate confusion. By reading this chapter, you learned a plethora of syntax details about C++. Some of the information, such as the details of references, const, scope resolution, the specifics of the C++-style casts, and the techniques for header files, you should use often in your programs. Other information, such as the uses of static and extern, how to write C-style variable-length argument lists, and how to write preprocessor macros, is important to understand, but not information that you should put into use in your programs on a day-to-day basis.

The next chapter starts a discussion on templates allowing you to write generic code.

NOTES

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset