Containers in the Standard Library are generic data structures that are useful for storing collections of data. You should rarely need to use a standard C-style array, write a linked list, or design a stack when you use the Standard Library. The containers are implemented as class templates, so you can instantiate them for any type that meets certain basic conditions outlined in the next section. Most of the Standard Library containers, except for array
and bitset
, are flexible in size and automatically grow or shrink to accommodate more or fewer elements. This is a huge benefit compared to the old, standard C-style arrays, which had a fixed size. Because of the fixed-size nature of standard C-style arrays, they are more vulnerable to overruns, which in the simplest cases merely cause the program to crash because data has been corrupted, but in the worst cases allow certain kinds of security attacks. By using Standard Library containers, you ensure that your programs will be less vulnerable to these kinds of problems.
The Standard Library provides 16 containers, divided into four categories.
|
|
Additionally, C++ string
s and streams can also be used as Standard Library containers to a certain degree, and bitset
can be used to store a fixed number of bits.
Everything in the Standard Library is in the std
namespace. The examples in this book usually use the blanket using namespace std;
statement in source files (never use this in header files!), but you can be more selective in your own programs about which symbols from std
to use.
Standard Library containers use value semantics on elements. That is, they store a copy of elements that they are given, assign to elements with the assignment operator, and destroy elements with the destructor. Thus, when you write classes that you intend to use with the Standard Library, you need to make sure they are copyable. When requesting an element from the container, a reference to the stored copy is returned.
If you prefer reference semantics, you can store pointers to elements instead of the elements themselves. When the containers copy a pointer, the result still refers to the same element. An alternative is to store std::reference_wrapper
s in the container. A reference_wrapper
can be created using std::ref()
or std::cref()
, and basically exist to make references copyable. The reference_wrapper
class template, and the ref()
and cref()
function templates are defined in the <functional>
header. An example of this is given in the section “Storing references in a vector.”
It is possible to store move-only types, that is non-copyable types, in a container, but when doing so, some operations on the container might not compile. An example of a move-only type is std::unique_ptr
.
One of the template type parameters for Standard Library containers is a so-called allocator. The container can use this allocator to allocate and deallocate memory for elements. The allocator type parameter has a default value, so you can almost always just ignore it.
Some containers, such as a map
, also accept a comparator as one of the template type parameters. This comparator is used to order elements. It has a default value, so you don’t always have to specify it.
The specific requirements on elements in containers using the default allocator and comparator are shown in the following table.
METHOD | DESCRIPTION | NOTES |
Copy Constructor | Creates a new element that is “equal” to the old one, but that can safely be destructed without affecting the old one. | Used every time you insert an element, except when using an emplace method (discussed later). |
Move Constructor | Creates a new element by moving all content from a source element to the new element. | Used when the source element is an rvalue, and will be destroyed after the construction of the new element; also used when a vector grows in size. The move constructor should be noexcept , otherwise it won’t be used! |
Assignment Operator | Replaces the contents of an element with a copy of the source element. | Used every time you modify an element. |
Move Assignment Operator | Replaces the contents of an element by moving all content from a source element. | Used when the source element is an rvalue, and will be destroyed after the assignment operation. The move assignment operator should be noexcept , otherwise it won’t be used! |
Destructor | Cleans up an element. | Used every time you remove an element, or when a vector grows in size and the elements are not noexcept movable. |
Default Constructor | Constructs an element without any arguments. | Required only for certain operations, such as the vector::resize() method with one argument, and the map::operator[] access. |
operator== |
Compares two elements for equality. | Required for keys in unordered associative containers, and for certain operations, such as operator== on two containers. |
operator< |
Determines if one element is less than another. | Required for keys in ordered associative containers, and for certain operations, such as operator< on two containers. |
Chapter 9 shows you how to write these methods, and discusses move semantics. For move semantics to work properly with Standard Library containers, the move constructor and the move assignment operator must be marked as noexcept
!
The Standard Library containers provide limited error checking. Clients are expected to ensure that their uses are valid. However, some container methods and functions throw exceptions in certain conditions, such as out-of-bounds indexing. Of course, it is impossible to list exhaustively the exceptions that can be thrown from these methods because they perform operations on user-specified types with unknown exception characteristics. This chapter mentions exceptions where appropriate. Consult a Standard Library Reference for a list of possible exceptions thrown from each method.
The Standard Library uses the iterator pattern to provide a generic abstraction for accessing the elements of a container. Each container provides a container-specific iterator, which is a glorified smart pointer that knows how to iterate over the elements of that specific container. The iterators for all the different containers adhere to a specific interface defined by the C++ standard. Thus, even though the containers provide different functionality, the iterators present a common interface to code that wishes to work with elements of the containers.
You can think of an iterator as a pointer to a specific element of the container. Like pointers to elements in an array, iterators can move to the next element with operator++
. Similarly, you can usually use operator*
and operator->
on the iterator to access the actual element or field of the element. Some iterators allow comparison with operator==
and operator!=
, and support operator--
for moving to previous elements.
All iterators must be copy constructible, copy assignable, and destructible. Lvalues of iterators must be swappable. Different containers provide iterators with slightly different additional capabilities. The standard defines five categories of iterators, as summarized in the following table.
ITERATOR CATEGORY | OPERATIONS REQUIRED | COMMENTS |
Input (also known as Read) | operator++ operator* operator-> copy constructor operator= operator== operator!= |
Provides read-only access, forward only (no operator-- to move backward).Iterators can be assigned, copied, and compared for equality. |
Output (also known as Write) | operator++ operator* copy constructor operator= |
Provides write-only access, forward only. Iterators can be assigned, but cannot be compared for equality. Specific to output iterators is that you can do *iter = value .Note the absence of operator-> .Provides both prefix and postfix operator++ .
|
Forward | Capabilities of input iterators, plus default constructor |
Provides read access, forward only. Iterators can be assigned, copied, and compared for equality. |
Bidirectional |
Capabilities of forward iterators, plusoperator-- |
Provides everything a forward iterator provides. Iterators can also move backward to a previous element. Provides both prefix and postfix operator-- .
|
Random Access |
Bidirectional capability, plus the following:operator+ operator- operator+= operator-= operator< operator> operator<= operator>= operator[] |
Equivalent to raw pointers: support pointer arithmetic, array index syntax, and all forms of comparison. |
Additionally, iterators that satisfy the requirements for output iterators are called mutable iterators, otherwise they are called constant iterators.
You can use std::distance()
to compute the distance between two iterators of a container.
Iterators are implemented similarly to smart pointer classes in that they overload the specific desired operators. Consult Chapter 15 for details on operator overloading.
The basic iterator operations are similar to those supported by raw pointers, so a raw pointer can be a legitimate iterator for certain containers. In fact, the vector
iterator could technically be implemented as a simple raw pointer. However, as a client of the containers, you need not worry about the implementation details; you can simply use the iterator abstraction.
This chapter shows you the basics of using the iterators for each container. Chapter 18 delves into more detail about iterators and the Standard Library algorithms that use them.
Every container class in the Standard Library that supports iterators provides public type aliases for its iterator types, called iterator
and const_iterator
. For example, a const
iterator for a vector
of int
s has as type std::vector<int>::const_iterator
. Containers that allow you to iterate over their elements in reverse order also provide public type aliases called reverse_iterator
and const_reverse_iterator
. This way, clients can use the container iterators without worrying about the actual types.
The containers also provide a method begin()
that returns an iterator referring to the first element in the container. The end()
method returns an iterator to the “past-the-end” value of the sequence of elements. That is, end()
returns an iterator that is equal to the result of applying operator++
to an iterator referring to the last element in the sequence. Together, begin()
and end()
provide a half-open range that includes the first element but not the last. The reason for this apparent complication is to support empty ranges (containers without any elements), in which case begin()
is equal to end()
. The half-open range bounded by iterators begin()
and end()
is often written mathematically like this: [begin, end).
Similarly, there are
cbegin()
and cend()
methods that return const
iterators.rbegin()
and rend()
methods that return reverse iterators.crbegin()
and crend()
methods that return const
reverse iterators.Examples of using iterators are given throughout the remainder of this chapter, and in subsequent chapters.
vector
, deque
, list
, forward_list
, and array
are called sequential containers. The best way to learn about sequential containers is to jump in with an example of the vector
container, which should be your default container. The next section describes the vector
container in detail, followed by briefer descriptions of deque
, list
, forward_list
, and array
. Once you become familiar with the sequential containers, it’s trivial to switch between them.
The Standard Library vector
container is similar to a standard C-style array: the elements are stored in contiguous memory, each in its own “slot.” You can index into a vector
, as well as add new elements to the back or insert them anywhere else. Inserting and deleting elements into and from a vector
generally takes linear time, though these operations actually run in amortized constant time at the end of a vector
, as explained in the section “The vector
Memory Allocation Scheme,” later in this chapter. Random access of individual elements has a constant complexity.
vector
is defined in the <vector>
header file as a class template with two type parameters: the element type to store and an allocator type.
template <class T, class Allocator = allocator<T>> class vector;
The Allocator
parameter specifies the type for a memory allocator object that the client can set in order to use custom memory allocation. This template parameter has a default value.
The simplest way to use a vector
is as a fixed-length array. vector
provides a constructor that allows you to specify the number of elements, and provides an overloaded operator[]
in order to access and modify those elements. The C++ standard states that the result of operator[]
is undefined when used to access an element outside the vector
bounds. This means that any compiler can decide how to behave in that case. For example, the default behavior of Microsoft Visual C++ is to give a run-time error message when your program is compiled in debug mode, and to disable any bounds checking in release mode for performance reasons. You can change these default behaviors.
In addition to using operator[]
, you can access vector
elements via at()
, front()
, and back()
. The at()
method is identical to operator[]
, except that it performs bounds checking, and throws an out_of_range
exception if the index is out of bounds. front()
and back()
return references to the first and last elements of a vector
, respectively. Calling front()
or back()
on an empty container causes undefined behavior.
Here is a small example program to “normalize” test scores so that the highest score is set to 100, and all other scores are adjusted accordingly. The program creates a vector
of ten double
s, reads in ten values from the user, divides each value by the max score (times 100), and prints out the new values. For the sake of brevity, the program forsakes error checking.
vector<double> doubleVector(10); // Create a vector of 10 doubles.
// Initialize max to smallest number
double max = -numeric_limits<double>::infinity();
for (size_t i = 0; i < doubleVector.size(); i++) {
cout << "Enter score " << i + 1 << ": ";
cin >> doubleVector[i];
if (doubleVector[i] > max) {
max = doubleVector[i];
}
}
max /= 100.0;
for (auto& element : doubleVector) {
element /= max;
cout << element << " ";
}
As you can see from this example, you can use a vector
just as you would use a standard C-style array. Note that the first for
loop uses the size()
method to determine the number of elements in the container. This example also demonstrates the use of a range-based for
loop with a vector
. Here, the range-based for
loop uses auto&
and not auto
because a reference is required so that the element can be modified in each iteration.
The real power of a vector
lies in its ability to grow dynamically. For example, consider the test score normalization program from the previous section with the additional requirement that it should handle any number of test scores. Here is the new version:
vector<double> doubleVector;
// Create a vector with zero elements.
// Initialize max to smallest number
double max = -numeric_limits<double>::infinity();
for (size_t i = 1; true; i++) {
double temp;
cout << "Enter score " << i << " (-1 to stop): ";
cin >> temp;
if (temp == -1) {
break;
}
doubleVector.push_back(temp);
if (temp > max) {
max = temp;
}
}
max /= 100.0;
for (auto& element : doubleVector) {
element /= max;
cout << element << " ";
}
This version of the program uses the default constructor to create a vector
with zero elements. As each score is read, it’s added to the vector
with the push_back()
method, which takes care of allocating space for the new element. The range-based for
loop doesn’t require any changes.
Now that you’ve had a taste of vector
s, it’s time to delve into their details.
The default constructor creates a vector
with zero elements.
vector<int> intVector; // Creates a vector of ints with zero elements
You can specify a number of elements and, optionally, a value for those elements, like this:
vector<int> intVector(10, 100);
// Creates vector of 10 ints with value 100
If you omit the default value, the new objects are zero-initialized. Zero-initialization constructs objects with the default constructor, and initializes primitive integer types (such as char
, int
, and so on) to zero, primitive floating-point types to 0.0, and pointer types to nullptr
.
You can create vector
s of built-in classes like this:
vector<string> stringVector(10, "hello");
User-defined classes can also be used as vector
elements:
class Element
{
public:
Element() {}
virtual ~Element() = default;
};
…
vector<Element> elementVector;
A vector
can be constructed with an initializer_list
containing the initial elements:
vector<int> intVector({ 1, 2, 3, 4, 5, 6 });
initializer_list
s can also be used for so-called uniform initialization, as discussed in Chapter 1. Uniform initialization works on most Standard Library containers. Here is an example:
vector<int> intVector1 = { 1, 2, 3, 4, 5, 6 };
vector<int> intVector2{ 1, 2, 3, 4, 5, 6 };
You can allocate vector
s on the heap as well.
auto elementVector = make_unique<vector<Element>>(10);
A vector
stores copies of the objects, and its destructor calls the destructor for each of the objects. The copy constructor and assignment operator of the vector
class perform deep copies of all the elements in the vector
. Thus, for efficiency, you should pass vector
s by reference or const
reference to functions and methods. Consult Chapter 12 for details on writing functions that take template instantiations as parameters.
In addition to normal copying and assignment, vector
s provide an assign()
method that removes all the current elements and adds any number of new elements. This method is useful if you want to reuse a vector
. Here is a trivial example. intVector
is created with ten elements having the default value zero. Then assign()
is used to remove all ten elements and replace them with five elements with value 100.
vector<int> intVector(10);
// Other code …
intVector.assign(5, 100);
assign()
can also accept an initializer_list
as follows. intVector
now has four elements with the given values.
intVector.assign({ 1, 2, 3, 4 });
vector
s also provide a swap()
method that allows you to swap the contents of two vector
s in constant time. Here is a simple example:
vector<int> vectorOne(10);
vector<int> vectorTwo(5, 100);
vectorOne.swap(vectorTwo);
// vectorOne now has 5 elements with the value 100.
// vectorTwo now has 10 elements with the value 0.
The Standard Library provides the usual six overloaded comparison operators for vector
s: ==
, !=
, <
, >
, <=
, >=
. Two vector
s are equal if they have the same number of elements and all the corresponding elements in the two vector
s are equal to each other. Two vector
s are compared lexicographically, that is, one vector
is “less than” another if all elements 0
through i–1
in the first vector
are equal to elements 0
through i-1
in the second vector
, but element i
in the first is less than element i
in the second, where i
must be in the range 0…n
and n
must be less than the size()
of the smallest of the two vector
s.
Here is an example of a simple program that compares vector
s of int
s:
vector<int> vectorOne(10);
vector<int> vectorTwo(10);
if (vectorOne == vectorTwo) {
cout << "equal!" << endl;
} else {
cout << "not equal!" << endl;
}
vectorOne[3] = 50;
if (vectorOne < vectorTwo) {
cout << "vectorOne is less than vectorTwo" << endl;
} else {
cout << "vectorOne is not less than vectorTwo" << endl;
}
The output of the program is as follows:
equal!
vectorOne is not less than vectorTwo
The section on “Iterators” at the beginning of this chapter explains the concepts of container iterators. The discussion can get a bit abstract, so it’s helpful to jump in and look at a code example. Here is the last for
loop of the test score normalization program from earlier using iterators instead of a range-based for
loop:
for (vector<double>::iterator iter = begin(doubleVector);
iter != end(doubleVector); ++iter) {
*iter /= max;
cout << *iter << " ";
}
First, take a look at the for
loop initialization statement:
vector<double>::iterator iter = begin(doubleVector);
Recall that every container defines a type named iterator
to represent iterators for that type of container. begin()
returns an iterator of that type referring to the first element in the container. Thus, the initialization statement obtains in the variable iter
an iterator referring to the first element of doubleVector
. Next, look at the for
loop comparison:
iter != end(doubleVector);
This statement simply checks if the iterator is past the end of the sequence of elements in the vector
. When it reaches that point, the loop terminates. The increment statement, ++iter
, increments the iterator to refer to the next element in the vector
.
The for
loop body contains these two lines:
*iter /= max;
cout << *iter << " ";
As you can see, your code can both access and modify the elements over which it iterates. The first line uses *
to dereference iter
to obtain the element to which it refers, and assigns to that element. The second line dereferences iter
again, but this time only to stream the element to cout
.
The preceding for
loop using iterators can be simplified by using the auto
keyword:
for (auto iter = begin(doubleVector);
iter != end(doubleVector); ++iter) {
*iter /= max;
cout << *iter << " ";
}
With auto
, the compiler automatically deduces the type of the variable iter
based on the right-hand side of the initializer, which in this case is the result of the call to begin()
.
If the elements of your container are objects, you can use the ->
operator on iterators to call methods or access data members of those objects. For example, the following program creates a vector
of ten string
s, then iterates over all of them appending a new string
to each one:
vector<string> stringVector(10, "hello");
for (auto it = begin(stringVector); it != end(stringVector); ++it) {
it->append(" there");
}
Often, using a range-based for
loop results in more elegant code, as in this example:
vector<string> stringVector(10, "hello");
for (auto& str : stringVector) {
str.append(" there");
}
The normal iterator
is read/write. However, if you call begin()
or end()
on a const
object, or you call cbegin()
or cend()
, you receive a const_iterator
. The const_iterator
is read-only; you cannot modify the element it refers to. An iterator
can always be converted to a const_iterator
, so it’s always safe to write something like this:
vector<type>::const_iterator it = begin(myVector);
However, a const_iterator
cannot be converted to an iterator
. If myVector
is const
, the following line doesn’t compile:
vector<type>::iterator it = begin(myVector);
When using the auto
keyword, using const_iterator
s looks a bit different. Suppose you write the following code:
vector<string> stringVector(10, "hello");
for (auto iter = begin(stringVector); iter != end(stringVector); ++iter) {
cout << *iter << endl;
}
Because of the auto
keyword, the compiler deduces the type of the iter
variable automatically and makes it a normal iterator
because stringVector
is not const
. If you want a read-only const_iterator
in combination with using auto
, then you need to use cbegin()
and cend()
instead of begin()
and end()
as follows:
vector<string> stringVector(10, "hello");
for (auto iter = cbegin(stringVector); iter != cend(stringVector); ++iter) {
cout << *iter << endl;
}
Now the compiler uses const_iterator
as type for the variable iter
because that’s what cbegin()
returns.
A range-based for
loop can also be forced to use const
iterators as follows:
vector<string> stringVector(10, "hello");
for (const auto& element : stringVector) {
cout << element << endl;
}
Generally, iterators are about as safe as pointers—that is, extremely unsafe. For example, you can write code like this:
vector<int> intVector;
auto iter = end(intVector);
*iter = 10; // BUG! iter doesn't refer to a valid element.
Recall that the iterator returned by end()
is one element past the end of a vector
, not an iterator referring to the last element! Trying to dereference it results in undefined behavior. Iterators are not required to perform any verification.
Another problem can occur if you use mismatched iterators. For example, the following for
loop initializes an iterator from vectorTwo
, and tries to compare it to the end iterator of vectorOne
. Needless to say, this loop will not do what you intended, and may never terminate. Dereferencing the iterator in the loop will likely produce undefined results.
vector<int> vectorOne(10);
vector<int> vectorTwo(10);
// Fill in the vectors.
// BUG! Possible infinite loop
for (auto iter = begin(
vectorTwo
); iter != end(vectorOne
); ++iter) {// Loop body
}
The vector
iterator is random access, which means that you can move it backward or forward, or jump around. For example, the following code eventually changes the fifth element (index 4
) to the value 4
:
vector<int> intVector(10);
auto it = begin(intVector);
it += 5;
--it;
*it = 4;
Given that you can write a for
loop that uses a simple index variable and the size()
method to iterate over the elements of the vector
, why should you bother using iterators? That’s a valid question, for which there are three main answers:
vector
s, but applies to list
s, map
s, and set
s.As mentioned in the beginning of this chapter, it is possible to store references in a container, such as a vector
. To do this, you store std::reference_wrapper
s in the container. The std::ref()
and cref()
function templates are used to create non-const
and const
reference_wrapper
instances. You need to include the <functional>
header file. Here is an example:
string str1 = "Hello";
string str2 = "World";
// Create a vector of references to strings.
vector<reference_wrapper<string>> vec{ ref(str1) };
vec.push_back(ref(str2)); // push_back() works as well.
// Modify the string referred to by the second reference in the vector.
vec[1].get() += "!";
// The end result is that str2 is actually modified.
cout << str1 << " " << str2 << endl;
As you have already read, you can append an element to a vector
with the push_back()
method. The vector
provides a parallel remove method called pop_back()
.
You can also insert elements at any point in the vector
with the insert()
method, which adds one or more elements to a position specified by an iterator, shifting all subsequent elements down to make room for the new ones. There are five different overloaded forms of insert()
that do the following:
vector
using move semantics.vector
where the list of elements is given as an initializer_list
.You can remove elements from any point in a vector
with erase()
, and you can remove all elements with clear()
. There are two forms of erase()
: one accepting a single iterator to remove a single element, and one accepting two iterators specifying a range of elements to remove.
If you want to remove a number of elements that satisfy a certain condition, one solution would be to write a loop iterating over all the elements and erasing every element that matches the condition. However, this solution has quadratic complexity, which is very bad for performance. In this case, the quadratic complexity can be avoided by using the remove-erase-idiom, which has a linear complexity. The remove-erase-idiom is discussed in Chapter 18.
Here is a small program that demonstrates the methods for adding and removing elements. It uses a helper function template printVector()
to print the contents of a vector
to cout
as follows. See Chapter 12 for details on writing function templates.
template<typename T>
void printVector(const vector<T>& v)
{
for (auto& element : v) { cout << element << " "; }
cout << endl;
}
The example includes demonstrations of the two-argument version of erase()
and the following versions of insert()
:
insert(const_iterator pos, const T& x):
the value x
is inserted at position pos.
insert(const_iterator pos, size_type n, const T& x):
the value x
is inserted n
times at position pos.
insert(const_iterator pos, InputIterator first, InputIterator last):
the elements in the range [first, last)
are inserted at position pos.
The code for the example is as follows:
vector<int> vectorOne = { 1, 2, 3, 5 };
vector<int> vectorTwo;
// Oops, we forgot to add 4. Insert it in the correct place
vectorOne.insert(cbegin(vectorOne) + 3, 4);
// Add elements 6 through 10 to vectorTwo
for (int i = 6; i <= 10; i++) {
vectorTwo.push_back(i);
}
printVector(vectorOne);
printVector(vectorTwo);
// Add all the elements from vectorTwo to the end of vectorOne
vectorOne.insert(cend(vectorOne), cbegin(vectorTwo), cend(vectorTwo));
printVector(vectorOne);
// Now erase the numbers 2 through 5 in vectorOne
vectorOne.erase(cbegin(vectorOne) + 1, cbegin(vectorOne) + 5);
printVector(vectorOne);
// Clear vectorTwo entirely
vectorTwo.clear();
// And add 10 copies of the value 100
vectorTwo.insert(cbegin(vectorTwo), 10, 100);
// Decide we only want 9 elements
vectorTwo.pop_back();
printVector(vectorTwo);
The output of the program is as follows:
1 2 3 4 5
6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 6 7 8 9 10
100 100 100 100 100 100 100 100 100
Recall that iterator pairs represent half-open ranges, and insert()
adds elements before the element referred to by a given iterator position. Thus, you can insert the entire contents of vectorTwo
into the end of vectorOne
, like this:
vectorOne.insert(cend(vectorOne), cbegin(vectorTwo), cend(vectorTwo));
All Standard Library containers implement move semantics by including a move constructor and move assignment operator. See Chapter 9 for details on move semantics. A big benefit of this is that you can easily return a Standard Library container from a function by value without performance penalty. Take a look at the following function:
Without move semantics, assigning the result of Similarly, push operations can also make use of move semantics to improve performance in certain situations. For example, suppose you have a
You can add an element to this
However, because The
Now you are explicitly saying that
The preceding call to C++ supports emplace operations on most Standard Library containers, including
The emplace methods take a variable number of arguments as a variadic template. Variadic templates are discussed in Chapter 22, but those details are not required to understand how to use
Since C++17, the There is also an Inserting or erasing elements in a Also keep in mind that an internal Move Semantics
vector<int> createVectorOfSize(size_t size)
{
vector<int> vec(size);
int contents = 0;
for (auto& i : vec) {
i = contents++;
}
return vec;
}
…
vector<int> myVector;
myVector = createVectorOfSize(123);
createVectorOfSize()
to myVector
calls the copy assignment operator. With the move semantics support in the Standard Library containers, copying of the vector
is avoided. Instead, the assignment to myVector
triggers a call to the move assignment operator.vector
of string
s:vector<string> vec;
vector
as follows:string myElement(5, 'a'); // Constructs the string "aaaaa"
vec.push_back(myElement);
myElement
is not a temporary object, push_back()
makes a copy of myElement
and puts it in the vector
.vector
class also defines a push_back(T&& val)
, which is the move equivalent of push_back(const T& val)
. So, copying can be avoided if you call the push_back()
method as follows:vec.push_back(move(myElement));
myElement
should be moved into the vector
. Note that after this call, myElement
is in a valid but otherwise indeterminate state. You should not use myElement
anymore, unless you first bring it back to a determinate state, for example by calling clear()
on it! You can also call push_back()
as follows:vec.push_back(string(5, 'a'));
push_back()
triggers a call to the move version because the call to the string
constructor results in a temporary object. The push_back()
method moves this temporary string
object into the vector
, avoiding any copying.Emplace Operations
vector
. Emplace means “to put into place.” An example is emplace_back()
on a vector
object, which does not copy or move anything. Instead, it makes space in the container and constructs the object in place, as in this example:vec.emplace_back(5, 'a');
emplace_back()
. The difference in performance between emplace_back()
and push_back()
using move semantics depends on how your specific compiler implements these operations. In most situations, you can pick the one based on the syntax that you prefer.vec.push_back(string(5, 'a'));
// Or
vec.emplace_back(5, 'a');
emplace_back()
method returns a reference to the inserted element. Before C++17, the return type of emplace_back()
was void
.emplace()
method that constructs an object in place at a specific position in the vector
, and returns an iterator to the inserted element.Algorithmic Complexity and Iterator Invalidation
vector
causes all subsequent elements to shift up or down to make room for, or fill in the holes left by, the affected elements. Thus, these operations take linear complexity. Furthermore, all iterators referring to the insertion or removal point or subsequent positions are invalid following the action. The iterators are not “magically” moved to keep up with the elements that are shifted up or down in the vector
—that’s up to you.vector
reallocation can cause invalidation of all iterators referring to elements in the vector
, not just those referring to elements past the point of insertion or deletion. See the next section for details.
A vector
allocates memory automatically to store the elements that you insert. Recall that the vector
requirements dictate that the elements must be in contiguous memory, like in standard C-style arrays. Because it’s impossible to request to add memory to the end of a current chunk of memory, every time a vector
allocates more memory, it must allocate a new, larger chunk in a separate memory location and copy/move all the elements to the new chunk. This process is time-consuming, so vector
implementations attempt to avoid it by allocating more space than needed when they have to perform a reallocation. That way, they can avoid reallocating memory every time you insert an element.
One obvious question at this point is why you, as a client of vector
, care how it manages its memory internally. You might think that the principle of abstraction should allow you to disregard the internals of the vector
memory allocation scheme. Unfortunately, there are two reasons why you need to understand how it works:
vector
allocation scheme can guarantee that an element insertion runs in amortized constant time: most of the time the operation is constant, but once in a while (if it requires a reallocation), it’s linear. If you are worried about efficiency, you can control when a vector
performs reallocations.vector
.Thus, the vector
interface allows you to query and control the vector
reallocations. If you don’t control the reallocations explicitly, you should assume that all insertions cause a reallocation and thus invalidate all iterators.
vector
provides two methods for obtaining information about its size: size()
and capacity()
. The size()
method returns the number of elements in a vector
, while capacity()
returns the number of elements that it can hold without a reallocation. Thus, the number of elements that you can insert without causing a reallocation is capacity()
–
size()
.
C++17 introduces non-member std::size()
and std::empty()
global functions. These are similar to the non-member functions that are available to get iterators (std::begin()
, std::end()
, and so on). The non-member size()
and empty()
functions can be used with all containers. They can also be used with statically allocated C-style arrays not accessed through pointers, and with initializer_list
s. Here is an example of using them with a vector
:
vector<int> vec{ 1,2,3 };
cout << size(vec) << endl;
cout << empty(vec) << endl;
If you don’t care about efficiency or iterator invalidations, there is never a need to control the vector
memory allocation explicitly. However, if you want to make your program as efficient as possible, or you want to guarantee that iterators will not be invalidated, you can force a vector
to preallocate enough space to hold all of its elements. Of course, you need to know how many elements it will hold, which is sometimes impossible to predict.
One way to preallocate space is to call reserve()
, which allocates enough memory to hold the specified number of elements. The next section shows an example of the reserve()
method in action.
Another way to preallocate space is to specify, in the constructor, or with the resize()
or assign()
method, how many elements you want a vector
to store. This method actually creates a vector
of that size (and probably of that capacity).
A vector
stores its data contiguously in memory. You can get a pointer to this block of memory with the data()
method.
C++17 introduces a non-member std::data()
global function that can be used to get a pointer to the data. It works for the array
and vector
containers, string
s, statically allocated C-style arrays not accessed through pointers, and initializer_list
s. Here is an example for a vector
:
vector<int> vec{ 1,2,3 };
int* data1 = vec.data();
int* data2 = data(vec);
A common problem in computer science is distributing requests among a finite list of resources. For example, a simple operating system could keep a list of processes and assign a time slice (such as 100ms) to each process to let the process perform some of its work. After the time slice is finished, the OS suspends the process and the next process in the list is given a time slice to perform some of its work. One of the simplest algorithmic solutions to this problem is round-robin scheduling. When the time slice of the last process is finished, the scheduler starts over again with the first process. For example, in the case of three processes, the first time slice would go to the first process, the second slice to the second process, the third slice to the third process, and the fourth slice back to the first process. The cycle would continue in this way indefinitely.
Suppose that you decide to write a generic round-robin scheduling class that can be used with any type of resource. The class should support adding and removing resources, and should support cycling through the resources in order to obtain the next one. You could use a vector
directly, but it’s often helpful to write a wrapper class that provides more directly the functionality you need for your specific application. The following example shows a RoundRobin
class template with comments explaining the code. First, here is the class definition:
// Class template RoundRobin
// Provides simple round-robin semantics for a list of elements.
template <typename T>
class RoundRobin
{
public:
// Client can give a hint as to the number of expected elements for
// increased efficiency.
RoundRobin(size_t numExpected = 0);
virtual ~RoundRobin() = default;
// Prevent assignment and pass-by-value
RoundRobin(const RoundRobin& src) = delete;
RoundRobin& operator=(const RoundRobin& rhs) = delete;
// Explicitly default a move constructor and move assignment operator
RoundRobin(RoundRobin&& src) = default;
RoundRobin& operator=(RoundRobin&& rhs) = default;
// Appends element to the end of the list. May be called
// between calls to getNext().
void add(const T& element);
// Removes the first (and only the first) element
// in the list that is equal (with operator==) to element.
// May be called between calls to getNext().
void remove(const T& element);
// Returns the next element in the list, starting with the first,
// and cycling back to the first when the end of the list is
// reached, taking into account elements that are added or removed.
T& getNext();
private:
std::vector<T> mElements;
typename std::vector<T>::iterator mCurrentElement;
};
As you can see, the public interface is straightforward: only three methods plus the constructor and destructor. The resources are stored in the vector
called mElements
. The iterator mCurrentElement
always refers to the element that will be returned with the next call to getNext()
. If getNext()
hasn’t been called yet, mCurrentElement
is equal to begin(mElements)
. Note the use of the typename
keyword in front of the line declaring mCurrentElement
. So far, you’ve only seen that keyword used to specify template parameters, but there is another use for it. You must specify typename
explicitly whenever you access a type based on one or more template parameters. In this case, the template parameter T
is used to access the iterator
type. Thus, you must specify typename
. This is another example of arcane C++ syntax.
The class also prevents assignment and pass-by-value because of the mCurrentElement
data member. To make assignment and pass-by-value work, you would have to implement an assignment operator and copy constructor and make sure mCurrentElement
is valid in the destination object.
The implementation of the RoundRobin
class follows with comments explaining the code. Note the use of reserve()
in the constructor, and the extensive use of iterators in add()
, remove()
, and getNext()
. The trickiest aspect is handling mCurrentElement
in the add()
and remove()
methods.
template <typename T> RoundRobin<T>::RoundRobin(size_t numExpected)
{
// If the client gave a guideline, reserve that much space.
mElements.reserve(numExpected);
// Initialize mCurrentElement even though it isn't used until
// there's at least one element.
mCurrentElement = begin(mElements);
}
// Always add the new element at the end
template <typename T> void RoundRobin<T>::add(const T& element)
{
// Even though we add the element at the end, the vector could
// reallocate and invalidate the mCurrentElement iterator with
// the push_back() call. Take advantage of the random-access
// iterator features to save our spot.
int pos = mCurrentElement - begin(mElements);
// Add the element.
mElements.push_back(element);
// Reset our iterator to make sure it is valid.
mCurrentElement = begin(mElements) + pos;
}
template <typename T> void RoundRobin<T>::remove(const T& element)
{
for (auto it = begin(mElements); it != end(mElements); ++it) {
if (*it == element) {
// Removing an element invalidates the mCurrentElement iterator
// if it refers to an element past the point of the removal.
// Take advantage of the random-access features of the iterator
// to track the position of the current element after removal.
int newPos;
if (mCurrentElement == end(mElements) - 1 &&
mCurrentElement == it) {
// mCurrentElement refers to the last element in the list,
// and we are removing that last element, so wrap back to
// the beginning.
newPos = 0;
} else if (mCurrentElement <= it) {
// Otherwise, if mCurrentElement is before or at the one
// we're removing, the new position is the same as before.
newPos = mCurrentElement - begin(mElements);
} else {
// Otherwise, it's one less than before.
newPos = mCurrentElement - begin(mElements) - 1;
}
// Erase the element (and ignore the return value).
mElements.erase(it);
// Now reset our iterator to make sure it is valid.
mCurrentElement = begin(mElements) + newPos;
return;
}
}
}
template <typename T> T& RoundRobin<T>::getNext()
{
// First, make sure there are elements.
if (mElements.empty()) {
throw std::out_of_range("No elements in the list");
}
// Store the current element which we need to return.
auto& toReturn = *mCurrentElement;
// Increment the iterator modulo the number of elements.
++mCurrentElement;
if (mCurrentElement == end(mElements)) {
mCurrentElement = begin(mElements);
}
// Return a reference to the element.
return toReturn;
}
Here’s a simple implementation of a scheduler that uses the RoundRobin
class template, with comments explaining the code:
// Simple Process class.
class Process final
{
public:
// Constructor accepting the name of the process.
Process(string_view name) : mName(name) {}
// Implementation of doWorkDuringTimeSlice() would let the process
// perform its work for the duration of a time slice.
// Actual implementation omitted.
void doWorkDuringTimeSlice() {
cout << "Process " << mName
<< " performing work during time slice." << endl;
}
// Needed for the RoundRobin::remove() method to work.
bool operator==(const Process& rhs) {
return mName == rhs.mName;
}
private:
string mName;
};
// Simple round-robin based process scheduler.
class Scheduler final
{
public:
// Constructor takes a vector of processes.
Scheduler(const vector<Process>& processes);
// Selects the next process using a round-robin scheduling
// algorithm and allows it to perform some work during
// this time slice.
void scheduleTimeSlice();
// Removes the given process from the list of processes.
void removeProcess(const Process& process);
private:
RoundRobin<Process> mProcesses;
};
Scheduler::Scheduler(const vector<Process>& processes)
{
// Add the processes
for (auto& process : processes) {
mProcesses.add(process);
}
}
void Scheduler::scheduleTimeSlice()
{
try {
mProcesses.getNext().doWorkDuringTimeSlice();
} catch (const out_of_range&) {
cerr << "No more processes to schedule." << endl;
}
}
void Scheduler::removeProcess(const Process& process)
{
mProcesses.remove(process);
}
int main()
{
vector<Process> processes = { Process("1"), Process("2"), Process("3") };
Scheduler scheduler(processes);
for (int i = 0; i < 4; ++i)
scheduler.scheduleTimeSlice();
scheduler.removeProcess(processes[1]);
cout << "Removed second process" << endl;
for (int i = 0; i < 4; ++i)
scheduler.scheduleTimeSlice();
return 0;
}
The output should be as follows:
Process 1 performing work during time slice.
Process 2 performing work during time slice.
Process 3 performing work during time slice.
Process 1 performing work during time slice.
Removed second process
Process 3 performing work during time slice.
Process 1 performing work during time slice.
Process 3 performing work during time slice.
Process 1 performing work during time slice.
The C++ standard requires a partial specialization of vector
for bool
s, with the intention that it optimizes space allocation by “packing” the Boolean values. Recall that a bool
is either true
or false
, and thus could be represented by a single bit, which can take on exactly two values. C++ does not have a native type that stores exactly one bit. Some compilers represent a Boolean value with a type the same size as a char
, other compilers use an int
. The vector
<bool
> specialization is supposed to store the “array of bool
s” in single bits, thus saving space.
In a half-hearted attempt to provide some bit-field routines for vector<bool>
, there is one additional method: flip()
. This method can be called either on the container—in which case it complements all the elements in the container—or on a single reference returned from operator[]
or a similar method, in which case it complements that single element.
At this point, you should be wondering how you can call a method on a reference to bool
. The answer is that you can’t. The vector<bool>
specialization actually defines a class called reference
that serves as a proxy for the underlying bool
(or bit). When you call operator[]
, at()
, or a similar method, then vector<bool>
returns a reference
object, which is a proxy for the real bool
.
In practice, the little amount of space saved by packing bool
s hardly seems worth the extra effort. Even worse, accessing and modifying elements in a vector<bool>
is much slower than, for example, in a vector<int>
. Many C++ experts recommend avoiding vector<bool>
in favor of the bitset
. If you do need a dynamically sized bit field, then it’s recommended to use something like vector<std::int_fast8_t>
or vector<unsigned char>
. The std::int
_fast8
_t
type is defined in <cstdint>
. It is a signed integer type for which the compiler has to use the fastest integer type it has that is at least 8 bits.
deque
(the abbreviation for double-ended queue) is almost identical to vector
, but is used far less frequently. It is defined in the <deque>
header file. The principle differences are as follows:
deque
supports true constant-time insertion and removal of elements at both the front and the back (a vector
supports amortized constant time at just the back).deque
provides push_front()
, pop_front()
, and emplace_front()
, which the vector
omits. Since C++17, emplace_front()
returns a reference to the inserted element, instead of void
.deque
does not invalidate iterators when inserting elements at the front or at the back.deque
does not expose its memory management scheme via reserve()
or capacity()
.deque
s are rarely used, as opposed to vector
s, so they are not further discussed. Consult a Standard Library Reference for a detailed list of all supported methods.
The Standard Library list
class template, defined in the <list>
header file, is a standard doubly linked list. It supports constant-time insertion and deletion of elements at any point in the list, but provides slow (linear) time access to individual elements. In fact, the list does not even provide random-access operations like operator[]
. Only through iterators can you access individual elements.
Most of the list
operations are identical to those of vector
, including the constructors, destructor, copying operations, assignment operations, and comparison operations. This section focuses on those methods that differ from those of vector
.
The only methods provided by a list
to access elements are front()
and back()
, both of which run in constant time. These methods return a reference to the first and last elements in a list
. All other element access must be performed through iterators.
A list
supports begin()
, returning an iterator referring to the first element in the list
, and end()
, returning an iterator referring to one past the last element in the list
. It also supports cbegin()
, cend()
, rbegin()
, rend()
, crbegin()
, and crend()
, similar to a vector
.
A list
iterator is bidirectional, not random access like a vector
iterator. That means that you cannot add and subtract list
iterators from each other, or perform other pointer arithmetic on them. For example, if p
is a list
iterator, you can traverse through the elements of the list
by doing ++p
or --p
, but you cannot use the addition or subtraction operator; p+n
and p-n
do not work.
A list
supports the same add element and remove element methods as a vector
, including push_back()
, pop_back()
, emplace()
, emplace_back()
, the five forms of insert()
, the two forms of erase()
, and clear()
. Like a deque
, it also provides push_front()
, emplace_front()
, and pop_front()
. All these methods (except for clear()
) run in constant time, once you’ve found the correct position. Thus, a list
could be appropriate for applications that perform many insertions and deletions from the data structure, but do not need quick index-based element access. And even then, a vector
might still be faster. Use a performance profiler to make sure.
Like deque
s, and unlike vector
s, list
s do not expose their underlying memory model. Consequently, they support size()
, empty()
, and resize()
, but not reserve()
or capacity()
. Note that the size()
method on a list
has constant complexity, which is not the case for the size()
method on a forward_list
(discussed later).
A list
provides several special operations that exploit its quick element insertion and deletion. This section provides an overview and examples. Consult a Standard Library Reference for a thorough reference of all the methods.
The linked-list characteristics of a list
allow it to splice, or insert, an entire list
at any position in another list
in constant time. The simplest version of this method works as follows:
// Store the a words in the main dictionary.
list<string> dictionary{ "aardvark", "ambulance" };
// Store the b words.
list<string> bWords{ "bathos", "balderdash" };
// Add the c words to the main dictionary.
dictionary.push_back("canticle");
dictionary.push_back("consumerism");
// Splice the b words into the main dictionary.
if (!bWords.empty()) {
// Get an iterator to the last b word.
auto iterLastB = --(cend(bWords));
// Iterate up to the spot where we want to insert b words.
auto it = cbegin(dictionary);
for (; it != cend(dictionary); ++it) {
if (*it > *iterLastB)
break;
}
// Add in the b words. This action removes the elements from bWords.
dictionary.splice(it, bWords);
}
// Print out the dictionary.
for (const auto& word : dictionary) {
cout << word << endl;
}
The result from running this program looks like this:
aardvark
ambulance
bathos
balderdash
canticle
consumerism
There are also two other forms of splice()
: one that inserts a single element from another list
, and one that inserts a range from another list
. Additionally, all forms of splice()
are available with either a normal reference or an rvalue reference to the source list
.
In addition to splice()
, a list
provides special implementations of several of the generic Standard Library algorithms. The generic forms are covered in Chapter 18. Here, only the specific versions provided by list
are discussed.
The following table summarizes the algorithms for which list
provides special implementations as methods. See Chapter 18 for more details on the algorithms.
METHOD | DESCRIPTION |
remove() remove_if() |
Removes certain elements from a list . |
unique() |
Removes duplicate consecutive elements from a list , based on operator== or a user-supplied binary predicate. |
merge() |
Merges two list s. Both list s must be sorted to start, according to operator< or a user-defined comparator. Like splice() , merge() is destructive to the list passed as an argument. |
sort() |
Performs a stable sort on elements in a list . |
reverse() |
Reverses the order of the elements in a list . |
Suppose that you are writing a computer registration system for a university. One feature you might provide is the ability to generate a complete list of enrolled students in the university from lists of the students in each class. For the sake of this example, assume that you must write only a single function that takes a vector
of list
s of student names (as string
s), plus a list
of students that have been dropped from their courses because they failed to pay tuition. This method should generate a complete list
of all the students in all the courses, without any duplicates, and without those students who have been dropped. Note that students might be in more than one course.
Here is the code for this method, with comments explaining the code. With the power of Standard Library list
s, the method is practically shorter than its written description! Note that the Standard Library allows you to “nest” containers: in this case, you can use a vector
of list
s.
// courseStudents is a vector of lists, one for each course. The lists
// contain the students enrolled in those courses. They are not sorted.
//
// droppedStudents is a list of students who failed to pay their
// tuition and so were dropped from their courses.
//
// The function returns a list of every enrolled (non-dropped) student in
// all the courses.
list<string> getTotalEnrollment(const vector<list<string>>& courseStudents,
const list<string>& droppedStudents)
{
list<string> allStudents;
// Concatenate all the course lists onto the master list
for (auto& lst : courseStudents) {
allStudents.insert(cend(allStudents), cbegin(lst), cend(lst));
}
// Sort the master list
allStudents.sort();
// Remove duplicate student names (those who are in multiple courses).
allStudents.unique();
// Remove students who are on the dropped list.
// Iterate through the dropped list, calling remove on the
// master list for each student in the dropped list.
for (auto& str : droppedStudents) {
allStudents.remove(str);
}
// done!
return allStudents;
}
A forward_list
, defined in the <forward_list>
header file, is similar to a list
except that it is a singly linked list, while a list
is a doubly linked list. This means that forward_list
supports only forward iteration and, because of this, ranges need to be specified differently compared to a list
. If you want to modify any list, you need access to the element before the first element of interest. Because a forward_lis
t does not have an iterator that supports going backward, there is no easy way to get to the preceding element. For this reason, ranges that will be modified—for example, ranges supplied to erase()
and splice()
—must be open at the beginning. The begin()
function that was discussed earlier returns an iterator to the first element, and thus can only be used to construct a range that is closed at the beginning. The forward_lis
t class therefore defines a before_begin(
) method, which returns an iterator that points to an imaginary element before the beginning of the list. You cannot dereference this iterator as it points to invalid data. However, incrementing this iterator by one makes it the same as the iterator returned by begin()
; as a result, it can be used to make a range that is open at the beginning. The following table sums up the differences between a list
and a forward_list
. A filled box () means the container supports that operation, while an empty box () means the operation is not supported.
OPERATION | LIST | FORWARD_LIST |
assign() |
||
back() |
||
before_begin() |
||
begin() |
||
cbefore_begin() |
||
cbegin() |
||
cend() |
||
clear() |
||
crbegin() |
||
crend() |
||
emplace() |
||
emplace_after() |
||
emplace_back() |
||
emplace_front() |
||
empty() |
||
end() |
||
erase() |
||
erase_after() |
||
front() |
||
insert() |
||
insert_after() |
||
iterator / const_iterator |
||
max_size() |
||
merge() |
||
pop_back() |
||
pop_front() |
||
push_back() |
||
push_front() |
||
rbegin() |
||
remove() |
||
remove_if() |
||
rend() |
||
resize() |
||
reverse() |
||
reverse_iterator / const_reverse_iterator |
||
size() |
||
sort() |
||
splice() |
||
splice_after() |
||
swap() |
||
unique() |
Constructors and assignment operators are similar between a list
and a forward_list
. The C++ standard states that forward_list
s should try to use minimal space. That’s the reason why there is no size()
method, because by not providing it, there is no need to store the size of the list. The following example demonstrates the use of forward_list
s:
// Create 3 forward lists using an initializer_list
// to initialize their elements (uniform initialization).
forward_list<int> list1 = { 5, 6 };
forward_list<int> list2 = { 1, 2, 3, 4 };
forward_list<int> list3 = { 7, 8, 9 };
// Insert list2 at the front of list1 using splice.
list1.splice_after(list1.before_begin(), list2);
// Add number 0 at the beginning of the list1.
list1.push_front(0);
// Insert list3 at the end of list1.
// For this, we first need an iterator to the last element.
auto iter = list1.before_begin();
auto iterTemp = iter;
while (++iterTemp != end(list1)) {
++iter;
}
list1.insert_after(iter, cbegin(list3), cend(list3));
// Output the contents of list1.
for (auto& i : list1) {
cout << i << ' ';
}
To insert list3
, you need an iterator to the last element in the list. However, because this is a forward_list
, you cannot use --end(list1)
, so you need to iterate over the list from the beginning and stop at the last element. The output of this example is as follows:
0 1 2 3 4 5 6 7 8 9
An array
, defined in the <array>
header file, is similar to a vector
except that it is of a fixed size; it cannot grow or shrink in size. The purpose of a fixed size is to allow an array
to be allocated on the stack, rather than always demanding heap access as vector
does. Just like vector
s, array
s support random-access iterators, and elements are stored in contiguous memory. An array
has support for front()
, back()
, at()
, and operator[]
. It also supports a fill()
method to fill the array
with a specific element. Because it is fixed in size, it does not support push_back()
, pop_back()
, insert()
, erase()
, clear()
, resize()
, reserve()
, or capacity()
. A disadvantage compared to a vector
is that the swap()
method of an array
runs in linear time, while it has constant complexity for a vector
. An array
can also not be moved in constant time, while a vector
can. An array
has a size()
method, which is a clear advantage over C-style arrays. The following example demonstrates how to use the array
class. Note that the array
declaration requires two template parameters: the first specifies the type of the elements, and the second specifies the fixed number of elements in the array
.
// Create an array of 3 integers and initialize them
// with the given initializer_list using uniform initialization.
array<int, 3> arr = { 9, 8, 7 };
// Output the size of the array.
cout << "Array size = " << arr.size() << endl; // or std::size(arr);
// Output the contents using a range-based for loop.
for (const auto& i : arr) {
cout << i << endl;
}
cout << "Performing arr.fill(3)…" << endl;
// Use the fill method to change the contents of the array.
arr.fill(3);
// Output the contents of the array using iterators.
for (auto iter = cbegin(arr); iter != cend(arr); ++iter) {
cout << *iter << endl;
}
The output of the preceding code is as follows:
Array size = 3
9
8
7
Performing arr.fill(3)…
3
3
3
You can use the std::get<n>()
function template to retrieve an element from an std::array
at the given index n. The index has to be a constant expression, so it cannot, for example, be a loop variable. The benefit of using std::get<n>()
is that the compiler checks at compile time that the given index is valid; otherwise, it results in a compilation error, as in this example:
array<int, 3> myArray{ 11, 22, 33 };
cout << std::get<1>(myArray) << endl;
cout << std::get<10>(myArray) << endl; // Compilation error!
In addition to the standard sequential containers, the Standard Library provides three container adaptors: queue
, priority_queue
, and stack
. Each of these adaptors is a wrapper around one of the sequential containers. They allow you to swap the underlying container without having to change the rest of the code. The intent of the adaptors is to simplify the interface and to provide only those features that are appropriate for the stack
or queue
abstraction. For example, the adaptors don’t provide iterators or the capability to insert or erase multiple elements simultaneously.
The queue
container adaptor, defined in the header file <queue>
, provides standard “first-in, first-out” (FIFO) semantics. As usual, it’s written as a class template, which looks like this:
template <class T, class Container = deque<T>> class queue;
The T
template parameter specifies the type that you intend to store in the queue
. The second template parameter allows you to stipulate the underlying container that the queue
adapts. However, the queue
requires the sequential container to support both push_back()
and pop_front()
, so you have only two built-in choices: deque
and list
. For most purposes, you can just stick with the default deque
.
The queue
interface is extremely simple: there are only eight methods plus the constructor and the normal comparison operators. The push()
and emplace()
methods add a new element to the tail of the queue, while pop()
removes the element at the head of the queue. You can retrieve references to, without removing, the first and last elements with front()
and back()
, respectively. As usual, when called on const
objects, front()
and back()
return const
references; and when called on non-const
objects, they return non-const
(read/write) references.
The queue also supports size()
, empty()
, and swap()
.
When two computers communicate over a network, they send information to each other divided up into discrete chunks called packets. The networking layer of the computer’s operating system must pick up the packets and store them as they arrive. However, the computer might not have enough bandwidth to process all of them at once. Thus, the networking layer usually buffers, or stores, the packets until the higher layers have a chance to attend to them. The packets should be processed in the order they arrive, so this problem is perfect for a queue
structure. The following is a small PacketBuffer
class, with comments explaining the code, which stores incoming packets in a queue
until they are processed. It’s a class template so that different layers of the networking layer can use it for different kinds of packets, such as IP packets or TCP packets. It allows the client to specify a maximum size because operating systems usually limit the number of packets that can be stored, so as not to use too much memory. When the buffer is full, subsequently arriving packets are ignored.
template <typename T>
class PacketBuffer
{
public:
// If maxSize is 0, the size is unlimited, because creating
// a buffer of size 0 makes little sense. Otherwise only
// maxSize packets are allowed in the buffer at any one time.
PacketBuffer(size_t maxSize = 0);
virtual ~PacketBuffer() = default;
// Stores a packet in the buffer.
// Returns false if the packet has been discarded because
// there is no more space in the buffer, true otherwise.
bool bufferPacket(const T& packet);
// Returns the next packet. Throws out_of_range
// if the buffer is empty.
T getNextPacket();
private:
std::queue<T> mPackets;
size_t mMaxSize;
};
template <typename T> PacketBuffer<T>::PacketBuffer(size_t maxSize/*= 0*/)
: mMaxSize(maxSize)
{
}
template <typename T> bool PacketBuffer<T>::bufferPacket(const T& packet)
{
if (mMaxSize > 0 && mPackets.size() == mMaxSize) {
// No more space. Drop the packet.
return false;
}
mPackets.push(packet);
return true;
}
template <typename T> T PacketBuffer<T>::getNextPacket()
{
if (mPackets.empty()) {
throw std::out_of_range("Buffer is empty");
}
// Retrieve the head element
T temp = mPackets.front();
// Pop the head element
mPackets.pop();
// Return the head element
return temp;
}
A practical application of this class would require multiple threads. C++ provides synchronization classes to allow thread-safe access to shared objects. Without explicit synchronization, no Standard Library object can be used safely from multiple threads when at least one of the threads modifies the object. Synchronization is discussed in Chapter 23. The focus in this example is on the queue
class, so here is a single-threaded example of using the PacketBuffer
:
class IPPacket final
{
public:
IPPacket(int id) : mID(id) {}
int getID() const { return mID; }
private:
int mID;
};
int main()
{
PacketBuffer<IPPacket> ipPackets(3);
// Add 4 packets
for (int i = 1; i <= 4; ++i) {
if (!ipPackets.bufferPacket(IPPacket(i))) {
cout << "Packet " << i << " dropped (queue is full)." << endl;
}
}
while (true) {
try {
IPPacket packet = ipPackets.getNextPacket();
cout << "Processing packet " << packet.getID() << endl;
} catch (const out_of_range&) {
cout << "Queue is empty." << endl;
break;
}
}
return 0;
}
The output of this program is as follows:
Packet 4 dropped (queue is full).
Processing packet 1
Processing packet 2
Processing packet 3
Queue is empty.
A priority queue is a queue that keeps its elements in sorted order. Instead of a strict FIFO ordering, the element at the head of the queue at any given time is the one with the highest priority. This element could be the oldest on the queue or the most recent. If two elements have equal priority, their relative order in the queue is undefined.
The priority_queue
container adaptor is also defined in <queue>
. Its template definition looks something like this (slightly simplified):
template <class T, class Container = vector<T>,
class Compare = less<T>>;
It’s not as complicated as it looks. You’ve seen the first two parameters before: T
is the element type stored in the priority_queue
and Container
is the underlying container on which the priority_queue
is adapted. The priority_queue
uses vector
as the default, but deque
works as well. list
does not work because the priority_queue
requires random access to its elements. The third parameter, Compare
, is trickier. As you’ll learn more about in Chapter 18, less
is a class template that supports comparison of two objects of type T
with operator<
. What this means for you is that the priority of elements in a priority_queue
is determined according to operator<
. You can customize the comparison used, but that’s a topic for Chapter 18. For now, just make sure that you define operator<
appropriately for the types stored in a priority_queue
.
A priority_queue
provides even fewer operations than does a queue
. The push()
and emplace()
methods allow you to insert elements, pop()
allows you to remove elements, and top()
returns a const
reference to the head element.
Like a queue
, a priority_queue
supports size()
, empty()
, and swap()
. However, it does not provide any comparison operators.
Single failures on a system can often cause multiple errors to be generated from different components. A good error-handling system uses error correlation to process the most important errors first. You can use a priority_queue
to write a very simple error correlator. Assume all error events encode their own priority. The error correlator simply sorts error events according to their priority, so that the highest-priority errors are always processed first. Here is the class definition:
// Sample Error class with just a priority and a string error description.
class Error final
{
public:
Error(int priority, std::string_view errorString)
: mPriority(priority), mErrorString(errorString) {}
int getPriority() const { return mPriority; }
std::string_view getErrorString() const { return mErrorString; }
private:
int mPriority;
std::string mErrorString;
};
bool operator<(const Error& lhs, const Error& rhs);
std::ostream& operator<<(std::ostream& os, const Error& err);
// Simple ErrorCorrelator class that returns highest priority errors first.
class ErrorCorrelator final
{
public:
// Add an error to be correlated.
void addError(const Error& error);
// Retrieve the next error to be processed.
Error getError();
private:
std::priority_queue<Error> mErrors;
};
Here are the definitions of the functions and methods:
bool operator<(const Error& lhs, const Error& rhs)
{
return (lhs.getPriority() < rhs.getPriority());
}
ostream& operator<<(ostream& os, const Error& err)
{
os << err.getErrorString() << " (priority " << err.getPriority() << ")";
return os;
}
void ErrorCorrelator::addError(const Error& error)
{
mErrors.push(error);
}
Error ErrorCorrelator::getError()
{
// If there are no more errors, throw an exception.
if (mErrors.empty()) {
throw out_of_range("No more errors.");
}
// Save the top element.
Error top = mErrors.top();
// Remove the top element.
mErrors.pop();
// Return the saved element.
return top;
}
Here is a simple unit test showing how to use the ErrorCorrelator
. Realistic use would require multiple threads so that one thread adds errors, while another processes them. As mentioned earlier with the queue
example, this requires explicit synchronization, which is only discussed in Chapter 23.
ErrorCorrelator ec;
ec.addError(Error(3, "Unable to read file"));
ec.addError(Error(1, "Incorrect entry from user"));
ec.addError(Error(10, "Unable to allocate memory!"));
while (true) {
try {
Error e = ec.getError();
cout << e << endl;
} catch (const out_of_range&) {
cout << "Finished processing errors" << endl;
break;
}
}
The output of this program is as follows:
Unable to allocate memory! (priority 10)
Unable to read file (priority 3)
Incorrect entry from user (priority 1)
Finished processing errors
A stack
is almost identical to a queue
, except that it provides first-in, last-out (FILO) semantics, also known as last-in, first-out (LIFO), instead of FIFO. It is defined in the <stack>
header file. The template definition looks like this:
template <class T, class Container = deque<T>> class stack;
You can use vector
, list
, or deque
as the underlying container for the stack
.
Like the queue
, the stack
provides push()
, emplace()
, and pop()
. The difference is that push()
adds a new element to the top of the stack
, “pushing down” all elements inserted earlier, and pop()
removes the element from the top of the stack
, which is the most recently inserted element. The top()
method returns a const
reference to the top element if called on a const
object, and a non-const
reference if called on a non-const
object.
The stack
supports empty()
, size()
, swap()
, and the standard comparison operators.
You can rewrite the previous ErrorCorrelator
class so that it gives out the most recent error instead of the one with the highest priority. The only change required is to change mErrors
from a priority_queue
to a stack
. With this change, the errors are distributed in LIFO order instead of priority order. Nothing in the method definitions needs to change because the push()
, pop()
, top()
, and empty()
methods exist on both a priority_queue
and a stack
.
Unlike the sequential containers, the ordered associative containers do not store elements in a linear configuration. Instead, they provide a mapping of keys to values. They generally offer insertion, deletion, and lookup times that are equivalent to each other.
There are four ordered associative containers provided by the Standard Library: map
, multimap
, set
, and multiset
. Each of these containers stores its elements in a sorted, tree-like data structure. There are also four unordered associative containers: unordered_map
, unordered_multimap
, unordered_set
, and unordered_multiset
. These are discussed later in this chapter.
Before learning about the ordered associative containers, you must become familiar with the pair
class, which is defined in the <utility>
header file. pair
is a class template that groups together two values of possibly different types. The values are accessible through the first
and second
public data members. operator==
and operator<
are defined for pairs to compare both the first
and second
elements. Here are some examples:
// Two-argument constructor and default constructor
pair<string, int> myPair("hello", 5);
pair<string, int> myOtherPair;
// Can assign directly to first and second
myOtherPair.first = "hello";
myOtherPair.second = 6;
// Copy constructor
pair<string, int> myThirdPair(myOtherPair);
// operator<
if (myPair < myOtherPair) {
cout << "myPair is less than myOtherPair" << endl;
} else {
cout << "myPair is greater than or equal to myOtherPair" << endl;
}
// operator==
if (myOtherPair == myThirdPair) {
cout << "myOtherPair is equal to myThirdPair" << endl;
} else {
cout << "myOtherPair is not equal to myThirdPair" << endl;
}
The output is as follows:
myPair is less than myOtherPair
myOtherPair is equal to myThirdPair
The library also provides a utility function template, make_pair()
, that constructs a pair
from two values, as in this example:
pair<int, double> aPair = make_pair(5, 10.10);
Of course, in this case you could have just used the two-argument constructor. However, make_pair()
is more useful when you want to pass a pair
to a function, or assign it to a pre-existing variable. Unlike class templates, function templates can infer types from parameters, so you can use make_pair()
to construct a pair
without explicitly specifying the types. You can also use make_pair()
in combination with the auto
keyword as follows:
auto aSecondPair = make_pair(5, 10.10);
C++17 introduces template parameter deduction for constructors, as discussed in Chapter 12. This allows you to forget about make_pair()
, and simply write the following:
auto aThirdPair = pair(5, 10.10);
Structured bindings is another C++17 feature introduced in Chapter 1. It can be used to decompose the elements of a pair
into separate variables. Here’s an example:
pair<string, int> myPair("hello", 5);
auto[theString, theInt] = myPair; // Decompose using structured bindings
cout << "theString: " << theString << endl;
cout << "theInt: " << theInt << endl;
A map
, defined in the <map>
header file, stores key/value pairs instead of just single values. Insertion, lookup, and deletion are all based on the key; the value is just “along for the ride.” The term map comes from the conceptual understanding that the container “maps” keys to values.
A map
keeps elements in sorted order, based on the keys, so that insertion, deletion, and lookup all take logarithmic time. Because of the order, when you enumerate the elements, they come out in the ordering imposed by the type’s operator<
or a user-defined comparator. It is usually implemented as some form of balanced tree, such as a red-black tree. However, the tree structure is not exposed to the client.
You should use a map
whenever you need to store and retrieve elements based on a “key” and you would like to have them in a certain order.
The map
class template takes four types: the key type, the value type, the comparison type, and the allocator type. As always, the allocator is ignored in this chapter. The comparison type is similar to the comparison type for a priority_queue
described earlier. It allows you to specify a different comparison class than the default. In this chapter, only the default less
comparison is used. When using the default, make sure that your keys all respond to operator<
appropriately. If you’re interested in further detail, Chapter 18 explains how to write your own comparison classes.
If you ignore the comparison and allocator parameters, constructing a map
is just like constructing a vector
or a list
, except that you specify the key and value types separately in the template instantiation. For example, the following code constructs a map
that uses int
s as the key and stores objects of the Data
class:
class Data final
{
public:
explicit Data(int value = 0) : mValue(value) { }
int getValue() const { return mValue; }
void setValue(int value) { mValue = value; }
private:
int mValue;
};
…
map<int, Data> dataMap;
A map
also supports uniform initialization:
map<string, int> m = {
{ "Marc G.", 123 },
{ "Warren B.", 456 },
{ "Peter V.W.", 789 }
};
Inserting an element into sequential containers such as vector
and list
always requires you to specify the position at which the element is to be added. A map
, along with the other ordered associative containers, is different. The map
’s internal implementation determines the position in which to store the new element; you need only to supply the key and the value.
When inserting elements, it is important to keep in mind that map
s require unique keys: every element in the map
must have a different key. If you want to support multiple elements with the same key, you have two options: you can either use a map
and store another container such as a vector
or an array
as the value for a key, or you can use multimap
s, described later.
The insert()
method can be used to add elements to a map
, and has the advantage of allowing you to detect if a key already exists. You must specify the key/value pair as a pair
object or as an initializer_list
. The return type of the basic form of insert()
is a pair
of an iterator
and a bool
. The reason for the complicated return type is that insert()
does not overwrite an element value if one already exists with the specified key. The bool
element of the returned pair
specifies whether or not the insert()
actually inserted the new key/value pair. The iterator
refers to the element in the map
with the specified key (with a new or old value, depending on whether the insert succeeded or failed). map
iterators are discussed in more detail in the next section. Continuing the map
example from the previous section, you can use insert()
as follows:
map<int, Data> dataMap;
auto ret = dataMap.insert({ 1, Data(4) }); // Using an initializer_list
if (ret.second) {
cout << "Insert succeeded!" << endl;
} else {
cout << "Insert failed!" << endl;
}
ret = dataMap.insert(make_pair(1, Data(6))); // Using a pair object
if (ret.second) {
cout << "Insert succeeded!" << endl;
} else {
cout << "Insert failed!" << endl;
}
The type of the ret
variable is a pair
as follows:
pair<map<int, Data>::iterator, bool> ret;
The first element of the pair
is a map
iterator for a map
with keys of type int
and values of type Data
. The second element of the pair
is a Boolean value.
The output of the program is as follows:
Insert succeeded!
Insert failed!
With initializers for if
statements (C++17), inserting the data into the map
and checking the result can be done with a single statement as follows:
if (auto result = dataMap.insert({ 1, Data(4) }); result.second) {
cout << "Insert succeeded!" << endl;
} else {
cout << "Insert failed!" << endl;
}
This can even be combined with C++17 structured bindings:
if (auto [iter, success] = dataMap.insert({ 1, Data(4) }); success) {
cout << "Insert succeeded!" << endl;
} else {
cout << "Insert failed!" << endl;
}
insert_or_assign()
has a similar return type as insert()
. However, if an element with the given key already exists, insert_or_assign()
overwrites the old value with the new value, while insert()
does not overwrite the old value in that case. Another difference with insert()
is that insert_or_assign()
has two separate parameters: the key and the value. Here is an example:
ret = dataMap.insert_or_assign(1, Data(7));
if (ret.second) {
cout << "Inserted." << endl;
} else {
cout << "Overwritten." << endl;
}
Another method to insert elements into a map
is through the overloaded operator[]
. The difference is mainly in the syntax: you specify the key and value separately. Additionally, operator[]
always succeeds. If no element value with the given key exists, it creates a new element with that key and value. If an element with the key already exists, operator[]
replaces the element value with the newly specified value. Here is part of the previous example using operator[]
instead of insert()
:
map<int, Data> dataMap;
dataMap[1] = Data(4);
dataMap[1] = Data(6); // Replaces the element with key 1
There is, however, one major caveat to operator[]
: it always constructs a new value object, even if it doesn’t need to use it. Thus, it requires a default constructor for your element values, and can be less efficient than insert()
.
The fact that operator[]
creates a new element in a map
if the requested element does not already exist means that this operator is not marked as const
. This sounds obvious, but might sometimes look counterintuitive. For example, suppose you have the following function:
void func(const map<int, int>& m)
{
cout << m[1] << endl; // Error
}
This fails to compile, even though you appear to be just reading the value m[1]
. It fails because the variable m
is a const
reference to a map
, and operator[]
is not marked as const
. Instead, you should use the find()
method described in the section “Looking Up Elements.”
A map
supports emplace()
and emplace_hint()
to construct elements in-place, similar to the emplace methods of a vector
. C++17 adds a try_emplace()
method that inserts an element in-place if the given key does not exist yet, or does nothing if the key already exists in the map
.
map
iterators work similarly to the iterators on the sequential containers. The major difference is that the iterators refer to key/value pair
s instead of just the values. In order to access the value, you must retrieve the second
field of the pair
object. map
iterators are bidirectional, meaning you can traverse them in both directions. Here is how you can iterate through the map
from the previous example:
for (auto iter = cbegin(dataMap); iter != cend(dataMap); ++iter) {
cout << iter->second.getValue() << endl;
}
Take another look at the expression used to access the value:
iter->second.getValue()
iter
refers to a key/value pair
, so you can use the ->
operator to access the second
field of that pair
, which is a Data
object. You can then call the getValue()
method on that Data
object.
Note that the following code is functionally equivalent:
(*iter).second.getValue()
Using a range-based for
loop, the loop can be written more readable as follows:
for (const auto& p : dataMap) {
cout << p.second.getValue() << endl;
}
It can be implemented even more elegantly using a combination of a range-based for
loop and C++17 structured bindings:
for (const auto& [key, data] : dataMap) {
cout << data.getValue() << endl;
}
A map
provides logarithmic lookup of elements based on a supplied key. If you already know that an element with a given key is in a map, the simplest way to look it up is through operator[]
as long as you call it on a non-const
map
or a non-const
reference to a map
. The nice thing about operator[]
is that it returns a reference to the element that you can use and modify directly, without worrying about pulling the value out of a pair
object. Here is an extension to the preceding example to call the setValue()
method on the Data
object value at key 1
:
map<int, Data> dataMap;
dataMap[1] = Data(4);
dataMap[1] = Data(6);
dataMap[1].setValue(100);
However, if you don’t know whether the element exists, you may not want to use operator[]
, because it will insert a new element with that key if it doesn’t find one already. As an alternative, map
provides a find()
method that returns an iterator
referring to the element with the specified key, if it exists, or the end()
iterator if it’s not in the map
. Here is an example using find()
to perform the same modification to the Data
object with key 1
:
auto it = dataMap.find(1);
if (it != end(dataMap)) {
it->second.setValue(100);
}
As you can see, using find()
is a bit clumsier, but it’s sometimes necessary.
If you only want to know whether or not an element with a certain key is in a map
, you can use the count()
method. It returns the number of elements in a map
with a given key. For map
s, the result will always be 0
or 1
because there can be no elements with duplicate keys.
A map
allows you to remove an element at a specific iterator position or to remove all elements in a given iterator range, in amortized constant and logarithmic time, respectively. From the client perspective, these two erase()
methods are equivalent to those in the sequential containers. A great feature of a map
, however, is that it also provides a version of erase()
to remove an element matching a key. Here is an example:
map<int, Data> dataMap;
dataMap[1] = Data(4);
cout << "There are " << dataMap.count(1) << " elements with key 1" << endl;
dataMap.erase(1);
cout << "There are " << dataMap.count(1) << " elements with key 1" << endl;
The output is as follows:
There are 1 elements with key 1
There are 0 elements with key 1
All the ordered and unordered associative containers are so-called node-based data structures. Starting with C++17, the Standard Library provides direct access to nodes in the form of node handles. The exact type is unspecified, but each container has a type alias called node_type
that specifies the type of a node handle for that container. A node handle can only be moved, and is the owner of the element stored in a node. It provides read/write access to both the key and the value.
Nodes can be extracted from an associative container as a node handle using the extract()
method, based either on a given iterator position or on a given key. Extracting a node from a container removes it from the container, because the returned node handle is the sole owner of the extracted element.
New insert()
overloads are provided that allow you to insert a node handle into a container.
Using extract()
to extract node handles, and insert()
to insert node handles, you can effectively transfer data from one associative container to another one without any copying or moving involved. You can even move nodes from a map
to a multimap
, and from a set
to a multiset
. Continuing with the example from the previous section, the following code snippet transfers the node with key 1 to a second map
:
map<int, Data> dataMap2;
auto extractedNode = dataMap.extract(1);
dataMap2.insert(std::move(extractedNode));
The last two lines can be combined into one:
dataMap2.insert(dataMap.extract(1));
One additional operation is available to move all nodes from one associative container to another one: merge()
. Nodes that cannot be moved because they would cause, for example, duplicates in a target container that does not allow duplicates, are left in the source container. Here is an example:
map<int, int> src = { {1, 11}, {2, 22} };
map<int, int> dst = { {2, 22}, {3, 33}, {4, 44}, {5, 55} };
dst.merge(src);
After the merge operation, src
still contains one element, {2, 22}, because the destination already contains such an element, so it cannot be moved. dst
contains {1, 11}, {2, 22}, {3, 33}, {4, 44}, and {5, 55} after the operation.
You can implement a simple bank account database using a map
. A common pattern is for the key to be one field of a class
or struct
that is stored in a map
. In this case, the key is the account number. Here are simple BankAccount
and BankDB
classes:
class BankAccount final
{
public:
BankAccount(int acctNum, std::string_view name)
: mAcctNum(acctNum), mClientName(name) {}
void setAcctNum(int acctNum) { mAcctNum = acctNum; }
int getAcctNum() const { return mAcctNum; }
void setClientName(std::string_view name) { mClientName = name; }
std::string_view getClientName() const { return mClientName; }
private:
int mAcctNum;
std::string mClientName;
};
class BankDB final
{
public:
// Adds account to the bank database. If an account exists already
// with that number, the new account is not added. Returns true
// if the account is added, false if it's not.
bool addAccount(const BankAccount& account);
// Removes the account acctNum from the database.
void deleteAccount(int acctNum);
// Returns a reference to the account represented
// by its number or the client name.
// Throws out_of_range if the account is not found.
BankAccount& findAccount(int acctNum);
BankAccount& findAccount(std::string_view name);
// Adds all the accounts from db to this database.
// Deletes all the accounts from db.
void mergeDatabase(BankDB& db);
private:
std::map<int, BankAccount> mAccounts;
};
Here are the implementations of the BankDB
methods, with comments explaining the code:
bool BankDB::addAccount(const BankAccount& acct)
{
// Do the actual insert, using the account number as the key
auto res = mAccounts.emplace(acct.getAcctNum(), acct);
// or: auto res = mAccounts.insert(make_pair(acct.getAcctNum(), acct));
// Return the bool field of the pair specifying success or failure
return res.second;
}
void BankDB::deleteAccount(int acctNum)
{
mAccounts.erase(acctNum);
}
BankAccount& BankDB::findAccount(int acctNum)
{
// Finding an element via its key can be done with find()
auto it = mAccounts.find(acctNum);
if (it == end(mAccounts)) {
throw out_of_range("No account with that number.");
}
// Remember that iterators into maps refer to pairs of key/value
return it->second;
}
BankAccount& BankDB::findAccount(string_view name)
{
// Finding an element by a non-key attribute requires a linear
// search through the elements. Uses C++17 structured bindings.
for (auto& [acctNum, account] : mAccounts) {
if (account.getClientName() == name) {
return account; // found it!
}
}
// If your compiler doesn't support the above C++17 structured
// bindings yet, you can use the following implementation
//for (auto& p : mAccounts) {
// if (p.second.getClientName() == name) { return p.second; }
//}
throw out_of_range("No account with that name.");
}
void BankDB::mergeDatabase(BankDB& db)
{
// Use C++17 merge().
mAccounts.merge(db.mAccounts);
// Or: mAccounts.insert(begin(db.mAccounts), end(db.mAccounts));
// Now clear the source database.
db.mAccounts.clear();
}
You can test the BankDB
class with the following code:
BankDB db;
db.addAccount(BankAccount(100, "Nicholas Solter"));
db.addAccount(BankAccount(200, "Scott Kleper"));
try {
auto& acct = db.findAccount(100);
cout << "Found account 100" << endl;
acct.setClientName("Nicholas A Solter");
auto& acct2 = db.findAccount("Scott Kleper");
cout << "Found account of Scott Kleper" << endl;
auto& acct3 = db.findAccount(1000);
} catch (const out_of_range& caughtException) {
cout << "Unable to find account: " << caughtException.what() << endl;
}
The output is as follows:
Found account 100
Found account of Scott Kleper
Unable to find account: No account with that number.
A multimap
is a map
that allows multiple elements with the same key. Like map
s, multimap
s support uniform initialization. The interface is almost identical to the map
interface, with the following differences:
multimap
s do not provide operator[]
and at()
. The semantics of these do not make sense if there can be multiple elements with a single key.multimap
s always succeed. Thus, the multimap::insert()
method that adds a single element returns just an iterator
instead of a pair
.insert_or_assign()
and try_emplace()
methods supported by map
are not supported by a multimap
.The trickiest aspect of multimap
s is looking up elements. You can’t use operator[]
, because it is not provided. find()
isn’t very useful because it returns an iterator
referring to any one of the elements with a given key (not necessarily the first element with that key).
However, multimap
s store all elements with the same key together and provide methods to obtain iterator
s for this subrange of elements with the same key in the container. The lower_bound()
and upper_bound()
methods each return a single iterator
referring to the first and one-past-the-last elements matching a given key. If there are no elements matching that key, the iterator
s returned by lower_bound()
and upper_bound()
will be equal to each other.
If you need to obtain both iterator
s bounding the elements with a given key, it’s more efficient to use equal_range()
instead of calling lower_bound()
followed by calling upper_bound()
. The equal_range()
method returns a pair
of the two iterator
s that would be returned by lower_bound()
and upper_bound()
.
Most of the numerous online chat programs allow users to have a “buddy list” or list of friends. The chat program confers special privileges on users in the buddy list, such as allowing them to send unsolicited messages to the user.
One way to implement the buddy lists for an online chat program is to store the information in a multimap
. One multimap
could store the buddy lists for every user. Each entry in the container stores one buddy for a user. The key is the user and the value is the buddy. For example, if Harry Potter and Ron Weasley had each other on their individual buddy lists, there would be two entries of the form “Harry Potter” maps to “Ron Weasley” and “Ron Weasley” maps to “Harry Potter.” A multimap
allows multiple values for the same key, so the same user is allowed multiple buddies. Here is the BuddyList
class definition:
class BuddyList final
{
public:
// Adds buddy as a friend of name.
void addBuddy(const std::string& name, const std::string& buddy);
// Removes buddy as a friend of name
void removeBuddy(const std::string& name, const std::string& buddy);
// Returns true if buddy is a friend of name, false otherwise.
bool isBuddy(const std::string& name, const std::string& buddy) const;
// Retrieves a list of all the friends of name.
std::vector<std::string> getBuddies(const std::string& name) const;
private:
std::multimap<std::string, std::string> mBuddies;
};
Here is the implementation, with comments explaining the code. It demonstrates the use of lower_bound()
, upper_bound()
, and equal_range()
:
void BuddyList::addBuddy(const string& name, const string& buddy)
{
// Make sure this buddy isn't already there. We don't want
// to insert an identical copy of the key/value pair.
if (!isBuddy(name, buddy)) {
mBuddies.insert({ name, buddy }); // Using initializer_list
}
}
void BuddyList::removeBuddy(const string& name, const string& buddy)
{
// Obtain the beginning and end of the range of elements with
// key 'name'. Use both lower_bound() and upper_bound() to demonstrate
// their use. Otherwise, it's more efficient to call equal_range().
auto begin = mBuddies.lower_bound(name); // Start of the range
auto end = mBuddies.upper_bound(name); // End of the range
// Iterate through the elements with key 'name' looking
// for a value 'buddy'. If there are no elements with key 'name',
// begin equals end, so the loop body doesn't execute.
for (auto iter = begin; iter != end; ++iter) {
if (iter->second == buddy) {
// We found a match! Remove it from the map.
mBuddies.erase(iter);
break;
}
}
}
bool BuddyList::isBuddy(const string& name, const string& buddy) const
{
// Obtain the beginning and end of the range of elements with
// key 'name' using equal_range(), and C++17 structured bindings.
auto [begin, end] = mBuddies.equal_range(name);
// Iterate through the elements with key 'name' looking
// for a value 'buddy'.
for (auto iter = begin; iter != end; ++iter) {
if (iter->second == buddy) {
// We found a match!
return true;
}
}
// No matches
return false;
}
vector<string> BuddyList::getBuddies(const string& name) const
{
// Obtain the beginning and end of the range of elements with
// key 'name' using equal_range(), and C++17 structured bindings.
auto [begin, end] = mBuddies.equal_range(name);
// Create a vector with all names in the range (all buddies of name).
vector<string> buddies;
for (auto iter = begin; iter != end; ++iter) {
buddies.push_back(iter->second);
}
return buddies;
}
This implementation uses C++17 structured bindings, as in this example:
auto [begin, end] = mBuddies.equal_range(name);
If your compiler doesn’t support structured bindings yet, then you can write the following:
auto range = mBuddies.equal_range(name);
auto begin = range.first; // Start of the range
auto end = range.second; // End of the range
Note that removeBuddy()
can’t simply use the version of erase()
that erases all elements with a given key, because it should erase only one element with the key, not all of them. Note also that getBuddies()
can’t use insert()
on the vector
to insert the elements in the range returned by equal_range()
, because the elements referred to by the multimap iterator
s are key/value pair
s, not string
s. The getBuddies()
method must iterate explicitly through the range extracting the string
from each key/value pair
and pushing it onto the new vector
to be returned.
Here is a test of the BuddyList
:
BuddyList buddies;
buddies.addBuddy("Harry Potter", "Ron Weasley");
buddies.addBuddy("Harry Potter", "Hermione Granger");
buddies.addBuddy("Harry Potter", "Hagrid");
buddies.addBuddy("Harry Potter", "Draco Malfoy");
// That's not right! Remove Draco.
buddies.removeBuddy("Harry Potter", "Draco Malfoy");
buddies.addBuddy("Hagrid", "Harry Potter");
buddies.addBuddy("Hagrid", "Ron Weasley");
buddies.addBuddy("Hagrid", "Hermione Granger");
auto harrysFriends = buddies.getBuddies("Harry Potter");
cout << "Harry's friends: " << endl;
for (const auto& name : harrysFriends) {
cout << " " << name << endl;
}
The output is as follows:
Harry's friends:
Ron Weasley
Hermione Granger
Hagrid
A set
, defined in <set>
, is very similar to a map
. The difference is that instead of storing key/value pairs, in set
s the key is the value. set
s are useful for storing information in which there is no explicit key, but which you want to have in sorted order without any duplicates, with quick insertion, lookup, and deletion.
The interface supplied by set
is almost identical to that of map
. The main difference is that set
doesn’t provide operator[]
, insert_or_assign()
, and try_emplace()
.
You cannot change the key/value of elements in a set
because modifying elements of a set
while they are in the container would destroy the order.
One way to implement basic security on a computer system is through access control lists. Each entity on the system, such as a file or a device, has a list of users with permissions to access that entity. Users can generally be added to and removed from the permissions list for an entity only by users with special privileges. Internally, a set
provides a nice way to represent the access control list. You could use one set
for each entity, containing all the usernames who are allowed to access the entity. Here is a class definition for a simple access control list:
class AccessList final
{
public:
// Default constructor
AccessList() = default;
// Constructor to support uniform initialization.
AccessList(std::initializer_list<std::string_view> initlist);
// Adds the user to the permissions list.
void addUser(std::string_view user);
// Removes the user from the permissions list.
void removeUser(std::string_view user);
// Returns true if the user is in the permissions list.
bool isAllowed(std::string_view user) const;
// Returns a vector of all the users who have permissions.
std::vector<std::string> getAllUsers() const;
private:
std::set<std::string> mAllowed;
};
Here are the method definitions:
AccessList::AccessList(initializer_list<string_view> initlist)
{
mAllowed.insert(begin(initlist), end(initlist));
}
void AccessList::addUser(string_view user)
{
mAllowed.emplace(user);
}
void AccessList::removeUser(string_view user)
{
mAllowed.erase(string(user));
}
bool AccessList::isAllowed(string_view user) const
{
return (mAllowed.count(string(user)) != 0);
}
vector<string> AccessList::getAllUsers() const
{
return { begin(mAllowed), end(mAllowed) };
}
Take a look at the interesting one-line implementation of getAllUsers()
. That one line constructs a vector<string>
to return, by passing a begin and end iterator of mAllowed
to the vector
constructor. If you want, you can split this over two lines:
vector<string> users(begin(mAllowed), end(mAllowed));
return users;
Finally, here is a simple test program:
AccessList fileX = { "pvw", "mgregoire", "baduser" };
fileX.removeUser("baduser");
if (fileX.isAllowed("mgregoire")) {
cout << "mgregoire has permissions" << endl;
}
if (fileX.isAllowed("baduser")) {
cout << "baduser has permissions" << endl;
}
auto users = fileX.getAllUsers();
for (const auto& user : users) {
cout << user << " ";
}
One of the constructors for AccessList
uses an initializer_list
as a parameter, so that you can use the uniform initialization syntax, as demonstrated in the test program for initializing fileX
.
The output of this program is as follows:
mgregoire has permissions
mgregoire pvw
A multiset
is to a set
what a multimap
is to a map
. A multiset
supports all the operations of a set
, but it allows multiple elements that are equal to each other to be stored in the container simultaneously. An example of a multiset
is not shown because it’s so similar to set
and multimap
.
The Standard Library has support for unordered associative containers or hash tables. There are four of them: unordered_map
, unordered_multimap
, unordered_set
, and unordered_multiset
. The map
, multimap
, set
, and multiset
containers discussed earlier sort their elements, while these unordered variants do not sort their elements.
The unordered associative containers are also called hash tables. That is because the implementation makes use of so-called hash functions. The implementation usually consists of some kind of array where each element in the array is called a bucket. Each bucket has a specific numerical index like 0, 1, 2, up until the last bucket. A hash function transforms a key into a hash value, which is then transformed into a bucket index. The value associated with that key is then stored in that bucket.
The result of a hash function is not always unique. The situation in which two or more keys hash to the same bucket index is called a collision. A collision can occur when different keys result in the same hash value, or when different hash values transform to the same bucket index. There are many approaches to handling collisions, including quadratic re-hashing and linear chaining, among others. If you are interested, you can consult one of the references in the Algorithms and Data Structures section in Appendix B. The Standard Library does not specify which collision-handling algorithm is required, but most current implementations have chosen to resolve collisions by linear chaining. With linear chaining, buckets do not directly contain the data values associated with the keys, but contain a pointer to a linked list. This linked list contains all the data values for that specific bucket. Figure 17-1 shows how this works.
In Figure 17-1, there are two collisions. The first collision is because applying the hash function to the keys “Marc G.” and “John D.” results in the same hash value which maps to bucket index 128. This bucket then points to a linked list containing the keys “Marc G.” and “John D.” together with their associated data values. The second collision is caused by the hash values for “Scott K.” and “Johan G.” mapping to the same bucket index 129.
From Figure 17-1, it is also clear how lookups based on keys work and what the complexity is. A lookup involves a single hash function call to calculate the hash value. This hash value is then transformed to a bucket index. Once the bucket index is known, one or more equality operations are required to find the right key in the linked list. This shows that lookups can be much faster compared to lookups with normal map
s, but it all depends on how many collisions there are.
The choice of the hash function is very important. A hash function that creates no collisions is known as a “perfect hash.” A perfect hash has a lookup time that is constant; a regular hash has a lookup time that is, on average, close to 1, independent of the number of elements. As the number of collisions increases, the lookup time increases, reducing performance. Collisions can be reduced by increasing the basic hash table size, but you need to take cache sizes into account.
The C++ standard provides hash functions for pointers and all primitive data types such as bool
, char
, int
, float
, double
, and so on. Hash functions are also provided for error_code
, error_condition
, optional
, variant
, bitset
, unique_ptr
, shared_ptr
, type_index
, string
, string_view
, vector<bool>
, and thread::id
. If there is no standard hash function available for the type of keys you want to use, then you have to implement your own hash function. Creating a perfect hash is a nontrivial exercise, even when the set of keys is fixed and known. It requires deep mathematical analysis. However, even creating a non-perfect one, but one which is good enough and has decent performance, is still challenging. It’s outside the scope of this book to explain the mathematics behind hash functions in detail. Instead, only an example of a very simple hash function is given.
The following code demonstrates how to write a custom hash function. This implementation simply forwards the request to one of the available standard hash functions. The code defines a class IntWrapper
that just wraps a single integer. An operator==
is provided because that’s a requirement for keys used in unordered associative containers.
class IntWrapper
{
public:
IntWrapper(int i) : mWrappedInt(i) {}
int getValue() const { return mWrappedInt; }
private:
int mWrappedInt;
};
bool operator==(const IntWrapper& lhs, const IntWrapper& rhs)
{
return lhs.getValue() == rhs.getValue();
}
To write a hash function for IntWrapper
, you write a specialization of the std::hash
template for IntWrapper
. The std::hash
template is defined in <functional>
. This specialization needs an implementation of the function call operator that calculates and returns the hash of a given IntWrapper
instance. For this example, the request is simply forwarded to the standard hash function for integers:
namespace std
{
template<> struct hash<IntWrapper>
{
using argument_type = IntWrapper;
using result_type = size_t;
result_type operator()(const argument_type& f) const {
return std::hash<int>()(f.getValue());
}
};
}
Note that you normally are not allowed to put anything in the std
namespace; however, std
class template specializations are an exception to this rule. The two type definitions are required by the hash
class template. The implementation of the function call operator is just one line. It creates an instance of the standard hash function for integers—std::hash<int>()
—and then calls the function call operator on it with, as argument, f.getValue()
. Note that this forwarding works in this example because IntWrapper
contains just one data member, an integer. If the class contained multiple data members, then a hash value would need to be calculated taking all these data members into account; however, those details fall outside the scope of this book.
The unordered_map
container is defined in <unordered_map>
as a class template:
template <class Key,
class T,
class Hash = hash<Key>,
class Pred = std::equal_to<Key>,
class Alloc = std::allocator<std::pair<const Key, T>>>
class unordered_map;
There are five template parameters: the key type, the value type, the hash type, the equal comparison type, and the allocator type. With the last three parameters you can specify your own hash function, equal comparison function, and allocator function, respectively. These parameters can usually be ignored because they have default values. I recommend you keep those default values when possible. The most important parameters are the first two. As with map
s, uniform initialization can be used to initialize an unordered_map
, as shown in the following example:
unordered_map<int, string> m = {
{1, "Item 1"},
{2, "Item 2"},
{3, "Item 3"},
{4, "Item 4"}
};
// Using C++17 structured bindings.
for (const auto&[key, value] : m) {
cout << key << " = " << value << endl;
}
// Without structured bindings.
for (const auto& p : m) {
cout << p.first << " = " << p.second << endl;
}
The following table summarizes the differences between a map
and an unordered_map
. A filled box () means the container supports that operation, while an empty box () means the operation is not supported.
OPERATION | <char:InlineCodeUserInput>map</char:InlineCodeUserInput> | <char:InlineCodeUserInput>unordered_map</char:InlineCodeUserInput> |
at() |
||
begin() |
||
begin(n) |
||
bucket() |
||
bucket_count() |
||
bucket_size() |
||
cbegin() |
||
cbegin(n) |
||
cend() |
||
cend(n) |
||
clear() |
||
count() |
||
crbegin() |
||
crend() |
||
emplace() |
||
emplace_hint() |
||
empty() |
||
end() |
||
end(n) |
||
equal_range() |
||
erase() |
||
extract() |
||
find() |
||
insert() |
||
insert_or_assign() |
||
iterator / const_iterator |
||
load_factor() |
||
local_iterator / const_local_iterator |
||
lower_bound() |
||
max_bucket_count() |
||
max_load_factor() |
||
max_size() |
||
merge() |
||
operator[] |
||
rbegin() |
||
rehash() |
||
rend() |
||
reserve() |
||
reverse_iterator / const_reverse_iterator |
||
size() |
||
swap() |
||
try_emplace() |
||
upper_bound() |
As with a normal map
, all keys in an unordered_map
should be unique. The preceding table includes a number of hash-specific methods. For example, load_factor()
returns the average number of elements per bucket to give you an indication of the number of collisions. The bucket_count()
method returns the number of buckets in the container. It also provides a local_iterator
and const_local_iterator
, allowing you to iterate over the elements in a single bucket; however, these may not be used to iterate across buckets. The bucket(key)
method returns the index of the bucket that contains the given key; begin(n)
returns a local_iterator
referring to the first element in the bucket with index n
, and end(n)
returns a local_iterator
referring to one-past-the-last element in the bucket with index n
. The example in the next section demonstrates how to use these methods.
The following example uses an unordered_map
to represent a phone book. The name of a person is the key, while the phone number is the value associated with that key.
template<class T>
void printMap(const T& m)
{
for (auto& [key, value] : m) {
cout << key << " (Phone: " << value << ")" << endl;
}
cout << "-------" << endl;
}
int main()
{
// Create a hash table.
unordered_map<string, string> phoneBook = {
{ "Marc G.", "123-456789" },
{ "Scott K.", "654-987321" } };
printMap(phoneBook);
// Add/remove some phone numbers.
phoneBook.insert(make_pair("John D.", "321-987654"));
phoneBook["Johan G."] = "963-258147";
phoneBook["Freddy K."] = "999-256256";
phoneBook.erase("Freddy K.");
printMap(phoneBook);
// Find the bucket index for a specific key.
const size_t bucket = phoneBook.bucket("Marc G.");
cout << "Marc G. is in bucket " << bucket
<< " which contains the following "
<< phoneBook.bucket_size(bucket) << " elements:" << endl;
// Get begin and end iterators for the elements in this bucket.
// 'auto' is used here. The compiler deduces the type of
// both as unordered_map<string, string>::const_local_iterator
auto localBegin = phoneBook.cbegin(bucket);
auto localEnd = phoneBook.cend(bucket);
for (auto iter = localBegin; iter != localEnd; ++iter) {
cout << " " << iter->first << " (Phone: "
<< iter->second << ")" << endl;
}
cout << "-------" << endl;
// Print some statistics about the hash table
cout << "There are " << phoneBook.bucket_count() << " buckets." << endl;
cout << "Average number of elements in a bucket is "
<< phoneBook.load_factor() << endl;
return 0;
}
A possible output is as follows. Note that the output can be different on a different system, because it depends on the implementation of both the hash function and the unordered_map
itself being used.
Scott K. (Phone: 654-987321)
Marc G. (Phone: 123-456789)
-------
Scott K. (Phone: 654-987321)
Marc G. (Phone: 123-456789)
Johan G. (Phone: 963-258147)
John D. (Phone: 321-987654)
-------
Marc G. is in bucket 1 which contains the following 2 elements:
Scott K. (Phone: 654-987321)
Marc G. (Phone: 123-456789)
-------
There are 8 buckets.
Average number of elements in a bucket is 0.5
An unordered_multimap
is an unordered_map
that allows multiple elements with the same key. Their interfaces are almost identical, with the following differences:
unordered_multimap
s do not provide operator[]
and at()
. The semantics of these do not make sense if there can be multiple elements with a single key.unordered_multimap
s always succeed. Thus, the unordered_multimap::insert()
method that adds a single element returns just an iterator
instead of a pair
.insert_or_assign()
and try_emplace()
methods supported by unordered_map
are not supported by an unordered_multimap
.As discussed earlier with multimap
s, looking up elements in unordered_multimap
s cannot be done using operator[]
because it is not provided. You can use find()
but it returns an iterator referring to any one of the elements with a given key (not necessarily the first element with that key). Instead, it’s best to use the equal_range()
method, which returns a pair of iterators: one referring to the first element matching a given key, and one referring to one-past-the-last element matching that key. The use of equal_range()
is exactly the same as discussed for multimap
s, so you can look at the example given for multimap
s to see how it works.
The <unordered_set>
header file defines unordered_set
and unordered_multiset
, which are very similar to set
and multiset
, respectively, except that they do not sort their keys but they use a hash function. The differences between unordered_set
and unordered_map
are similar to the differences between set
and map
as discussed earlier in this chapter, so they are not discussed in detail here. Consult a Standard Library Reference for a thorough summary of unordered_set
and unordered_multiset
operations.
There are several other parts of the C++ language that work with the Standard Library to varying degrees, including standard C-style arrays, string
s, streams, and bitset
.
Recall that “dumb”/raw pointers are bona fide iterators because they support the required operations. This point is more than just a piece of trivia. It means that you can treat standard C-style arrays as Standard Library containers by using pointers to their elements as iterators. Standard C-style arrays, of course, don’t provide methods like size()
, empty()
, insert()
, and erase()
, so they aren’t true Standard Library containers. Nevertheless, because they do support iterators through pointers, you can use them in the algorithms described in Chapter 18 and in some of the methods described in this chapter.
For example, you could copy all the elements of a standard C-style array into a vector
using the insert()
method of a vector
that takes an iterator range from any container. This insert()
method’s prototype looks like this:
template <class InputIterator> iterator insert(const_iterator position,
InputIterator first, InputIterator last);
If you want to use a standard C-style int
array as the source, then the templatized type of InputIterator
becomes int*
. Here is a full example:
const size_t count = 10;
int arr[count]; // standard C-style array
// Initialize each element of the array to the value of its index.
for (int i = 0; i < count; i++) {
arr[i] = i;
}
// Insert the contents of the array at the end of a vector.
vector<int> vec;
vec.insert(end(vec), arr, arr + count);
// Print the contents of the vector.
for (const auto& i : vec) {
cout << i << " ";
}
Note that the iterator referring to the first element of the array is the address of the first element, which is arr
in this case. The name of an array alone is interpreted as the address of the first element. The iterator referring to the end must be one-past-the-last element, so it’s the address of the first element plus count
, or arr+count
.
It’s easier to use std::begin()
or std::cbegin()
to get an iterator to the first element of a statically allocated C-style array not accessed through pointers, and std::end()
or std::cend()
to get an iterator to one-past-the-last element of such an array. For example, the call to insert()
in the previous example can be written as follows:
vec.insert(end(vec), cbegin(arr), cend(arr));
You can think of a string
as a sequential container of characters. Thus, it shouldn’t be surprising to learn that a C++ string
is a full-fledged sequential container. It contains begin()
and end()
methods that return iterators into the string
, insert()
, push_back()
, and erase()
methods, size()
, empty()
, and all the rest of the sequential container basics. It resembles a vector
quite closely, even providing the methods reserve()
and capacity()
.
You can use string
as a Standard Library container just as you would use vector
. Here is an example:
string myString;
myString.insert(cend(myString), 'h');
myString.insert(cend(myString), 'e');
myString.push_back('l');
myString.push_back('l');
myString.push_back('o');
for (const auto& letter : myString) {
cout << letter;
}
cout << endl;
for (auto it = cbegin(myString); it != cend(myString); ++it) {
cout << *it;
}
cout << endl;
In addition to the Standard Library sequential container methods, string
s provide a host of useful methods and friend
functions. The string
interface is actually quite a good example of a cluttered interface, one of the design pitfalls discussed in Chapter 6. The string
class is discussed in detail in Chapter 2.
Input and output streams are not containers in the traditional sense because they do not store elements. However, they can be considered sequences of elements, and as such share some characteristics with Standard Library containers. C++ streams do not provide any Standard Library–related methods directly, but the Standard Library supplies special iterators called istream_iterator
and ostream_iterator
that allow you to “iterate” through input and output streams. Chapter 21 explains how to use them.
A bitset
is a fixed-length abstraction of a sequence of bits. A bit can represent only two values, 1 and 0, which can be referred to as on/off, true/false, and so on. A bitset
also uses the terminology set and unset. You can toggle or flip a bit from one value to the other.
A bitset
is not a true Standard Library container: it’s of fixed size, it’s not templatized on an element type, and it doesn’t support iteration. However, it’s a useful utility class, which is often lumped with the containers, so a brief introduction is provided here. Consult a Standard Library Reference for a thorough summary of the bitset
operations.
A bitset
, defined in <bitset>
, is templatized on the number of bits it stores. The default constructor initializes all fields of a bitset
to 0. An alternative constructor creates a bitset
from a string
of 0
and 1
characters.
You can adjust the values of individual bits with the set()
, reset()
, and flip()
methods, and you can access and set individual fields with an overloaded operator[]
. Note that operator[]
on a non-const
object returns a proxy object to which you can assign a Boolean value, call flip()
, or complement with operator~
. You can also access individual fields with the test()
method. Additionally, you can stream bitset
s with the normal insertion and extraction operators. A bitset
is streamed as a string
of 0
and 1
characters.
Here is a small example:
bitset<10> myBitset;
myBitset.set(3);
myBitset.set(6);
myBitset[8] = true;
myBitset[9] = myBitset[3];
if (myBitset.test(3)) {
cout << "Bit 3 is set!"<< endl;
}
cout << myBitset << endl;
The output is as follows:
Bit 3 is set!
1101001000
Note that the leftmost character in the output string
is the highest numbered bit. This corresponds to our intuitions about binary number representations, where the low-order bit representing 20 = 1 is the rightmost bit in the printed representation.
In addition to the basic bit manipulation routines, a bitset
provides implementations of all the bitwise operators: &
, |
, ^
, ~
, <<
, >>
, &=
, |=
, ^=
, <<=
, and >>=
. They behave just as they would on a “real” sequence of bits. Here is an example:
auto str1 = "0011001100";
auto str2 = "0000111100";
bitset<10> bitsOne(str1);
bitset<10> bitsTwo(str2);
auto bitsThree = bitsOne & bitsTwo;
cout << bitsThree << endl;
bitsThree <<= 4;
cout << bitsThree << endl;
The output of the program is as follows:
0000001100
0011000000
One possible use of bitset
s is tracking channels of cable subscribers. Each subscriber could have a bitset
of channels associated with their subscription, with set bits representing the channels to which they actually subscribe. This system could also support “packages” of channels, also represented as bitset
s, which represent commonly subscribed combinations of channels.
The following CableCompany
class is a simple example of this model. It uses two map
s, each of string
/bitset
. One stores the cable packages, while the other stores subscriber information.
const size_t kNumChannels = 10;
class CableCompany final
{
public:
// Adds the package with the specified channels to the database.
void addPackage(std::string_view packageName,
const std::bitset<kNumChannels>& channels);
// Removes the specified package from the database.
void removePackage(std::string_view packageName);
// Retrieves the channels of a given package.
// Throws out_of_range if the package name is invalid.
const std::bitset<kNumChannels>& getPackage(
std::string_view packageName) const;
// Adds customer to database with initial channels found in package.
// Throws out_of_range if the package name is invalid.
// Throws invalid_argument if the customer is already known.
void newCustomer(std::string_view name, std::string_view package);
// Adds customer to database with given initial channels.
// Throws invalid_argument if the customer is already known.
void newCustomer(std::string_view name,
const std::bitset<kNumChannels>& channels);
// Adds the channel to the customers profile.
// Throws invalid_argument if the customer is unknown.
void addChannel(std::string_view name, int channel);
// Removes the channel from the customers profile.
// Throws invalid_argument if the customer is unknown.
void removeChannel(std::string_view name, int channel);
// Adds the specified package to the customers profile.
// Throws out_of_range if the package name is invalid.
// Throws invalid_argument if the customer is unknown.
void addPackageToCustomer(std::string_view name,
std::string_view package);
// Removes the specified customer from the database.
void deleteCustomer(std::string_view name);
// Retrieves the channels to which a customer subscribes.
// Throws invalid_argument if the customer is unknown.
const std::bitset<kNumChannels>& getCustomerChannels(
std::string_view name) const;
private:
// Retrieves the channels for a customer. (non-const)
// Throws invalid_argument if the customer is unknown.
std::bitset<kNumChannels>& getCustomerChannelsHelper(
std::string_view name);
using MapType = std::map<std::string, std::bitset<kNumChannels>>;
MapType mPackages, mCustomers;
};
Here are the implementations of all methods, with comments explaining the code:
void CableCompany::addPackage(string_view packageName,
const bitset<kNumChannels>& channels)
{
mPackages.emplace(packageName, channels);
}
void CableCompany::removePackage(string_view packageName)
{
mPackages.erase(packageName.data());
}
const bitset<kNumChannels>& CableCompany::getPackage(
string_view packageName) const
{
// Get a reference to the specified package.
auto it = mPackages.find(packageName.data());
if (it == end(mPackages)) {
// That package doesn't exist. Throw an exception.
throw out_of_range("Invalid package");
}
return it->second;
}
void CableCompany::newCustomer(string_view name, string_view package)
{
// Get the channels for the given package.
auto& packageChannels = getPackage(package);
// Create the account with the bitset representing that package.
newCustomer(name, packageChannels);
}
void CableCompany::newCustomer(string_view name,
const bitset<kNumChannels>& channels)
{
// Add customer to the customers map.
auto result = mCustomers.emplace(name, channels);
if (!result.second) {
// Customer was already in the database. Nothing changed.
throw invalid_argument("Duplicate customer");
}
}
void CableCompany::addChannel(string_view name, int channel)
{
// Get the current channels for the customer.
auto& customerChannels = getCustomerChannelsHelper(name);
// We found the customer; set the channel.
customerChannels.set(channel);
}
void CableCompany::removeChannel(string_view name, int channel)
{
// Get the current channels for the customer.
auto& customerChannels = getCustomerChannelsHelper(name);
// We found this customer; remove the channel.
customerChannels.reset(channel);
}
void CableCompany::addPackageToCustomer(string_view name, string_view package)
{
// Get the channels for the given package.
auto& packageChannels = getPackage(package);
// Get the current channels for the customer.
auto& customerChannels = getCustomerChannelsHelper(name);
// Or-in the package to the customer's existing channels.
customerChannels |= packageChannels;
}
void CableCompany::deleteCustomer(string_view name)
{
mCustomers.erase(name.data());
}
const bitset<kNumChannels>& CableCompany::getCustomerChannels(
string_view name) const
{
// Use const_cast() to forward to getCustomerChannelsHelper()
// to avoid code duplication.
return const_cast<CableCompany*>(this)->getCustomerChannelsHelper(name);
}
bitset<kNumChannels>& CableCompany::getCustomerChannelsHelper(
string_view name)
{
// Find a reference to the customer.
auto it = mCustomers.find(name.data());
if (it == end(mCustomers)) {
throw invalid_argument("Unknown customer");
}
// Found it.
// Note that 'it' is a reference to a name/bitset pair.
// The bitset is the second field.
return it->second;
}
Finally, here is a simple program demonstrating how to use the CableCompany
class:
CableCompany myCC;
auto basic_pkg = "1111000000";
auto premium_pkg = "1111111111";
auto sports_pkg = "0000100111";
myCC.addPackage("basic", bitset<kNumChannels>(basic_pkg));
myCC.addPackage("premium", bitset<kNumChannels>(premium_pkg));
myCC.addPackage("sports", bitset<kNumChannels>(sports_pkg));
myCC.newCustomer("Marc G.", "basic");
myCC.addPackageToCustomer("Marc G.", "sports");
cout << myCC.getCustomerChannels("Marc G.") << endl;
The output is as follows:
1111100111
This chapter introduced the Standard Library containers. It also presented sample code illustrating a variety of uses for these containers. Hopefully, you appreciate the power of vector
, deque
, list
, forward_list
, array
, stack
, queue
, priority_queue
, map
, multimap
, set
, multiset
, unordered_map
, unordered_multimap
, unordered_set
, unordered_multiset
, string
, and bitset
. Even if you don’t incorporate them into your programs immediately, you should keep them in the back of your mind for future projects.
Now that you are familiar with the containers, the next chapter illustrates the true power of the Standard Library with a discussion of generic algorithms.