16
Overview of the C++ Standard Library

CODING PRINCIPLES

The Standard Library makes heavy use of the C++ features called templates and operator overloading.

Use of Templates

Templates are used to allow generic programming. They make it possible to write code that can work with all kinds of objects, even objects unknown to the programmer when writing the code. The obligation of the programmer writing the template code is to specify the requirements of the classes that define these objects; for example, that they have an operator for comparison, or a copy constructor, or whatever is deemed appropriate, and then making sure the code that is written uses only those required capabilities. The obligation of the programmer creating the objects is to supply those operators and methods that the template writer requires.

Unfortunately, many programmers consider templates to be the most difficult part of C++ and, for that reason, tend to avoid them. However, even if you never write your own templates, you need to understand their syntax and capabilities in order to use the Standard Library. Templates are described in detail in Chapter 12. If you skipped that chapter and are not familiar with templates, I suggest you first read Chapter 12, and then come back to learn more about the Standard Library.

Use of Operator Overloading

Operator overloading is another feature used extensively by the C++ Standard Library. Chapter 9 has a whole section devoted to operator overloading. Make sure you read that section and understand it before tackling this and subsequent chapters. In addition, Chapter 15 presents much more detail on the subject of operator overloading, but those details are not required to understand the following chapters.

OVERVIEW OF THE C++ STANDARD LIBRARY

This section introduces the various components of the Standard Library from a design perspective. You will learn what facilities are available for you to use, but you will not learn the coding details. Those details are covered in other chapters.

Strings

C++ provides a built-in string class, defined in the <string> header. Although you may still use C-style strings of character arrays, the C++ string class is superior in almost every way. It handles the memory management; provides some bounds checking, assignment semantics, and comparisons; and supports manipulations such as concatenation, substring extraction, and substring or character replacement.

The Standard Library also provides a string_view class, defined in <string_view>. It is a read-only view of all kind of string representations, and can be used as a drop-in replacement for const string&, but without the overhead. It never copies strings!

C++ provides support for Unicode and localization. These features allow you to write programs that work with different languages, for example, Chinese. Locales, defined in <locale>, allow you to format data such as numbers and dates according to the rules of a certain country or region.

In case you missed it, Chapter 2 provides all the details of the string and string_view classes, while Chapter 19 discusses Unicode and localization.

Regular Expressions

Regular expressions are available through the <regex> header file. They make it easy to perform so-called pattern-matching, often used in text processing. Pattern-matching allows you to search special patterns in strings and optionally to replace them with a new pattern. Regular expressions are discussed in Chapter 19.

I/O Streams

C++ includes a model for input and output called streams. The C++ library provides routines for reading and writing built-in types from and to files, console/keyboard, and strings. C++ also provides the facilities for coding your own routines for reading and writing your own objects. The I/O functionality is defined in several header files: <fstream>, <iomanip>, <ios>, <iosfwd>, <iostream>, <istream>, <ostream>, <sstream>, <streambuf>, and <strstream>. Chapter 1 reviews the basics of I/O streams, and Chapter 13 discusses streams in detail.

Smart Pointers

One of the problems faced in robust programming is knowing when to delete an object. There are several failures that can happen. A first problem is not deleting the object at all (failing to free the storage). These are known as memory leaks, where objects accumulate and take up space but are not used. Another problem is where a piece of code deletes the storage while another piece of code is still pointing to that storage, resulting in pointers to storage that either is no longer in use or has been reallocated for another purpose. These are known as dangling pointers. One more problem is when one piece of code frees the storage, and another piece of code attempts to free the same storage. This is known as double-freeing. All of these problems tend to result in program failures of some sort. Some failures are readily detected; others cause the program to produce erroneous results. Most of these errors are difficult to discover and repair.

C++ addresses these problems with smart pointers: unique_ptr, shared_ptr, and weak_ptr. shared_ptr and weak_ptr are thread-safe. They are all defined in the <memory> header. These smart pointers are introduced in Chapter 1 and discussed in more detail in Chapter 7.

Before C++11, the functionality of unique_ptr was handled by a type called auto_ptr, which has been removed from C++17 and should not be used anymore. There was no equivalent to shared_ptr in the earlier Standard Library, although many third-party libraries (for example, Boost) did provide this capability.

Exceptions

The C++ language supports exceptions, which allow functions or methods to pass errors of various types up to calling functions or methods. The C++ Standard Library provides a class hierarchy of exceptions that you can use in your code as is, or that you can derive from to create your own exception types. Exception support is defined in a couple of header files: <exception>, <stdexcept>, and <system_error>. Chapter 14 covers the details of exceptions and the standard exception classes.

Mathematical Utilities

The C++ Standard Library provides a collection of mathematical utility classes and functions.

A whole range of common mathematical functions is available, such as abs(), remainder(), fma(), exp(), log(), pow(), sqrt(), sin(), atan2(), sinh(), erf(), tgamma(), ceil(), floor(), and more. C++17 adds a number of special mathematical functions to work with Legendre polynomials, beta functions, elliptic integrals, Bessel functions, cylindrical Neumann functions, and so on.

There is a complex number class called complex, defined in <complex>, which provides an abstraction for working with numbers that contain both real and imaginary components.

The compile-time rational arithmetic library provides a ratio class template, defined in the <ratio> header file. This ratio class template can exactly represent any finite rational number defined by a numerator and denominator. This library is discussed in Chapter 20.

The Standard Library also contains a class called valarray, defined in <valarray>, which is similar to the vector class but is more optimized for high-performance numerical applications. The library provides several related classes to represent the concept of vector slices. From these building blocks, it is possible to build classes to perform matrix mathematics. There is no built-in matrix class; however, there are third-party libraries like Boost that include matrix support. The valarray class is not further discussed in this book.

C++ also provides a standard way to obtain information about numeric limits, such as the maximum possible value for an integer on the current platform. In C, you could access #defines, such as INT_MAX. While those are still available in C++, it’s recommended to use the numeric_limits class template defined in the <limits> header file. Its use is straightforward, as shown in the following code:

cout << "int:" << endl;
cout << "Max int value: " << numeric_limits<int>::max() << endl;
cout << "Min int value: " << numeric_limits<int>::min() << endl;
cout << "Lowest int value: " << numeric_limits<int>::lowest() << endl;

cout << endl << "double:" << endl;
cout << "Max double value: " << numeric_limits<double>::max() << endl;
cout << "Min double value: " << numeric_limits<double>::min() << endl;
cout << "Lowest double value: " << numeric_limits<double>::lowest() << endl;

The output of this code snippet on my system is as follows:

int:
Max int value: 2147483647
Min int value: -2147483648
Lowest int value: -2147483648

double:
Max double value: 1.79769e+308
Min double value: 2.22507e-308
Lowest double value: -1.79769e+308

Note the differences between min() and lowest(). For an integer, the minimum value equals the lowest value. However, for floating-point types, the minimum value is the smallest positive value that can be represented, while the lowest value is the most negative value representable, which equals -max().

Time Utilities

C++ includes the chrono library, defined in the <chrono> header file. This library makes it easy to work with time; for example, to time certain durations, or to perform actions based on timing. The chrono library is discussed in detail in Chapter 20. Other time and date utilities are provided in the <ctime> header.

Random Numbers

C++ already has support for generating pseudo-random numbers for a long time with the srand() and rand() functions. However, those functions provide only very basic random numbers. For example, you cannot change the distribution of the generated random numbers.

Since C++11, a random number library has been added to the standard, which is much more powerful. The new library is defined in <random>, and comes with random number engines, random number engine adaptors, and random number distributions. All of these can be used to give you random numbers more suited to your problem domain, such as normal distributions, negative exponential distributions, and so on.

Consult Chapter 20 for details on this library.

Initializer Lists

Initializer lists are defined in the <initializer_list> header file. They make it easy to write functions that can accept a variable number of arguments and are discussed in Chapter 1.

Pair and Tuple

The <utility> header defines the pair template, which can store two elements with two different types. This is known as storing heterogeneous elements. All Standard Library containers discussed further in this chapter store homogeneous elements, meaning that all the elements in the container must have the same type. A pair allows you to store exactly two elements of completely unrelated types in one object. The pair class template is discussed in more detail in Chapter 17.

tuple, defined in <tuple>, is a generalization of pair. It is a sequence with a fixed size that can have heterogeneous elements. The number and type of elements for a tuple instantiation is fixed at compile time. Tuples are discussed in Chapter 20.

image optional, variant, and any

C++17 introduces the following new classes:

  • optional, defined in <optional>, holds a value of a specific type, or nothing. It can be used for parameters or return types of a function if you want to allow for values to be optional.
  • variant, defined in <variant>, can hold a single value of one of a given set of types, or nothing.
  • any, defined in <any>, is a class that can contain a single value of any type.

These three types are discussed in Chapter 20.

Function Objects

A class that implements a function call operator is called a function object. Function objects can, for example, be used as predicates for certain Standard Library algorithms. The <functional> header file defines a number of predefined function objects and supports creating new function objects based on existing ones.

Function objects are discussed in detail in Chapter 18 together with the Standard Library algorithms.

image Filesystem

C++17 introduces a filesystem support library. Everything is defined in the <filesystem> header, and lives in the std::filesystem namespace. It allows you to write portable code to work with a filesystem. You can use it for querying whether something is a directory or a file, iterating over the contents of a directory, manipulating paths, and retrieving information about files such as their size, extension, creation time, and so on. This filesystem support library is discussed in Chapter 20.

Multithreading

All major CPU vendors are selling processors with multiple cores. They are being used for everything from servers to consumer computers and even smartphones. If you want your software to take advantage of all these cores, then you need to write multithreaded code. The Standard Library provides a couple of basic building blocks for writing such code. Individual threads can be created using the thread class from the <thread> header.

In multithreaded code you need to take care that several threads are not reading and writing to the same piece of data at the same time. To prevent this, you can use atomics, defined in <atomic>, which give you thread-safe atomic access to a piece of data. Other thread synchronization mechanisms are provided by <condition_variable> and <mutex>.

If you just need to calculate something, possibly on a different thread, and get the result back with proper exception handling, you can use async and future. These are defined in the <future> header, and are easier to use than directly using the thread class.

Writing multithreaded code is discussed in detail in Chapter 23.

Type Traits

Type traits are defined in the <type_traits> header file and provide information about types at compile time. They are useful when writing advanced templates and are discussed in Chapter 22.

Standard Integer Types

The <cstdint> header file defines a number of standard integer types such as int8_t, int64_t and so on. It also includes macros specifying minimum and maximum values of those types. These integer types are discussed in the context of writing cross-platform code in Chapter 30.

Containers

The Standard Library provides implementations of commonly used data structures such as linked lists and queues. When you use C++, you should not need to write such data structures again. The data structures are implemented using a concept called containers, which store information called elements, in a way that implements the data structure (linked list, queue, and so on) appropriately. Different data structures have different insertion, deletion, and access behavior and performance characteristics. It is important to be familiar with the available data structures so that you can choose the most appropriate one for any given task.

All the containers in the Standard Library are class templates, so you can use them to store any type, from built-in types such as int and double to your own classes. Each container instance stores objects of only one type; that is, they are homogeneous collections. If you need non-fixed-sized heterogeneous collections, you can wrap each element in an std::any instance and store those any instances in a container. Alternatively, you can store std::variant instances in a container. A variant can be used if the number of different required types is limited and known at compile time. Both any and variant are introduced with C++17, and are discussed in Chapter 20. If you needed heterogeneous collections before C++17, you could create a class that had multiple derived classes, and each derived class could wrap an object of your required type.

Note that the C++ standard specifies the interface, but not the implementation, of each container and algorithm. Thus, different vendors are free to provide different implementations. However, the standard also specifies performance requirements as part of the interface, which the implementations must meet.

This section provides an overview of the various containers available in the Standard Library.

vector

The <vector> header file defines vector, which stores a sequence of elements and provides random access to these elements. You can think of a vector as an array of elements that grows dynamically as you insert elements, and additionally provides some bounds checking. Like an array, the elements of a vector are stored in contiguous memory.

vectors provide fast element insertion and deletion (amortized constant time) at the end of the vector. Amortized constant time insertion means that most of the time insertions are done in constant time O(1) (Chapter 4 explains big-O notation). However, sometimes the vector needs to grow in size to accommodate new elements, which has a complexity of O(N). On average, this results in O(1) complexity or amortized constant time. Details are explained in Chapter 17. A vector has slower (linear time) insertion and deletion anywhere else, because the operation must move all the elements “up” or “down” by one to make room for the new element or to fill the space left by the deleted element. Like arrays, vectors provide fast (constant time) access to any of their elements.

Even though inserting and removing elements in the middle of a vector requires moving other elements up or down, a vector should be your default container! Often, a vector will be faster than, for example, a linked list, even for inserting and removing elements in the middle. The reason is that a vector is stored contiguously in memory, while a linked list is scattered around in memory. Computers are extremely efficient to work with contiguous data, which makes vector operations fast. You should only use something like a linked list if a performance profiler (discussed in Chapter 25) tells you that it performs better than a vector.

There is a template specialization available for vector<bool> to store Boolean values in a vector. This specialization optimizes space allocation for the Boolean elements; however, the standard does not specify how an implementation of vector<bool> should optimize space. The difference between the vector<bool> specialization and the bitset discussed further in this chapter is that the bitset container is of fixed size.

list

A Standard Library list is a doubly linked list structure and is defined in <list>. Like an array or vector, it stores a sequence of elements. However, unlike an array or vector, the elements of a list are not necessarily contiguous in memory. Instead, each element in the list specifies where to find the next and previous elements in the list (usually via pointers), hence the name doubly linked list.

The performance characteristics of a list are the exact opposite of a vector. They provide slow (linear time) element lookup and access, but quick (constant time) insertion and deletion of elements once the relevant position has been found. Still, as discussed in the previous section, a vector is usually faster than a list. Use a profiler to be sure.

forward_list

The forward_list, defined in <forward_list>, is a singly linked list, compared to the list container, which is doubly linked. forward_list supports forward iteration only, and requires less memory than a list. Like lists, forward_lists allow constant time insertion and deletion anywhere once the relevant position has been found, and there is no fast random access to elements.

deque

The name deque is an abbreviation for a double-ended queue. A deque, defined in <deque>, provides quick (constant time) element access. It also provides fast (constant time) insertion and deletion at both ends of the sequence, but it provides slow (linear time) insertion and deletion in the middle of the sequence. The elements of a deque are not stored contiguously in memory, and thus a deque might be slower than a vector.

You could use a deque instead of a vector when you need to insert or remove elements from either end of the sequence but still need fast access to all elements. However, this requirement does not apply to many programming problems; in most cases, a vector is recommended.

array

The <array> header defines array, which is a replacement for standard C-style arrays. Sometimes you know the exact number of elements in your container up front and you don’t need the flexibility of a vector or a list, which are able to grow dynamically to accommodate new elements. An array is perfect for such fixed-sized collections, and it has a bit less overhead compared to a vector; it’s basically a thin wrapper around standard C-style arrays. There are a number of advantages in using arrays instead of standard C-style arrays: they always know their own size, and do not automatically get cast to a pointer to avoid certain types of bugs. Also, arrays do not provide insertion or deletion; they have a fixed size. The advantage of having a fixed size is that this allows an array to be allocated on the stack, rather than always demanding heap access as vector does. Access to elements is very fast (constant time), just as with vectors.

queue

The name queue comes directly from the definition of the English word queue, which means a line of people or objects. The queue container is defined in <queue> and provides standard first in, first out (or FIFO) semantics. A queue is a container in which you insert elements at one end and take them out at the other end. Both insertion (amortized constant time) and removal (constant time) of elements are quick.

You should use a queue structure when you want to model real-life “first-come, first-served” semantics. For example, consider a bank. As customers arrive at the bank, they get in line. As tellers become available, they serve the next customer in line, thus providing “first-come, first-served” behavior. You could implement a bank simulation by storing customer objects in a queue. As customers arrive at the bank, they are added to the end of the queue. As tellers serve customers, they start with customers at the front of the queue.

priority_queue

A priority_queue, also defined in <queue>, provides queue functionality in which each element has a priority. Elements are removed from the queue in priority order. In the case of priority ties, the order in which elements are removed is undefined. priority_queue insertion and deletion are generally slower than simple queue insertion and deletion, because the elements must be reordered to support the priority ordering.

You can use priority_queues to model “queues with exceptions.” For example, in the preceding bank simulation, suppose that customers with business accounts take priority over regular customers. Many real-life banks implement this behavior with two separate lines: one for business customers and one for everyone else. Any customers in the business queue are taken before customers in the other line. However, banks could also provide this behavior with a single line in which business customers move to the front of the line ahead of any non-business customers. In your program, you could use a priority_queue in which customers have one of two priorities: business or regular. All business customers would be serviced before all regular customers.

stack

The <stack> header defines the stack class, which provides standard first-in, last-out (FILO) semantics, also known as last-in, first-out (LIFO). Like a queue, elements are inserted and removed from the container. However, in a stack, the most recent element inserted is the first one removed. The name stack derives from a visualization of this structure as a stack of objects in which only the top object is visible. When you add an object to the stack, you hide all the objects underneath it.

The stack container provides fast (constant time) insertion and removal of elements. You should use the stack structure when you want FILO semantics. For example, an error-processing tool might want to store errors on a stack so that the most recent error is the first one available for a human administrator to read. Processing errors in a FILO order is often useful because newer errors sometimes obviate older ones.

set and multiset

The set class template is defined in the <set> header file, and, as the name suggests, it is a set of elements, loosely analogous to the notion of a mathematical set: each element is unique, and there is at most one instance of the element in the set. One difference between the mathematical concept of set, and set as implemented in the Standard Library, is that in the Standard Library the elements are kept in an order. The reason for the order is that when the client enumerates the elements, they come out in the ordering imposed by the type’s operator< or a user-defined comparator. The set provides logarithmic insertion, deletion, and lookup. This means that, in theory, insertions and deletions are faster than for a vector but slower than for a list; while lookups are faster than for a list, but slower than for a vector. As always, use a profiler to make sure which container is faster for your use case.

You could use a set when you need the elements to be in an order, to have equal amounts of insertion/deletion and lookups, and you want to optimize performance for both as much as possible. For example, an inventory-tracking program in a busy bookstore might want to use a set to store the books. The list of books in stock must be updated whenever books arrive or are sold, so insertion and deletion should be quick. Customers also need the ability to look for a specific book, so the program should provide fast lookup as well.

Note that a set does not allow duplicate elements. That is, each element in a set must be unique. If you want to store duplicate elements, you must use a multiset, also defined in the <set> header file.

map and multimap

The <map> header defines the map class template, which is an associative array. You can use it as an array in which the index can be any type; for example, a string. A map stores key/value pairs, and keeps its elements in sorted order, based on the keys, not the values. It also provides an operator[], which a set does not provide. In most other respects, it is identical to a set. You could use a map when you want to associate keys and values. For example, in the preceding bookstore example, you might want to store the books in a map where the key is the ISBN number of the book and the value is a Book object containing detailed information for that specific book.

A multimap, also defined in <map>, has the same relation to a map as a multiset does to a set. Specifically, a multimap is a map that allows duplicate keys.

Unordered Associative Containers / Hash Tables

The Standard Library supports hash tables, also called unordered associative containers. There are four unordered associative containers:

  • unordered_map
  • unordered_multimap
  • unordered_set
  • unordered_multiset

The first two containers are defined in <unordered_map>, and the other two containers in <unordered_set>. Better names would have been hash_map, hash_set, and so on. Unfortunately, hash tables were not part of the C++ Standard Library before C++11, which means a lot of third-party libraries implemented hash tables themselves by using names with “hash” as a prefix, like hash_map. Because of this, the C++ standard committee decided to use the prefix “unordered” instead of “hash” to avoid name clashes.

These unordered associative containers behave similar to their ordered counterparts. An unordered_map is similar to a standard map except that the standard map sorts its elements while the unordered_map doesn’t sort its elements.

Insertion, deletion, and lookup with these unordered associative containers can be done on average in constant time. In a worst-case scenario, it will be in linear time. Lookup of elements in an unordered container can be much faster than with a normal map or set, especially when there are a lot of elements in the container.

Chapter 17 explains how these unordered associative containers work and why they are also called hash tables.

bitset

C and C++ programmers commonly store a set of flags in a single int or long, using one bit for each flag. They set and access these bits with the bitwise operators: &, |, ^, ~, <<, and >>. The C++ Standard Library provides a bitset class that abstracts this bit field manipulation, so you shouldn’t need to use the bit manipulation operators anymore.

The <bitset> header file defines the bitset container, but this is not a container in the normal sense, in that it does not implement a specific data structure in which you insert and remove elements. A bitset has a fixed size and does not support iterators. You can think of them as a sequence of Boolean values that you can read and write. However, unlike the C-style way of handling bits, the bitset is not limited to the size of an int or other elementary data types. Thus, you can have a 40-bit bitset, or a 213-bit bitset. The implementation will use as much storage as it needs to implement N bits when you declare your bitset with bitset<N>.

Summary of Standard Library Containers

The following table summarizes the containers provided by the Standard Library. It uses the big-O notation introduced in Chapter 4 to present the performance characteristics on a container of N elements. An N/A entry in the table means that the operation is not part of the container semantics.

CONTAINER CLASS NAME CONTAINER TYPE INSERT PERFORMANCE DELETE PERFORMANCE LOOKUP PERFORMANCE
vector Sequential Amortized O(1) at the end; O(N) otherwise. O(1) at the end; O(N) otherwise. O(1)
When to Use: This should be your default container. Only use another container after using a profiler to confirm it is faster than a vector.
list Sequential O(1) at the beginning and the end, and once you are at the position where you want to insert the element. O(1) at the beginning and the end, and once you are at the position where you want to delete the element. O(1) to access the first or last element; O(N) otherwise.
When to Use: Rarely. You should use a vector, unless a profiler tells you a list is faster for your use case.
forward_list Sequential O(1) at the beginning, and once you are at the position where you want to insert the element. O(1) at the beginning, and once you are at the position where you want to delete the element. O(1) to access the first element; O(N) otherwise.
When to Use: Rarely. You should use a vector, unless a profiler tells you a forward_list is faster for your use case.
deque Sequential O(1) at the beginning or end; O(N) otherwise. O(1) at the beginning or end; O(N) otherwise. O(1)
When to Use: Not usually needed; use a vector instead.
array Sequential N/A N/A O(1)
When to Use: When you need a fixed-size array to replace a standard C-style array.
queue Container adaptor Depends on the underlying container; O(1) for list and deque. Depends on the underlying container; O(1) for list and deque. N/A
When to Use: When you want a FIFO structure.
priority_queue Container adaptor Depends on the underlying container; amortized O(log(N)) for vector, O(log(N)) for deque. Depends on the underlying container; O(log(N)) for vector and deque. N/A
When to Use: When you want a queue with priority.
stack Container adaptor Depends on the underlying container; O(1) for list and deque, amortized O(1) for vector. Depends on the underlying container; O(1) for list, vector, and deque. N/A
When to Use: When you want a FILO/LIFO structure.
set
multiset
Sorted associative O(log(N)) O(log(N)) O(log(N))
When to Use: When you want a sorted collection of elements with equal lookup, insertion, and deletion times. Use a set when you want a collection of elements without duplicates.
map
multimap
Sorted associative O(log(N)) O(log(N)) O(log(N))
When to Use: When you want a sorted collection to associate keys with values, that is, an associative array, with equal lookup, insertion, and deletion times.
unordered_map
unordered_multimap
Unordered associative / hash table Average case O(1); worst case O(N). Average case O(1); worst case O(N). Average case O(1); worst case O(N).
When to Use: When you want to associate keys with values with equal lookup, insertion, and deletion times, and you don’t require the elements to be sorted. Performance can be better than with a normal map, but that depends on the elements.
unordered_set
unordered_multiset
Unordered associative/hash table Average case O(1); worst case O(N). Average case O(1); worst case O(N). Average case O(1); worst case O(N).
When to Use: When you want a collection of elements with equal lookup, insertion, and deletion times, and you don’t require the elements to be sorted. Performance can be better than with a normal set, but that depends on the elements.
bitset Special N/A N/A O(1)
When to Use: When you want a collection of flags.

Note that strings are technically containers as well. They can be thought of as vectors of characters. Thus, some of the algorithms described in the material that follows also work on strings.

Algorithms

In addition to containers, the Standard Library provides implementations of many generic algorithms. An algorithm is a strategy for performing a particular task, such as sorting or searching. These algorithms are implemented as function templates, so they work on most of the different container types. Note that the algorithms are not generally part of the containers. The Standard Library takes the approach of separating the data (containers) from the functionality (algorithms). Although this approach seems counter to the spirit of object-oriented programming, it is necessary in order to support generic programming in the Standard Library. The guiding principle of orthogonality maintains that algorithms and containers are independent, with (almost) any algorithm working with (almost) any container.

Note that the generic algorithms do not work directly on the containers. They use an intermediary called an iterator. Each container in the Standard Library provides an iterator that supports traversing the elements in the container in a sequence. The different iterators for the various containers adhere to standard interfaces, so algorithms can perform their work by using iterators without worrying about the underlying container implementation. The <iterator> header defines the following helper functions that return specific iterators for containers.

FUNCTION NAME FUNCTION SYNOPSIS
begin()
end()
Returns a non-const iterator to the first, and one past the last, element in a sequence.
cbegin()
cend()
Returns a const iterator to the first, and one past the last, element in a sequence.
rbegin()
rend()
Returns a non-const reverse iterator to the last, and one before the first, element in a sequence.
crbegin()
crend()
Returns a const reverse iterator to the last, and one before the first, element in a sequence.

This section gives an overview of what kinds of algorithms are available in the Standard Library without going into detail. Chapter 17 goes deeper into iterators, and Chapter 18 discusses a selection of algorithms with coding examples. For the exact prototypes of all the available algorithms, consult a Standard Library Reference.

There are approximately 100 algorithms in the Standard Library, depending on how you count them. The following sections divide these algorithms into different categories. The algorithms are defined in the <algorithm> header file unless otherwise noted. Note that whenever the following algorithms are specified as working on a “sequence” of elements, that sequence is presented to the algorithm via iterators.

Non-modifying Sequence Algorithms

The non-modifying algorithms are those that look at a sequence of elements and return some information about the elements. As “non-modifying” algorithms, they cannot change the values of elements or the order of elements within the sequence. This category contains three types of algorithms. The following tables list and provide brief summaries of the various non-modifying algorithms. With these algorithms, you should rarely need to write a for loop to iterate over a sequence of values.

Search Algorithms

These algorithms do not require the sequence to be sorted. N is the size of the sequence to search in, and M is the size of the pattern to find.

ALGORITHM NAME ALGORITHM SYNOPSIS COMPLEXITY
adjacent_find() Finds the first instance of two consecutive elements that are equal to each other or are equivalent to each other as specified by a predicate. O(N)
find()
find_if()
Finds the first element that matches a value or causes a predicate to return true. O(N)
find_first_of() Like find, but searches for one of several elements at the same time. O(NM)
find_if_not() Finds the first element that causes a predicate to return false. O(N)
find_end() Finds the last subsequence in a sequence that matches another sequence or whose elements are equivalent, as specified by a predicate. O(M*(N-M))
search() Finds the first subsequence in a sequence that matches another sequence or whose elements are equivalent, as specified by a predicate.* O(NM)*
search_n() Finds the first instance of n consecutive elements that are equal to a given value or relate to that value according to a predicate. O(N)

*Since C++17, search() accepts an optional extra parameter to specify the searching algorithm to use (default_searcher, boyer_moore_searcher, or boyer_moore_horspool_searcher). With the Boyer-Moore searchers, the worst-case complexity is O(N+M) when the pattern is not found, and O(NM) when the pattern is found.

Comparison Algorithms

The following comparison algorithms are provided. None of them require the source sequences to be ordered. All of them have a linear worst-case complexity.

ALGORITHM NAME ALGORITHM SYNOPSIS
equal() Determines if two sequences are equal by checking if parallel elements are equal or match a predicate.
mismatch() Returns the first element in each sequence that does not match the element in the same location in the other sequence.
lexicographical_compare() Compares two sequences to determine their “lexicographical” ordering. This algorithm compares each element of the first sequence with its equivalent element in the second. If one element is less than the other, that sequence is lexicographically first. If the elements are equal, it compares the next elements in order.
Counting Algorithms
ALGORITHM NAME ALGORITHM SYNOPSIS
all_of() Returns true if the predicate returns true for all the elements in the sequence or if the sequence is empty; false otherwise.
any_of() Returns true if the predicate returns true for at least one element in the sequence; false otherwise.
none_of() Returns true if the predicate returns false for all the elements in the sequence or if the sequence is empty; false otherwise.
count()
count_if()
Counts the number of elements matching a value or that cause a predicate to return true.

Modifying Sequence Algorithms

The modifying algorithms modify some or all of the elements in a sequence. Some of them modify elements in place, so that the original sequence changes. Others copy the results to a different sequence, so that the original sequence remains unchanged. All of them have a linear worst-case complexity. The following table summarizes the modifying algorithms.

ALGORITHM NAME ALGORITHM SYNOPSIS
copy()
copy_backward()
Copies elements from one sequence to another.
copy_if() Copies elements for which a predicate returns true from one sequence to another.
copy_n() Copies n elements from one sequence to another.
fill() Sets all elements in the sequence to a new value.
fill_n() Sets the first n elements in the sequence to a new value.
generate() Calls a specified function to generate a new value for each element in the sequence.
generate_n() Calls a specified function to generate a new value for the first n elements in the sequence.
move()
move_backward()
Moves elements from one sequence to another. This uses efficient move semantics (see Chapter 9).
remove()
remove_if()
remove_copy()
remove_copy_if()
Removes elements that match a given value or that cause a predicate to return true, either in place or by copying the results to a different sequence.
replace()
replace_if()
replace_copy()
replace_copy_if()
Replaces all elements matching a value or that cause a predicate to return true with a new element, either in place or by copying the results to a different sequence.
reverse()
reverse_copy()
Reverses the order of the elements in the sequence, either in place or by copying the results to a different sequence.
rotate()
rotate_copy()
Swaps the first and second “halves” of the sequence, either in place or by copying the results to a different sequence. The two subsequences to be swapped need not be equal in size.
image sample() Selects n random elements from the sequence.
shuffle()
random_shuffle()
Shuffles the sequence by randomly reordering the elements. It is possible to specify the properties of the random number generator used for shuffling. random_shuffle() is deprecated since C++14, and is removed from C++17.
transform() Calls a unary function on each element of a sequence or a binary function on parallel elements of two sequences. This is an in-place transformation.
unique()
unique_copy()
Removes consecutive duplicates from the sequence, either in place or by copying results to a different sequence.

Operational Algorithms

Operational algorithms execute a function on individual elements of a sequence. There are two operational algorithms provided. Both have a linear complexity and do not require the source sequence to be ordered.

ALGORITHM NAME ALGORITHM SYNOPSIS
for_each() Executes a function on each element in the sequence. The sequence is specified with a begin and end iterator.
image for_each_n() Similar to for_each() but only processes the first n elements in the sequence. The sequence is specified by a begin iterator and a number of elements (n).

Swap and Exchange Algorithms

The C++ Standard Library provides the following swap and exchange algorithms.

ALGORITHM NAME ALGORITHM SYNOPSIS
iter_swap()
swap_ranges()
Swaps two elements or sequences of elements.
swap() Swaps two values, defined in the <utility> header.
image exchange() Replaces a given value with a new value and returns the old value. Defined in the <utility> header.

Partition Algorithms

A sequence is partitioned on a certain predicate, if all elements for which the predicate returns true are before all elements for which it returns false. The first element in the sequence that does not satisfy the predicate is called the partition point. The Standard Library provides the following partition algorithms.

ALGORITHM NAME ALGORITHM SYNOPSIS COMPLEXITY
is_partitioned() Returns true if all elements for which a predicate returns true are before all elements for which it returns false. Linear
partition() Sorts the sequence such that all elements for which a predicate returns true are before all elements for which it returns false, without preserving the original order of the elements within each partition. Linear
stable_partition() Sorts the sequence such that all elements for which a predicate returns true are before all elements for which it returns false, while preserving the original order of the elements within each partition. Linear logarithmic
partition_copy() Copies elements from one sequence to two different sequences. The target sequence is selected based on the result of a predicate, either true or false. Linear
partition_point() Returns an iterator such that all elements before this iterator return true for a predicate and all elements after this iterator return false for that predicate. Logarithmic

Sorting Algorithms

The Standard Library provides several different sorting algorithms with varying performance guarantees.

ALGORITHM NAME ALGORITHM SYNOPSIS COMPLEXITY
is_sorted()
is_sorted_until()
Checks if a sequence is sorted or which subsequence is sorted. Linear
nth_element() Relocates the nth element of the sequence such that the element in the position pointed to by nth is the element that would be in that position if the whole range were sorted, and it rearranges all elements such that all elements preceding the nth element are less than the new nth element, and the ones following it are greater than the new nth element. Linear
partial_sort()
partial_sort_copy()
Partially sorts the sequence: the first n elements (specified by iterators) are sorted; the rest are not. They are sorted either in place or by copying them to a new sequence. Linear logarithmic
sort()
stable_sort()
Sorts elements in place, either preserving the order of duplicate elements or not. Linear logarithmic

Binary Search Algorithms

The following binary search algorithms are normally used on sorted sequences. Technically, they only require the sequence to be at least partitioned on the element that is searched for. This could, for example, be achieved by applying std::partition(). A sorted sequence also meets this requirement. All these algorithms have logarithmic complexity.

ALGORITHM NAME ALGORITHM SYNOPSIS
lower_bound() Finds the first element in a sequence not less than (that is greater or equal to) a given value.
upper_bound() Finds the first element in a sequence greater than a given value.
equal_range() Returns a pair containing the result of both lower_bound() and upper_bound().
binary_search() Returns true if a given value is found in a sequence; false otherwise.

Set Algorithms

Set algorithms are special modifying algorithms that perform set operations on sequences. They are most appropriate on sequences from set containers, but work on sorted sequences from most containers.

ALGORITHM NAME ALGORITHM SYNOPSIS COMPLEXITY
inplace_merge() Merges two sorted sequences in place. Linear logarithmic
merge() Merges two sorted sequences by copying them to a new sequence. Linear
includes() Determines if every element from one sorted sequence is in another sorted sequence. Linear
set_union()
set_intersection()
set_difference()
set_symmetric_difference()
Performs the specified set operation on two sorted sequences, copying results to a third sorted sequence. Linear

Heap Algorithms

A heap is a standard data structure in which the elements of an array or sequence are ordered in a semi-sorted fashion, so that finding the “top” element is quick. For example, a heap data structure is typically used to implement a priority_queue. Six algorithms allow you to work with heaps.

ALGORITHM NAME ALGORITHM SYNOPSIS COMPLEXITY
is_heap() Checks if a range of elements is a heap. Linear
is_heap_until() Finds the largest subrange in the given range of elements that is a heap. Linear
make_heap() Creates a heap from a range of elements. Linear
push_heap()
pop_heap()
Adds an element to, or removes an element from a heap. Logarithmic
sort_heap() Converts a heap into a range of ascending sorted elements. Linear logarithmic

Minimum/Maximum Algorithms

ALGORITHM NAME ALGORITHM SYNOPSIS
image clamp() Makes sure a value (v) is between a given minimum (lo) and maximum (hi). Returns a reference to lo if v < lo; returns a reference to hi if v > hi; otherwise returns a reference to v.
min()
max()
Returns the minimum or maximum of two or more values.
minmax() Returns the minimum and maximum of two or more values as a pair.
min_element()
max_element()
Returns the minimum or maximum element in a sequence.
minmax_element() Returns the minimum and maximum element in a sequence as a pair.

Numerical Processing Algorithms

The <numeric> header provides the following numerical processing algorithms. None of them require the source sequences to be ordered. All of them have a linear complexity.

ALGORITHM NAME ALGORITHM SYNOPSIS
iota() Fills a sequence with successively incrementing values starting with a given value.
image gcd() Returns the greatest common divisor of two integer types.
image lcm() Returns the least common multiple of two integer types.
adjacent_difference() Generates a new sequence in which each element is the difference (or other binary operation) of the second and first of each adjacent pair of elements in the source sequence.
partial_sum() Generates a new sequence in which each element is the sum (or other binary operation) of an element and all its preceding elements in the source sequence.
image exclusive_scan()
inclusive_scan()
These are similar to partial_sum(). An inclusive scan is identical to a partial sum if the given summation operation is associative. However, inclusive_scan() sums in a non-deterministic order, while partial_sum() left to right, so for non-associative summation operations the result of the former is non-deterministic. The exclusive_scan() algorithm also sums in a non-deterministic order.
For inclusive_scan(), the ith element is included in the ith sum, just as for partial_sum(). For exclusive_scan(), the ith element is not included in the ith sum.
image transform_exclusive_scan()
transform_inclusive_scan()
Applies a transformation to each element in a sequence, then performs an exclusive/inclusive scan.
accumulate() “Accumulates” the values of all the elements in a sequence. The default behavior is to sum the elements, but the caller can supply a different binary function instead.
inner_product() Similar to accumulate(), but works on two sequences. This algorithm calls a binary function (multiplication by default) on parallel elements in the sequences, accumulating the result using another binary function (addition by default). If the sequences represent mathematical vectors, the algorithm calculates the dot product of the vectors.
image reduce() Similar to accumulate(), but supports parallel execution. The order of evaluation for reduce() is non-deterministic, while it’s from left to right for accumulate(). This means that the behavior of the former is non-deterministic if the given binary operation is not associative or not commutative.
image transform_reduce() Applies a transformation to each element in a sequence, then performs a reduce().

Permutation Algorithms

A permutation of a sequence contains the same elements but in a different order. The following algorithms are provided to work with permutations.

ALGORITHM NAME ALGORITHM SYNOPSIS COMPLEXITY
is_permutation() Returns true if the elements in one range are a permutation of the elements in another range. Quadratic
next_permutation()
prev_permutation()
Modifies the sequence by transforming it into its “next” or “previous” lexicographical permutation. Successive calls to one or the other will permute the sequence into all possible permutations of its elements, if you start with a properly sorted sequence. This algorithm returns false if no more permutations exist. Linear

Choosing an Algorithm

The number and capabilities of the algorithms might overwhelm you at first. It can also be difficult to see how to apply them in the beginning. However, now that you have an idea of the available options, you are better able to tackle your program designs. The following chapters cover the details of how to use these algorithms in your code.

What’s Missing from the Standard Library

The Standard Library is powerful, but it’s not perfect. Here are two examples of missing functionality:

  • The Standard Library does not guarantee any thread safety for accessing containers simultaneously from multiple threads.
  • The Standard Library does not provide any generic tree or graph structures. Although maps and sets are generally implemented as balanced binary trees, the Standard Library does not expose this implementation in the interface. If you need a tree or graph structure for something like writing a parser, you need to implement your own or find an implementation in another library.

It is important to keep in mind that the Standard Library is extensible. You can write your own containers or algorithms that work with existing algorithms or containers. So, if the Standard Library doesn’t provide exactly what you need, consider writing your desired code such that it works with the Standard Library. Chapter 21 covers the topic of customizing and extending the Standard Library.

SUMMARY

This chapter provided an overview of the C++ Standard Library, which is the most important library that you will use in your code. It subsumes the C library, and includes additional facilities for strings, I/O, error handling, and other tasks. It also includes generic containers and algorithms. The following chapters describe the Standard Library in more detail.

NOTE

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset