21
Customizing and Extending the Standard Library

ALLOCATORS

Every Standard Library container takes an Allocator type as a template parameter, for which the default usually suffices. For example, the vector template definition looks like this:

template <class T, class Allocator = allocator<T>> class vector;

The container constructors then allow you to specify an object of type Allocator. This permits you to customize the way the containers allocate memory. Every memory allocation performed by a container is made with a call to the allocate() method of the Allocator object. Conversely, every deallocation is performed with a call to the deallocate() method of the Allocator object. The Standard Library provides a default Allocator class called allocator, which implements these methods as wrappers for operator new and operator delete.

If you want containers in your program to use a custom memory allocation and deallocation scheme, you can write your own Allocator class. There are several reasons for using custom allocators. For example, if the underlying allocator has unacceptable performance, there are alternatives that can be constructed. Or, if memory fragmentation is a problem (a lot of different allocations and deallocations leaving unusable, small holes in memory), a single “pool” of objects of one type can be created, called a memory pool. When OS-specific capabilities, such as shared memory segments, must be allocated, using custom allocators allows the use of Standard Library containers in those shared memory segments. The use of custom allocators is complex, and there are many potential problems if you are not careful, so this should not be approached lightly.

Any class that provides allocate(), deallocate(), and several other required methods and type aliases can be used in place of the default allocator class.

image C++17 introduces the concept of polymorphic memory allocators. Basically, the problem with the allocator for a container being specified as a template type parameter is that two containers that are similar but have different allocator types are completely different. For example, two vector<int> containers with different Allocator template type parameters are different, and so cannot be assigned to one another.

The polymorphic memory allocators, defined in <memory_resource> in the std::pmr namespace, help to solve this problem. The class std::pmr::polymorphic_allocator is a proper Allocator class because it satisfies all the requirements, such as having an allocate() and deallocate() method. The allocation behavior of a polymorphic_allocator depends on the memory_resource it’s given during construction, and not on any template type parameters. As such, different polymorphic_allocators can behave in completely different ways when allocating and deallocating memory, even though they all have the same type, that is, polymorphic_allocator. The standard provides some built-in memory resources that you can use to initialize a polymorphic memory allocator: synchronized_pool_resource, unsynchronized_pool_resource, and monotonic_buffer_resource.

However, in my experience, both custom allocators and polymorphic memory allocators are rather advanced and rarely used features. I’ve never used them myself, so a detailed discussion falls outside the scope of this book. For more information, consult one of the books listed in Appendix B that specifically covers the C++ Standard Library.

STREAM ITERATORS

The Standard Library provides four stream iterators. These are iterator-like class templates that allow you to treat input and output streams as input and output iterators. Using these stream iterators, you can adapt input and output streams so that they can serve as sources and destinations, respectively, for various Standard Library algorithms. The following stream iterators are available:

  • ostream_iterator—an output stream iterator
  • istream_iterator—an input stream iterator

There is also an ostreambuf_iterator and an istreambuf_iterator, but these are rarely used and are not further discussed here.

Output Stream Iterator

The ostream_iterator class is an output stream iterator. It is a class template that takes the element type as a type parameter. Its constructor takes an output stream and a delimiter string to write to the stream following each element. The ostream_iterator class writes elements using operator<<.

For example, you can use the ostream_iterator with the copy() algorithm to print the elements of a container with only one line of code. The first parameter of copy() is the start iterator of the range to copy, the second parameter is the end iterator of the range, and the third parameter is the destination iterator:

vector<int> myVector(10);
iota(begin(myVector), end(myVector), 1);   // Fill vector with 1,2,3...10

// Print the contents of the vector.
copy(cbegin(myVector), cend(myVector), ostream_iterator<int>(cout, " "));

The output is as follows:

1 2 3 4 5 6 7 8 9 10

Input Stream Iterator

You can use the input stream iterator, istream_iterator, to read values from an input stream using the iterator abstraction. It is a class template that takes the element type as a type parameter. Elements are read using operator>>. You can use an istream_iterator as a source for algorithms and container methods.

For example, the following piece of code reads integers from the console until the end of the stream is reached. On Windows, this happens when you press Ctrl+Z followed by Enter, while on Linux you press Enter followed by Ctrl+D. The accumulate() algorithm is used to sum all the integers together. Note that the default constructor of istream_iterator creates an end iterator.

cout << "Enter numbers separated by white space." << endl;
cout << "Press Ctrl+Z followed by Enter to stop." << endl;
istream_iterator<int> numbersIter(cin);
istream_iterator<int> endIter;
int sum = accumulate(numbersIter, endIter, 0);
cout << "Sum: " << sum << endl;

Take a moment to reflect on this code. If you remove all the output statements and the variable declarations, the only line left is the call to accumulate(). Thanks to algorithms and input stream iterators, this single line of code reads any number of integers from the console and sums them together, without using any explicit loops.

ITERATOR ADAPTORS

The Standard Library provides three iterator adaptors, which are special iterators built on top of other iterators. All three are defined in the <iterator> header. It’s also possible to write your own iterator adaptors, but this is not covered in this book. Consult one of the books on the Standard Library listed in Appendix B for details.

Reverse Iterators

The Standard Library provides an std::reverse_iterator class template that iterates through a bidirectional or random access iterator in a reverse direction. Every reversible container in the Standard Library, which happens to be every container that’s part of the standard except forward_list and the unordered associative containers, supplies a reverse_iterator type alias and methods called rbegin() and rend(). These reverse_iterator type aliases are of type std::reverse_iterator<T> with T equal to the iterator type alias of the container. The method rbegin() returns a reverse_iterator pointing to the last element of the container, and rend() returns a reverse_iterator pointing to the element before the first element of the container. Applying operator++ to a reverse_iterator calls operator-- on the underlying container iterator, and vice versa. For example, iterating over a collection from the beginning to the end can be done as follows:

for (auto iter = begin(collection); iter != end(collection); ++iter) {}

Iterating over the elements in the collection from the end to the beginning can be done using a reverse_iterator by calling rbegin() and rend(). Note that you still call ++iter:

for (auto iter = rbegin(collection); iter != rend(collection); ++iter) {}

An std::reverse_iterator is useful mostly with algorithms in the Standard Library that have no equivalents that work in reverse order. For example, the basic find() algorithm searches for the first element in a sequence. If you want to find the last element in the sequence, you can use a reverse_iterator instead. Note that when you call an algorithm such as find() with a reverse_iterator, it returns a reverse_iterator as well. You can always obtain the underlying iterator from a reverse_iterator by calling the base() method on the reverse_iterator. However, due to the implementation details of reverse_iterator, the iterator returned from base() always refers to one element past the element referred to by the reverse_iterator on which it’s called. In order to get to the same element, you must subtract one.

Here is an example of find() with a reverse_iterator:

// The implementation of populateContainer() is identical to that shown in
// Chapter 18, so it is omitted here.

vector<int> myVector;
populateContainer(myVector);

int num;
cout << "Enter a number to find: ";
cin >> num;

auto it1 = find(begin(myVector), end(myVector), num);
auto it2 = find(rbegin(myVector), rend(myVector), num);
if (it1 != end(myVector)) {
    cout << "Found " << num << " at position " << it1 - begin(myVector)
         << " going forward." << endl;
    cout << "Found " << num << " at position "
         << it2.base() - 1 - begin(myVector) << " going backward." << endl;
} else {
    cout << "Failed to find " << num << endl;
}

A possible output of this program is as follows:

Enter a number (0 to quit): 11
Enter a number (0 to quit): 22
Enter a number (0 to quit): 33
Enter a number (0 to quit): 22
Enter a number (0 to quit): 11
Enter a number (0 to quit): 0
Enter a number to find: 22
Found 22 at position 1 going forward.
Found 22 at position 3 going backward.

Insert Iterators

As Chapter 18 mentions, algorithms like copy() don’t insert elements into a container; they simply replace old elements in a range with new ones. In order to make algorithms like copy() more useful, the Standard Library provides three insert iterator adaptors that actually insert elements into a container: insert_iterator, back_insert_iterator, and front_insert_iterator. They are all templatized on a container type, and take the actual container reference in their constructor. Because they supply the necessary iterator interfaces, these adaptors can be used as the destination iterators of algorithms like copy(). However, instead of replacing elements in the container, they make calls on their container to actually insert new elements.

The basic insert_iterator calls insert(position, element) on the container, the back_insert_iterator calls push_back(element), and the front_insert_iterator calls push_front(element).

The following example uses a back_insert_iterator with the copy_if() algorithm to populate vectorTwo with all elements from vectorOne that are not equal to 100:

vector<int> vectorOne, vectorTwo;
populateContainer(vectorOne);

back_insert_iterator<vector<int>> inserter(vectorTwo);
copy_if(cbegin(vectorOne), cend(vectorOne), inserter,
    [](int i){ return i != 100; });

copy(cbegin(vectorTwo), cend(vectorTwo), ostream_iterator<int>(cout, " "));

As you can see, when you use insert iterators, you don’t need to size the destination containers ahead of time.

You can also use the std::back_inserter() utility function to create a back_insert_iterator. In the previous example, you can remove the line that defines the inserter variable, and rewrite the copy_if() call as follows. The result is exactly the same as the previous implementation:

copy_if(cbegin(vectorOne), cend(vectorOne),
    back_inserter(vectorTwo), [](int i){ return i != 100; });

With C++17’s template argument deduction for constructors, this can also be written as follows:

copy_if(cbegin(vectorOne), cend(vectorOne),
    back_insert_iterator(vectorTwo), [](int i) { return i != 100; });

The front_insert_iterator and insert_iterator work similarly, except that the insert_iterator also takes an initial iterator position in its constructor, which it passes to the first call to insert(position, element). Subsequent iterator position hints are generated based on the return value from each insert() call.

One benefit of using an insert_iterator is that it allows you to use associative containers as destinations of the modifying algorithms. Chapter 18 explains that the problem with associative containers is that you are not allowed to modify the elements over which you iterate. By using an insert_iterator, you can instead insert elements. Associative containers actually support a form of insert() that takes an iterator position, and are supposed to use the position as a “hint,” which they can ignore. When you use an insert_iterator on an associative container, you can pass the begin() or end() iterator of the container to use as the hint. The insert_iterator modifies the iterator hint that it passes to insert() after each call to insert(), such that the position is one past the just-inserted element.

Here is the previous example modified so that the destination container is a set instead of a vector:

vector<int> vectorOne;
set<int> setOne;
populateContainer(vectorOne);

insert_iterator<set<int>> inserter(setOne, begin(setOne));
copy_if(cbegin(vectorOne), cend(vectorOne), inserter,
    [](int i){ return i != 100; });

copy(cbegin(setOne), cend(setOne), ostream_iterator<int>(cout, " "));

Similar to the back_insert_iterator example, you can use the std::inserter() utility function to create an insert_iterator:

copy_if(cbegin(vectorOne), cend(vectorOne), 
    inserter(setOne, begin(setOne)),
    [](int i){ return i != 100; });

Or, you can use C++17’s template argument deduction for constructors:

copy_if(cbegin(vectorOne), cend(vectorOne),
    insert_iterator(setOne, begin(setOne)),
    [](int i) { return i != 100; });

Move Iterators

Chapter 9 discusses move semantics, which can be used to prevent unnecessary copying in cases where you know that the source object will be destroyed after an assignment operation or copy construction. There is an iterator adaptor called std::move_iterator. The dereferencing operator for a move_iterator automatically converts the value to an rvalue reference, which means that the value can be moved to a new destination without the overhead of copying. Before you can use move semantics, you need to make sure your objects are supporting it. The following MoveableClass supports move semantics. For more details, see Chapter 9.

class MoveableClass
{
    public:
        MoveableClass() {
            cout << "Default constructor" << endl;
        }
        MoveableClass(const MoveableClass& src) {
            cout << "Copy constructor" << endl;
        }
        MoveableClass(MoveableClass&& src) noexcept {
            cout << "Move constructor" << endl;
        }
        MoveableClass& operator=(const MoveableClass& rhs) {
            cout << "Copy assignment operator" << endl;
            return *this;
        }
        MoveableClass& operator=(MoveableClass&& rhs) noexcept {
            cout << "Move assignment operator" << endl;
            return *this;
        }
};

The constructors and assignment operators are not doing anything useful here, except printing a message to make it easy to see which one is being called. Now that you have this class, you can define a vector and store a few MoveableClass instances in it as follows:

vector<MoveableClass> vecSource;
MoveableClass mc;
vecSource.push_back(mc);
vecSource.push_back(mc);

The output could be as follows:

Default constructor  // [1]
Copy constructor     // [2]
Copy constructor     // [3]
Move constructor     // [4]

The second line of the code creates a MoveableClass instance by using the default constructor, [1]. The first push_back() call triggers the copy constructor to copy mc into the vector, [2]. After this operation, the vector has space for one element, the first copy of mc. Note that this discussion is based on the growth strategy and the initial size of a vector as implemented by Microsoft Visual C++ 2017. The C++ standard does not specify the initial capacity of a vector or its growth strategy, so the output can be different with different compilers.

The second push_back() call triggers the vector to resize itself, to allocate space for the second element. This resizing causes the move constructor to be called to move every element from the old vector to the new resized vector, [4]. The copy constructor is triggered to copy mc a second time into the vector, [3]. The order of moving and copying is undefined, so [3] and [4] could be reversed.

You can create a new vector called vecOne that contains a copy of the elements from vecSource as follows:

vector<MoveableClass> vecOne(cbegin(vecSource), cend(vecSource));

Without using move_iterators, this code triggers the copy constructor two times, once for every element in vecSource:

Copy constructor
Copy constructor

By using std::make_move_iterator() to create move_iterators, the move constructor of MoveableClass is called instead of the copy constructor:

vector<MoveableClass> vecTwo(make_move_iterator(begin(vecSource)),
                             make_move_iterator(end(vecSource)));

This generates the following output:

Move constructor
Move constructor

You can also use C++17’s template argument deduction for constructors:

vector<MoveableClass> vecTwo(move_iterator(begin(vecSource)),
                             move_iterator(end(vecSource)));

EXTENDING THE STANDARD LIBRARY

The Standard Library includes many useful containers, algorithms, and iterators that you can use in your applications. It is impossible, however, for any library to include all possible utilities that all potential clients might need. Thus, the best libraries are extensible: they allow clients to adapt and add to the basic capabilities to obtain exactly the functionality they require. The Standard Library is inherently extensible because of its fundamental structure of separating data from the algorithms that operate on them. You can write your own containers that can work with the Standard Library algorithms by providing iterators that conforms to the Standard Library guidelines. Similarly, you can write your own algorithms that work with iterators from the standard containers. Note that you are not allowed to put your own containers and algorithms in the std namespace.

Why Extend the Standard Library?

If you sit down to write an algorithm or container in C++, you can either make it adhere to the Standard Library conventions or not. For simple containers and algorithms, it might not be worth the extra effort to follow the Standard Library requirements. However, for substantial code that you plan to reuse, the effort pays off. First, the code will be easier for other C++ programmers to understand, because you follow well-established interface guidelines. Second, you will be able to use your container or algorithm with the other parts of the Standard Library (algorithms or containers) without needing to provide special hacks or adaptors. Finally, it will force you to employ the necessary rigor required to develop solid code.

Writing a Standard Library Algorithm

Chapter 18 describes a useful set of algorithms that are part of the Standard Library, but you will inevitably encounter situations in your programs for which you need new algorithms. When that happens, it is usually not difficult to write your own algorithm that works with Standard Library iterators just like the standard algorithms.

find_all()

Suppose that you want to find all the elements matching a predicate in a given range. The find() and find_if() algorithms are the most likely candidates, but each returns an iterator referring to only one element. You can use copy_if() to find all elements matching a given predicate, but it fills the output with copies of the found elements. If you want to avoid copies, you can use copy_if() with a back_insert_iterator into a vector<reference_wrapper<T>>, but this does not give you the position of the found elements. In fact, there is no standard algorithm to get iterators to all the elements matching a predicate. However, you can write your own version of this functionality called find_all().

The first task is to define the function prototype. You can follow the model established by copy_if(), that is, a function template with three template type parameters: the input iterator type, the output iterator type, and the predicate type. The arguments of the function are start and end iterators of the input sequence, a start iterator of the output sequence, and a predicate object. As with copy_if(), the algorithm returns an iterator into the output sequence that is one-past-the-last element stored in the output sequence. Here is the prototype:

template <typename InputIterator, typename OutputIterator, typename Predicate>
OutputIterator find_all(InputIterator first, InputIterator last,
                        OutputIterator dest, Predicate pred);

Another option would be to omit the output iterator, and to return an iterator into the input sequence that iterates over all the matching elements in the input sequence. This would require you to write your own iterator class, which is discussed later in this chapter.

The next task is to write the implementation. The find_all() algorithm iterates over all elements in the input sequence, calls the predicate on each element, and stores iterators of matching elements in the output sequence. Here is the implementation:

template <typename InputIterator, typename OutputIterator, typename Predicate>
OutputIterator find_all(InputIterator first, InputIterator last,
                        OutputIterator dest, Predicate pred)
{
    while (first != last) {
        if (pred(*first)) {
            *dest = first;
            ++dest;
        }
        ++first;
    }
    return dest;
}

Similar to copy_if(), the algorithm only overwrites existing elements in the output sequence, so make sure the output sequence is large enough to hold the result, or use an iterator adaptor such as back_insert_iterator, as demonstrated in the following code. After finding all matching elements, the code counts the number of elements found, which is the number of iterators in matches. Then, it iterates through the result, printing each element.

vector<int> vec{ 3, 4, 5, 4, 5, 6, 5, 8 };
vector<vector<int>::iterator> matches;

find_all(begin(vec), end(vec), back_inserter(matches),
    [](int i){ return i == 5; });

cout << "Found " << matches.size() << " matching elements: " << endl;
for (const auto& it : matches) {
    cout << *it << " at position " << (it - cbegin(vec)) << endl;;
}

The output is as follows:

Found 3 matching elements:
5 at position 2
5 at position 4
5 at position 6

Iterator Traits

Some algorithm implementations need additional information about their iterators. For example, they might need to know the type of the elements referred to by the iterator in order to store temporary values, or perhaps they want to know whether the iterator is bidirectional or random access.

C++ provides a class template called iterator_traits that allows you to find this information. You instantiate the iterator_traits class template with the iterator type of interest, and access one of five type aliases: value_type, difference_type, iterator_category, pointer, or reference. For example, the following function template declares a temporary variable of the type that an iterator of type IteratorType refers to. Note the use of the typename keyword in front of the iterator _traits line. You must specify typename explicitly whenever you access a type based on one or more template parameters. In this case, the template parameter IteratorType is used to access the value_type type.

#include <iterator>

template <typename IteratorType>
void iteratorTraitsTest(IteratorType it)
{
   typename std::iterator_traits<IteratorType>::value_type temp;
   temp = *it;
   cout << temp << endl;
}

This function can be tested with the following code:

vector<int> v{ 5 };
iteratorTraitsTest(cbegin(v));

With this code, the variable temp in iteratorTraitsTest() is of type int. The output is 5.

In this example, the auto keyword could be used to simplify the code, but that wouldn’t show you how to use iterator_traits.

Writing a Standard Library Container

The C++ standard contains a list of requirements that any container must fulfill in order to qualify as a Standard Library container.

Additionally, if you want your container to be sequential (like a vector), ordered associative (like a map), or unordered associative (like an unordered_map), it must conform to supplementary requirements.

My suggestion when writing a new container is to write the basic container first, following the general Standard Library rules such as making it a class template, but without worrying too much yet about the specific details of Standard Library conformity. After you’ve developed the basic implementation, you can add the iterator and methods so that it can work with the Standard Library framework. This chapter takes that approach to develop a hash map.

A Basic Hash Map

C++11 added support for hash tables, which are discussed in Chapter 17. However, pre-C++11 did not include hash tables. Unlike the Standard Library map and set, which provide logarithmic insertion, lookup, and deletion times, a hash table provides constant time insertion, deletion, and lookup in the average case, linear in the worst case. Instead of storing elements in sorted order, a hash table hashes, or maps, each element to a particular bucket. As long as the number of elements stored isn’t significantly greater than the number of buckets, and the hash function distributes the elements uniformly between the buckets, the insertion, deletion, and lookup operations all run in constant time.

This section implements a simple, but fully functional, hash_map. Like a map, a hash_map stores key/value pairs. In fact, the operations it provides are almost identical to those provided by the map, but with different performance characteristics.

This hash_map implementation uses chained hashing (also called open hashing), and does not attempt to provide advanced features such as rehashing. Chapter 17 explains the concept of chained hashing in the section on unordered associative containers.

The Hash Function

The first choice when writing a hash_map is how to handle hash functions. Recalling the adage that a good abstraction makes the easy case easy and the hard case possible, a good hash_map interface allows clients to specify their own hash function and number of buckets in order to customize the hashing behavior for their particular workload. On the other hand, clients that do not have the desire, or ability, to write a good hash function and choose a number of buckets should still be able to use the container without doing so. One solution is to allow clients to provide a hash function and number of buckets in the hash_map constructor, but also to provide default values. In this implementation, the hash function is a simple function object containing just a single function call operator. The function object is templatized on the key type that it hashes in order to support a templatized hash_map container. Template specialization can be used to write custom hash functions for certain types. Here is the basic hash function object:

template <typename T>
class hash
{
    public:
        size_t operator()(const T& key) const;
};

Note that everything for the hash_map implementation is inside a ProCpp namespace so that names don’t clash with already existing names. The implementation of the hash function call operator is tricky because it must apply to keys of any type. The following implementation computes an integer-sized hash value by simply treating the key as a sequence of bytes:

// Calculate a hash by treating the key as a sequence
// of bytes and summing the ASCII values of the bytes.
template <typename T>
size_t hash<T>::operator()(const T& key) const
{
    const size_t bytes = sizeof(key);
    size_t sum = 0;
    for (size_t i = 0; i < bytes; ++i) {
        unsigned char b = *(reinterpret_cast<const unsigned char*>(&key) + i);
        sum += b;
    }
    return sum;
}

Unfortunately, when using this hashing method on strings, the function calculates the hash of the entire string object, and not just of the actual text. The actual text is probably on the heap, and the string object only contains a length and a pointer to the text on the heap. The pointer will be different, even if the text it refers to is the same. The result is that two string objects with the same text will generate different hash values. Therefore, it’s a good idea to provide a specialization of the hash template for strings, and in general for any class that contains dynamically allocated memory. Template specialization is discussed in detail in Chapter 12.

// A hash specialization for strings
template <>
class hash<std::string>
{
    public:
        size_t operator()(const std::string& key) const;
}; 

// Calculate a hash by summing the ASCII values of all characters.
size_t hash<std::string>::operator()(const std::string& key) const
{
    size_t sum = 0;
    for (auto c : key) {
        sum += static_cast<unsigned char>(c);
    }
    return sum;
}

If you want to use other pointer types or objects as the key, you should write your own hash specialization for those types.

The Hash Map Interface

A hash_map supports three basic operations: insertion, deletion, and lookup. It is also swappable. Of course, it provides a constructor and destructor as well. The copy and move constructors are explicitly defaulted, and the copy and move assignment operators are provided. Here is the public portion of the hash_map class template:

template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
    typename Hash = hash<Key>>
class hash_map
{
    public:
        using key_type = Key;
        using mapped_type = T;
        using value_type = std::pair<const Key, T>;

        virtual ~hash_map() = default;  // Virtual destructor

        // Throws invalid_argument if the number of buckets is illegal.
        explicit hash_map(const KeyEqual& equal = KeyEqual(),
            size_t numBuckets = 101, const Hash& hash = Hash());

        // Copy constructor
        hash_map(const hash_map<Key, T, KeyEqual, Hash>& src) = default;
        // Move constructor
        hash_map(hash_map<Key, T, KeyEqual, Hash>&& src) noexcept = default;

        // Copy assignment operator
        hash_map<Key, T, KeyEqual, Hash>& operator=(
            const hash_map<Key, T, KeyEqual, Hash>& rhs);
        // Move assignment operator
        hash_map<Key, T, KeyEqual, Hash>& operator=(
            hash_map<Key, T, KeyEqual, Hash>&& rhs) noexcept;

        // Inserts the key/value pair x.
        void insert(const value_type& x);

        // Removes the element with key k, if it exists.
        void erase(const key_type& k);

        // Removes all elements.
        void clear() noexcept;

        // Find returns a pointer to the element with key k.
        // Returns nullptr if no element with that key exists.
        value_type* find(const key_type& k);
        const value_type* find(const key_type& k) const;

        // operator[] finds the element with key k, or inserts an
        // element with that key if none exists yet. Returns a reference to
        // the value corresponding to that key.
        T& operator[] (const key_type& k);

        // Swaps two hash_maps.
        void swap(hash_map<Key, T, KeyEqual, Hash>& other) noexcept;
    private:
        // Implementation details not shown yet
};

As you can see, the key and value types are both template parameters, similar as for the Standard Library map. A hash_map stores pair<const Key, T> as the actual elements in the container. The insert(), erase(), find(), clear(), and operator[] methods are straightforward. However, a few aspects of this interface require further explanation.

The KeyEqual Template Parameter

Like a map, set, and other standard containers, a hash_map allows the client to specify the comparison type as a template parameter and to pass a specific comparison object of that type in the constructor. Unlike a map and set, a hash_map does not sort elements by key, but must still compare keys for equality. Thus, instead of using less as the default comparator, it uses the transparent equal_to<> comparator. The comparison object is used only to detect attempts to insert duplicate keys into the container.

The Hash Template Parameter

You should be able to change the hashing function to make it better suit the type of elements you want to store in the hash map. Thus, the hash_map template takes four template parameters: the key type, the value type, the comparator type, and the hash type.

The Type Aliases

The hash_map class template defines three type aliases:

using key_type = Key;
using mapped_type = T;
using value_type = std::pair<const Key, T>;

The value_type, in particular, is useful for referring to the more cumbersome pair<const Key, T> type. As you will see, these type aliases are required to satisfy the Standard Library container requirements.

The Implementation

After you finalize the hash_map interface, you need to choose the implementation model. The basic hash table structure generally consists of a fixed number of buckets, each of which can store one or more elements. The buckets should be accessible in constant time based on a bucket-id (the result of hashing a key). Thus, a vector is the most appropriate container for the buckets. Each bucket must store a list of elements, so the Standard Library list can be used as the bucket type. Thus, the final structure is a vector of lists of pair<const Key, T> elements.1 Here are the private members of the hash_map class:

private:
    using ListType = std::list<value_type>;
    std::vector<ListType> mBuckets;
    size_t mSize = 0;
    KeyEqual mEqual;
    Hash mHash;

Without the type aliases for value_type and ListType, the line declaring mBuckets would look like this:

std::vector<std::list<std::pair<const Key, T>>> mBuckets;

The mEqual and mHash members store the comparison and hashing objects, respectively, and mSize stores the number of elements currently in the container.

The Constructor

The constructor initializes all the fields. It constructs mBuckets with the correct number of buckets. Unfortunately, the template syntax is somewhat dense. If the syntax confuses you, consult Chapter 12 for details on writing class templates.

// Construct mBuckets with the correct number of buckets.
template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>::hash_map(
    const KeyEqual& equal, size_t numBuckets, const Hash& hash)
    : mBuckets(numBuckets), mEqual(equal), mHash(hash)
{
    if (numBuckets == 0) {
        throw std::invalid_argument("Number of buckets must be positive");
    }
}

The implementation requires at least one bucket, so the constructor enforces that restriction.

Searching Elements

Each of the three major operations (lookup, insertion, and deletion) requires code to find an element with a given key. Thus, it is helpful to have a private helper method that performs that task. findElement() first uses the hash object to calculate the hash of the key and limits the calculated hash value to the number of hash buckets by taking the modulo of the calculated value. Then, it searches all the elements in that bucket for an element with a key matching the given key. The elements stored are key/value pairs, so the actual comparison must be done on the first field of the element. The comparison function object specified in the constructor is used to perform the comparison.

template <typename Key, typename T, typename KeyEqual, typename Hash>
std::pair<
    typename hash_map<Key, T, KeyEqual, Hash>::ListType::iterator, size_t>
        hash_map<Key, T, KeyEqual, Hash>::findElement(const key_type& k)
{

    // Hash the key to get the bucket.
    size_t bucket = mHash(k) % mBuckets.size();

    // Search for the key in the bucket.
    auto iter = find_if(begin(mBuckets[bucket]), end(mBuckets[bucket]),
        [this, &k](const auto& element) { return mEqual(element.first, k); });

    // Return a pair of the iterator and the bucket index.
    return std::make_pair(iter, bucket);
}

findElement() returns a pair containing an iterator and a bucket index. The bucket index is the index of the bucket to which the given key maps, independent of whether or not the given key is actually in the container. The returned iterator refers to an element in the bucket list, the list representing the bucket to which the key mapped. If the element is found, the iterator refers to that element; otherwise, it is the end iterator for that list.

The syntax in the function header of this method is somewhat confusing, particularly the use of the typename keyword. You must use the typename keyword whenever you are using a type that is dependent on a template parameter. Specifically, the type ListType::iterator , which is list<pair<const Key, T>>::iterator , is dependent on both the Key and T template parameters.

You can implement the find() method as a simple wrapper for findElement():

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::value_type*
    hash_map<Key, T, KeyEqual, Hash>::find(const key_type& k)
{
    // Use the findElement() helper, and C++17 structured bindings.
    auto[it, bucket] = findElement(k);
    if (it == end(mBuckets[bucket])) {
        // Element not found -- return nullptr.
        return nullptr;
    }
    // Element found -- return a pointer to it.
    return &(*it);
}

The const version of find() uses a const_cast to forward the request to the non-const version to avoid code duplication:

template <typename Key, typename T, typename KeyEqual, typename Hash>
const typename hash_map<Key, T, KeyEqual, Hash>::value_type*
    hash_map<Key, T, KeyEqual, Hash>::find(const key_type& k) const
{
    return const_cast<hash_map<Key, T, KeyEqual, Hash>*>(this)->find(k);
}

The operator[] implementation uses findElement() and if it does not find the element, it inserts it as follows:

template <typename Key, typename T, typename KeyEqual, typename Hash>
T& hash_map<Key, T, KeyEqual, Hash>::operator[] (const key_type& k)
{
    // Try to find the element. If it doesn't exist, add a new element.
    auto[it, bucket] = findElement(k);
    if (it == end(mBuckets[bucket])) {
        mSize++;
        mBuckets[bucket].push_back(std::make_pair(k, T()));
        return mBuckets[bucket].back().second;
    } else {
        return it->second;
    }
}

Inserting Elements

insert() must first check if an element with that key is already in the hash_map. If not, it can add the element to the list in the appropriate bucket. Note that findElement() returns the bucket index to which a key hashes, even if an element with that key is not found:

template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::insert(const value_type& x)
{
    // Try to find the element.
    auto[it, bucket] = findElement(x.first);
    if (it != end(mBuckets[bucket])) {
        // The element already exists.
        return;
    } else {
        // We didn't find the element, so insert a new one.
        mSize++;
        mBuckets[bucket].push_back(x);
    }
}

Note that this implementation of insert() returns void, so the caller does not know whether the element was inserted or if it was already in the hash_map. This shortcoming is resolved later in this chapter, once iterators have been implemented for hash_map.

Deleting Elements

erase() follows the same pattern as insert(): It first attempts to find the element by calling findElement(). If the element exists, it erases it from the list in the appropriate bucket. Otherwise, it does nothing.

template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::erase(const key_type& k)
{
    // First, try to find the element.
    auto[it, bucket] = findElement(k);
    if (it != end(mBuckets[bucket])) {
        // The element exists -- erase it.
        mBuckets[bucket].erase(it);
        mSize--;
    }
}

Removing All Elements

clear() simply clears each bucket, and sets the size of the hash_map to zero:

template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::clear() noexcept
{
    // Call clear on each bucket.
    for (auto& bucket : mBuckets) {
        bucket.clear();
    }
    mSize = 0;
}

Swapping

The swap() method just swaps all data members using std::swap():

template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::swap(
    hash_map<Key, T, KeyEqual, Hash>& other) noexcept
{
    using std::swap;

    swap(mBuckets, other.mBuckets);
    swap(mSize, other.mSize);
    swap(mEqual, other.mEqual);
    swap(mHash, other.mHash);
}

The following standalone swap() function is also provided, which simply forwards to the swap() method:

template <typename Key, typename T, typename KeyEqual, typename Hash>
void swap(hash_map<Key, T, KeyEqual, Hash>& first,
          hash_map<Key, T, KeyEqual, Hash>& second) noexcept
{
    first.swap(second);
}

Assignment Operators

Here are the implementations of the copy and move assignment operators. See Chapter 9 for a discussion of the copy-and-swap idiom.

template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>&
    hash_map<Key, T, KeyEqual, Hash>::operator=(
        const hash_map<Key, T, KeyEqual, Hash>& rhs)
{
    // check for self-assignment
    if (this == &rhs) {
        return *this;
    }

    // Copy-and-swap idiom
    auto copy = rhs;  // Do all the work in a temporary instance
    swap(copy);       // Commit the work with only non-throwing operations
    return *this;
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>&
    hash_map<Key, T, KeyEqual, Hash>::operator=(
        hash_map<Key, T, KeyEqual, Hash>&& rhs) noexcept
{
    swap(rhs);
    return *this;
}

Using the Basic Hash Map

Here is a small test program demonstrating the basic hash_map class template:

hash_map<int, int> myHash;
myHash.insert(make_pair(4, 40));
myHash.insert(make_pair(6, 60));

// x will have type hash_map<int, int>::value_type*
auto x = myHash.find(4);
if (x != nullptr) {
    cout << "4 maps to " << x->second << endl;
} else {
    cout << "cannot find 4 in map" << endl;
}

myHash.erase(4);

auto x2 = myHash.find(4);
if (x2 != nullptr) {
    cout << "4 maps to " << x2->second << endl;
} else {
    cout << "cannot find 4 in map" << endl;
}

myHash[4] = 35;
myHash[4] = 60;

auto x3 = myHash.find(4);
if (x3 != nullptr) {
    cout << "4 maps to " << x3->second << endl;
} else {
    cout << "cannot find 4 in map" << endl;
}

// Test std::swap().
hash_map<int, int> other(std::equal_to<>(), 11);
swap(other, myHash);

// Test copy construction and copy assignment.
hash_map<int, int> myHash2(other);
hash_map<int, int> myHash3;
myHash3 = myHash2;

// Test move construction and move assignment.
hash_map<int, int> myHash4(std::move(myHash3));
hash_map<int, int> myHash5;
myHash5 = std::move(myHash4);

The output is as follows:

4 maps to 40
cannot find 4 in map
4 maps to 60

Making hash_map a Standard Library Container

The basic hash_map shown in the previous section follows the spirit, but not the letter, of the Standard Library. For most purposes, the preceding implementation is good enough. However, if you want to use the Standard Library algorithms on your hash_map, you must do a bit more work. The C++ standard specifies methods and type aliases that a data structure must provide in order to qualify as a Standard Library container.

Required Type Aliases

The C++ standard specifies that every Standard Library container must provide the following public type aliases.

TYPE NAME DESCRIPTION
value_type The element type stored in the container
reference A reference to the element type stored in the container
const_reference A const reference to the element type stored in the container
iterator The type for iterating over elements of the container
const_iterator A version of iterator for iterating over const elements of the container
size_type A type that can represent the number of elements in the container; this is usually just size_t (from <cstddef>).
difference_type A type that can represent the difference of two iterators for the container; this is usually just ptrdiff_t (from <cstddef>).

Here are the definitions for the hash_map class template of all these type aliases except iterator and const_iterator. Writing an iterator is covered in detail in a subsequent section. Note that value_type (plus key_type and mapped_type, which are discussed later) was already defined in the previous version of the hash_map. This implementation also adds a type alias hash_map_type to give a shorter name to a specific template instantiation of hash_map:

template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
    typename Hash = hash<Key>>
class hash_map
{
    public:
        using key_type = Key;
        using mapped_type = T;
        using value_type = std::pair<const Key, T>;
        using reference = value_type&;
        using const_reference = const value_type&;
        using size_type = size_t;
        using difference_type = ptrdiff_t;
        using hash_map_type = hash_map<Key, T, KeyEqual, Hash>;
        // Remainder of class definition omitted for brevity
};

Required Methods

In addition to the obligatory type aliases, every container must provide the following methods.

METHOD DESCRIPTION WORST-CASE COMPLEXITY
Default constructor Constructs an empty container Constant
Copy constructor Performs a deep copy of the container Linear
Move constructor Performs a move constructing operation Constant
Copy assignment operator Performs a deep copy of the container Linear
Move assignment operator Performs a move assignment operation Constant
Destructor Destroys dynamically allocated memory; this method calls destructor on all elements left in the container. Linear
iterator begin();
const_iterator
begin() const;
Returns an iterator referring to the first element in the container Constant
iterator end();
const_iterator
end() const;
Returns an iterator referring to one-past-the-last element in the container Constant
const_iterator
cbegin() const;
Returns a const iterator referring to the first element in the container Constant
const_iterator
cend() const;
Returns a const iterator referring to one-past-the-last element in the container Constant
operator==
operator!=
Comparison operators that compare two containers, element by element Linear
void swap(Container&)
noexcept;
Swaps the contents of the container passed to the method with the object on which the method is called Constant
size_type size()
const;
Returns the number of elements in the container Constant
size_type max_size()
const;
Returns the maximum number of elements the container can hold Constant
bool empty() const; Returns whether the container has any elements Constant

The following code snippet shows the declarations of the remaining methods except for begin(), end(), cbegin(), and cend(). Those are covered in the next section.

template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
    typename Hash = hash<Key>>
class hash_map
{
    public:
        // Type aliases omitted for brevity

        // Size methods
        bool empty() const;
        size_type size() const;
        size_type max_size() const;

        // Other methods omitted for brevity
};

The implementations of size() and empty() are easy because the hash_map implementation tracks its size in the mSize data member. Note that size_type is one of the type aliases defined in the class. Because it is a member of the class, such a return type in the implementation must be fully qualified with typename hash_map<Key, T, KeyEqual, Hash>:

template <typename Key, typename T, typename KeyEqual, typename Hash>
bool hash_map<Key, T, KeyEqual, Hash>::empty() const
{
    return mSize == 0;
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
    hash_map<Key, T, KeyEqual, Hash>::size() const
{
    return mSize;
}

max_size() is a little trickier. At first, you might think the maximum size of a hash_map container is the sum of the maximum size of all the lists. However, the worst-case scenario is that all the elements hash to the same bucket. Thus, the maximum size it can claim to support is the maximum size of a single list:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
    hash_map<Key, T, KeyEqual, Hash>::max_size() const
{
    return mBuckets[0].max_size();
}

Writing an Iterator

The most important container requirement is the iterator. In order to work with the generic algorithms, every container must provide an iterator for accessing the elements in the container. Your iterator should generally provide overloaded operator* and operator->, plus some other operations depending on its specific behavior. As long as your iterator provides the basic iteration operations, everything should be fine.

The first decision to make about your iterator is what kind it will be: forward, bidirectional, or random access. Random-access iterators don’t make much sense for associative containers, so bidirectional seems like the logical choice for a hash_map iterator. That means you must also provide operator++, operator--, operator==, and operator!=. Consult Chapter 17 for more details on the requirements for the different iterators.

The second decision is how to order the elements of your container. The hash_map is unsorted, so iterating in a sorted order is probably too difficult. Instead, your iterator can just step through the buckets, starting with the elements in the first bucket and progressing to those in the last bucket. This order will appear random to the client, but will be consistent and repeatable.

The third decision is how to represent your iterator internally. The implementation is usually quite dependent on the internal implementation of the container. The first purpose of an iterator is to refer to a single element in the container. In the case of a hash_map, each element is in a Standard Library list, so perhaps a hash_map iterator can be a wrapper around a list iterator referring to the element in question. However, the second purpose of a bidirectional iterator is to allow the client to progress to the next or previous element from the current one. In order to progress from one bucket to the next, you need to track the current bucket and the hash_map object to which the iterator refers.

Once you’ve chosen your implementation, you must decide on a consistent representation for the end iterator. Recall that the end iterator should really be the “past-the-end” marker: the iterator that’s reached by applying ++ to an iterator referring to the final element in the container. A hash_map iterator can use as its end iterator the end iterator of the list of the final bucket in the hash_map.

A container needs to provide both a const iterator and a non-const iterator. The non-const iterator must be convertible to a const iterator. This implementation defines a const_hash_map_iterator class with hash_map_iterator deriving from it.

The const_hash_map_iterator Class

Given the decisions made in the previous section, it’s time to define the const_hash_map_iterator class. The first thing to note is that each const_hash_map_iterator object is an iterator for a specific instantiation of the hash_map class. In order to provide this one-to-one mapping, the const_hash_map_iterator must also be a class template with the hash map type as a template parameter called HashMap.

The main question in the class definition is how to conform to the bidirectional iterator requirements. Recall that anything that behaves like an iterator is an iterator. Your class is not required to derive from another class in order to qualify as a bidirectional iterator. However, if you want your iterator to be usable in the generic algorithms functions, you must specify its traits. The discussion earlier in this chapter explains that iterator_traits is a class template that defines, for each iterator type, five type aliases: value_type, difference_type, iterator_category, pointer, and reference. The iterator_traits class template coud be partially specialized for your new iterator type if you want. Alternatively, the default implementation of the iterator_traits class template just grabs the five type aliases out of the iterator class itself. Thus, you can simply define those type aliases directly in your iterator class. The const_hash_map_iterator is a bidirectional iterator, so you specify bidirectional_iterator_tag as the iterator category. Other legal iterator categories are input_iterator_tag, output_iterator_tag, forward_iterator_tag, and random_access_iterator_tag. For the const_hash_map_iterator, the element type is typename HashMap::value_type.

Here is the basic const_hash_map_iterator class definition:

template <typename HashMap>
class const_hash_map_iterator
{
    public:
        using value_type = typename HashMap::value_type;
        using difference_type = ptrdiff_t;
        using iterator_category = std::bidirectional_iterator_tag;
        using pointer = value_type*;
        using reference = value_type&;
        using list_iterator_type = typename HashMap::ListType::const_iterator;

        // Bidirectional iterators must supply a default constructor.
        // Using an iterator constructed with the default constructor
        // is undefined, so it doesn't matter how it's initialized.
        const_hash_map_iterator() = default;

        const_hash_map_iterator(size_t bucket, list_iterator_type listIt,
            const HashMap* hashmap);

        // Don't need to define a copy constructor or operator= because the
        // default behavior is what we want.

        // Don't need destructor because the default behavior
        // (not deleting mHashmap) is what we want!

        const value_type& operator*() const;

        // Return type must be something to which -> can be applied.
        // Return a pointer to a pair<const Key, T>, to which the compiler
        // will apply -> again.
        const value_type* operator->() const;

        const_hash_map_iterator<HashMap>& operator++();
        const_hash_map_iterator<HashMap> operator++(int);

        const_hash_map_iterator<HashMap>& operator--();
        const_hash_map_iterator<HashMap> operator--(int);

        // The following are ok as member functions because we don't
        // support comparisons of different types to this one.
        bool operator==(const const_hash_map_iterator<HashMap>& rhs) const;
        bool operator!=(const const_hash_map_iterator<HashMap>& rhs) const;
    protected:
        size_t mBucketIndex = 0;
        list_iterator_type mListIterator;
        const HashMap* mHashmap = nullptr;

        // Helper methods for operator++ and operator--
        void increment();
        void decrement();
};

Consult Chapter 15 for details on operator overloading if the definitions and implementations (shown in the next section) of the overloaded operators confuse you.

The const_hash_map_iterator Method Implementations

The const_hash_map_iterator constructor initializes the three member variables:

template<typename HashMap>
const_hash_map_iterator<HashMap>::const_hash_map_iterator(size_t bucket,
    list_iterator_type listIt, const HashMap* hashmap)
    : mBucketIndex(bucket), mListIterator(listIt), mHashmap(hashmap)
{
}

The default constructor is defaulted so that clients can declare const_hash_map_iterator variables without initializing them. An iterator constructed with the default constructor does not need to refer to any value, and attempting any operations on it is allowed to have undefined results.

The implementations of the dereferencing operators are concise, but can be tricky. Chapter 15 explains that operator* and operator-> are asymmetric; operator* returns a reference to the actual underlying value, which in this case is the element to which the iterator refers, while operator-> must return something to which the arrow operator can be applied again. Thus, it returns a pointer to the element. The compiler then applies -> to the pointer, which results in accessing a field of the element.

// Return a reference to the actual element.
template<typename HashMap>
const typename const_hash_map_iterator<HashMap>::value_type&
    const_hash_map_iterator<HashMap>::operator*() const
{
    return *mListIterator;
}

// Return a pointer to the actual element, so the compiler can
// apply -> to it to access the actual desired field.
template<typename HashMap>
const typename const_hash_map_iterator<HashMap>::value_type*
    const_hash_map_iterator<HashMap>::operator->() const
{
    return &(*mListIterator);
}

The increment and decrement operators are implemented as follows. They defer the actual incrementing and decrementing to the private increment() and decrement() helper methods.

// Defer the details to the increment() helper.
template<typename HashMap>
const_hash_map_iterator<HashMap>&
    const_hash_map_iterator<HashMap>::operator++()
{
    increment();
    return *this;
}

// Defer the details to the increment() helper.
template<typename HashMap>
const_hash_map_iterator<HashMap>
    const_hash_map_iterator<HashMap>::operator++(int)
{
    auto oldIt = *this;
    increment();
    return oldIt;
}

// Defer the details to the decrement() helper.
template<typename HashMap>
const_hash_map_iterator<HashMap>&
    const_hash_map_iterator<HashMap>::operator--()
{
    decrement();
    return *this;
}

// Defer the details to the decrement() helper.
template<typename HashMap>
const_hash_map_iterator<HashMap>
    const_hash_map_iterator<HashMap>::operator--(int)
{
    auto oldIt = *this;
    decrement();
    return oldIt;
}

Incrementing a const_hash_map_iterator tells it to refer to the “next” element in the container. This method first increments the list iterator, then checks if it has reached the end of its bucket. If so, it finds the next non-empty bucket in the hash_map and sets the list iterator equal to the start element in that bucket. Note that it can’t simply move to the next bucket, because there might not be any elements in it. If there are no more non-empty buckets, mListIterator is set, by the convention chosen for this example, to the end iterator of the last bucket in the hash_map, which is the special “end” position of the const_hash_map_iterator. Iterators are not required to be any safer than dumb pointers, so error-checking for things like incrementing an iterator already at the end is not required.

// Behavior is undefined if mListIterator already refers to the past-the-end
// element, or is otherwise invalid.
template<typename HashMap>
void const_hash_map_iterator<HashMap>::increment()
{
    // mListIterator is an iterator into a single bucket. Increment it.
    ++mListIterator;

    // If we're at the end of the current bucket,
    // find the next bucket with elements.
    auto& buckets = mHashmap->mBuckets;
    if (mListIterator == end(buckets[mBucketIndex])) {
        for (size_t i = mBucketIndex + 1; i < buckets.size(); i++) {
            if (!buckets[i].empty()) {
                // We found a non-empty bucket.
                // Make mListIterator refer to the first element in it.
                mListIterator = begin(buckets[i]);
                mBucketIndex = i;
                return;
            }
        }
        // No more non-empty buckets. Set mListIterator to refer to the
        // end iterator of the last list.
        mBucketIndex = buckets.size() - 1;
        mListIterator = end(buckets[mBucketIndex]);
    }
}

Decrement is the inverse of increment: it makes the iterator refer to the “previous” element in the container. However, there is an asymmetry because of the asymmetry between the way the start and end positions are represented: start is the first element, but end is “one past” the last element. The algorithm for decrement first checks if the underlying list iterator is at the start of its current bucket. If not, it can just be decremented. Otherwise, the code needs to check for the first non-empty bucket before the current one. If one is found, the list iterator must be set to refer to the last element in that bucket, which is the end iterator decremented by one. If no non-empty buckets are found, the decrement is invalid, so the code can do anything it wants (behavior is undefined). Note that the for loop needs to use a signed integer type for its loop variable and not an unsigned type such as size_t because the loop uses --i:

// Behavior is undefined if mListIterator already refers to the first
// element, or is otherwise invalid.
template<typename HashMap>
void const_hash_map_iterator<HashMap>::decrement()
{
    // mListIterator is an iterator into a single bucket.
    // If it's at the beginning of the current bucket, don't decrement it.
    // Instead, try to find a non-empty bucket before the current one.
    auto& buckets = mHashmap->mBuckets;
    if (mListIterator == begin(buckets[mBucketIndex])) {
        for (int i = mBucketIndex - 1; i >= 0; --i) {
            if (!buckets[i].empty()) {
                mListIterator = --end(buckets[i]);
                mBucketIndex = i;
                return;
            }
        }
        // No more non-empty buckets. This is an invalid decrement.
        // Set mListIterator to refer to the end iterator of the last list.
        mBucketIndex = buckets.size() - 1;
        mListIterator = end(buckets[mBucketIndex]);
    } else {
        // We're not at the beginning of the bucket, so just move down.
        --mListIterator;
    }
}

Note that both increment() and decrement() access private members of the hash_map class. Thus, the hash_map class must declare const_hash_map_iterator to be a friend class.

After increment() and decrement(), operator== and operator!= are positively simple. They just compare each of the three data members of the objects:

template<typename HashMap>
bool const_hash_map_iterator<HashMap>::operator==(
    const const_hash_map_iterator<HashMap>& rhs) const
{
    // All fields, including the hash_map to which the iterators refer,
    // must be equal.
    return (mHashmap == rhs.mHashmap &&
        mBucketIndex == rhs.mBucketIndex &&
        mListIterator == rhs.mListIterator);
}

template<typename HashMap>
bool const_hash_map_iterator<HashMap>::operator!=(
    const const_hash_map_iterator<HashMap>& rhs) const
{
    return !(*this == rhs);
}

The hash_map_iterator Class

The hash_map_iterator class derives from const_hash_map_iterator. It does not need to override operator==, operator!=, increment(), and decrement() because the base class versions are just fine:

template <typename HashMap>
class hash_map_iterator : public const_hash_map_iterator<HashMap>
{
    public:
        using value_type =
            typename const_hash_map_iterator<HashMap>::value_type;
        using difference_type = ptrdiff_t;
        using iterator_category = std::bidirectional_iterator_tag;
        using pointer = value_type*;
        using reference = value_type&;
        using list_iterator_type = typename HashMap::ListType::iterator;

        hash_map_iterator() = default;
        hash_map_iterator(size_t bucket, list_iterator_type listIt,
            HashMap* hashmap);

        value_type& operator*();
        value_type* operator->();

        hash_map_iterator<HashMap>& operator++();
        hash_map_iterator<HashMap> operator++(int);

        hash_map_iterator<HashMap>& operator--();
        hash_map_iterator<HashMap> operator--(int);
};

The hash_map_iterator Method Implementations

The implementations of the hash_map_iterator methods are rather straightforward. The constructor just calls the base class constructor. The operator* and operator-> use const_cast to return a non-const type. operator++ and operator-- just use the increment() and decrement() from the base class, but return a hash_map_iterator instead of a const_hash_map_iterator. Note that the C++ name lookup rules require you to explicitly use the this pointer to refer to data members and methods in a base class template:

template<typename HashMap>
hash_map_iterator<HashMap>::hash_map_iterator(size_t bucket,
    list_iterator_type listIt, HashMap* hashmap)
    : const_hash_map_iterator<HashMap>(bucket, listIt, hashmap)
{
}

// Return a reference to the actual element.
template<typename HashMap>
typename hash_map_iterator<HashMap>::value_type&
    hash_map_iterator<HashMap>::operator*()
{
    return const_cast<value_type&>(*this->mListIterator);
}

// Return a pointer to the actual element, so the compiler can
// apply -> to it to access the actual desired field.
template<typename HashMap>
typename hash_map_iterator<HashMap>::value_type*
    hash_map_iterator<HashMap>::operator->()
{
    return const_cast<value_type*>(&(*this->mListIterator));
}

// Defer the details to the increment() helper in the base class.
template<typename HashMap>
hash_map_iterator<HashMap>& hash_map_iterator<HashMap>::operator++()
{
    this->increment();
    return *this;
}

// Defer the details to the increment() helper in the base class.
template<typename HashMap>
hash_map_iterator<HashMap> hash_map_iterator<HashMap>::operator++(int)
{
    auto oldIt = *this;
    this->increment();
    return oldIt;
}

// Defer the details to the decrement() helper in the base class.
template<typename HashMap>
hash_map_iterator<HashMap>& hash_map_iterator<HashMap>::operator--()
{
    this->decrement();
    return *this;
}

// Defer the details to the decrement() helper in the base class.
template<typename HashMap>
hash_map_iterator<HashMap> hash_map_iterator<HashMap>::operator--(int)
{
    auto oldIt = *this;
    this->decrement();
    return oldIt;
}

Iterator Type Aliases and Access Methods

The final piece involved in providing iterator support for hash_map is to supply the necessary type aliases in the hash_map class template, and to write the begin(), end(), cbegin(), and cend() methods. The type aliases and method prototypes look like this:

template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
    typename Hash = hash<Key>>
class hash_map
{
    public:
        // Other type aliases omitted for brevity
        using iterator = hash_map_iterator<hash_map_type>;
        using const_iterator = const_hash_map_iterator<hash_map_type>;

        // Iterator methods
        iterator begin();
        iterator end();
        const_iterator begin() const;
        const_iterator end() const;
        const_iterator cbegin() const;
        const_iterator cend() const;
        // Remainder of class definition omitted for brevity
};

The implementation of begin() includes an optimization for the case when there are no elements in the hash_map, in which case the end iterator is returned. Here is the code:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
    hash_map<Key, T, KeyEqual, Hash>::begin()
{
    if (mSize == 0) {
        // Special case: there are no elements, so return the end iterator.
        return end();
    }

    // We know there is at least one element. Find the first element.
    for (size_t i = 0; i < mBuckets.size(); ++i) {
        if (!mBuckets[i].empty()) {
            return hash_map_iterator<hash_map_type>(i,
                std::begin(mBuckets[i]), this);
        }
    }
    // Should never reach here, but if we do, return the end iterator.
    return end();
}

end() creates a hash_map_iterator referring to the end iterator of the last bucket:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
    hash_map<Key, T, KeyEqual, Hash>::end()
{
    // The end iterator is the end iterator of the list of the last bucket.
    size_t bucket = mBuckets.size() - 1;
    return hash_map_iterator<hash_map_type>(bucket,
        std::end(mBuckets[bucket]), this);
}

The implementations of the const versions of begin() and end() use const_cast to call the non-const versions. These non-const versions return a hash_map_iterator, which is convertible to a const_hash_map_iterator:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
    hash_map<Key, T, KeyEqual, Hash>::begin() const
{
    return const_cast<hash_map_type*>(this)->begin();
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
    hash_map<Key, T, KeyEqual, Hash>::end() const
{
    return const_cast<hash_map_type*>(this)->end();
}

The cbegin() and cend() methods forward the request to the const versions of begin() and end():

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
    hash_map<Key, T, KeyEqual, Hash>::cbegin() const
{
    return begin();
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
    hash_map<Key, T, KeyEqual, Hash>::cend() const
{
    return end();
}

Using the hash_map Iterators

Now that hash_map supports iteration, you can iterate over its elements just as you would on any Standard Library container, and you can pass the iterators to methods and functions. Here are some examples:

hash_map<string, int> myHash;
myHash.insert(make_pair("KeyOne", 100));
myHash.insert(make_pair("KeyTwo", 200));
myHash.insert(make_pair("KeyThree", 300));

for (auto it = myHash.cbegin(); it != myHash.cend(); ++it) {
    // Use both -> and * to test the operations.
    cout << it->first << " maps to " << (*it).second << endl;
}

// Print elements using a range-based for loop
for (auto& p : myHash) {
    cout << p.first << " maps to " << p.second << endl;
}

// Print elements using a range-based for loop and C++17 structured bindings
for (auto&[key, value] : myHash) {
    cout << key << " maps to " << value << endl;
}

// Create an std::map with all the elements in the hash_map.
map<string, int> myMap(cbegin(myHash), cend(myHash));
for (auto& p : myMap) {
    cout << p.first << " maps to " << p.second << endl;
}

The last piece of code also shows that the non-member functions such as std::cbegin() and std::cend() are working as expected.

Note on Allocators

As described earlier in this chapter, all the Standard Library containers allow you to specify a custom memory allocator. A “good citizen” hash_map implementation should do the same. However, those details are omitted because they obscure the main points of this implementation, and because custom allocators are rarely used.

Note on Reversible Containers

If your container supplies a bidirectional or random access iterator, it is considered reversible. Reversible containers are supposed to supply two additional type aliases.

TYPE NAME DESCRIPTION
reverse_iterator The type for iterating over elements of the container in reverse order
const_reverse_iterator A version of reverse_iterator for iterating over const elements of the container in reverse order

Additionally, the container should provide rbegin() and rend(), which are symmetric with begin() and end(); it should also provide crbegin() and crend(), which are symmetric with cbegin() and cend(). The usual implementations just use the std::reverse_iterator adaptor described earlier in this chapter. These are left as an exercise for you to try.

Making hash_map an Unordered Associative Container

In addition to the basic container requirements that were shown already, you can also make your container adhere to additional requirements for ordered associative, unordered associative, or sequential containers. This section modifies the hash_map class template to satisfy a few additional unordered associative container requirements.

Unordered Associative Container Type Alias Requirements

Unordered associative containers require the following type aliases.

TYPE NAME DESCRIPTION
key_type The key type with which the container is instantiated
mapped_type The element type with which the container is instantiated
value_type pair<const Key, T>
hasher The hash type with which the container is instantiated
key_equal The equality predicate with which the container is instantiated
local_iterator An iterator type to iterate through a single bucket. Cannot be used to iterate across buckets.
const_local_iterator A const iterator type to iterate through a single bucket. Cannot be used to iterate across buckets.
node_type A type to represent a node. See Chapter 17 for a discussion on nodes. Not further discussed in this section.

Here is the hash_map definition with an updated set of type aliases. Note that the definition of ListType has been moved, because the definitions of the local iterators require ListType.

template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
    typename Hash = hash<Key>>
class hash_map
{
    public:
        using key_type = Key;
        using mapped_type = T;
        using value_type = std::pair<const Key, T>;
        using hasher = Hash;
        using key_equal = KeyEqual;
        using reference = value_type&;
        using const_reference = const value_type&;
        using size_type = size_t;
        using difference_type = ptrdiff_t;
        using hash_map_type = hash_map<Key, T, KeyEqual, Hash>;
        using iterator = hash_map_iterator<hash_map_type>;
        using const_iterator = const_hash_map_iterator<hash_map_type>;

    private:
        using ListType = std::list<value_type>;
    public:
        using local_iterator = typename ListType::iterator;
        using const_local_iterator = typename ListType::const_iterator;

        // Remainder of hash_map class definition omitted for brevity
};

Unordered Associative Container Method Requirements

The standard prescribes quite a few additional method requirements for unordered associative containers, as listed in the following table. In the last column, n is the number of elements in the container.

METHOD DESCRIPTION COMPLEXITY
Constructor taking an iterator range Constructs the container and inserts the elements from the iterator range. The iterator range need not refer to another container of the same type.
Note that all constructors of unordered associative containers must take an equality predicate. The constructors should provide a default constructed object as the default value.
On average O(n), worst case O(n2)
Constructor taking an initializer_list<value_type> as parameter Constructs the container and inserts the elements from the initializer list into the container. On average O(n), worst case O(n2)
Assignment operator with an initializer_list<value_type> as right-hand side Replaces all elements from the container with the elements from the initializer list. On average O(n), worst case O(n2)
hasher hash_function()
const;
Returns the hash function. Constant
key_equal key_eq()
const;
Returns the equality predicate for comparing keys. Constant
pair<iterator, bool>
insert(value_type&);
iterator insert(
const_iterator hint,
value_type&);
Two different forms of insert.
The hint can be ignored by the implementation.
Containers that allow duplicate keys return just iterator for the first form, because insert() always succeeds in that case.
On average O(1), worst case O(n)
void insert(
InputIterator start,
InputIterator end);
Inserts a range of elements. The range need not be from a container of the same type. On average O(m) with m the number of elements to insert. Worst case O(m*n+m)
void insert(
initializer_list<
value_type>);
Inserts the elements from the initializer list into the container. On average O(m) with m the number of elements to insert. Worst case O(m*n+m)
pair<iterator, bool>
emplace(Args&&…);
iterator emplace_hint(
const_iterator hint,
Args&&...);
Implements the emplace operations to construct objects in-place. In-place construction is discussed in Chapter 17. On average O(1), worst case O(n)
size_type
erase(key_type&);
iterator erase(
iterator position);
iterator erase(
iterator start,
iterator end);
Three different forms of erase.
The first form returns the number of values erased (0 or 1 in containers that do not allow duplicate keys).
The second and third forms erase the elements at position, or in the range start to end, and return an iterator to the element following the last erased element.
Worst case O(n)
void clear(); Erases all elements. O(n)
Iterator
find(key_type&);
const_iterator
find(key_type&)
const;
Finds the element with the specified key. On average O(1), worst case O(n)
size_type
count(key_type&)
const;
Returns the number of elements with the specified key (0 or 1 in containers that do not allow duplicate keys). On average O(1), worst case O(n)
pair<iterator,iterator>
equal_range(
key_type&);
pair<const_iterator,
const_iterator>
equal_range(
key_type&) const;
Returns iterators referring to the first element with the specified key and one past the last element with the specified key. Worst case O(n)

Note that hash_map does not allow duplicate keys, so equal_range() always returns a pair of identical iterators.

C++17 adds extract() and merge() methods to the list of requirements. These have to do with handling nodes as discussed in Chapter 17, but are omitted for this hash_map implementation.

Here is the complete hash_map class definition. The prototypes for insert(), erase(), and find() need to change slightly from the previous versions shown because those initial versions don’t have the right return types required for unordered associative containers.

template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
    typename Hash = hash<Key>>
class hash_map
{
    public:
        using key_type = Key;
        using mapped_type = T;
        using value_type = std::pair<const Key, T>;
        using hasher = Hash;
        using key_equal = KeyEqual;
        using reference = value_type&;
        using const_reference = const value_type&;
        using size_type = size_t;
        using difference_type = ptrdiff_t;
        using hash_map_type = hash_map<Key, T, KeyEqual, Hash>;
        using iterator = hash_map_iterator<hash_map_type>;
        using const_iterator = const_hash_map_iterator<hash_map_type>;

    private:
        using ListType = std::list<value_type>;
    public:
        using local_iterator = typename ListType::iterator;
        using const_local_iterator = typename ListType::const_iterator;

        // The iterator classes need access to all members of the hash_map
        friend class hash_map_iterator<hash_map_type>;
        friend class const_hash_map_iterator<hash_map_type>;

        virtual ~hash_map() = default;    // Virtual destructor

        // Throws invalid_argument if the number of buckets is illegal.
        explicit hash_map(const KeyEqual& equal = KeyEqual(),
            size_type numBuckets = 101, const Hash& hash = Hash());

        // Throws invalid_argument if the number of buckets is illegal.
        template <typename InputIterator>
        hash_map(InputIterator first, InputIterator last,
            const KeyEqual& equal = KeyEqual(),
            size_type numBuckets = 101, const Hash& hash = Hash());

        // Initializer list constructor
        // Throws invalid_argument if the number of buckets is illegal. 
        explicit hash_map(std::initializer_list<value_type> il,
            const KeyEqual& equal = KeyEqual(), size_type numBuckets = 101,
            const Hash& hash = Hash());

        // Copy constructor
        hash_map(const hash_map_type& src) = default;
        // Move constructor
        hash_map(hash_map_type&& src) noexcept = default;

        // Copy assignment operator
        hash_map_type& operator=(const hash_map_type& rhs);
        // Move assignment operator
        hash_map_type& operator=(hash_map_type&& rhs) noexcept;
        // Initializer list assignment operator
        hash_map_type& operator=(std::initializer_list<value_type> il);

        // Iterator methods
        iterator begin();
        iterator end();
        const_iterator begin() const;
        const_iterator end() const;
        const_iterator cbegin() const;
        const_iterator cend() const;

        // Size methods
        bool empty() const;
        size_type size() const;
        size_type max_size() const;

        // Element insert methods
        T& operator[](const key_type& k);
        std::pair<iterator, bool> insert(const value_type& x);
        iterator insert(const_iterator hint, const value_type& x);
        template <typename InputIterator>
        void insert(InputIterator first, InputIterator last);
        void insert(std::initializer_list<value_type> il);

        // Element delete methods
        size_type erase(const key_type& k);
        iterator erase(iterator position);
        iterator erase(iterator first, iterator last);

        // Other modifying utilities
        void swap(hash_map_type& other) noexcept;
        void clear() noexcept;

        // Access methods for Standard Library conformity
        key_equal key_eq() const;
        hasher hash_function() const;

        // Lookup methods
        iterator find(const key_type& k);
        const_iterator find(const key_type& k) const;
        std::pair<iterator, iterator> equal_range(const key_type& k);
        std::pair<const_iterator, const_iterator>
            equal_range(const key_type& k) const;

        size_type count(const key_type& k) const;

    private:
        // Returns a pair containing an iterator to the found element with
        // a given key, and the index of that element's bucket.
        std::pair<typename ListType::iterator, size_t> findElement(
            const key_type& k);

        std::vector<ListType> mBuckets;
        size_type mSize = 0;
        KeyEqual mEqual;
        Hash mHash;
};

hash_map Iterator Range Constructor

The constructor accepting an iterator range is a method template so that it can take an iterator range from any container, not just other hash_maps. If it were not a method template, it would need to specify the InputIterator type explicitly as hash_map_iterator, limiting it to iterators from hash_maps. Despite the syntax, the implementation is uncomplicated: it delegates the construction to the explicit constructor to initialize all the data members, then calls insert() to actually insert all the elements in the specified range.

// Make a call to insert() to actually insert the elements.
template <typename Key, typename T, typename KeyEqual, typename Hash>
template <typename InputIterator>
hash_map<Key, T, KeyEqual, Hash>::hash_map(
    InputIterator first, InputIterator last, const KeyEqual& equal,
    size_type numBuckets, const Hash& hash)
    : hash_map(equal, numBuckets, hash)
{
    insert(first, last);
} 

hash_map Initializer List Constructor

Initializer lists are discussed in Chapter 1. Following is the implementation of the hash_map constructor that takes an initializer list, which is very similar to the implementation of the constructor accepting an iterator range:

template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>::hash_map(
    std::initializer_list<value_type> il,
    const KeyEqual& equal, size_type numBuckets, const Hash& hash)
    : hash_map(equal, numBuckets, hash) 
{
    insert(std::begin(il), std::end(il));
}

With this initializer list constructor, a hash_map can be constructed as follows:

hash_map<string, int> myHash {
    { "KeyOne", 100 },
    { "KeyTwo", 200 },
    { "KeyThree", 300 } };

hash_map Initializer List Assignment Operator

Assignment operators can also accept an initializer list on the right-hand side. Following is an implementation of an initializer list assignment operator for hash_map. It uses a copy-and-swap-like algorithm to guarantee strong exception safety.

template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>& hash_map<Key, T, KeyEqual, Hash>::operator=(
    std::initializer_list<value_type> il)
{
    // Do all the work in a temporary instance
    hash_map_type newHashMap(il, mEqual, mBuckets.size(), mHash);
    swap(newHashMap);  // Commit the work with only non-throwing operations
    return *this;
}

With this assignment operator, you can write code as follows:

myHash = {
    { "KeyOne", 100 },
    { "KeyTwo", 200 },
    { "KeyThree", 300 } };

hash_map Insertion Operations

In the earlier section, “Using the Basic Hash Map,” a simple insert() method is given. In this version, four insert() methods are provided with additional features:

  • The simple insert() operation returns a pair<iterator, bool>, which indicates both where the item is inserted and whether or not it was newly created.
  • The version of insert() that takes a hint is useless for a hash_map, but it is provided for symmetry with other kinds of collections. The hint is ignored, and it merely calls the first version.
  • The third form of insert() is a method template, so ranges of elements from arbitrary containers can be inserted into the hash_map.
  • The last form of insert() accepts an initializer_list<value_type>.

Note that technically, the following versions of insert() can also be provided. These accept rvalue references.

std::pair<iterator, bool> insert(value_type&& x);
iterator insert(const_iterator hint, value_type&& x);

The hash_map does not provide these. Additionally, there are two versions of insert() related to handling nodes. Nodes are discussed in Chapter 17. The hash_map omits these as well.

The first two insert() methods are implemented as follows:

template <typename Key, typename T, typename KeyEqual, typename Hash>
std::pair<typename hash_map<Key, T, KeyEqual, Hash>::iterator, bool>
    hash_map<Key, T, KeyEqual, Hash>::insert(const value_type& x)
{
    // Try to find the element.
    auto[it, bucket] = findElement(x.first);
    bool inserted = false;
    if (it == std::end(mBuckets[bucket])) {
        // We didn't find the element, so insert a new one.
        it = mBuckets[bucket].insert(it, x);
        inserted = true;
        mSize++;
    }
    return std::make_pair(
        hash_map_iterator<hash_map_type>(bucket, it, this), inserted);
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
    hash_map<Key, T, KeyEqual, Hash>::insert(
        const_iterator /*hint*/, const value_type& x)
{
    // Completely ignore position.
    return insert(x).first;
}

The third form of insert() is a method template for the same reason as the constructor shown earlier: it should be able to insert elements by using iterators from containers of any type. The actual implementation uses an insert_iterator, described earlier in this chapter.

template <typename Key, typename T, typename KeyEqual, typename Hash>
template <typename InputIterator>
void hash_map<Key, T, KeyEqual, Hash>::insert(
    InputIterator first, InputIterator last)
{
    // Copy each element in the range by using an insert_iterator adaptor.
    // Give begin() as a dummy position -- insert ignores it anyway.
    std::insert_iterator<hash_map_type> inserter(*this, begin());
    std::copy(first, last, inserter);
}

The last insert operation accepts an initializer list. The implementation for hash_map simply forwards the work to the insert() method accepting an iterator range:

template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::insert(
    std::initializer_list<value_type> il)
{
    insert(std::begin(il), std::end(il));
}

With this insert() method, you can write code as follows:

myHash.insert({
    { "KeyFour", 400 },
    { "KeyFive", 500 } });

hash_map Emplace Operations

Emplace operations construct objects in-place. They are discussed in Chapter 17. The emplace methods for hash_map look as follows :

template <typename... Args>
std::pair<iterator, bool> emplace(Args&&... args);

template <typename... Args>
iterator emplace_hint(const_iterator hint, Args&&... args);

The ... in these lines are not typos. These are so-called variadic templates, that is, templates with a variable number of template type parameters and a variable number of function parameters. Variadic templates are discussed in Chapter 22. This hash_map implementation omits emplace operations.

hash_map Erase Operations

The version of erase() in the earlier section, “A Basic Hash Map,” is not compliant with Standard Library requirements. You need to implement the following versions:

  • A version that takes as a parameter a key_type and returns a size_type for the number of elements removed from the collection (for hash_map there are only two possible return values, 0 and 1).
  • A version that erases a value at a specific iterator position, and returns an iterator to the element following the erased element.
  • A version that erases a range of elements based on two iterators, and returns an iterator to the element following the last erased element.

The first version is implemented as follows:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
    hash_map<Key, T, KeyEqual, Hash>::erase(const key_type& k)
{
    // First, try to find the element.
    auto[it, bucket] = findElement(k);
    if (it != std::end(mBuckets[bucket])) {
        // The element exists -- erase it.
        mBuckets[bucket].erase(it);
        mSize--;
        return 1;
    } else {
        return 0;
    }
}

The second version of erase() must remove the element at a specific iterator position. The iterator given is, of course, a hash_map_iterator. Thus, hash_map must have some ability to obtain the underlying bucket and list iterator from the hash_map_iterator. The approach taken here is to make the hash_map class a friend of the hash_map_iterator (not shown in the earlier class definition). Here is the implementation of this version of erase():

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
    hash_map<Key, T, KeyEqual, Hash>::erase(iterator position)
{
    iterator next = position;
    ++next;
    // Erase the element from its bucket.
    mBuckets[position.mBucketIndex].erase(position.mListIterator);
    mSize--;
    return next;
}

The final version of erase() removes a range of elements. It iterates from first to last, calling erase() on each element, thus letting the previous version of erase() do all the work:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
    hash_map<Key, T, KeyEqual, Hash>::erase(iterator first, iterator last)
{
    // Erase all the elements in the range.
    for (iterator next = first; next != last;) {
        next = erase(next);
    }
    return last;
}

hash_map Accessor Operations

The standard requires methods called key_eq() and hash_function() to retrieve the equality predicate and the hash function respectively:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::key_equal
    hash_map<Key, T, KeyEqual, Hash>::key_eq() const
{
    return mEqual;
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::hasher
    hash_map<Key, T, KeyEqual, Hash>::hash_function() const
{
    return mHash;
}

The find() method is identical to the version shown earlier for the basic hash_map, except for the return code. Instead of returning a pointer to the element, it constructs a hash_map_iterator referring to it:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
    hash_map<Key, T, KeyEqual, Hash>::find(const key_type& k)
{
    // Use the findElement() helper, and C++17 structured bindings.
    auto[it, bucket] = findElement(k);
    if (it == std::end(mBuckets[bucket])) {
        // Element not found -- return the end iterator.
        return end();
    }
    // Element found -- convert the bucket/iterator to a hash_map_iterator.
    return hash_map_iterator<hash_map_type>(bucket, it, this);
}

The const version of find() returns a const_hash_map_iterator. It uses const_cast to call the non-const version of find() to avoid code duplication. Note that the non-const find() returns a hash_map_iterator, which is convertible to a const_hash_map_iterator.

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
    hash_map<Key, T, KeyEqual, Hash>::find(const key_type& k) const
{
    return const_cast<hash_map_type*>(this)->find(k);
}

The implementations of both versions of equal_range() are identical, except that one returns a pair of hash_map_iterators while the other returns a pair of const_hash_map_iterators. They both simply forward the request to find(). A hash_map cannot have elements with duplicate keys, so the result of equal_range() for hash_map is always a pair of identical iterators.

template <typename Key, typename T, typename KeyEqual, typename Hash>
std::pair<
    typename hash_map<Key, T, KeyEqual, Hash>::iterator,
    typename hash_map<Key, T, KeyEqual, Hash>::iterator>
    hash_map<Key, T, KeyEqual, Hash>::equal_range(const key_type& k)
{
    auto it = find(k);
    return std::make_pair(it, it);
}

Because hash_map does not allow duplicate keys, count() can only return 1 or 0: 1 if it finds the element, and 0 if it doesn’t. The implementation just wraps a call to find(). The find() method returns the end iterator if it can’t find the element. count() retrieves an end iterator by calling end() in order to compare it.

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
    hash_map<Key, T, KeyEqual, Hash>::count(const key_type& k) const
{
    // There are either 1 or 0 elements matching key k.
    // If we can find a match, return 1, otherwise return 0.
    if (find(k) == end()) {
        return 0;
    } else {
        return 1;
    }
}

The final method, operator[], is not required by the standard, but is provided for convenience of the programmer, and to be symmetric with std::map. The prototype is identical to the one for std::map. The comments explain the potentially confusing one-line implementation.

template <typename Key, typename T, typename KeyEqual, typename Hash>
T& hash_map<Key, T, KeyEqual, Hash>::operator[] (const key_type& k)
{
    // It's a bit cryptic, but it basically attempts to insert
    // a new key/value pair of k and a zero-initialized value. Regardless
    // of whether the insert succeeds or fails, insert() returns a pair of
    // an iterator/bool. The iterator refers to a key/value pair, the
    // second element of which is the value we want to return.
    return ((insert(std::make_pair(k, T()))).first)->second;
}

hash_map Bucket Operations

Unordered associative containers should also provide a number of bucket-related methods:

  • bucket_count() returns the number of buckets in the container.
  • max_bucket_count() returns the maximum number of buckets that are supported.
  • bucket(key) returns the index of the bucket to which the given key maps.
  • bucket_size(n) returns the number of elements in the bucket with given index.
  • begin(n), end(n), cbegin(n), cend(n) return local begin and end iterators to the bucket with given index.

Here are the implementations for the hash_map:

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
    hash_map<Key, T, KeyEqual, Hash>::bucket_count() const
{
    return mBuckets.size();
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
    hash_map<Key, T, KeyEqual, Hash>::max_bucket_count() const
{
    return mBuckets.max_size();
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
    hash_map<Key, T, KeyEqual, Hash>::bucket(const Key& k) const
{
    return const_cast<hash_map_type*>(this)->findElement(k).second;
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
    hash_map<Key, T, KeyEqual, Hash>::bucket_size(size_type n) const
{
    return mBuckets[n].size();
}

template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::local_iterator
    hash_map<Key, T, KeyEqual, Hash>::begin(size_type n)
{
    return mBuckets[n].begin();
}

The implementations for the other begin(n), end(n), cbegin(n), and cend(n) methods are similar; they simply forward the call to the correct bucket list based on the given index.

Finally, an unordered associative container should also provide load_factor(), max_load_ factor(), rehash(), and reserve() methods. These are omitted for hash_map.

Note on Sequential Containers

The hash_map developed in the preceding sections is an unordered associative container. However, you could also write a sequential container, or an ordered associative container, in which case you would need to follow a different set of requirements. Instead of listing them here, it’s easier to point out that the deque container follows the prescribed sequential container requirements almost exactly. The only difference is that it provides an extra resize() method (not required by the standard). An example of an ordered associative container is the map, on which you can model your own ordered associative containers.

SUMMARY

The final example in this chapter showed almost the complete development of a hash_map unordered associative container and its iterators. This hash_map implementation is given here only to teach you how to write your own Standard Library-compliant containers and iterators. C++ does include its own set of unordered associative containers. You should use those instead of your own implementation.

In the process of reading this chapter, you hopefully gained an appreciation for the steps involved in developing containers. Even if you never write another Standard Library algorithm or container, you understand better the Standard Library’s mentality and capabilities, and you can put it to better use.

This chapter concludes the tour of the Standard Library. Even with all the details given in this book, some features are still omitted. If this material excited you, and you would like more information, consult some of the resources in Appendix B. Don’t feel compelled to use all the features discussed here. Forcing them into your programs without a true need will just complicate your code. However, I encourage you to consider incorporating aspects of the Standard Library into your programs where they make sense. Start with the containers, maybe throw in an algorithm or two, and before you know it, you’ll be a convert!

NOTE

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset