Every Standard Library container takes an Allocator
type as a template parameter, for which the default usually suffices. For example, the vector
template definition looks like this:
template <class T, class Allocator = allocator<T>> class vector;
The container constructors then allow you to specify an object of type Allocator
. This permits you to customize the way the containers allocate memory. Every memory allocation performed by a container is made with a call to the allocate()
method of the Allocator
object. Conversely, every deallocation is performed with a call to the deallocate()
method of the Allocator
object. The Standard Library provides a default Allocator
class called allocator
, which implements these methods as wrappers for operator new
and operator delete
.
If you want containers in your program to use a custom memory allocation and deallocation scheme, you can write your own Allocator
class. There are several reasons for using custom allocators. For example, if the underlying allocator has unacceptable performance, there are alternatives that can be constructed. Or, if memory fragmentation is a problem (a lot of different allocations and deallocations leaving unusable, small holes in memory), a single “pool” of objects of one type can be created, called a memory pool. When OS-specific capabilities, such as shared memory segments, must be allocated, using custom allocators allows the use of Standard Library containers in those shared memory segments. The use of custom allocators is complex, and there are many potential problems if you are not careful, so this should not be approached lightly.
Any class that provides allocate()
, deallocate()
, and several other required methods and type aliases can be used in place of the default allocator
class.
C++17 introduces the concept of polymorphic memory allocators. Basically, the problem with the allocator for a container being specified as a template type parameter is that two containers that are similar but have different allocator types are completely different. For example, two vector<int>
containers with different Allocator
template type parameters are different, and so cannot be assigned to one another.
The polymorphic memory allocators, defined in <memory_resource>
in the std::pmr
namespace, help to solve this problem. The class std::pmr::polymorphic_allocator
is a proper Allocator
class because it satisfies all the requirements, such as having an allocate()
and deallocate()
method. The allocation behavior of a polymorphic_allocator
depends on the memory_resource
it’s given during construction, and not on any template type parameters. As such, different polymorphic_allocator
s can behave in completely different ways when allocating and deallocating memory, even though they all have the same type, that is, polymorphic_allocator
. The standard provides some built-in memory resources that you can use to initialize a polymorphic memory allocator: synchronized_pool_resource
, unsynchronized_pool_resource
, and monotonic_buffer_resource
.
However, in my experience, both custom allocators and polymorphic memory allocators are rather advanced and rarely used features. I’ve never used them myself, so a detailed discussion falls outside the scope of this book. For more information, consult one of the books listed in Appendix B that specifically covers the C++ Standard Library.
The Standard Library provides four stream iterators. These are iterator-like class templates that allow you to treat input and output streams as input and output iterators. Using these stream iterators, you can adapt input and output streams so that they can serve as sources and destinations, respectively, for various Standard Library algorithms. The following stream iterators are available:
ostream_iterator
—an output stream iteratoristream_iterator
—an input stream iteratorThere is also an ostreambuf_iterator
and an istreambuf_iterator
, but these are rarely used and are not further discussed here.
The ostream_iterator
class is an output stream iterator. It is a class template that takes the element type as a type parameter. Its constructor takes an output stream and a delimiter string
to write to the stream following each element. The ostream_iterator
class writes elements using operator<<
.
For example, you can use the ostream_iterator
with the copy()
algorithm to print the elements of a container with only one line of code. The first parameter of copy()
is the start iterator of the range to copy, the second parameter is the end iterator of the range, and the third parameter is the destination iterator:
vector<int> myVector(10);
iota(begin(myVector), end(myVector), 1); // Fill vector with 1,2,3...10
// Print the contents of the vector.
copy(cbegin(myVector), cend(myVector), ostream_iterator<int>(cout, " "));
The output is as follows:
1 2 3 4 5 6 7 8 9 10
You can use the input stream iterator, istream_iterator
, to read values from an input stream using the iterator abstraction. It is a class template that takes the element type as a type parameter. Elements are read using operator>>
. You can use an istream_iterator
as a source for algorithms and container methods.
For example, the following piece of code reads integers from the console until the end of the stream is reached. On Windows, this happens when you press Ctrl+Z followed by Enter, while on Linux you press Enter followed by Ctrl+D. The accumulate()
algorithm is used to sum all the integers together. Note that the default constructor of istream_iterator
creates an end iterator.
cout << "Enter numbers separated by white space." << endl;
cout << "Press Ctrl+Z followed by Enter to stop." << endl;
istream_iterator<int> numbersIter(cin);
istream_iterator<int> endIter;
int sum = accumulate(numbersIter, endIter, 0);
cout << "Sum: " << sum << endl;
Take a moment to reflect on this code. If you remove all the output statements and the variable declarations, the only line left is the call to accumulate()
. Thanks to algorithms and input stream iterators, this single line of code reads any number of integers from the console and sums them together, without using any explicit loops.
The Standard Library provides three iterator adaptors, which are special iterators built on top of other iterators. All three are defined in the <iterator>
header. It’s also possible to write your own iterator adaptors, but this is not covered in this book. Consult one of the books on the Standard Library listed in Appendix B for details.
The Standard Library provides an std::reverse_iterator
class template that iterates through a bidirectional or random access iterator in a reverse direction. Every reversible container in the Standard Library, which happens to be every container that’s part of the standard except forward_list
and the unordered associative containers, supplies a reverse_iterator
type alias and methods called rbegin()
and rend()
. These reverse_iterator
type aliases are of type std::reverse_iterator<T>
with T
equal to the iterator
type alias of the container. The method rbegin()
returns a reverse_iterator
pointing to the last element of the container, and rend()
returns a reverse_iterator
pointing to the element before the first element of the container. Applying operator++
to a reverse_iterator
calls operator--
on the underlying container iterator, and vice versa. For example, iterating over a collection from the beginning to the end can be done as follows:
for (auto iter = begin(collection); iter != end(collection); ++iter) {}
Iterating over the elements in the collection from the end to the beginning can be done using a reverse_iterator
by calling rbegin()
and rend()
. Note that you still call ++iter
:
for (auto iter = rbegin(collection); iter != rend(collection); ++iter) {}
An std::reverse_iterator
is useful mostly with algorithms in the Standard Library that have no equivalents that work in reverse order. For example, the basic find()
algorithm searches for the first element in a sequence. If you want to find the last element in the sequence, you can use a reverse_iterator
instead. Note that when you call an algorithm such as find()
with a reverse_iterator
, it returns a reverse_iterator
as well. You can always obtain the underlying iterator
from a reverse_iterator
by calling the base()
method on the reverse_iterator
. However, due to the implementation details of reverse_iterator
, the iterator
returned from base()
always refers to one element past the element referred to by the reverse_iterator
on which it’s called. In order to get to the same element, you must subtract one.
Here is an example of find()
with a reverse_iterator
:
// The implementation of populateContainer() is identical to that shown in
// Chapter 18, so it is omitted here.
vector<int> myVector;
populateContainer(myVector);
int num;
cout << "Enter a number to find: ";
cin >> num;
auto it1 = find(begin(myVector), end(myVector), num);
auto it2 = find(rbegin(myVector), rend(myVector), num);
if (it1 != end(myVector)) {
cout << "Found " << num << " at position " << it1 - begin(myVector)
<< " going forward." << endl;
cout << "Found " << num << " at position "
<< it2.base() - 1 - begin(myVector) << " going backward." << endl;
} else {
cout << "Failed to find " << num << endl;
}
A possible output of this program is as follows:
Enter a number (0 to quit): 11
Enter a number (0 to quit): 22
Enter a number (0 to quit): 33
Enter a number (0 to quit): 22
Enter a number (0 to quit): 11
Enter a number (0 to quit): 0
Enter a number to find: 22
Found 22 at position 1 going forward.
Found 22 at position 3 going backward.
As Chapter 18 mentions, algorithms like copy()
don’t insert elements into a container; they simply replace old elements in a range with new ones. In order to make algorithms like copy()
more useful, the Standard Library provides three insert iterator adaptors that actually insert elements into a container: insert_iterator
, back_insert_iterator
, and front_insert_iterator
. They are all templatized on a container type, and take the actual container reference in their constructor. Because they supply the necessary iterator interfaces, these adaptors can be used as the destination iterators of algorithms like copy()
. However, instead of replacing elements in the container, they make calls on their container to actually insert new elements.
The basic insert_iterator
calls insert(position, element)
on the container, the back_insert_iterator
calls push_back(element)
, and the front_insert_iterator
calls push_front(element)
.
The following example uses a back_insert_iterator
with the copy_if()
algorithm to populate vectorTwo
with all elements from vectorOne
that are not equal to 100:
vector<int> vectorOne, vectorTwo;
populateContainer(vectorOne);
back_insert_iterator<vector<int>> inserter(vectorTwo);
copy_if(cbegin(vectorOne), cend(vectorOne), inserter,
[](int i){ return i != 100; });
copy(cbegin(vectorTwo), cend(vectorTwo), ostream_iterator<int>(cout, " "));
As you can see, when you use insert iterators, you don’t need to size the destination containers ahead of time.
You can also use the std::back_inserter()
utility function to create a back_insert_iterator
. In the previous example, you can remove the line that defines the inserter
variable, and rewrite the copy_if()
call as follows. The result is exactly the same as the previous implementation:
copy_if(cbegin(vectorOne), cend(vectorOne),
back_inserter(vectorTwo), [](int i){ return i != 100; });
With C++17’s template argument deduction for constructors, this can also be written as follows:
copy_if(cbegin(vectorOne), cend(vectorOne),
back_insert_iterator(vectorTwo), [](int i) { return i != 100; });
The front_insert_iterator
and insert_iterator
work similarly, except that the insert_iterator
also takes an initial iterator position in its constructor, which it passes to the first call to insert(position, element)
. Subsequent iterator position hints are generated based on the return value from each insert()
call.
One benefit of using an insert_iterator
is that it allows you to use associative containers as destinations of the modifying algorithms. Chapter 18 explains that the problem with associative containers is that you are not allowed to modify the elements over which you iterate. By using an insert_iterator
, you can instead insert elements. Associative containers actually support a form of insert()
that takes an iterator position, and are supposed to use the position as a “hint,” which they can ignore. When you use an insert_iterator
on an associative container, you can pass the begin()
or end()
iterator of the container to use as the hint. The insert_iterator
modifies the iterator hint that it passes to insert()
after each call to insert()
, such that the position is one past the just-inserted element.
Here is the previous example modified so that the destination container is a set
instead of a vector
:
vector<int> vectorOne;
set<int> setOne;
populateContainer(vectorOne);
insert_iterator<set<int>> inserter(setOne, begin(setOne));
copy_if(cbegin(vectorOne), cend(vectorOne), inserter,
[](int i){ return i != 100; });
copy(cbegin(setOne), cend(setOne), ostream_iterator<int>(cout, " "));
Similar to the back_insert_iterator
example, you can use the std::inserter()
utility function to create an insert_iterator
:
copy_if(cbegin(vectorOne), cend(vectorOne),
inserter(setOne, begin(setOne)),
[](int i){ return i != 100; });
Or, you can use C++17’s template argument deduction for constructors:
copy_if(cbegin(vectorOne), cend(vectorOne),
insert_iterator(setOne, begin(setOne)),
[](int i) { return i != 100; });
Chapter 9 discusses move semantics, which can be used to prevent unnecessary copying in cases where you know that the source object will be destroyed after an assignment operation or copy construction. There is an iterator adaptor called std::move_iterator
. The dereferencing operator for a move_iterator
automatically converts the value to an rvalue reference, which means that the value can be moved to a new destination without the overhead of copying. Before you can use move semantics, you need to make sure your objects are supporting it. The following MoveableClass
supports move semantics. For more details, see Chapter 9.
class MoveableClass
{
public:
MoveableClass() {
cout << "Default constructor" << endl;
}
MoveableClass(const MoveableClass& src) {
cout << "Copy constructor" << endl;
}
MoveableClass(MoveableClass&& src) noexcept {
cout << "Move constructor" << endl;
}
MoveableClass& operator=(const MoveableClass& rhs) {
cout << "Copy assignment operator" << endl;
return *this;
}
MoveableClass& operator=(MoveableClass&& rhs) noexcept {
cout << "Move assignment operator" << endl;
return *this;
}
};
The constructors and assignment operators are not doing anything useful here, except printing a message to make it easy to see which one is being called. Now that you have this class, you can define a vector
and store a few MoveableClass
instances in it as follows:
vector<MoveableClass> vecSource;
MoveableClass mc;
vecSource.push_back(mc);
vecSource.push_back(mc);
The output could be as follows:
Default constructor // [1]
Copy constructor // [2]
Copy constructor // [3]
Move constructor // [4]
The second line of the code creates a MoveableClass
instance by using the default constructor, [1]. The first push_back()
call triggers the copy constructor to copy mc
into the vector
, [2]. After this operation, the vector has space for one element, the first copy of mc
. Note that this discussion is based on the growth strategy and the initial size of a vector
as implemented by Microsoft Visual C++ 2017. The C++ standard does not specify the initial capacity of a vector
or its growth strategy, so the output can be different with different compilers.
The second push_back()
call triggers the vector
to resize itself, to allocate space for the second element. This resizing causes the move constructor to be called to move every element from the old vector
to the new resized vector
, [4]. The copy constructor is triggered to copy mc
a second time into the vector
, [3]. The order of moving and copying is undefined, so [3] and [4] could be reversed.
You can create a new vector
called vecOne
that contains a copy of the elements from vecSource
as follows:
vector<MoveableClass> vecOne(cbegin(vecSource), cend(vecSource));
Without using move_iterator
s, this code triggers the copy constructor two times, once for every element in vecSource
:
Copy constructor
Copy constructor
By using std::make_move_iterator()
to create move_iterator
s, the move constructor of MoveableClass
is called instead of the copy constructor:
vector<MoveableClass> vecTwo(make_move_iterator(begin(vecSource)),
make_move_iterator(end(vecSource)));
This generates the following output:
Move constructor
Move constructor
You can also use C++17’s template argument deduction for constructors:
vector<MoveableClass> vecTwo(move_iterator(begin(vecSource)),
move_iterator(end(vecSource)));
The Standard Library includes many useful containers, algorithms, and iterators that you can use in your applications. It is impossible, however, for any library to include all possible utilities that all potential clients might need. Thus, the best libraries are extensible: they allow clients to adapt and add to the basic capabilities to obtain exactly the functionality they require. The Standard Library is inherently extensible because of its fundamental structure of separating data from the algorithms that operate on them. You can write your own containers that can work with the Standard Library algorithms by providing iterators that conforms to the Standard Library guidelines. Similarly, you can write your own algorithms that work with iterators from the standard containers. Note that you are not allowed to put your own containers and algorithms in the std
namespace.
If you sit down to write an algorithm or container in C++, you can either make it adhere to the Standard Library conventions or not. For simple containers and algorithms, it might not be worth the extra effort to follow the Standard Library requirements. However, for substantial code that you plan to reuse, the effort pays off. First, the code will be easier for other C++ programmers to understand, because you follow well-established interface guidelines. Second, you will be able to use your container or algorithm with the other parts of the Standard Library (algorithms or containers) without needing to provide special hacks or adaptors. Finally, it will force you to employ the necessary rigor required to develop solid code.
Chapter 18 describes a useful set of algorithms that are part of the Standard Library, but you will inevitably encounter situations in your programs for which you need new algorithms. When that happens, it is usually not difficult to write your own algorithm that works with Standard Library iterators just like the standard algorithms.
Suppose that you want to find all the elements matching a predicate in a given range. The find()
and find_if()
algorithms are the most likely candidates, but each returns an iterator referring to only one element. You can use copy_if()
to find all elements matching a given predicate, but it fills the output with copies of the found elements. If you want to avoid copies, you can use copy_if()
with a back_insert_iterator
into a vector<reference_wrapper<T>>
, but this does not give you the position of the found elements. In fact, there is no standard algorithm to get iterators to all the elements matching a predicate. However, you can write your own version of this functionality called find_all()
.
The first task is to define the function prototype. You can follow the model established by copy_if()
, that is, a function template with three template type parameters: the input iterator type, the output iterator type, and the predicate type. The arguments of the function are start and end iterators of the input sequence, a start iterator of the output sequence, and a predicate object. As with copy_if()
, the algorithm returns an iterator into the output sequence that is one-past-the-last element stored in the output sequence. Here is the prototype:
template <typename InputIterator, typename OutputIterator, typename Predicate>
OutputIterator find_all(InputIterator first, InputIterator last,
OutputIterator dest, Predicate pred);
Another option would be to omit the output iterator, and to return an iterator into the input sequence that iterates over all the matching elements in the input sequence. This would require you to write your own iterator class, which is discussed later in this chapter.
The next task is to write the implementation. The find_all()
algorithm iterates over all elements in the input sequence, calls the predicate on each element, and stores iterators of matching elements in the output sequence. Here is the implementation:
template <typename InputIterator, typename OutputIterator, typename Predicate>
OutputIterator find_all(InputIterator first, InputIterator last,
OutputIterator dest, Predicate pred)
{
while (first != last) {
if (pred(*first)) {
*dest = first;
++dest;
}
++first;
}
return dest;
}
Similar to copy_if()
, the algorithm only overwrites existing elements in the output sequence, so make sure the output sequence is large enough to hold the result, or use an iterator adaptor such as back_insert_iterator
, as demonstrated in the following code. After finding all matching elements, the code counts the number of elements found, which is the number of iterators in matches
. Then, it iterates through the result, printing each element.
vector<int> vec{ 3, 4, 5, 4, 5, 6, 5, 8 };
vector<vector<int>::iterator> matches;
find_all(begin(vec), end(vec), back_inserter(matches),
[](int i){ return i == 5; });
cout << "Found " << matches.size() << " matching elements: " << endl;
for (const auto& it : matches) {
cout << *it << " at position " << (it - cbegin(vec)) << endl;;
}
The output is as follows:
Found 3 matching elements:
5 at position 2
5 at position 4
5 at position 6
Some algorithm implementations need additional information about their iterators. For example, they might need to know the type of the elements referred to by the iterator in order to store temporary values, or perhaps they want to know whether the iterator is bidirectional or random access.
C++ provides a class template called iterator_traits
that allows you to find this information. You instantiate the iterator_traits
class template with the iterator type of interest, and access one of five type aliases: value_type
, difference_type
, iterator_category
, pointer
, or reference
. For example, the following function template declares a temporary variable of the type that an iterator of type IteratorType
refers to. Note the use of the typename
keyword in front of the iterator _traits
line. You must specify typename
explicitly whenever you access a type based on one or more template parameters. In this case, the template parameter IteratorType
is used to access the value_type
type.
#include <iterator>
template <typename IteratorType>
void iteratorTraitsTest(IteratorType it)
{
typename std::iterator_traits<IteratorType>::value_type temp;
temp = *it;
cout << temp << endl;
}
This function can be tested with the following code:
vector<int> v{ 5 };
iteratorTraitsTest(cbegin(v));
With this code, the variable temp
in iteratorTraitsTest()
is of type int
. The output is 5
.
In this example, the auto
keyword could be used to simplify the code, but that wouldn’t show you how to use iterator_traits
.
The C++ standard contains a list of requirements that any container must fulfill in order to qualify as a Standard Library container.
Additionally, if you want your container to be sequential (like a vector
), ordered associative (like a map
), or unordered associative (like an unordered_map
), it must conform to supplementary requirements.
My suggestion when writing a new container is to write the basic container first, following the general Standard Library rules such as making it a class template, but without worrying too much yet about the specific details of Standard Library conformity. After you’ve developed the basic implementation, you can add the iterator and methods so that it can work with the Standard Library framework. This chapter takes that approach to develop a hash map.
It is recommended to use the standard C++ unordered associative containers, also called hash tables, instead of implementing your own. These unordered associative containers, explained in Chapter 17, are called unordered _map
, unordered_multimap
, unordered_set
, and unordered_multiset
. The hash_map
in this chapter is only used to demonstrate writing Standard Library containers.
C++11 added support for hash tables, which are discussed in Chapter 17. However, pre-C++11 did not include hash tables. Unlike the Standard Library map
and set
, which provide logarithmic insertion, lookup, and deletion times, a hash table provides constant time insertion, deletion, and lookup in the average case, linear in the worst case. Instead of storing elements in sorted order, a hash table hashes, or maps, each element to a particular bucket. As long as the number of elements stored isn’t significantly greater than the number of buckets, and the hash function distributes the elements uniformly between the buckets, the insertion, deletion, and lookup operations all run in constant time.
This section assumes that you are familiar with hashed data structures. If you are not, consult Chapter 17, which includes a discussion on hash tables, or one of the standard data structure texts listed in Appendix B.
This section implements a simple, but fully functional, hash_map
. Like a map
, a hash_map
stores key/value pairs. In fact, the operations it provides are almost identical to those provided by the map
, but with different performance characteristics.
This hash_map
implementation uses chained hashing (also called open hashing), and does not attempt to provide advanced features such as rehashing. Chapter 17 explains the concept of chained hashing in the section on unordered associative containers.
The first choice when writing a hash_map
is how to handle hash functions. Recalling the adage that a good abstraction makes the easy case easy and the hard case possible, a good hash_map
interface allows clients to specify their own hash function and number of buckets in order to customize the hashing behavior for their particular workload. On the other hand, clients that do not have the desire, or ability, to write a good hash function and choose a number of buckets should still be able to use the container without doing so. One solution is to allow clients to provide a hash function and number of buckets in the hash_map
constructor, but also to provide default values. In this implementation, the hash function is a simple function object containing just a single function call operator. The function object is templatized on the key type that it hashes in order to support a templatized hash_map
container. Template specialization can be used to write custom hash functions for certain types. Here is the basic hash
function object:
template <typename T>
class hash
{
public:
size_t operator()(const T& key) const;
};
Note that everything for the hash_map
implementation is inside a ProCpp
namespace so that names don’t clash with already existing names. The implementation of the hash
function call operator is tricky because it must apply to keys of any type. The following implementation computes an integer-sized hash value by simply treating the key as a sequence of bytes:
// Calculate a hash by treating the key as a sequence
// of bytes and summing the ASCII values of the bytes.
template <typename T>
size_t hash<T>::operator()(const T& key) const
{
const size_t bytes = sizeof(key);
size_t sum = 0;
for (size_t i = 0; i < bytes; ++i) {
unsigned char b = *(reinterpret_cast<const unsigned char*>(&key) + i);
sum += b;
}
return sum;
}
Unfortunately, when using this hashing method on string
s, the function calculates the hash of the entire string
object, and not just of the actual text. The actual text is probably on the heap, and the string object only contains a length and a pointer to the text on the heap. The pointer will be different, even if the text it refers to is the same. The result is that two string
objects with the same text will generate different hash values. Therefore, it’s a good idea to provide a specialization of the hash
template for string
s, and in general for any class that contains dynamically allocated memory. Template specialization is discussed in detail in Chapter 12.
// A hash specialization for strings
template <>
class hash<std::string>
{
public:
size_t operator()(const std::string& key) const;
};
// Calculate a hash by summing the ASCII values of all characters.
size_t hash<std::string>::operator()(const std::string& key) const
{
size_t sum = 0;
for (auto c : key) {
sum += static_cast<unsigned char>(c);
}
return sum;
}
If you want to use other pointer types or objects as the key, you should write your own hash specialization for those types.
The hash functions shown in this section are very basic. They do not guarantee uniform hashing for all key universes. If you need more mathematically rigorous hash functions, or if you don’t know what uniform hashing is, consult an algorithm’s reference from Appendix B.
A hash_map
supports three basic operations: insertion, deletion, and lookup. It is also swappable. Of course, it provides a constructor and destructor as well. The copy and move constructors are explicitly defaulted, and the copy and move assignment operators are provided. Here is the public portion of the hash_map
class template:
template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
typename Hash = hash<Key>>
class hash_map
{
public:
using key_type = Key;
using mapped_type = T;
using value_type = std::pair<const Key, T>;
virtual ~hash_map() = default; // Virtual destructor
// Throws invalid_argument if the number of buckets is illegal.
explicit hash_map(const KeyEqual& equal = KeyEqual(),
size_t numBuckets = 101, const Hash& hash = Hash());
// Copy constructor
hash_map(const hash_map<Key, T, KeyEqual, Hash>& src) = default;
// Move constructor
hash_map(hash_map<Key, T, KeyEqual, Hash>&& src) noexcept = default;
// Copy assignment operator
hash_map<Key, T, KeyEqual, Hash>& operator=(
const hash_map<Key, T, KeyEqual, Hash>& rhs);
// Move assignment operator
hash_map<Key, T, KeyEqual, Hash>& operator=(
hash_map<Key, T, KeyEqual, Hash>&& rhs) noexcept;
// Inserts the key/value pair x.
void insert(const value_type& x);
// Removes the element with key k, if it exists.
void erase(const key_type& k);
// Removes all elements.
void clear() noexcept;
// Find returns a pointer to the element with key k.
// Returns nullptr if no element with that key exists.
value_type* find(const key_type& k);
const value_type* find(const key_type& k) const;
// operator[] finds the element with key k, or inserts an
// element with that key if none exists yet. Returns a reference to
// the value corresponding to that key.
T& operator[] (const key_type& k);
// Swaps two hash_maps.
void swap(hash_map<Key, T, KeyEqual, Hash>& other) noexcept;
private:
// Implementation details not shown yet
};
As you can see, the key and value types are both template parameters, similar as for the Standard Library map
. A hash_map
stores pair<const Key, T>
as the actual elements in the container. The insert()
, erase()
, find()
, clear()
, and operator[]
methods are straightforward. However, a few aspects of this interface require further explanation.
Like a map
, set
, and other standard containers, a hash_map
allows the client to specify the comparison type as a template parameter and to pass a specific comparison object of that type in the constructor. Unlike a map
and set
, a hash_map
does not sort elements by key, but must still compare keys for equality. Thus, instead of using less
as the default comparator, it uses the transparent equal_to<>
comparator. The comparison object is used only to detect attempts to insert duplicate keys into the container.
You should be able to change the hashing function to make it better suit the type of elements you want to store in the hash map. Thus, the hash_map
template takes four template parameters: the key type, the value type, the comparator type, and the hash type.
The hash_map
class template defines three type aliases:
using key_type = Key;
using mapped_type = T;
using value_type = std::pair<const Key, T>;
The value_type
, in particular, is useful for referring to the more cumbersome pair<const Key, T>
type. As you will see, these type aliases are required to satisfy the Standard Library container requirements.
After you finalize the hash_map
interface, you need to choose the implementation model. The basic hash table structure generally consists of a fixed number of buckets, each of which can store one or more elements. The buckets should be accessible in constant time based on a bucket-id (the result of hashing a key). Thus, a vector
is the most appropriate container for the buckets. Each bucket must store a list of elements, so the Standard Library list
can be used as the bucket type. Thus, the final structure is a vector
of list
s of pair<const Key, T>
elements.1 Here are the private
members of the hash_map
class:
private:
using ListType = std::list<value_type>;
std::vector<ListType> mBuckets;
size_t mSize = 0;
KeyEqual mEqual;
Hash mHash;
Without the type aliases for value_type
and ListType
, the line declaring mBuckets
would look like this:
std::vector<std::list<std::pair<const Key, T>>> mBuckets;
The mEqual
and mHash
members store the comparison and hashing objects, respectively, and mSize
stores the number of elements currently in the container.
The constructor initializes all the fields. It constructs mBuckets
with the correct number of buckets. Unfortunately, the template syntax is somewhat dense. If the syntax confuses you, consult Chapter 12 for details on writing class templates.
// Construct mBuckets with the correct number of buckets.
template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>::hash_map(
const KeyEqual& equal, size_t numBuckets, const Hash& hash)
: mBuckets(numBuckets), mEqual(equal), mHash(hash)
{
if (numBuckets == 0) {
throw std::invalid_argument("Number of buckets must be positive");
}
}
The implementation requires at least one bucket, so the constructor enforces that restriction.
Each of the three major operations (lookup, insertion, and deletion) requires code to find an element with a given key. Thus, it is helpful to have a private
helper method that performs that task. findElement()
first uses the hash object to calculate the hash of the key and limits the calculated hash value to the number of hash buckets by taking the modulo of the calculated value. Then, it searches all the elements in that bucket for an element with a key matching the given key. The elements stored are key/value pairs, so the actual comparison must be done on the first field of the element. The comparison function object specified in the constructor is used to perform the comparison.
template <typename Key, typename T, typename KeyEqual, typename Hash>
std::pair<
typename hash_map<Key, T, KeyEqual, Hash>::ListType::iterator, size_t>
hash_map<Key, T, KeyEqual, Hash>::findElement(const key_type& k)
{
// Hash the key to get the bucket.
size_t bucket = mHash(k) % mBuckets.size();
// Search for the key in the bucket.
auto iter = find_if(begin(mBuckets[bucket]), end(mBuckets[bucket]),
[this, &k](const auto& element) { return mEqual(element.first, k); });
// Return a pair of the iterator and the bucket index.
return std::make_pair(iter, bucket);
}
findElement()
returns a pair
containing an iterator and a bucket index. The bucket index is the index of the bucket to which the given key maps, independent of whether or not the given key is actually in the container. The returned iterator refers to an element in the bucket list
, the list
representing the bucket to which the key mapped. If the element is found, the iterator refers to that element; otherwise, it is the end iterator for that list
.
The syntax in the function header of this method is somewhat confusing, particularly the use of the typename
keyword. You must use the typename
keyword whenever you are using a type that is dependent on a template parameter. Specifically, the type ListType::iterator
, which is list<pair<const Key, T>>::iterator
, is dependent on both the Key
and T
template parameters.
You can implement the find()
method as a simple wrapper for findElement()
:
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::value_type*
hash_map<Key, T, KeyEqual, Hash>::find(const key_type& k)
{
// Use the findElement() helper, and C++17 structured bindings.
auto[it, bucket] = findElement(k);
if (it == end(mBuckets[bucket])) {
// Element not found -- return nullptr.
return nullptr;
}
// Element found -- return a pointer to it.
return &(*it);
}
The const
version of find()
uses a const_cast
to forward the request to the non-const
version to avoid code duplication:
template <typename Key, typename T, typename KeyEqual, typename Hash>
const typename hash_map<Key, T, KeyEqual, Hash>::value_type*
hash_map<Key, T, KeyEqual, Hash>::find(const key_type& k) const
{
return const_cast<hash_map<Key, T, KeyEqual, Hash>*>(this)->find(k);
}
The operator[]
implementation uses findElement()
and if it does not find the element, it inserts it as follows:
template <typename Key, typename T, typename KeyEqual, typename Hash>
T& hash_map<Key, T, KeyEqual, Hash>::operator[] (const key_type& k)
{
// Try to find the element. If it doesn't exist, add a new element.
auto[it, bucket] = findElement(k);
if (it == end(mBuckets[bucket])) {
mSize++;
mBuckets[bucket].push_back(std::make_pair(k, T()));
return mBuckets[bucket].back().second;
} else {
return it->second;
}
}
insert()
must first check if an element with that key is already in the hash_map
. If not, it can add the element to the list
in the appropriate bucket. Note that findElement()
returns the bucket index to which a key hashes, even if an element with that key is not found:
template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::insert(const value_type& x)
{
// Try to find the element.
auto[it, bucket] = findElement(x.first);
if (it != end(mBuckets[bucket])) {
// The element already exists.
return;
} else {
// We didn't find the element, so insert a new one.
mSize++;
mBuckets[bucket].push_back(x);
}
}
Note that this implementation of insert()
returns void
, so the caller does not know whether the element was inserted or if it was already in the hash_map
. This shortcoming is resolved later in this chapter, once iterators have been implemented for hash_map
.
erase()
follows the same pattern as insert()
: It first attempts to find the element by calling findElement()
. If the element exists, it erases it from the list in the appropriate bucket. Otherwise, it does nothing.
template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::erase(const key_type& k)
{
// First, try to find the element.
auto[it, bucket] = findElement(k);
if (it != end(mBuckets[bucket])) {
// The element exists -- erase it.
mBuckets[bucket].erase(it);
mSize--;
}
}
clear()
simply clears each bucket, and sets the size of the hash_map
to zero:
template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::clear() noexcept
{
// Call clear on each bucket.
for (auto& bucket : mBuckets) {
bucket.clear();
}
mSize = 0;
}
The swap()
method just swaps all data members using std::swap()
:
template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::swap(
hash_map<Key, T, KeyEqual, Hash>& other) noexcept
{
using std::swap;
swap(mBuckets, other.mBuckets);
swap(mSize, other.mSize);
swap(mEqual, other.mEqual);
swap(mHash, other.mHash);
}
The following standalone swap()
function is also provided, which simply forwards to the swap()
method:
template <typename Key, typename T, typename KeyEqual, typename Hash>
void swap(hash_map<Key, T, KeyEqual, Hash>& first,
hash_map<Key, T, KeyEqual, Hash>& second) noexcept
{
first.swap(second);
}
Here are the implementations of the copy and move assignment operators. See Chapter 9 for a discussion of the copy-and-swap idiom.
template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>&
hash_map<Key, T, KeyEqual, Hash>::operator=(
const hash_map<Key, T, KeyEqual, Hash>& rhs)
{
// check for self-assignment
if (this == &rhs) {
return *this;
}
// Copy-and-swap idiom
auto copy = rhs; // Do all the work in a temporary instance
swap(copy); // Commit the work with only non-throwing operations
return *this;
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>&
hash_map<Key, T, KeyEqual, Hash>::operator=(
hash_map<Key, T, KeyEqual, Hash>&& rhs) noexcept
{
swap(rhs);
return *this;
}
Here is a small test program demonstrating the basic hash_map
class template:
hash_map<int, int> myHash;
myHash.insert(make_pair(4, 40));
myHash.insert(make_pair(6, 60));
// x will have type hash_map<int, int>::value_type*
auto x = myHash.find(4);
if (x != nullptr) {
cout << "4 maps to " << x->second << endl;
} else {
cout << "cannot find 4 in map" << endl;
}
myHash.erase(4);
auto x2 = myHash.find(4);
if (x2 != nullptr) {
cout << "4 maps to " << x2->second << endl;
} else {
cout << "cannot find 4 in map" << endl;
}
myHash[4] = 35;
myHash[4] = 60;
auto x3 = myHash.find(4);
if (x3 != nullptr) {
cout << "4 maps to " << x3->second << endl;
} else {
cout << "cannot find 4 in map" << endl;
}
// Test std::swap().
hash_map<int, int> other(std::equal_to<>(), 11);
swap(other, myHash);
// Test copy construction and copy assignment.
hash_map<int, int> myHash2(other);
hash_map<int, int> myHash3;
myHash3 = myHash2;
// Test move construction and move assignment.
hash_map<int, int> myHash4(std::move(myHash3));
hash_map<int, int> myHash5;
myHash5 = std::move(myHash4);
The output is as follows:
4 maps to 40
cannot find 4 in map
4 maps to 60
The basic hash_map
shown in the previous section follows the spirit, but not the letter, of the Standard Library. For most purposes, the preceding implementation is good enough. However, if you want to use the Standard Library algorithms on your hash_map
, you must do a bit more work. The C++ standard specifies methods and type aliases that a data structure must provide in order to qualify as a Standard Library container.
The C++ standard specifies that every Standard Library container must provide the following public
type aliases.
TYPE NAME | DESCRIPTION |
value_type |
The element type stored in the container |
reference |
A reference to the element type stored in the container |
const_reference |
A const reference to the element type stored in the container |
iterator |
The type for iterating over elements of the container |
const_iterator |
A version of iterator for iterating over const elements of the container |
size_type |
A type that can represent the number of elements in the container; this is usually just size_t (from <cstddef> ). |
difference_type |
A type that can represent the difference of two iterator s for the container; this is usually just ptrdiff_t (from <cstddef> ). |
Here are the definitions for the hash_map
class template of all these type aliases except iterator
and const_iterator
. Writing an iterator is covered in detail in a subsequent section. Note that value_type
(plus key_type
and mapped_type
, which are discussed later) was already defined in the previous version of the hash_map
. This implementation also adds a type alias hash_map_type
to give a shorter name to a specific template instantiation of hash_map
:
template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
typename Hash = hash<Key>>
class hash_map
{
public:
using key_type = Key;
using mapped_type = T;
using value_type = std::pair<const Key, T>;
using reference = value_type&;
using const_reference = const value_type&;
using size_type = size_t;
using difference_type = ptrdiff_t;
using hash_map_type = hash_map<Key, T, KeyEqual, Hash>;
// Remainder of class definition omitted for brevity
};
In addition to the obligatory type aliases, every container must provide the following methods.
METHOD | DESCRIPTION | WORST-CASE COMPLEXITY |
Default constructor | Constructs an empty container | Constant |
Copy constructor | Performs a deep copy of the container | Linear |
Move constructor | Performs a move constructing operation | Constant |
Copy assignment operator | Performs a deep copy of the container | Linear |
Move assignment operator | Performs a move assignment operation | Constant |
Destructor | Destroys dynamically allocated memory; this method calls destructor on all elements left in the container. | Linear |
iterator begin(); const_iterator begin() const; |
Returns an iterator referring to the first element in the container | Constant |
iterator end(); const_iterator end() const; |
Returns an iterator referring to one-past-the-last element in the container | Constant |
const_iterator cbegin() const; |
Returns a const iterator referring to the first element in the container |
Constant |
const_iterator cend() const; |
Returns a const iterator referring to one-past-the-last element in the container |
Constant |
operator== operator!= |
Comparison operators that compare two containers, element by element | Linear |
void swap(Container&) noexcept; |
Swaps the contents of the container passed to the method with the object on which the method is called | Constant |
size_type size() const; |
Returns the number of elements in the container | Constant |
size_type max_size() const; |
Returns the maximum number of elements the container can hold | Constant |
bool empty() const; |
Returns whether the container has any elements | Constant |
In this hash_map
example, comparison operators are omitted. Implementing them would be a good exercise for you to try, but you first have to think about the semantics of equality for two hash_map
s. One possibility could be that two hash_map
s are equal if and only if they have exactly the same number of buckets with the same contents. Similarly, you'll have to think about what it means for a hash_map
to be less than another hash_map
. An option is to define it as a pairwise comparison of the elements.
The following code snippet shows the declarations of the remaining methods except for begin()
, end()
, cbegin()
, and cend()
. Those are covered in the next section.
template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
typename Hash = hash<Key>>
class hash_map
{
public:
// Type aliases omitted for brevity
// Size methods
bool empty() const;
size_type size() const;
size_type max_size() const;
// Other methods omitted for brevity
};
The implementations of size()
and empty()
are easy because the hash_map
implementation tracks its size in the mSize
data member. Note that size_type
is one of the type aliases defined in the class. Because it is a member of the class, such a return type in the implementation must be fully qualified with typename hash_map<Key, T, KeyEqual, Hash>
:
template <typename Key, typename T, typename KeyEqual, typename Hash>
bool hash_map<Key, T, KeyEqual, Hash>::empty() const
{
return mSize == 0;
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
hash_map<Key, T, KeyEqual, Hash>::size() const
{
return mSize;
}
max_size()
is a little trickier. At first, you might think the maximum size of a hash_map
container is the sum of the maximum size of all the list
s. However, the worst-case scenario is that all the elements hash to the same bucket. Thus, the maximum size it can claim to support is the maximum size of a single list
:
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
hash_map<Key, T, KeyEqual, Hash>::max_size() const
{
return mBuckets[0].max_size();
}
The most important container requirement is the iterator. In order to work with the generic algorithms, every container must provide an iterator for accessing the elements in the container. Your iterator should generally provide overloaded operator*
and operator->
, plus some other operations depending on its specific behavior. As long as your iterator provides the basic iteration operations, everything should be fine.
The first decision to make about your iterator is what kind it will be: forward, bidirectional, or random access. Random-access iterators don’t make much sense for associative containers, so bidirectional seems like the logical choice for a hash_map
iterator. That means you must also provide operator++
, operator--
, operator==
, and operator!=
. Consult Chapter 17 for more details on the requirements for the different iterators.
The second decision is how to order the elements of your container. The hash_map
is unsorted, so iterating in a sorted order is probably too difficult. Instead, your iterator can just step through the buckets, starting with the elements in the first bucket and progressing to those in the last bucket. This order will appear random to the client, but will be consistent and repeatable.
The third decision is how to represent your iterator internally. The implementation is usually quite dependent on the internal implementation of the container. The first purpose of an iterator is to refer to a single element in the container. In the case of a hash_map
, each element is in a Standard Library list
, so perhaps a hash_map
iterator can be a wrapper around a list
iterator referring to the element in question. However, the second purpose of a bidirectional iterator is to allow the client to progress to the next or previous element from the current one. In order to progress from one bucket to the next, you need to track the current bucket and the hash_map
object to which the iterator refers.
Once you’ve chosen your implementation, you must decide on a consistent representation for the end iterator. Recall that the end iterator should really be the “past-the-end” marker: the iterator that’s reached by applying ++
to an iterator referring to the final element in the container. A hash_map
iterator can use as its end iterator the end iterator of the list
of the final bucket in the hash_map
.
A container needs to provide both a const
iterator and a non-const
iterator. The non-const
iterator must be convertible to a const
iterator. This implementation defines a const_hash_map_iterator
class with hash_map_iterator
deriving from it.
Given the decisions made in the previous section, it’s time to define the const_hash_map_iterator
class. The first thing to note is that each const_hash_map_iterator
object is an iterator for a specific instantiation of the hash_map
class. In order to provide this one-to-one mapping, the const_hash_map_iterator
must also be a class template with the hash map type as a template parameter called HashMap
.
The main question in the class definition is how to conform to the bidirectional iterator requirements. Recall that anything that behaves like an iterator is an iterator. Your class is not required to derive from another class in order to qualify as a bidirectional iterator. However, if you want your iterator to be usable in the generic algorithms functions, you must specify its traits. The discussion earlier in this chapter explains that iterator_traits
is a class template that defines, for each iterator type, five type aliases: value_type
, difference_type
, iterator_category
, pointer
, and reference
. The iterator_traits
class template coud be partially specialized for your new iterator type if you want. Alternatively, the default implementation of the iterator_traits
class template just grabs the five type aliases out of the iterator class itself. Thus, you can simply define those type aliases directly in your iterator class. The const_hash_map_iterator
is a bidirectional iterator, so you specify bidirectional_iterator_tag
as the iterator category. Other legal iterator categories are input_iterator_tag
, output_iterator_tag
, forward_iterator_tag
, and random_access_iterator_tag
. For the const_hash_map_iterator
, the element type is typename HashMap::value_type
.
Here is the basic const_hash_map_iterator
class definition:
template <typename HashMap>
class const_hash_map_iterator
{
public:
using value_type = typename HashMap::value_type;
using difference_type = ptrdiff_t;
using iterator_category = std::bidirectional_iterator_tag;
using pointer = value_type*;
using reference = value_type&;
using list_iterator_type = typename HashMap::ListType::const_iterator;
// Bidirectional iterators must supply a default constructor.
// Using an iterator constructed with the default constructor
// is undefined, so it doesn't matter how it's initialized.
const_hash_map_iterator() = default;
const_hash_map_iterator(size_t bucket, list_iterator_type listIt,
const HashMap* hashmap);
// Don't need to define a copy constructor or operator= because the
// default behavior is what we want.
// Don't need destructor because the default behavior
// (not deleting mHashmap) is what we want!
const value_type& operator*() const;
// Return type must be something to which -> can be applied.
// Return a pointer to a pair<const Key, T>, to which the compiler
// will apply -> again.
const value_type* operator->() const;
const_hash_map_iterator<HashMap>& operator++();
const_hash_map_iterator<HashMap> operator++(int);
const_hash_map_iterator<HashMap>& operator--();
const_hash_map_iterator<HashMap> operator--(int);
// The following are ok as member functions because we don't
// support comparisons of different types to this one.
bool operator==(const const_hash_map_iterator<HashMap>& rhs) const;
bool operator!=(const const_hash_map_iterator<HashMap>& rhs) const;
protected:
size_t mBucketIndex = 0;
list_iterator_type mListIterator;
const HashMap* mHashmap = nullptr;
// Helper methods for operator++ and operator--
void increment();
void decrement();
};
Consult Chapter 15 for details on operator overloading if the definitions and implementations (shown in the next section) of the overloaded operators confuse you.
The const_hash_map_iterator
constructor initializes the three member variables:
template<typename HashMap>
const_hash_map_iterator<HashMap>::const_hash_map_iterator(size_t bucket,
list_iterator_type listIt, const HashMap* hashmap)
: mBucketIndex(bucket), mListIterator(listIt), mHashmap(hashmap)
{
}
The default constructor is defaulted so that clients can declare const_hash_map_iterator
variables without initializing them. An iterator constructed with the default constructor does not need to refer to any value, and attempting any operations on it is allowed to have undefined results.
The implementations of the dereferencing operators are concise, but can be tricky. Chapter 15 explains that operator*
and operator->
are asymmetric; operator*
returns a reference to the actual underlying value, which in this case is the element to which the iterator refers, while operator->
must return something to which the arrow operator can be applied again. Thus, it returns a pointer to the element. The compiler then applies ->
to the pointer, which results in accessing a field of the element.
// Return a reference to the actual element.
template<typename HashMap>
const typename const_hash_map_iterator<HashMap>::value_type&
const_hash_map_iterator<HashMap>::operator*() const
{
return *mListIterator;
}
// Return a pointer to the actual element, so the compiler can
// apply -> to it to access the actual desired field.
template<typename HashMap>
const typename const_hash_map_iterator<HashMap>::value_type*
const_hash_map_iterator<HashMap>::operator->() const
{
return &(*mListIterator);
}
The increment and decrement operators are implemented as follows. They defer the actual incrementing and decrementing to the private increment()
and decrement()
helper methods.
// Defer the details to the increment() helper.
template<typename HashMap>
const_hash_map_iterator<HashMap>&
const_hash_map_iterator<HashMap>::operator++()
{
increment();
return *this;
}
// Defer the details to the increment() helper.
template<typename HashMap>
const_hash_map_iterator<HashMap>
const_hash_map_iterator<HashMap>::operator++(int)
{
auto oldIt = *this;
increment();
return oldIt;
}
// Defer the details to the decrement() helper.
template<typename HashMap>
const_hash_map_iterator<HashMap>&
const_hash_map_iterator<HashMap>::operator--()
{
decrement();
return *this;
}
// Defer the details to the decrement() helper.
template<typename HashMap>
const_hash_map_iterator<HashMap>
const_hash_map_iterator<HashMap>::operator--(int)
{
auto oldIt = *this;
decrement();
return oldIt;
}
Incrementing a const_hash_map_iterator
tells it to refer to the “next” element in the container. This method first increments the list
iterator, then checks if it has reached the end of its bucket. If so, it finds the next non-empty bucket in the hash_map
and sets the list
iterator equal to the start element in that bucket. Note that it can’t simply move to the next bucket, because there might not be any elements in it. If there are no more non-empty buckets, mListIterator
is set, by the convention chosen for this example, to the end iterator of the last bucket in the hash_map
, which is the special “end” position of the const_hash_map_iterator
. Iterators are not required to be any safer than dumb pointers, so error-checking for things like incrementing an iterator already at the end is not required.
// Behavior is undefined if mListIterator already refers to the past-the-end
// element, or is otherwise invalid.
template<typename HashMap>
void const_hash_map_iterator<HashMap>::increment()
{
// mListIterator is an iterator into a single bucket. Increment it.
++mListIterator;
// If we're at the end of the current bucket,
// find the next bucket with elements.
auto& buckets = mHashmap->mBuckets;
if (mListIterator == end(buckets[mBucketIndex])) {
for (size_t i = mBucketIndex + 1; i < buckets.size(); i++) {
if (!buckets[i].empty()) {
// We found a non-empty bucket.
// Make mListIterator refer to the first element in it.
mListIterator = begin(buckets[i]);
mBucketIndex = i;
return;
}
}
// No more non-empty buckets. Set mListIterator to refer to the
// end iterator of the last list.
mBucketIndex = buckets.size() - 1;
mListIterator = end(buckets[mBucketIndex]);
}
}
Decrement is the inverse of increment: it makes the iterator refer to the “previous” element in the container. However, there is an asymmetry because of the asymmetry between the way the start and end positions are represented: start is the first element, but end is “one past” the last element. The algorithm for decrement first checks if the underlying list
iterator is at the start of its current bucket. If not, it can just be decremented. Otherwise, the code needs to check for the first non-empty bucket before the current one. If one is found, the list
iterator must be set to refer to the last element in that bucket, which is the end iterator decremented by one. If no non-empty buckets are found, the decrement is invalid, so the code can do anything it wants (behavior is undefined). Note that the for
loop needs to use a signed integer type for its loop variable and not an unsigned type such as size_t
because the loop uses --i
:
// Behavior is undefined if mListIterator already refers to the first
// element, or is otherwise invalid.
template<typename HashMap>
void const_hash_map_iterator<HashMap>::decrement()
{
// mListIterator is an iterator into a single bucket.
// If it's at the beginning of the current bucket, don't decrement it.
// Instead, try to find a non-empty bucket before the current one.
auto& buckets = mHashmap->mBuckets;
if (mListIterator == begin(buckets[mBucketIndex])) {
for (int i = mBucketIndex - 1; i >= 0; --i) {
if (!buckets[i].empty()) {
mListIterator = --end(buckets[i]);
mBucketIndex = i;
return;
}
}
// No more non-empty buckets. This is an invalid decrement.
// Set mListIterator to refer to the end iterator of the last list.
mBucketIndex = buckets.size() - 1;
mListIterator = end(buckets[mBucketIndex]);
} else {
// We're not at the beginning of the bucket, so just move down.
--mListIterator;
}
}
Note that both increment()
and decrement()
access private
members of the hash_map
class. Thus, the hash_map
class must declare const_hash_map_iterator
to be a friend
class.
After increment()
and decrement()
, operator==
and operator!=
are positively simple. They just compare each of the three data members of the objects:
template<typename HashMap>
bool const_hash_map_iterator<HashMap>::operator==(
const const_hash_map_iterator<HashMap>& rhs) const
{
// All fields, including the hash_map to which the iterators refer,
// must be equal.
return (mHashmap == rhs.mHashmap &&
mBucketIndex == rhs.mBucketIndex &&
mListIterator == rhs.mListIterator);
}
template<typename HashMap>
bool const_hash_map_iterator<HashMap>::operator!=(
const const_hash_map_iterator<HashMap>& rhs) const
{
return !(*this == rhs);
}
The hash_map_iterator
class derives from const_hash_map_iterator
. It does not need to override operator==
, operator!=
, increment()
, and decrement()
because the base class versions are just fine:
template <typename HashMap>
class hash_map_iterator : public const_hash_map_iterator<HashMap>
{
public:
using value_type =
typename const_hash_map_iterator<HashMap>::value_type;
using difference_type = ptrdiff_t;
using iterator_category = std::bidirectional_iterator_tag;
using pointer = value_type*;
using reference = value_type&;
using list_iterator_type = typename HashMap::ListType::iterator;
hash_map_iterator() = default;
hash_map_iterator(size_t bucket, list_iterator_type listIt,
HashMap* hashmap);
value_type& operator*();
value_type* operator->();
hash_map_iterator<HashMap>& operator++();
hash_map_iterator<HashMap> operator++(int);
hash_map_iterator<HashMap>& operator--();
hash_map_iterator<HashMap> operator--(int);
};
The implementations of the hash_map_iterator
methods are rather straightforward. The constructor just calls the base class constructor. The operator*
and operator->
use const_cast
to return a non-const
type. operator++
and operator--
just use the increment()
and decrement()
from the base class, but return a hash_map_iterator
instead of a const_hash_map_iterator
. Note that the C++ name lookup rules require you to explicitly use the this
pointer to refer to data members and methods in a base class template:
template<typename HashMap>
hash_map_iterator<HashMap>::hash_map_iterator(size_t bucket,
list_iterator_type listIt, HashMap* hashmap)
: const_hash_map_iterator<HashMap>(bucket, listIt, hashmap)
{
}
// Return a reference to the actual element.
template<typename HashMap>
typename hash_map_iterator<HashMap>::value_type&
hash_map_iterator<HashMap>::operator*()
{
return const_cast<value_type&>(*this->mListIterator);
}
// Return a pointer to the actual element, so the compiler can
// apply -> to it to access the actual desired field.
template<typename HashMap>
typename hash_map_iterator<HashMap>::value_type*
hash_map_iterator<HashMap>::operator->()
{
return const_cast<value_type*>(&(*this->mListIterator));
}
// Defer the details to the increment() helper in the base class.
template<typename HashMap>
hash_map_iterator<HashMap>& hash_map_iterator<HashMap>::operator++()
{
this->increment();
return *this;
}
// Defer the details to the increment() helper in the base class.
template<typename HashMap>
hash_map_iterator<HashMap> hash_map_iterator<HashMap>::operator++(int)
{
auto oldIt = *this;
this->increment();
return oldIt;
}
// Defer the details to the decrement() helper in the base class.
template<typename HashMap>
hash_map_iterator<HashMap>& hash_map_iterator<HashMap>::operator--()
{
this->decrement();
return *this;
}
// Defer the details to the decrement() helper in the base class.
template<typename HashMap>
hash_map_iterator<HashMap> hash_map_iterator<HashMap>::operator--(int)
{
auto oldIt = *this;
this->decrement();
return oldIt;
}
The final piece involved in providing iterator support for hash_map
is to supply the necessary type aliases in the hash_map
class template, and to write the begin()
, end()
, cbegin()
, and cend()
methods. The type aliases and method prototypes look like this:
template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
typename Hash = hash<Key>>
class hash_map
{
public:
// Other type aliases omitted for brevity
using iterator = hash_map_iterator<hash_map_type>;
using const_iterator = const_hash_map_iterator<hash_map_type>;
// Iterator methods
iterator begin();
iterator end();
const_iterator begin() const;
const_iterator end() const;
const_iterator cbegin() const;
const_iterator cend() const;
// Remainder of class definition omitted for brevity
};
The implementation of begin()
includes an optimization for the case when there are no elements in the hash_map
, in which case the end iterator is returned. Here is the code:
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
hash_map<Key, T, KeyEqual, Hash>::begin()
{
if (mSize == 0) {
// Special case: there are no elements, so return the end iterator.
return end();
}
// We know there is at least one element. Find the first element.
for (size_t i = 0; i < mBuckets.size(); ++i) {
if (!mBuckets[i].empty()) {
return hash_map_iterator<hash_map_type>(i,
std::begin(mBuckets[i]), this);
}
}
// Should never reach here, but if we do, return the end iterator.
return end();
}
end()
creates a hash_map_iterator
referring to the end iterator of the last bucket:
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
hash_map<Key, T, KeyEqual, Hash>::end()
{
// The end iterator is the end iterator of the list of the last bucket.
size_t bucket = mBuckets.size() - 1;
return hash_map_iterator<hash_map_type>(bucket,
std::end(mBuckets[bucket]), this);
}
The implementations of the const
versions of begin()
and end()
use const_cast
to call the non-const
versions. These non-const
versions return a hash_map_iterator
, which is convertible to a const_hash_map_iterator
:
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
hash_map<Key, T, KeyEqual, Hash>::begin() const
{
return const_cast<hash_map_type*>(this)->begin();
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
hash_map<Key, T, KeyEqual, Hash>::end() const
{
return const_cast<hash_map_type*>(this)->end();
}
The cbegin()
and cend()
methods forward the request to the const
versions of begin()
and end()
:
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
hash_map<Key, T, KeyEqual, Hash>::cbegin() const
{
return begin();
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
hash_map<Key, T, KeyEqual, Hash>::cend() const
{
return end();
}
Now that hash_map
supports iteration, you can iterate over its elements just as you would on any Standard Library container, and you can pass the iterators to methods and functions. Here are some examples:
hash_map<string, int> myHash;
myHash.insert(make_pair("KeyOne", 100));
myHash.insert(make_pair("KeyTwo", 200));
myHash.insert(make_pair("KeyThree", 300));
for (auto it = myHash.cbegin(); it != myHash.cend(); ++it) {
// Use both -> and * to test the operations.
cout << it->first << " maps to " << (*it).second << endl;
}
// Print elements using a range-based for loop
for (auto& p : myHash) {
cout << p.first << " maps to " << p.second << endl;
}
// Print elements using a range-based for loop and C++17 structured bindings
for (auto&[key, value] : myHash) {
cout << key << " maps to " << value << endl;
}
// Create an std::map with all the elements in the hash_map.
map<string, int> myMap(cbegin(myHash), cend(myHash));
for (auto& p : myMap) {
cout << p.first << " maps to " << p.second << endl;
}
The last piece of code also shows that the non-member functions such as std::cbegin()
and std::cend()
are working as expected.
As described earlier in this chapter, all the Standard Library containers allow you to specify a custom memory allocator. A “good citizen” hash_map
implementation should do the same. However, those details are omitted because they obscure the main points of this implementation, and because custom allocators are rarely used.
If your container supplies a bidirectional or random access iterator, it is considered reversible. Reversible containers are supposed to supply two additional type aliases.
TYPE NAME | DESCRIPTION |
reverse_iterator |
The type for iterating over elements of the container in reverse order |
const_reverse_iterator |
A version of reverse_iterator for iterating over const elements of the container in reverse order |
Additionally, the container should provide rbegin()
and rend()
, which are symmetric with begin()
and end()
; it should also provide crbegin()
and crend()
, which are symmetric with cbegin()
and cend()
. The usual implementations just use the std::reverse_iterator
adaptor described earlier in this chapter. These are left as an exercise for you to try.
In addition to the basic container requirements that were shown already, you can also make your container adhere to additional requirements for ordered associative, unordered associative, or sequential containers. This section modifies the hash_map
class template to satisfy a few additional unordered associative container requirements.
Unordered associative containers require the following type aliases.
TYPE NAME | DESCRIPTION |
key_type |
The key type with which the container is instantiated |
mapped_type |
The element type with which the container is instantiated |
value_type |
pair<const Key, T> |
hasher |
The hash type with which the container is instantiated |
key_equal |
The equality predicate with which the container is instantiated |
local_iterator |
An iterator type to iterate through a single bucket. Cannot be used to iterate across buckets. |
const_local_iterator |
A const iterator type to iterate through a single bucket. Cannot be used to iterate across buckets. |
node_type |
A type to represent a node. See Chapter 17 for a discussion on nodes. Not further discussed in this section. |
Here is the hash_map
definition with an updated set of type aliases. Note that the definition of ListType
has been moved, because the definitions of the local iterators require ListType
.
template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
typename Hash = hash<Key>>
class hash_map
{
public:
using key_type = Key;
using mapped_type = T;
using value_type = std::pair<const Key, T>;
using hasher = Hash;
using key_equal = KeyEqual;
using reference = value_type&;
using const_reference = const value_type&;
using size_type = size_t;
using difference_type = ptrdiff_t;
using hash_map_type = hash_map<Key, T, KeyEqual, Hash>;
using iterator = hash_map_iterator<hash_map_type>;
using const_iterator = const_hash_map_iterator<hash_map_type>;
private:
using ListType = std::list<value_type>;
public:
using local_iterator = typename ListType::iterator;
using const_local_iterator = typename ListType::const_iterator;
// Remainder of hash_map class definition omitted for brevity
};
The standard prescribes quite a few additional method requirements for unordered associative containers, as listed in the following table. In the last column, n is the number of elements in the container.
METHOD | DESCRIPTION | COMPLEXITY |
Constructor taking an iterator range |
Constructs the container and inserts the elements from the iterator range. The iterator range need not refer to another container of the same type. Note that all constructors of unordered associative containers must take an equality predicate. The constructors should provide a default constructed object as the default value. |
On average O(n), worst case O(n2) |
Constructor taking an initializer_list<value_type> as parameter |
Constructs the container and inserts the elements from the initializer list into the container. | On average O(n), worst case O(n2) |
Assignment operator with an initializer_list<value_type> as right-hand side |
Replaces all elements from the container with the elements from the initializer list. | On average O(n), worst case O(n2) |
hasher hash_function() const; |
Returns the hash function. | Constant |
key_equal key_eq() const; |
Returns the equality predicate for comparing keys. | Constant |
pair<iterator, bool> insert(value_type&); iterator insert( const_iterator hint, value_type&); |
Two different forms of insert .The hint can be ignored by the implementation.Containers that allow duplicate keys return just iterator for the first form, because insert() always succeeds in that case.
|
On average O(1), worst case O(n) |
void insert( InputIterator start, InputIterator end); |
Inserts a range of elements. The range need not be from a container of the same type. | On average O(m) with m the number of elements to insert. Worst case O(m*n+m) |
void insert( initializer_list< value_type>); |
Inserts the elements from the initializer list into the container. | On average O(m) with m the number of elements to insert. Worst case O(m*n+m) |
pair<iterator, bool> emplace(Args&&…); iterator emplace_hint( const_iterator hint, Args&&...); |
Implements the emplace operations to construct objects in-place. In-place construction is discussed in Chapter 17. | On average O(1), worst case O(n) |
size_type erase(key_type&); iterator erase( iterator position); iterator erase( iterator start, iterator end); |
Three different forms of erase. The first form returns the number of values erased (0 or 1 in containers that do not allow duplicate keys). The second and third forms erase the elements at position , or in the range start to end , and return an iterator to the element following the last erased element.
|
Worst case O(n) |
void clear(); |
Erases all elements. | O(n) |
Iterator find(key_type&); const_iterator find(key_type&) const; |
Finds the element with the specified key. | On average O(1), worst case O(n) |
size_type count(key_type&) const; |
Returns the number of elements with the specified key (0 or 1 in containers that do not allow duplicate keys). | On average O(1), worst case O(n) |
pair<iterator,iterator> equal_range( key_type&); pair<const_iterator, const_iterator> equal_range( key_type&) const; |
Returns iterators referring to the first element with the specified key and one past the last element with the specified key. | Worst case O(n) |
Note that hash_map
does not allow duplicate keys, so equal_range()
always returns a pair of identical iterators.
C++17 adds extract()
and merge()
methods to the list of requirements. These have to do with handling nodes as discussed in Chapter 17, but are omitted for this hash_map
implementation.
Here is the complete hash_map
class definition. The prototypes for insert()
, erase()
, and find()
need to change slightly from the previous versions shown because those initial versions don’t have the right return types required for unordered associative containers.
template <typename Key, typename T, typename KeyEqual = std::equal_to<>,
typename Hash = hash<Key>>
class hash_map
{
public:
using key_type = Key;
using mapped_type = T;
using value_type = std::pair<const Key, T>;
using hasher = Hash;
using key_equal = KeyEqual;
using reference = value_type&;
using const_reference = const value_type&;
using size_type = size_t;
using difference_type = ptrdiff_t;
using hash_map_type = hash_map<Key, T, KeyEqual, Hash>;
using iterator = hash_map_iterator<hash_map_type>;
using const_iterator = const_hash_map_iterator<hash_map_type>;
private:
using ListType = std::list<value_type>;
public:
using local_iterator = typename ListType::iterator;
using const_local_iterator = typename ListType::const_iterator;
// The iterator classes need access to all members of the hash_map
friend class hash_map_iterator<hash_map_type>;
friend class const_hash_map_iterator<hash_map_type>;
virtual ~hash_map() = default; // Virtual destructor
// Throws invalid_argument if the number of buckets is illegal.
explicit hash_map(const KeyEqual& equal = KeyEqual(),
size_type numBuckets = 101, const Hash& hash = Hash());
// Throws invalid_argument if the number of buckets is illegal.
template <typename InputIterator>
hash_map(InputIterator first, InputIterator last,
const KeyEqual& equal = KeyEqual(),
size_type numBuckets = 101, const Hash& hash = Hash());
// Initializer list constructor
// Throws invalid_argument if the number of buckets is illegal.
explicit hash_map(std::initializer_list<value_type> il,
const KeyEqual& equal = KeyEqual(), size_type numBuckets = 101,
const Hash& hash = Hash());
// Copy constructor
hash_map(const hash_map_type& src) = default;
// Move constructor
hash_map(hash_map_type&& src) noexcept = default;
// Copy assignment operator
hash_map_type& operator=(const hash_map_type& rhs);
// Move assignment operator
hash_map_type& operator=(hash_map_type&& rhs) noexcept;
// Initializer list assignment operator
hash_map_type& operator=(std::initializer_list<value_type> il);
// Iterator methods
iterator begin();
iterator end();
const_iterator begin() const;
const_iterator end() const;
const_iterator cbegin() const;
const_iterator cend() const;
// Size methods
bool empty() const;
size_type size() const;
size_type max_size() const;
// Element insert methods
T& operator[](const key_type& k);
std::pair<iterator, bool> insert(const value_type& x);
iterator insert(const_iterator hint, const value_type& x);
template <typename InputIterator>
void insert(InputIterator first, InputIterator last);
void insert(std::initializer_list<value_type> il);
// Element delete methods
size_type erase(const key_type& k);
iterator erase(iterator position);
iterator erase(iterator first, iterator last);
// Other modifying utilities
void swap(hash_map_type& other) noexcept;
void clear() noexcept;
// Access methods for Standard Library conformity
key_equal key_eq() const;
hasher hash_function() const;
// Lookup methods
iterator find(const key_type& k);
const_iterator find(const key_type& k) const;
std::pair<iterator, iterator> equal_range(const key_type& k);
std::pair<const_iterator, const_iterator>
equal_range(const key_type& k) const;
size_type count(const key_type& k) const;
private:
// Returns a pair containing an iterator to the found element with
// a given key, and the index of that element's bucket.
std::pair<typename ListType::iterator, size_t> findElement(
const key_type& k);
std::vector<ListType> mBuckets;
size_type mSize = 0;
KeyEqual mEqual;
Hash mHash;
};
The constructor accepting an iterator range is a method template so that it can take an iterator range from any container, not just other Initializer lists are discussed in Chapter 1. Following is the implementation of the
With this initializer list constructor, a Assignment operators can also accept an initializer list on the right-hand side. Following is an implementation of an initializer list assignment operator for
With this assignment operator, you can write code as follows: In the earlier section, “Using the Basic Hash Map,” a simple Note that technically, the following versions of
The The first two
The third form of
The last insert operation accepts an initializer list. The implementation for
With this Emplace operations construct objects in-place. They are discussed in Chapter 17. The emplace methods for
The The version of The first version is implemented as follows:
The second version of
The final version of The standard requires methods called
The
The
The implementations of both versions of
Because
The final method, Unordered associative containers should also provide a number of bucket-related methods: Here are the implementations for the
The implementations for the other Finally, an unordered associative container should also provide hash_map Iterator Range Constructor
hash_map
s. If it were not a method template, it would need to specify the InputIterator
type explicitly as hash_map_iterator
, limiting it to iterators from hash_map
s. Despite the syntax, the implementation is uncomplicated: it delegates the construction to the explicit constructor to initialize all the data members, then calls insert()
to actually insert all the elements in the specified range.// Make a call to insert() to actually insert the elements.
template <typename Key, typename T, typename KeyEqual, typename Hash>
template <typename InputIterator>
hash_map<Key, T, KeyEqual, Hash>::hash_map(
InputIterator first, InputIterator last, const KeyEqual& equal,
size_type numBuckets, const Hash& hash)
: hash_map(equal, numBuckets, hash)
{
insert(first, last);
}
hash_map Initializer List Constructor
hash_map
constructor that takes an initializer list, which is very similar to the implementation of the constructor accepting an iterator range:template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>::hash_map(
std::initializer_list<value_type> il,
const KeyEqual& equal, size_type numBuckets, const Hash& hash)
: hash_map(equal, numBuckets, hash)
{
insert(std::begin(il), std::end(il));
}
hash_map
can be constructed as follows:hash_map<string, int> myHash {
{ "KeyOne", 100 },
{ "KeyTwo", 200 },
{ "KeyThree", 300 } };
hash_map Initializer List Assignment Operator
hash_map
. It uses a copy-and-swap-like algorithm to guarantee strong exception safety.template <typename Key, typename T, typename KeyEqual, typename Hash>
hash_map<Key, T, KeyEqual, Hash>& hash_map<Key, T, KeyEqual, Hash>::operator=(
std::initializer_list<value_type> il)
{
// Do all the work in a temporary instance
hash_map_type newHashMap(il, mEqual, mBuckets.size(), mHash);
swap(newHashMap); // Commit the work with only non-throwing operations
return *this;
}
myHash = {
{ "KeyOne", 100 },
{ "KeyTwo", 200 },
{ "KeyThree", 300 } };
hash_map Insertion Operations
insert()
method is given. In this version, four insert()
methods are provided with additional features:
insert()
operation returns a pair<iterator, bool>
, which indicates both where the item is inserted and whether or not it was newly created.insert()
that takes a hint is useless for a hash_map
, but it is provided for symmetry with other kinds of collections. The hint is ignored, and it merely calls the first version.insert()
is a method template, so ranges of elements from arbitrary containers can be inserted into the hash_map
.insert()
accepts an initializer_list<value_type>
.insert()
can also be provided. These accept rvalue references.std::pair<iterator, bool> insert(value_type&& x);
iterator insert(const_iterator hint, value_type&& x);
hash_map
does not provide these. Additionally, there are two versions of insert()
related to handling nodes. Nodes are discussed in Chapter 17. The hash_map
omits these as well.insert()
methods are implemented as follows:template <typename Key, typename T, typename KeyEqual, typename Hash>
std::pair<typename hash_map<Key, T, KeyEqual, Hash>::iterator, bool>
hash_map<Key, T, KeyEqual, Hash>::insert(const value_type& x)
{
// Try to find the element.
auto[it, bucket] = findElement(x.first);
bool inserted = false;
if (it == std::end(mBuckets[bucket])) {
// We didn't find the element, so insert a new one.
it = mBuckets[bucket].insert(it, x);
inserted = true;
mSize++;
}
return std::make_pair(
hash_map_iterator<hash_map_type>(bucket, it, this), inserted);
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
hash_map<Key, T, KeyEqual, Hash>::insert(
const_iterator /*hint*/, const value_type& x)
{
// Completely ignore position.
return insert(x).first;
}
insert()
is a method template for the same reason as the constructor shown earlier: it should be able to insert elements by using iterators from containers of any type. The actual implementation uses an insert_iterator
, described earlier in this chapter.template <typename Key, typename T, typename KeyEqual, typename Hash>
template <typename InputIterator>
void hash_map<Key, T, KeyEqual, Hash>::insert(
InputIterator first, InputIterator last)
{
// Copy each element in the range by using an insert_iterator adaptor.
// Give begin() as a dummy position -- insert ignores it anyway.
std::insert_iterator<hash_map_type> inserter(*this, begin());
std::copy(first, last, inserter);
}
hash_map
simply forwards the work to the insert()
method accepting an iterator range:template <typename Key, typename T, typename KeyEqual, typename Hash>
void hash_map<Key, T, KeyEqual, Hash>::insert(
std::initializer_list<value_type> il)
{
insert(std::begin(il), std::end(il));
}
insert()
method, you can write code as follows:myHash.insert({
{ "KeyFour", 400 },
{ "KeyFive", 500 } });
hash_map Emplace Operations
hash_map
look as follows :template <typename... Args>
std::pair<iterator, bool> emplace(Args&&... args);
template <typename... Args>
iterator emplace_hint(const_iterator hint, Args&&... args);
...
in these lines are not typos. These are so-called variadic templates, that is, templates with a variable number of template type parameters and a variable number of function parameters. Variadic templates are discussed in Chapter 22. This hash_map
implementation omits emplace operations.hash_map Erase Operations
erase()
in the earlier section, “A Basic Hash Map,” is not compliant with Standard Library requirements. You need to implement the following versions:
key_type
and returns a size_type
for the number of elements removed from the collection (for hash_map
there are only two possible return values, 0 and 1).template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
hash_map<Key, T, KeyEqual, Hash>::erase(const key_type& k)
{
// First, try to find the element.
auto[it, bucket] = findElement(k);
if (it != std::end(mBuckets[bucket])) {
// The element exists -- erase it.
mBuckets[bucket].erase(it);
mSize--;
return 1;
} else {
return 0;
}
}
erase()
must remove the element at a specific iterator position. The iterator given is, of course, a hash_map_iterator
. Thus, hash_map
must have some ability to obtain the underlying bucket and list
iterator from the hash_map_iterator
. The approach taken here is to make the hash_map
class a friend of the hash_map_iterator
(not shown in the earlier class definition). Here is the implementation of this version of erase()
:template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
hash_map<Key, T, KeyEqual, Hash>::erase(iterator position)
{
iterator next = position;
++next;
// Erase the element from its bucket.
mBuckets[position.mBucketIndex].erase(position.mListIterator);
mSize--;
return next;
}
erase()
removes a range of elements. It iterates from first
to last
, calling erase()
on each element, thus letting the previous version of erase()
do all the work:template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
hash_map<Key, T, KeyEqual, Hash>::erase(iterator first, iterator last)
{
// Erase all the elements in the range.
for (iterator next = first; next != last;) {
next = erase(next);
}
return last;
}
hash_map Accessor Operations
key_eq()
and hash_function()
to retrieve the equality predicate and the hash function respectively:template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::key_equal
hash_map<Key, T, KeyEqual, Hash>::key_eq() const
{
return mEqual;
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::hasher
hash_map<Key, T, KeyEqual, Hash>::hash_function() const
{
return mHash;
}
find()
method is identical to the version shown earlier for the basic hash_map
, except for the return code. Instead of returning a pointer to the element, it constructs a hash_map_iterator
referring to it:template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::iterator
hash_map<Key, T, KeyEqual, Hash>::find(const key_type& k)
{
// Use the findElement() helper, and C++17 structured bindings.
auto[it, bucket] = findElement(k);
if (it == std::end(mBuckets[bucket])) {
// Element not found -- return the end iterator.
return end();
}
// Element found -- convert the bucket/iterator to a hash_map_iterator.
return hash_map_iterator<hash_map_type>(bucket, it, this);
}
const
version of find()
returns a const_hash_map_iterator
. It uses const_cast
to call the non-const
version of find()
to avoid code duplication. Note that the non-const
find()
returns a hash_map_iterator
, which is convertible to a const_hash_map_iterator
.template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::const_iterator
hash_map<Key, T, KeyEqual, Hash>::find(const key_type& k) const
{
return const_cast<hash_map_type*>(this)->find(k);
}
equal_range()
are identical, except that one returns a pair of hash_map_iterator
s while the other returns a pair of const_hash_map_iterator
s. They both simply forward the request to find()
. A hash_map
cannot have elements with duplicate keys, so the result of equal_range()
for hash_map
is always a pair of identical iterators.template <typename Key, typename T, typename KeyEqual, typename Hash>
std::pair<
typename hash_map<Key, T, KeyEqual, Hash>::iterator,
typename hash_map<Key, T, KeyEqual, Hash>::iterator>
hash_map<Key, T, KeyEqual, Hash>::equal_range(const key_type& k)
{
auto it = find(k);
return std::make_pair(it, it);
}
hash_map
does not allow duplicate keys, count()
can only return 1 or 0: 1 if it finds the element, and 0 if it doesn’t. The implementation just wraps a call to find()
. The find()
method returns the end iterator if it can’t find the element. count()
retrieves an end iterator by calling end()
in order to compare it.template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
hash_map<Key, T, KeyEqual, Hash>::count(const key_type& k) const
{
// There are either 1 or 0 elements matching key k.
// If we can find a match, return 1, otherwise return 0.
if (find(k) == end()) {
return 0;
} else {
return 1;
}
}
operator[]
, is not required by the standard, but is provided for convenience of the programmer, and to be symmetric with std::map
. The prototype is identical to the one for std::map
. The comments explain the potentially confusing one-line implementation.template <typename Key, typename T, typename KeyEqual, typename Hash>
T& hash_map<Key, T, KeyEqual, Hash>::operator[] (const key_type& k)
{
// It's a bit cryptic, but it basically attempts to insert
// a new key/value pair of k and a zero-initialized value. Regardless
// of whether the insert succeeds or fails, insert() returns a pair of
// an iterator/bool. The iterator refers to a key/value pair, the
// second element of which is the value we want to return.
return ((insert(std::make_pair(k, T()))).first)->second;
}
hash_map Bucket Operations
bucket_count()
returns the number of buckets in the container.max_bucket_count()
returns the maximum number of buckets that are supported.bucket(key)
returns the index of the bucket to which the given key maps.bucket_size(n)
returns the number of elements in the bucket with given index.begin(n)
, end(n)
, cbegin(n)
, cend(n)
return local begin and end iterators to the bucket with given index.hash_map
:template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
hash_map<Key, T, KeyEqual, Hash>::bucket_count() const
{
return mBuckets.size();
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
hash_map<Key, T, KeyEqual, Hash>::max_bucket_count() const
{
return mBuckets.max_size();
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
hash_map<Key, T, KeyEqual, Hash>::bucket(const Key& k) const
{
return const_cast<hash_map_type*>(this)->findElement(k).second;
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::size_type
hash_map<Key, T, KeyEqual, Hash>::bucket_size(size_type n) const
{
return mBuckets[n].size();
}
template <typename Key, typename T, typename KeyEqual, typename Hash>
typename hash_map<Key, T, KeyEqual, Hash>::local_iterator
hash_map<Key, T, KeyEqual, Hash>::begin(size_type n)
{
return mBuckets[n].begin();
}
begin(n)
, end(n)
, cbegin(n)
, and cend(n)
methods are similar; they simply forward the call to the correct bucket list
based on the given index.load_factor()
, max_load_ factor()
, rehash()
, and reserve()
methods. These are omitted for hash_map
.
The hash_map
developed in the preceding sections is an unordered associative container. However, you could also write a sequential container, or an ordered associative container, in which case you would need to follow a different set of requirements. Instead of listing them here, it’s easier to point out that the deque
container follows the prescribed sequential container requirements almost exactly. The only difference is that it provides an extra resize()
method (not required by the standard). An example of an ordered associative container is the map
, on which you can model your own ordered associative containers.
The final example in this chapter showed almost the complete development of a hash_map
unordered associative container and its iterators. This hash_map
implementation is given here only to teach you how to write your own Standard Library-compliant containers and iterators. C++ does include its own set of unordered associative containers. You should use those instead of your own implementation.
In the process of reading this chapter, you hopefully gained an appreciation for the steps involved in developing containers. Even if you never write another Standard Library algorithm or container, you understand better the Standard Library’s mentality and capabilities, and you can put it to better use.
This chapter concludes the tour of the Standard Library. Even with all the details given in this book, some features are still omitted. If this material excited you, and you would like more information, consult some of the resources in Appendix B. Don’t feel compelled to use all the features discussed here. Forcing them into your programs without a true need will just complicate your code. However, I encourage you to consider incorporating aspects of the Standard Library into your programs where they make sense. Start with the containers, maybe throw in an algorithm or two, and before you know it, you’ll be a convert!
const
Key
in the pair<const
Key,
T>
elements stored in the list
, you cannot use a vector
instead of a list
in this case.