How it works...

In this recipe, we will learn how to monitor how much memory an application is consuming, as well as the different ways that C++ can allocate memory behind the scenes. To start, let's look at a simple application that does nothing:

int main(void)
{
}

As we can see, this application does nothing. To see how much memory the application has used, we will use Valgrind, a dynamic analysis tool, as follows:

As shown in the preceding example, our application has allocated heap memory (that is, memory allocated using new()/delete() or malloc()/free()). To determine where this allocation occurred, let's use Valgrind again, but this time, we will enable a tool called Massif, which will trace where the memory allocation came from:

To see the output of the preceding example, we must output a file that was created for us automatically:

> cat massif.out.*

This results in us retrieving the following output:

As we can see, the dynamic linker's init() function is performing the allocation, which is 72,704 bytes in size. To further demonstrate how to use Valgrind, let's take a look at this simple example, where we perform our own allocation:

int main(void)
{
    auto ptr = new int;
    delete ptr;
}

To see the memory allocation of the preceding source, we need to run Valgrind again:

As we can see, we have allocated 72,708 bytes. Since we know that the application will allocate 72,704 bytes for us automatically, we can see that Valgrind has successfully detected the 4 bytes we allocated (the size of an integer on Intel 64-bit systems running Linux). To see where this allocation occurred, let's use Massif again:

As we can see, we've added the --threshold=0.1 to the command-line options as this tells Valgrind that any allocation that makes up .1% of the allocations should be logged. Let's cat the results (the cat program simply echoes the contents of a file to the console):

> cat massif.out.*

By doing this, we get the following output:

As we can see, Valgrind has detected the memory allocations from the init() function, as well as from our main() function.

Now that we know how to analyze the memory allocations our application makes, let's look at some different C++ APIs to see what types of memory allocations they make behind the scenes. To start, let's look at an std::vector, as follows:

#include <vector>
std::vector<int> data;

int main(void)
{
    for (auto i = 0; i < 10000; i++) {
        data.push_back(i);
    }
}

Here, we've created a global vector of integers and then added 10,000 integers to the vector. Using Valgrind, we get the following output:

Here, we can see 16 allocations, with a total of 203,772 bytes. We know that the application will allocate 72,704 bytes for us, so we must remove this from our total, leaving us with 131,068 bytes of memory. We also know that we allocated 10,000 integers, which is 40,000 bytes in total. So, the question is, where did the other 91,068 bytes come from?

The answer is in how std::vector works under the hood. std::vector must ensure a continuous view of memory at all times, which means that when an insertion occurs and the std::vector is out of space, it must allocate a new, larger buffer and then copy the contents of the old buffer into the new buffer. The problem is that std::vector doesn't know what the total size of the buffer will be when all of the insertions are complete, so when the first insertion is performed, it creates a small buffer to ensure memory is not wasted and then proceeds to increase the size of the std::vector in small increments as the vector grows, resulting in several memory allocations and memory copies.

To prevent such allocation from happening, C++ provides the reserve() function, which provides the user of a std::vector to estimate how much memory the user thinks they will need. For example, consider the following code:

#include <vector>
std::vector<int> data;

int main(void)
{
    data.reserve(10000);  // <--- added optimization 

    for (auto i = 0; i < 10000; i++) {
        data.push_back(i);
    }
}

The code in the preceding example is the same as it is in the previous example, with the difference being that we added a call to the reserve() function, which tells the std::vector how large we think the vector will be. Valgrind's output is as follows:

As we can see, the application allocated 112,704 bytes. If we remove our 72,704 bytes that the application creates by default, we are left with 40,000 bytes, which is the exact size we expected (since we are adding 10,000 integers to the vector, with each integer being 4 bytes in size).

Data structures are not the only type of C++ Standard Library API that performs hidden allocations. Let's look at an std::any, as follows:

#include <any>
#include <string>

std::any data;

int main(void)
{
    data = 42;
    data = std::string{"The answer is: 42"};
}

In this example, we created an std::any and assigned it an integer and an std::string. Let's look at the output of Valgrind:

As we can see, 3 allocations occurred. The first allocation occurs by default, while the second allocation is produced by the std::string. The last allocation is produced by the std::any. This occurs because std::any has to adjust its internal storage to account for any new random data type that it sees. In other words, to handle a generic data type, C++ has to perform an allocation. This is made worse if we keep changing the data type. For example, consider the following code:

#include <any>
#include <string>

std::any data;

int main(void)
{
    data = 42;
    data = std::string{"The answer is: 42"};
    data = 42;                                 // <--- keep swapping
    data = std::string{"The answer is: 42"};   // <--- keep swapping
    data = 42;                                 // <--- keep swapping
    data = std::string{"The answer is: 42"};   // ...
    data = 42;
    data = std::string{"The answer is: 42"};
}

The preceding code is identical to the previous example, with the only difference being that we swap between data types. Valgrind produces the following output:

As we can see, 9 allocations occurred instead of 3. To solve this problem, we need to use an std::variant instead of std::any, as follows:

#include <variant>
#include <string>

std::variant<int, std::string> data;

int main(void)
{
    data = 42;
    data = std::string{"The answer is: 42"};
}

The difference between std::any and std::variant is that std::variant requires that the user states which types the variant must support, removing the need for dynamic memory allocation on assignment. Valgrind's output is as follows:

Now, we only have 2 allocations, as expected (the default allocation and the allocation from std::string). As shown in this recipe, libraries, including the C++ Standard Library, can hide memory allocations, potentially slowing down your code and using more memory resources than you intended. Tools such as Valgrind can be used to identify these types of problems, allowing you to create more efficient C++ code.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...