The first rule of debugging is to be honest with yourself and admit that your code will contain bugs! This realistic assessment enables you to put your best effort into preventing bugs from crawling into your code in the first place, while you simultaneously include the necessary features to make debugging as easy as possible.
A bug in a computer program is incorrect run-time behavior. This undesirable behavior includes both catastrophic and noncatastrophic bugs. Examples of catastrophic bugs are program death, data corruption, operating system failures, or some other horrific outcome. A catastrophic bug can also manifest itself external to the software or computer system running the software; for example, medical software might contain a catastrophic bug causing a massive radiation overdose to a patient. Noncatastrophic bugs are bugs that cause the program to behave incorrectly in more subtle ways; for example, a web browser might return the wrong web page, or a spreadsheet application might calculate the standard deviation of a column incorrectly.
There are also so-called cosmetic bugs, where something is visually not correct, but otherwise works correctly. For example, a button in a user interface is kept enabled when it shouldn’t be, but clicking it does nothing. All computations are perfectly correct, the program does not crash, but it doesn’t look as “nice” as it should.
The underlying cause, or root cause, of the bug is the mistake in the program that causes this incorrect behavior. The process of debugging a program includes both determining the root cause of the bug, and fixing the code so that the bug will not occur again.
It’s impossible to write completely bug-free code, so debugging skills are important. However, a few tips can help you to minimize the number of bugs.
override
keyword, and so on. This makes it easier for other people to understand your code.Your programs should contain features that enable easier debugging when the inevitable bugs arise. This section describes these features and presents sample implementations, where appropriate, that you can incorporate into your own programs.
Imagine this scenario: You have just released a new version of your flagship product, and one of the first users reports that the program “stopped working.” You attempt to pry more information from the user, and eventually discover that the program died in the middle of an operation. The user can’t quite remember what he was doing, or if there were any error messages. How will you debug this problem?
Now imagine the same scenario, but in addition to the limited information from the user, you are also able to examine the error log on the user’s computer. In the log you see a message from your program that says, “Error: unable to open config.xml file.” Looking at the code near the spot where that error message was generated, you find a line in which you read from the file without checking whether the file was opened successfully. You’ve found the root cause of your bug!
Error logging is the process of writing error messages to persistent storage so that they will be available following an application, or even machine, death. Despite the example scenario, you might still have doubts about this strategy. Won’t it be obvious by your program’s behavior if it encounters errors? Won’t the user notice if something goes wrong? As the preceding example shows, user reports are not always accurate or complete. In addition, many programs, such as the operating system kernel and long-running daemons like inetd
or syslogd
on Unix, are not interactive and run unattended on a machine. The only way these programs can communicate with users is through error logging. In many cases, a program might also want to automatically recover from certain errors, and hide those errors from the user. Still, having logs of those errors available can be invaluable to improve the overall stability of the program.
Thus, your program should log errors as it encounters them. That way, if a user reports a bug, you will be able to examine the log files on the machine to see if your program reported any errors prior to encountering the bug. Unfortunately, error logging is platform dependent: C++ does not contain a standard logging mechanism. Examples of platform-specific logging mechanisms include the syslog
facility in Unix, and the event reporting API in Windows. You should consult the documentation for your development platform. There are also some open-source implementations of cross-platform logging frameworks. Here are two examples:
http://log4cpp.sourceforge.net/
http://www.boost.org/
Now that you’re convinced that logging is a great feature to add to your programs, you might be tempted to log messages every few lines in your code so that, in the event of any bug, you’ll be able to trace the code path that was executing. These types of log messages are appropriately called traces.
However, you should not write these traces to log files for two reasons. First, writing to persistent storage is slow. Even on systems that write the logs asynchronously, logging that much information will slow down your program. Second, and most important, most of the information that you would put in your traces is not appropriate for the end user to see. It will just confuse the user, leading to unwarranted service calls. That said, tracing is an important debugging technique under the correct circumstances, as described in the next section.
Here are some specific guidelines for the types of errors your programs should log:
It is also useful to log warnings, or recoverable errors, which allows you to investigate if you can possibly avoid them.
Most logging APIs allow you to specify a log level or error level, typically error, warning, and info. You can log non-error conditions under a log level that is less severe than “error.” For example, you might want to log significant state changes in your application, or startup and shutdown of the program. You also might consider giving your users a way to adjust the log level of your program at run time so that they can customize the amount of logging that occurs.
When debugging complicated problems, public error messages generally do not contain enough information. You often need a complete trace of the code path taken, or values of variables before the bug showed up. In addition to basic messages, it’s sometimes helpful to include the following information in debug traces:
You can add this tracing to your program through a special debug mode, or via a ring buffer. Both of these methods are explained in detail in the following sections. Note that in multithreaded programs you have to make your trace logging thread-safe. See Chapter 23 for details on multithreaded programming.
The first technique to add debug traces is to provide a debug mode for your program. In debug mode, the program writes trace output to standard error or to a file, and perhaps does extra checking during execution. There are several ways to add a debug mode to your program. Note that all these examples are writing traces in text format.
Start-time debug mode allows your application to run with or without debug mode depending on a command-line argument. This strategy includes the debug code in the “release” binary, and allows debug mode to be enabled at a customer site. However, it does require users to restart the program in order to run it in debug mode, which may prevent you from obtaining useful information about certain bugs.
The following example is a simple program implementing a start-time debug mode. This program doesn’t do anything useful; it is only for demonstrating the technique.
All logging functionality is wrapped in a Logger
class. This class has two static
data members: the name of the log file, and a Boolean saying whether logging is enabled or disabled. The class has a static
public log()
variadic template method. Variadic templates are discussed in Chapter 22. Note that the log file is opened, flushed, and closed on each call to log()
. This might lower performance a bit; however, it does guarantee correct logging, which is more important.
class Logger
{
public:
static void enableLogging(bool enable) { msLoggingEnabled = enable; }
static bool isLoggingEnabled() { return msLoggingEnabled; }
template<typename… Args>
static void log(const Args&… args)
{
if (!msLoggingEnabled)
return;
ofstream logfile(msDebugFileName, ios_base::app);
if (logfile.fail()) {
cerr << "Unable to open debug file!" << endl;
return;
}
// Use a C++17 unary right fold, see Chapter 22.
((logfile << args),…);
logfile << endl;
}
private:
static const string msDebugFileName;
static bool msLoggingEnabled;
};
const string Logger::msDebugFileName = "debugfile.out";
bool Logger::msLoggingEnabled = false;
The following helper macro is defined to make it easy to log something. It uses __func__
, a predefined variable defined by the C++ standard that contains the name of the current function.
#define log(…) Logger::log(__func__, "(): ", __VA_ARGS__)
This macro replaces every call to log()
in your code with a call to Logger::log()
. The macro automatically includes the function name as first argument to Logger::log()
. For example, suppose you call the macro as follows:
log("The value is: ", value);
The log()
macro replaces this with the following:
Logger::log(__func__, "(): ", "The value is: ", value);
Start-time debug mode needs to parse the command-line arguments to find out whether or not it should enable debug mode. Unfortunately, there is no standard functionality in C++ for parsing command-line arguments. This program uses a simple isDebugSet()
function to check for the debug flag among all the command-line arguments, but a function to parse all command-line arguments would need to be more sophisticated.
bool isDebugSet(int argc, char* argv[])
{
for (int i = 1; i < argc; i++) {
if (strcmp(argv[i], "-d") == 0) {
return true;
}
}
return false;
}
Some arbitrary test code is used to exercise the debug mode in this example. Two classes are defined, ComplicatedClass
and UserCommand
. Both classes define an operator<<
to write instances of them to a stream. The Logger
class uses this operator to dump objects to the log file.
class ComplicatedClass { /* … */ };
ostream& operator<<(ostream& ostr, const ComplicatedClass& src)
{
ostr << "ComplicatedClass";
return ostr;
}
class UserCommand { /* … */ };
ostream& operator<<(ostream& ostr, const UserCommand& src)
{
ostr << "UserCommand";
return ostr;
}
Here is some test code with a number of log calls:
UserCommand getNextCommand(ComplicatedClass* obj)
{
UserCommand cmd;
return cmd;
}
void processUserCommand(UserCommand& cmd)
{
// details omitted for brevity
}
void trickyFunction(ComplicatedClass* obj)
{
log("given argument: ", *obj);
for (size_t i = 0; i < 100; ++i) {
UserCommand cmd = getNextCommand(obj);
log("retrieved cmd ", i, ": ", cmd);
try {
processUserCommand(cmd);
} catch (const exception& e) {
log("exception from processUserCommand(): ", e.what());
}
}
}
int main(int argc, char* argv[])
{
Logger::enableLogging(isDebugSet(argc, argv));
if (Logger::isLoggingEnabled()) {
// Print the command-line arguments to the trace
for (int i = 0; i < argc; i++) {
log(argv[i]);
}
}
ComplicatedClass obj;
trickyFunction(&obj);
// Rest of the function not shown
return 0;
}
There are two ways to run this application:
> STDebug
> STDebug -d
Debug mode is activated only when the -d
argument is specified on the command line.
Instead of enabling or disabling debug mode through a command-line argument, you could also use a preprocessor symbol such as DEBUG_MODE
and #ifdef
s to selectively compile the debug code into your program. In order to generate a debug version of this program, you would have to compile it with the symbol DEBUG_MODE
defined. Your compiler should allow you to define symbols during compilation; consult your compiler’s documentation for details. For example, GCC allows you to specify –Dsymbol
through the command-line. Microsoft VC++ allows you to specify the symbols through the Visual Studio IDE, or by specifying /D symbol
if you use the VC++ command-line tools. Visual C++ automatically defines the _DEBUG
symbol for debug builds. However, that symbol is Visual C++ specific, so the example in this section uses a custom symbol called DEBUG_MODE
.
The advantage of this method is that your debug code is not compiled into the “release” binary, and so does not increase its size. The disadvantage is that there is no way to enable debugging at a customer site for testing or following the discovery of a bug.
An example implementation is given in CTDebug.cpp
in the downloadable source code archive. One important remark on this implementation is that it contains the following definition for the log()
macro:
#ifdef DEBUG_MODE
#define log(…) Logger::log(__func__, "(): ", __VA_ARGS__)
#else
#define log(…)
#endif
That is, if DEBUG_MODE
is not defined, then all calls to log()
are replaced with nothing, called no-ops.
The most flexible way to provide a debug mode is to allow it to be enabled or disabled at run time. One way to provide this feature is to supply an asynchronous interface that controls debug mode on the fly. This interface could be an asynchronous command that makes an interprocess call into the application (for example, using sockets, signals, or remote procedure calls). This interface could also take the form of a menu command in the user interface. C++ provides no standard way to perform interprocess communication, so an example of this technique is not shown.
Debug mode is useful for debugging reproducible problems and for running tests. However, bugs often appear when the program is running in non-debug mode, and by the time you or the customer enables debug mode, it is too late to gain any information about the bug. One solution to this problem is to enable tracing in your program at all times. You usually need only the most recent traces to debug a program, so you should store only the most recent traces at any point in a program’s execution. One way to provide for this is through careful use of log file rotations.
However, for performance reasons, it is better that your program doesn’t log these traces continuously to disk. Instead, it should store them in memory and provide a mechanism to dump all the trace messages to standard error or to a log file if the need arises.
A common technique is to use a ring buffer, also known as a circular buffer, to store a fixed number of messages, or messages in a fixed amount of memory. When the buffer fills up, it starts writing messages at the beginning of the buffer again, overwriting the older messages. This cycle can repeat indefinitely. The following sections provide an implementation of a ring buffer and show you how you can use it in your programs.
The following RingBuffer
class provides a simple debug ring buffer. The client specifies the number of entries in the constructor, and adds messages with the addEntry()
method. Once the number of entries exceeds the number allowed, new entries overwrite the oldest entries in the buffer. The buffer also provides the option to output entries to a stream as they are added to the buffer. The client can specify an output stream in the constructor, and can reset it with the setOutput()
method. Finally, the operator<<
streams the entire buffer to an output stream. This implementation uses a variadic template method. Variadic templates are discussed in Chapter 22.
class RingBuffer
{
public:
// Constructs a ring buffer with space for numEntries.
// Entries are written to *ostr as they are queued (optional).
explicit RingBuffer(size_t numEntries = kDefaultNumEntries,
std::ostream* ostr = nullptr);
virtual ~RingBuffer() = default;
// Adds an entry to the ring buffer, possibly overwriting the
// oldest entry in the buffer (if the buffer is full).
template<typename… Args>
void addEntry(const Args&… args)
{
std::ostringstream os;
// Use a C++17 unary right fold, see Chapter 22.
((os << args), …);
addStringEntry(os.str());
}
// Streams the buffer entries, separated by newlines, to ostr.
friend std::ostream& operator<<(std::ostream& ostr, RingBuffer& rb);
// Streams entries as they are added to the given stream.
// Specify nullptr to disable this feature.
// Returns the old output stream.
std::ostream* setOutput(std::ostream* newOstr);
private:
std::vector<std::string> mEntries;
std::vector<std::string>::iterator mNext;
std::ostream* mOstr;
bool mWrapped;
static const size_t kDefaultNumEntries = 500;
void addStringEntry(std::string&& entry);
};
This implementation of the ring buffer stores a fixed number of string
objects. This approach certainly is not the most efficient solution. Other possibilities would be to provide a fixed number of bytes of memory for the buffer. However, this implementation should be sufficient unless you’re writing a high-performance application.
For multithreaded programs, it’s useful to add the ID of the thread and a timestamp to each trace entry. Of course, the ring buffer has to be made thread-safe before using it in a multithreaded application. See Chapter 23 for multithreaded programming.
Here are the implementations:
// Initialize the vector to hold exactly numEntries. The vector size
// does not need to change during the lifetime of the object.
// Initialize the other members.
RingBuffer::RingBuffer(size_t numEntries, ostream* ostr)
: mEntries(numEntries), mOstr(ostr), mWrapped(false)
{
if (numEntries == 0)
throw invalid_argument("Number of entries must be > 0.");
mNext = begin(mEntries);
}
// The addStringEntry algorithm is pretty simple: add the entry to the next
// free spot, then reset mNext to indicate the free spot after
// that. If mNext reaches the end of the vector, it starts over at 0.
//
// The buffer needs to know if the buffer has wrapped or not so
// that it knows whether to print the entries past mNext in operator<<.
void RingBuffer::addStringEntry(string&& entry)
{
// If there is a valid ostream, write this entry to it.
if (mOstr) {
*mOstr << entry << endl;
}
// Move the entry to the next free spot and increment
// mNext to point to the free spot after that.
*mNext = std::move(entry);
++mNext;
// Check if we've reached the end of the buffer. If so, we need to wrap.
if (mNext == end(mEntries)) {
mNext = begin(mEntries);
mWrapped = true;
}
}
// Set the output stream.
ostream* RingBuffer::setOutput(ostream* newOstr)
{
return std::exchange(mOstr, newOstr);
}
// operator<< uses an ostream_iterator to "copy" entries directly
// from the vector to the output stream.
//
// operator<< must print the entries in order. If the buffer has wrapped,
// the earliest entry is one past the most recent entry, which is the entry
// indicated by mNext. So, first print from entry mNext to the end.
//
// Then (even if the buffer hasn't wrapped) print from beginning to mNext-1.
ostream& operator<<(ostream& ostr, RingBuffer& rb)
{
if (rb.mWrapped) {
// If the buffer has wrapped, print the elements from
// the earliest entry to the end.
copy(rb.mNext, end(rb.mEntries), ostream_iterator<string>(ostr, " "));
}
// Now, print up to the most recent entry.
// Go up to mNext because the range is not inclusive on the right side.
copy(begin(rb.mEntries), rb.mNext, ostream_iterator<string>(ostr, " "));
return ostr;
}
In order to use the ring buffer, you can create an instance of it and start adding messages to it. When you want to print the buffer, just use operator<<
to print it to the appropriate ostream
. Here is the earlier start-time debug mode program modified to use a ring buffer instead. Changes are highlighted. The definitions of the ComplicatedClass
and UserCommand
classes, and the functions getNextCommand()
, processUserCommand()
, and trickyFunction()
are not shown. They are exactly the same as before.
RingBuffer debugBuf;
#define log(…) debugBuf.addEntry(__func__, "(): ", __VA_ARGS__)
int main(int argc, char* argv[])
{
// Log the command-line arguments
for (int i = 0; i < argc; i++) {
log(argv[i]);
}
ComplicatedClass obj;
trickyFunction(&obj);
// Print the current contents of the debug buffer to cout
cout << debugBuf;
return 0;
}
Storing trace debug messages in memory is a great start, but in order for them to be useful, you need a way to access these traces for debugging.
Your program should provide a “hook” to tell it to export the messages. This hook could be similar to the interface you would use to enable debugging at run time. Additionally, if your program encounters a fatal error that causes it to exit, it could export the ring buffer automatically to a log file before exiting.
Another way to retrieve these messages is to obtain a memory dump of the program. Each platform handles memory dumps differently, so you should consult a reference or expert for your platform.
The <cassert>
header defines an assert
macro. It takes a Boolean expression and, if the expression evaluates to false
, prints an error message and terminates the program. If the expression evaluates to true
, it does nothing.
Assertions allow you to “force” your program to exhibit a bug at the exact point where that bug originates. If you didn’t assert
at that point, your program might proceed with those incorrect values, and the bug might not show up until much later. Thus, assertions allow you to detect bugs earlier than you otherwise would.
You could use assertions in your code whenever you are “assuming” something about the state of your variables. For example, if you call a library function that is supposed to return a pointer and claims never to return nullptr
, throw in an assert
after the function call to make sure that the pointer isn’t nullptr
.
Note that you should assume as little as possible. For example, if you are writing a library function, don’t assert that the parameters are valid. Instead, check the parameters, and return an error code or throw an exception if they are invalid.
As a rule, assertions should only be used for cases that are truly problematic, and should therefore never be ignored when they occur during development. If you hit an assertion during development, fix it, don’t just disable the assertion.
Make sure your programs create crash dumps, also called memory dumps, core dumps, and so on. A crash dump is a dump file that is created when your application crashes. It contains information about which threads were running at the time of the crash, a call stack of all the threads, and so on. How you create such dumps is platform dependent, so you should consult the documentation of your platform, or use a third-party library that takes care of it for you. Breakpad1 is an example of such an open-source cross-platform library that can write and process crash dumps.
Also make sure you set up a symbol server and a source code server. The symbol server is used to store debugging symbols of released binaries of your software. These symbols are used later on to interpret crash dumps received from customers. The source code server, discussed in Chapter 24, stores all revisions of your source code. When debugging crash dumps, this source code server is used to download the correct source code for the revision of your software that created the crash dump.
The exact procedure of analyzing crash dumps depends on your platform and compiler, so consult their documentation.
From my personal experience, I have found that a crash dump is often worth more than a thousand bug reports.
The assertions discussed earlier in this chapter are evaluated at run time. static_assert()
allows assertions evaluated at compile time. A call to static_assert()
accepts two parameters: an expression to evaluate at compile time and a string. When the expression evaluates to false
, the compiler issues an error that contains the given string. A simple example could be to check that you are compiling with a 64-bit compiler:
static_assert(sizeof(void*) == 8, "Requires 64-bit compilation.");
If you compile this with a 32-bit compiler where a pointer is four bytes, the compiler issues an error that can look like this:
test.cpp(3): error C2338: Requires 64-bit compilation.
Since C++17, the string parameter is optional, as in this example:
static_assert(sizeof(void*) == 8);
In this case, if the expression evaluates to false
, you get a compiler-dependent error message. For example, Microsoft Visual C++ 2017 gives the following error:
test.cpp(3): error C2607: static assertion failed
Another example where static_assert
s are pretty powerful is in combination with type traits, which are discussed in Chapter 22. For example, if you write a function template or class template, you could use static_assert
s together with type traits to issue compiler errors when template types don’t satisfy certain conditions. The following example requires that the template type for the process()
function template has Base1
as a base class:
class Base1 {};
class Base1Child : public Base1 {};
class Base2 {};
class Base2Child : public Base2 {};
template<typename T>
void process(const T& t)
{
static_assert(is_base_of_v<Base1, T>, "Base1 should be a base for T.");
}
int main()
{
process(Base1());
process(Base1Child());
//process(Base2()); // Error
//process(Base2Child()); // Error
}
If you try to call process()
with an instance of Base2
or Base2Child
, the compiler issues an error that could look like this:
test.cpp(13): error C2338: Base1 should be a base for T.
test.cpp(21) : see reference to function template
instantiation 'void process<Base2>(const T &)' being compiled
with
[
T=Base2
]
Debugging a program can be incredibly frustrating. However, with a systematic approach it becomes significantly easier. Your first step in trying to debug a program should always be to reproduce the bug. Depending on whether or not you can reproduce the bug, your subsequent approach will differ. The next four sections explain how to reproduce bugs, how to debug reproducible bugs, how to debug nonreproducible bugs, and how to debug regressions. Additional sections explain details about debugging memory errors and debugging multithreaded programs.
If you can reproduce the bug consistently, it will be much easier to determine the root cause. Finding the root cause of bugs that are not reproducible is difficult, if not impossible.
As a first step to reproduce the bug, run the program with exactly the same inputs as the run when the bug first appeared. Be sure to include all inputs, from the program’s startup to the time of the bug’s appearance. A common mistake is to attempt to reproduce the bug by performing only the triggering action. This technique may not reproduce the bug because the bug might be caused by an entire sequence of actions.
For example, if your web browser program dies when you request a certain web page, it may be due to memory corruption triggered by that particular request’s network address. On the other hand, it may be because your program records all requests in a queue, with space for one million entries, and this entry was number one million and one. Starting the program over and sending one request certainly wouldn’t trigger the bug in that case.
Sometimes it is impossible to emulate the entire sequence of events that leads to the bug. Perhaps the bug was reported by someone who can’t remember everything that she did. Alternatively, maybe the program was running for too long to emulate every input. In that case, do your best to reproduce the bug. It takes some guesswork, and can be time-consuming, but effort at this point will save time later in the debugging process. Here are some techniques you can try:
After you are able to reproduce the bug consistently, you should attempt to find the smallest sequence that triggers the bug. You can start with the minimum sequence, containing only the triggering action, and slowly expand the sequence to cover the entire sequence from startup until the bug is triggered. This will result in the simplest and most efficient test case to reproduce it, which makes it simpler to find the root cause of the problem, and easier to verify the fix.
When you can reproduce a bug consistently and efficiently, it’s time to figure out the problem in the code that causes the bug. Your goal at this point is to find the exact lines of code that trigger the problem. You can use two different strategies.
The debugging example at the end of this chapter demonstrates both these approaches.
Fixing bugs that are not reproducible is significantly more difficult than fixing reproducible bugs. You often have very little information and must employ a lot of guesswork. Nevertheless, a few strategies can aid you.
Once you have found the root cause of a nonreproducible bug, you should create a reproducible test case and move it to the “reproducible bugs” category. It is important to be able to reproduce a bug before you actually fix it. Otherwise, how will you test the fix? A common mistake when debugging nonreproducible bugs is to fix the wrong problem in the code. Because you can’t reproduce the bug, you don’t know if you’ve really fixed it, so you shouldn’t be surprised when it shows up again a month later.
If a feature contains a regression bug, it means that the feature used to work correctly, but stopped working due to the introduction of a bug.
A useful debugging technique for investigating regressions is to look at the change log of relevant files. If you know at what time the feature was still working, look at all the change logs since that time. You might notice something suspicious that could lead you to the root cause.
Another approach that can save you a lot of time when debugging regressions is to use a binary search approach with older versions of the software to try and figure out when it started to go wrong. You can use binaries of older versions if you keep them, or revert the source code to an older revision. Once you know when it started to go wrong, inspect the change logs to see what changed at that time. This mechanism is only possible when you can reproduce the bug.
Most catastrophic bugs, such as application death, are caused by memory errors. Many noncatastrophic bugs are triggered by memory errors as well. Some memory bugs are obvious. If your program attempts to dereference a nullptr
pointer, the default action is to terminate the program. However, nearly every platform enables you to respond to catastrophic errors and take remedial action. The amount of effort you devote to the response depends on the importance of this kind of recovery to your end users. For example, a text editor really needs to make a best attempt to save the modified buffers (possibly under a “recovered” name), while for other programs, users may find the default behavior acceptable, even if it is unpleasant.
Some memory bugs are more insidious. If you write past the end of an array in C++, your program will probably not crash at that point. However, if that array was on the stack, you may have written into a different variable or array, changing values that won’t show up until later in the program. Alternatively, if the array was on the heap, you could cause memory corruption in the heap, which will cause errors later when you attempt to allocate or free more memory dynamically.
Chapter 7 introduces some of the common memory errors from the perspective of what to avoid when you’re coding. This section discusses memory errors from the perspective of identifying problems in code that exhibits bugs. You should be familiar with the discussion in Chapter 7 before continuing with this section.
In order to debug memory problems, you should be familiar with the types of errors that can occur. This section describes the major categories of memory errors. Each category lists different types of memory errors, including a small code example demonstrating each error, and a list of possible symptoms that you might observe. Note that a symptom is not the same thing as a bug: a symptom is an observable behavior caused by a bug.
The following table summarizes five major errors that involve freeing memory.
ERROR TYPE | SYMPTOMS | EXAMPLE |
Memory leak |
Process memory usage grows over time. Process runs more slowly over time. Eventually, depending on the OS, operations and system calls fail because of lack of memory. |
void memoryLeak() { int* p = new int[1000]; return; // Bug! Not freeing p. } |
Using mismatched allocation and free operations |
Does not usually cause a crash immediately. This type of error can cause memory corruption on some platforms, which might show up as a crash later in the program. Certain mismatches can also cause memory leaks. |
void mismatchedFree() { int* p1 = (int*)malloc(sizeof(int)); int* p2 = new int; int* p3 = new int[1000]; delete p1; // BUG! Should use free() delete[] p2; // BUG! Should use delete free(p3); // BUG! Should use delete[] } |
Freeing memory more than once | Can cause a crash if the memory at that location has been handed out in another allocation between the two calls to delete. | void doubleFree() { int* p1 = new int[1000]; delete[] p1; int* p2 = new int[1000]; delete[] p1; // BUG! freeing p1 twice } // BUG! Leaking memory of p2 |
Freeing unallocated memory | Will usually cause a crash. | void freeUnallocated() { int* p = reinterpret_cast<int*>(10000); delete p; // BUG! p not a valid pointer. } |
Freeing stack memory | Technically a special case of freeing unallocated memory. This will usually cause a crash. | void freeStack() { int x; int* p = &x; delete p; // BUG! Freeing stack memory } |
The crashes mentioned in this table can have different manifestations depending on your platform, such as segmentation faults, bus errors, access violations, and so on.
As you can see, some of the errors do not cause immediate program termination. These bugs are more subtle, leading to problems later in the program’s execution.
Another category of memory errors involves the actual reading and writing of memory.
ERROR TYPE | SYMPTOMS | EXAMPLE |
Accessing invalid memory | Almost always causes the program to crash immediately. | void accessInvalid() { int* p = reinterpret_cast<int*>(10000); *p = 5; // BUG! p is not a valid pointer. } |
Accessing freed memory |
Does not usually cause a program crash. If the memory has been handed out in another allocation, this error type can cause “strange” values to appear unexpectedly. |
void accessFreed() { int* p1 = new int; delete p1; int* p2 = new int; *p1 = 5; // BUG! The memory pointed to // by p1 has been freed. } |
Accessing memory in a different allocation |
Does not cause a crash. This error type can cause “strange” and potentially dangerous values to appear unexpectedly. |
void accessElsewhere() { int x, y[10], z; x = 0; z = 0; for (int i = 0; i <= 10; i++) { y[i] = 5; // BUG for i==10! element 10 // is past end of array. } } |
Reading uninitialized memory | Does not cause a crash, unless you use the uninitialized value as a pointer and dereference it (as in the example). Even then, it will not always cause a crash. | void readUninitialized() { int* p; cout << *p; // BUG! p is uninitialized } |
Memory-access errors don’t always cause a crash. They can instead lead to subtle errors, in which the program does not terminate but instead produces erroneous results. Erroneous results can lead to serious consequences, for example, when external devices (such as robotic arms, X-ray machines, radiation treatments, life support systems, and so on) are being controlled by the computer.
Note that the symptoms discussed here for both memory-freeing errors and memory-access errors are the default symptoms for release builds of your program. Debug builds will most likely behave differently, and when you run the program inside a debugger, the debugger might break into the code when an error occurs.
Memory-related bugs often show up in slightly different places in the code each time you run the program. This is usually the case with heap memory corruption. Heap memory corruption is like a time bomb, ready to explode at some attempt to allocate, free, or use memory on the heap. So, when you see a bug that is reproducible, but that shows up in slightly different places, you should suspect memory corruption.
If you suspect a memory bug, your best option is to use a memory-checking tool for C++. Debuggers often provide options to run the program while checking for memory errors. For example, if you run a debug build of your application in the Microsoft Visual C++ debugger, it will catch almost all types of errors discussed in the previous sections. Additionally, there are some excellent third-party tools such as Purify from Rational Software (now owned by IBM) and Valgrind for Linux (discussed in Chapter 7). Microsoft also provides a free download called Application Verifier, which can be used with release builds of your applications in a Windows environment. It is a run-time verification tool to help you find subtle programming errors like the previously discussed memory errors. These debuggers and tools work by interposing their own memory-allocation and -freeing routines in order to check for any misuse of dynamic memory, such as freeing unallocated memory, dereferencing unallocated memory, or writing off the end of an array.
If you don’t have a memory-checking tool at your disposal, and the normal strategies for debugging are not helping, you may need to resort to code inspection. First, narrow down the part of the code containing the bug. Then, as a general rule, look at all raw pointers. Provided that you work on moderate- to good-quality code, most pointers should already be wrapped in smart pointers. If you do encounter raw pointers, take a closer look at how they are used, because they might be the cause of the error. Here are some more items to look for in your code.
nullptr
so that their destructors don’t try to free that memory.dynamic_cast
s.new
is matched with exactly one call to delete
. Similarly, every call to malloc
, alloc
, or calloc
should be matched with one call to free
, and every call to new[]
should be matched with one call to delete[]
. To avoid freeing memory multiple times or using freed memory, it’s recommended to set your pointer to nullptr
after freeing its memory. Of course, the best solution is to avoid using raw pointers to handle ownership of resources, and instead use smart pointers.T* p = nullptr;
or T* p = new T;
but never T* p;
. Better yet, use smart pointers!nullptr
. Also here, the best solution is to use smart pointers.C++ includes a threading support library that provides mechanisms for threading and synchronization between threads. This threading support library is discussed in Chapter 23. Multithreaded C++ programs are common, so it is important to think about the special issues involved in debugging a multithreaded program. Bugs in multithreaded programs are often caused by variations in timings in the operating system scheduling, and can be difficult to reproduce. Thus, debugging multithreaded programs requires a special set of techniques.
<thread>
header defines sleep_until()
and sleep_for()
in the std::this_thread
namespace, which you can use to sleep. The time to sleep is specified as an std::time_point
or an std::duration
respectively, both part of the chrono library discussed in Chapter 20. Sleeping for several seconds right before releasing a lock, immediately before signaling a condition variable, or directly before accessing shared data can reveal race conditions that would otherwise go undetected. If this debugging technique reveals the root cause, it must be fixed, so that it works correctly after removing these forced sleeps and context switches. Never leave these forced sleeps and context switches in your code! That would be the wrong “fix” for the problem.This section presents a buggy program and shows you the steps to take in order to debug it and fix the problem.
Suppose that you’re part of a team writing a web page that allows users to search for the research articles that cite a particular paper. This type of service is useful for authors who are trying to find work similar to their own. Once they find one paper representing a related work, they can look for every paper that cites that one to find other related work.
In this project, you are responsible for the code that reads the raw citation data from text files. For simplicity, assume that the citation information for each paper is found in its own file. Furthermore, assume that the first line of each file contains the author, title, and publication information for the paper; that the second line is always empty; and that all subsequent lines contain the citations from the article (one on each line). Here is an example file for one of the most important papers in computer science:
Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.
Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.
Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.
Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.
E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.
You may decide to structure your program by writing an The
The implementation is as follows. Keep in mind that this program is buggy! Don’t use it verbatim or as a model. The following program asks the user for a filename, constructs an
You decide to test the program on the Alan Turing example (stored in a file called
That doesn’t look right. There are supposed to be four citations printed instead of four blank lines. For this bug, you decide to try log-based debugging, and because this is a console application, you decide to just print messages to
Running the same test with this program gives the following output:
As you can see from the output, the first time the program reads the citations from the file, in order to count them, it reads them correctly. However, the second time, they are not read correctly; nothing is printed between “readFile(): reading citations” and “readFile(): finished”. Why not? One way to delve deeper into this issue is to add some debugging code to check the state of the file stream after each attempt to read a citation:
When you run your program this time, you find some interesting information:
It looks like the stream state is good until after the final citation is read for the first time. Because the However, Chapter 13 explains that streams maintain their error states until you clear them explicitly; Here is the corrected
Running the same test again on Now that your
When you try to run your program on this file, depending on your version of Linux and your compiler, you might get a crash that looks something like the following:
The message “core dumped” means that the program crashed. This time you decide to give the debugger a shot. The Gnu Debugger (GDB) is widely available on Unix and Linux platforms. First, you must compile your program with debugging information (
When the program crashes, the debugger breaks the execution, and allows you to poke around in the state of the program at that time. The
When you get a stack trace like this, you should try to find the first stack frame from the top that is in your own code. In this example, this is stack frame #8. From this frame, you can see that there seems to be a problem in the
This output shows that the following line caused a problem:
Now, you can use the
In GDB, you can print values available in the current scope with the
Ah-ha! Here’s the problem. This article isn’t supposed to have any citations. Why is
As this example shows, bugs don’t always show up right away. It often takes a debugger and some persistence to find them. This section explains the same debugging procedure as described in the previous section, but uses the Microsoft Visual C++ 2017 debugger instead of GDB. First, you need to create a project. Start VC++ and click File ➪ New ➪ Project. In the project template tree on the left, select Visual C++ ➪ Win32 (or Windows Desktop). Then select the Win32 Console Application (or Windows Console Application) template in the list in the middle of the window. At the bottom, you can specify a name for the project and a location where to save it. Specify ArticleCitations as the name, choose a folder in which to save the project, and click OK. A wizard opens.2 In this wizard, click Next, select Console application, select Empty Project, and click Finish. Once your new project is created, you can see a list of project files in the Solution Explorer. If this docking window is not visible, go to View ➪ Solution Explorer. There should be no files in the solution right now. Right-click the ArticleCitations project in the Solution Explorer and click Add ➪ Existing Item. Add all the files from the VC++ 2017 does not yet automatically enable C++17 features yet. Because this example uses Visual C++ supports so-called precompiled headers, a topic outside the scope of this book. In general, I recommend to use precompiled headers if your compiler supports them. However, the Now you can compile the program. Click Build ➪ Build Solution. Then copy the Run the application with Debug ➪ Start Debugging, and test the program by first specifying the This immediately shows you the line where the crash happened. If you only see disassembly code, right-click anywhere on the disassembly and select Go To Source Code. You can now inspect variables by simply hovering your mouse over the name of a variable. If you hover over You can come to the same conclusion by inspecting the call stack (Debug ➪ Windows ➪ Call Stack). In this call stack, you need to find the first line that contains code that you wrote. This is shown in Figure 27-3. Just as with GDB, you see that the problem is in Instead of hovering over variables to inspect their values, you can also use the Debug ➪ Windows ➪ Autos window, which shows a list of variables. Figure 27-4 shows this list with the Buggy Implementation of an ArticleCitations Class
ArticleCitations
class that reads the file and stores the information. This class stores the article information from the first line in one string, and the citations in a C-style array of strings.ArticleCitations
class definition looks like this:class ArticleCitations
{
public:
ArticleCitations(std::string_view fileName);
virtual ~ArticleCitations();
ArticleCitations(const ArticleCitations& src);
ArticleCitations& operator=(const ArticleCitations& rhs);
std::string_view getArticle() const { return mArticle; }
int getNumCitations() const { return mNumCitations; }
std::string_view getCitation(int i) const { return mCitations[i]; }
private:
void readFile(std::string_view fileName);
void copy(const ArticleCitations& src);
std::string mArticle;
std::string* mCitations;
int mNumCitations;
};
ArticleCitations::ArticleCitations(string_view fileName)
: mCitations(nullptr), mNumCitations(0)
{
// All we have to do is read the file.
readFile(fileName);
}
ArticleCitations::ArticleCitations(const ArticleCitations& src)
{
copy(src);
}
ArticleCitations& ArticleCitations::operator=(const ArticleCitations& rhs)
{
// Check for self-assignment.
if (this == &rhs) {
return *this;
}
// Free the old memory.
delete [] mCitations;
// Copy the data
copy(rhs);
return *this;
}
void ArticleCitations::copy(const ArticleCitations& src)
{
// Copy the article name, author, etc.
mArticle = src.mArticle;
// Copy the number of citations
mNumCitations = src.mNumCitations;
// Allocate an array of the correct size
mCitations = new string[mNumCitations];
// Copy each element of the array
for (int i = 0; i < mNumCitations; i++) {
mCitations[i] = src.mCitations[i];
}
}
ArticleCitations::~ArticleCitations()
{
delete [] mCitations;
}
void ArticleCitations::readFile(string_view fileName)
{
// Open the file and check for failure.
ifstream inputFile(fileName.data());
if (inputFile.fail()) {
throw invalid_argument("Unable to open file");
}
// Read the article author, title, etc. line.
getline(inputFile, mArticle);
// Skip the white space before the citations start.
inputFile >> ws;
int count = 0;
// Save the current position so we can return to it.
streampos citationsStart = inputFile.tellg();
// First count the number of citations.
while (!inputFile.eof()) {
// Skip white space before the next entry.
inputFile >> ws;
string temp;
getline(inputFile, temp);
if (!temp.empty()) {
count++;
}
}
if (count != 0) {
// Allocate an array of strings to store the citations.
mCitations = new string[count];
mNumCitations = count;
// Seek back to the start of the citations.
inputFile.seekg(citationsStart);
// Read each citation and store it in the new array.
for (count = 0; count < mNumCitations; count++) {
string temp;
getline(inputFile, temp);
if (!temp.empty()) {
mCitations[count] = temp;
}
}
} else {
mNumCitations = -1;
}
}
Testing the ArticleCitations class
ArticleCitations
instance for that file, and passes this instance by value to the processCitations()
function, which prints out all the information:void processCitations(ArticleCitations cit)
{
cout << cit.getArticle() << endl;
int num = cit.getNumCitations();
for (int i = 0; i < num; i++) {
cout << cit.getCitation(i) << endl;
}
}
int main()
{
while (true) {
cout << "Enter a file name ("STOP" to stop): ";
string fileName;
cin >> fileName;
if (fileName == "STOP") {
break;
}
ArticleCitations cit(fileName);
processCitations(cit);
}
return 0;
}
paper1.txt
). Here is the output:Enter a file name ("STOP" to stop): paper1.txt
Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.
[ 4 empty lines omitted for brevity ]
Enter a file name ("STOP" to stop): STOP
Message-Based Debugging
cout
. In this case, it makes sense to start by looking at the function that reads the citations from the file. If that doesn’t work right, then obviously the object won’t have the citations. You can modify readFile()
as follows:void ArticleCitations::readFile(string_view fileName)
{
// Code omitted for brevity
// First count the number of citations.
cout << "readFile(): counting number of citations" << endl;
while (!inputFile.eof()) {
// Skip white space before the next entry.
inputFile >> ws;
string temp;
getline(inputFile, temp);
if (!temp.empty()) {
cout << "Citation " << count << ": " << temp << endl;
count++;
}
}
cout << "Found " << count << " citations" << endl;
cout << "readFile(): reading citations" << endl;
if (count != 0) {
// Allocate an array of strings to store the citations.
mCitations = new string[count];
mNumCitations = count;
// Seek back to the start of the citations.
inputFile.seekg(citationsStart);
// Read each citation and store it in the new array.
for (count = 0; count < mNumCitations; count++) {
string temp;
getline(inputFile, temp);
if (!temp.empty()) {
cout << temp << endl;
mCitations[count] = temp;
}
}
} else {
mNumCitations = -1;
}
cout << "readFile(): finished" << endl;
}
Enter a file name ("STOP" to stop): paper1.txt
readFile(): counting number of citations
Citation 0: Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.
Citation 1: Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.
Citation 2: Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.
Citation 3: E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.
Found 4 citations
readFile(): reading citations
readFile(): finished
Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.
[ 4 empty lines omitted for brevity ]
Enter a file name ("STOP" to stop): STOP
void printStreamState(const istream& inputStream)
{
if (inputStream.good()) {
cout << "stream state is good" << endl;
}
if (inputStream.bad()) {
cout << "stream state is bad" << endl;
}
if (inputStream.fail()) {
cout << "stream state is fail" << endl;
}
if (inputStream.eof()) {
cout << "stream state is eof" << endl;
}
}
void ArticleCitations::readFile(string_view fileName)
{
// Code omitted for brevity
// First count the number of citations.
cout << "readFile(): counting number of citations" << endl;
while (!inputFile.eof()) {
// Skip white space before the next entry.
inputFile >> ws;
printStreamState(inputFile);
string temp;
getline(inputFile, temp);
printStreamState(inputFile);
if (!temp.empty()) {
cout << "Citation " << count << ": " << temp << endl;
count++;
}
}
cout << "Found " << count << " citations" << endl;
cout << "readFile(): reading citations" << endl;
if (count != 0) {
// Allocate an array of strings to store the citations.
mCitations = new string[count];
mNumCitations = count;
// Seek back to the start of the citations.
inputFile.seekg(citationsStart);
// Read each citation and store it in the new array.
for (count = 0; count < mNumCitations; count++) {
string temp;
getline(inputFile, temp);
printStreamState(inputFile);
if (!temp.empty()) {
cout << temp << endl;
mCitations[count] = temp;
}
}
} else {
mNumCitations = -1;
}
cout << "readFile(): finished" << endl;
}
Enter a file name ("STOP" to stop): paper1.txt
readFile(): counting number of citations
stream state is good
stream state is good
Citation 0: Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.
stream state is good
stream state is good
Citation 1: Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.
stream state is good
stream state is good
Citation 2: Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.
stream state is good
stream state is good
Citation 3: E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.
stream state is eof
stream state is fail
stream state is eof
Found 4 citations
readFile(): reading citations
stream state is fail
stream state is fail
stream state is fail
stream state is fail
readFile(): finished
Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.
[ 4 empty lines omitted for brevity ]
Enter a file name ("STOP" to stop): STOP
paper1.txt
file contains an empty last line, the while
loop is executed one more time after having read the last citation. In this last loop, inputFile >> ws
reads the white-space of the last line, which causes the stream state to become eof
. Then, the code still tries to read a line using getline()
which causes the stream state to become fail
and eof
. That is expected. What is not expected is that the stream state remains as fail
after all attempts to read the citations a second time. That doesn’t appear to make sense at first: the code uses seekg()
to seek back to the beginning of the citations before reading them a second time.seekg()
doesn’t clear the fail
state automatically. When in an error state, streams fail to read data correctly, which explains why the stream state is also fail
after trying to read the citations a second time. A closer look at the code reveals that it fails to call clear()
on the istream
after reaching the end of the file. If you modify the code by adding a call to clear()
, it will read the citations properly.readFile()
method without the debugging cout
statements:void ArticleCitations::readFile(string_view fileName)
{
// Code omitted for brevity
if (count != 0) {
// Allocate an array of strings to store the citations.
mCitations = new string[count];
mNumCitations = count;
// Clear the stream state.
inputFile.clear();
// Seek back to the start of the citations.
inputFile.seekg(citationsStart);
// Read each citation and store it in the new array.
for (count = 0; count < mNumCitations; count++) {
string temp;
getline(inputFile, temp);
if (!temp.empty()) {
mCitations[count] = temp;
}
}
} else {
mNumCitations = -1;
}
}
paper1.txt
now shows you the correct four citations.Using the GDB Debugger on Linux
ArticleCitations
class seems to work well on one citations file, you decide to blaze ahead and test some special cases, starting with a file with no citations. The file looks like this, and is stored in a file named paper2.txt
:Author with no citations
Enter a file name ("STOP" to stop): paper2.txt
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)
-g
with g++). Then you can launch the program under GDB. Here’s an example session using the debugger to find the root cause of this problem. This example assumes your compiled executable is called buggyprogram
. Text that you have to type is shown in bold.>
gdb buggyprogram
[ Start-up messages omitted for brevity ]
Reading symbols from /home/marc/c++/gdb/buggyprogram…done.
(gdb)
run
Starting program: buggyprogram
Enter a file name ("STOP" to stop):
paper2.txt
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Program received signal SIGABRT, Aborted.
0x00007ffff7535c39 in raise () from /lib64/libc.so.6
(gdb)
backtrace
or bt
command shows the current stack trace. The last operation is at the top, with frame number zero (#0):(gdb)
bt
#0 0x00007ffff7535c39 in raise () from /lib64/libc.so.6
#1 0x00007ffff7537348 in abort () from /lib64/libc.so.6
#2 0x00007ffff7b35f85 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
#3 0x00007ffff7b33ee6 in ?? () from /lib64/libstdc++.so.6
#4 0x00007ffff7b33f13 in std::terminate() () from /lib64/libstdc++.so.6
#5 0x00007ffff7b3413f in __cxa_throw () from /lib64/libstdc++.so.6
#6 0x00007ffff7b346cd in operator new(unsigned long) () from /lib64/libstdc++.so.6
#7 0x00007ffff7b34769 in operator new[](unsigned long) () from /lib64/libstdc++.so.6
#8 0x00000000004016ea in ArticleCitations::copy (this=0x7fffffffe090, src=…) at ArticleCitations.cpp:40
#9 0x00000000004015b5 in ArticleCitations::ArticleCitations (this=0x7fffffffe090, src=…)
at ArticleCitations.cpp:16
#10 0x0000000000401d0c in main () at ArticleCitationsTest.cpp:20
copy()
method of ArticleCitations
. This method is invoked because main()
calls processCitations()
and passes the argument by value, which triggers a call to the copy constructor, which calls copy()
. Of course, in production code you should pass a const
reference, but pass-by-value is used in this example of a buggy program. You can tell the debugger to switch to stack frame #8 with the frame
command, which requires the index of the frame to jump to:(gdb)
frame 8
#8 0x00000000004016ea in ArticleCitations::copy (this=0x7fffffffe090, src=…) at ArticleCitations.cpp:40
40 mCitations = new string[mNumCitations];
mCitations = new string[mNumCitations];
list
command to show the code in the current stack frame around the offending line:(gdb)
list
35 // Copy the article name, author, etc.
36 mArticle = src.mArticle;
37 // Copy the number of citations
38 mNumCitations = src.mNumCitations;
39 // Allocate an array of the correct size
40 mCitations = new string[mNumCitations];
41 // Copy each element of the array
42 for (int i = 0; i < mNumCitations; i++) {
43 mCitations[i] = src.mCitations[i];
44 }
print
command. In order to find the root cause of the problem, you can try printing some of the variables. The error happens inside the copy()
method, so checking the value of the src
parameter is a good start:(gdb)
print src
$1 = (const ArticleCitations &) @0x7fffffffe060: {
_vptr.ArticleCitations = 0x401fb0 <vtable for ArticleCitations+16>,
mArticle = "Author with no citations", mCitations = 0x7fffffffe080, mNumCitations = -1}
mNumCitations
set to the strange value -1
? Take another look at the code in readFile()
for the case where there are no citations. In that case, it looks like mNumCitations
is erroneously set to -1
. The fix is easy: you always need to initialize mNumCitations
to 0
, instead of setting it to -1
when there are no citations. Another problem is that readFile()
can be called multiple times on the same ArticleCitations
object, so you also need to free a previously allocated mCitations
array. Here is the fixed code:void ArticleCitations::readFile(string_view fileName)
{
// Code omitted for brevity
delete [] mCitations;
// Free previously allocated citations. mCitations = nullptr;
mNumCitations = 0;
if (count != 0) {
// Allocate an array of strings to store the citations.
mCitations = new string[count];
mNumCitations = count;
// Clear the stream state.
inputFile.clear();
// Seek back to the start of the citations.
inputFile.seekg(citationsStart);
// Code omitted for brevity
}
}
Using the Visual C++ 2017 Debugger
06_ArticleCitations 6_VisualStudio
folder in the downloadable source code archive to the project. After this, your Solution Explorer should look similar to Figure 27-1.std::string_view
from C++17, you have to tell VC++ to enable C++17 features. In the Solution Explorer window, right-click the ArticleCitations project and click Properties. In the properties window, go to Configuration Properties ➪ C/C++ ➪ Language, and set the C++ Language Standard option to “ISO C++17 Standard” or “ISO C++ Latest Draft Standard”, whichever is available in your version of Visual C++.ArticleCitations
implementation does not use precompiled headers, so you have to disable that feature for this particular project. In the Solution Explorer window, right-click the ArticleCitations project and click Properties. In the properties window, go to Configuration Properties ➪ C/C++ ➪ Precompiled Headers, and set the Precompiled Header option to “Not Using Precompiled Headers.”paper1.txt
and paper2.txt
test files to your ArticleCitations
project folder, which is the folder containing the ArticleCitations.vcxproj
file.paper1.txt
file. It should properly read the file and output the result to the console. Then, test paper2.txt
. The debugger breaks the execution with a message similar to Figure 27-2.src
, you’ll notice that mNumCitations
is -1
. The reason and the fix are exactly the same as in the earlier example.copy()
. You can double-click that line in the call stack window to jump to the right place in the code.src
variable expanded to show its data members. From this window, you can also see that mNumCitations
is -1
.
You might be inclined to disregard this example as too small to be representative of real debugging. Although the buggy code is not lengthy, many classes that you write will not be much bigger, even in large projects. Imagine if you had failed to test this example thoroughly before integrating it with the rest of the project. If these bugs showed up later, you and other engineers would have to spend more time narrowing down the problem before you could debug it as shown here. Additionally, the techniques shown in this example apply to all debugging, whether on a large or small scale.
The most important concept in this chapter is the fundamental law of debugging: avoid bugs when you’re coding, but plan for bugs in your code. The reality of programming is that bugs will appear. If you’ve prepared your program properly, with error logging, debug traces, and assertions, then the actual debugging will be significantly easier.
In addition to these techniques, this chapter also presented specific approaches for debugging bugs. The most important rule when actually debugging is to reproduce the problem. Then, you can use a symbolic debugger, or log-based debugging, to track down the root cause. Memory errors present particular difficulties, and account for the majority of bugs in legacy C++ code. This chapter described the various categories of memory bugs and their symptoms, and showed examples of debugging errors in a program.
Debugging is a hard skill to learn. To take your C++ skills to a professional level, you will have to practice debugging a lot.