27
Conquering Debugging

THE FUNDAMENTAL LAW OF DEBUGGING

The first rule of debugging is to be honest with yourself and admit that your code will contain bugs! This realistic assessment enables you to put your best effort into preventing bugs from crawling into your code in the first place, while you simultaneously include the necessary features to make debugging as easy as possible.

BUG TAXONOMIES

A bug in a computer program is incorrect run-time behavior. This undesirable behavior includes both catastrophic and noncatastrophic bugs. Examples of catastrophic bugs are program death, data corruption, operating system failures, or some other horrific outcome. A catastrophic bug can also manifest itself external to the software or computer system running the software; for example, medical software might contain a catastrophic bug causing a massive radiation overdose to a patient. Noncatastrophic bugs are bugs that cause the program to behave incorrectly in more subtle ways; for example, a web browser might return the wrong web page, or a spreadsheet application might calculate the standard deviation of a column incorrectly.

There are also so-called cosmetic bugs, where something is visually not correct, but otherwise works correctly. For example, a button in a user interface is kept enabled when it shouldn’t be, but clicking it does nothing. All computations are perfectly correct, the program does not crash, but it doesn’t look as “nice” as it should.

The underlying cause, or root cause, of the bug is the mistake in the program that causes this incorrect behavior. The process of debugging a program includes both determining the root cause of the bug, and fixing the code so that the bug will not occur again.

AVOIDING BUGS

It’s impossible to write completely bug-free code, so debugging skills are important. However, a few tips can help you to minimize the number of bugs.

  • Read this book from cover to cover: Learn the C++ language intimately, especially pointers and memory management. Then, recommend this book to your friends and coworkers so they avoid bugs too.
  • Design before you code: Designing while you code tends to lead to convoluted designs that are harder to understand and are more error-prone. It also makes you more likely to omit possible edge cases and error conditions.
  • Do code reviews: In a professional environment, every single line of code should be peer-reviewed. Sometimes it takes a fresh perspective to notice problems.
  • Test, test, and test again: Thoroughly test your code, and have others test your code! They are more likely to find problems you haven’t thought of.
  • Write automated unit tests: Unit tests are designed to test isolated functionality. You should write unit tests for all implemented features. Run these unit tests automatically as part of your continuous integration setup, or automatically after each local compilation. Chapter 26 discusses unit testing in detail.
  • Expect error conditions, and handle them appropriately: In particular, plan for and handle errors when working with files. They will occur. See chapters 13 and 14.
  • Use smart pointers to avoid memory leaks: Smart pointers automatically free resources when they are not needed anymore.
  • Don’t ignore compiler warnings: Configure your compiler to compile with a high warning level. Do not blindly ignore warnings. Ideally, you should enable an option in your compiler to treat warnings as errors. This forces you to address each warning.
  • Use static code analysis: A static code analyzer helps you to pinpoint problems in your code by analyzing your source code. Ideally, static code analysis is done automatically by your build process to detect problems early.
  • Use good coding style: Strive for readability and clarity, use meaningful names, don’t use abbreviations, add code comments (not only interface comments), use the override keyword, and so on. This makes it easier for other people to understand your code.

PLANNING FOR BUGS

Your programs should contain features that enable easier debugging when the inevitable bugs arise. This section describes these features and presents sample implementations, where appropriate, that you can incorporate into your own programs.

Error Logging

Imagine this scenario: You have just released a new version of your flagship product, and one of the first users reports that the program “stopped working.” You attempt to pry more information from the user, and eventually discover that the program died in the middle of an operation. The user can’t quite remember what he was doing, or if there were any error messages. How will you debug this problem?

Now imagine the same scenario, but in addition to the limited information from the user, you are also able to examine the error log on the user’s computer. In the log you see a message from your program that says, “Error: unable to open config.xml file.” Looking at the code near the spot where that error message was generated, you find a line in which you read from the file without checking whether the file was opened successfully. You’ve found the root cause of your bug!

Error logging is the process of writing error messages to persistent storage so that they will be available following an application, or even machine, death. Despite the example scenario, you might still have doubts about this strategy. Won’t it be obvious by your program’s behavior if it encounters errors? Won’t the user notice if something goes wrong? As the preceding example shows, user reports are not always accurate or complete. In addition, many programs, such as the operating system kernel and long-running daemons like inetd or syslogd on Unix, are not interactive and run unattended on a machine. The only way these programs can communicate with users is through error logging. In many cases, a program might also want to automatically recover from certain errors, and hide those errors from the user. Still, having logs of those errors available can be invaluable to improve the overall stability of the program.

Thus, your program should log errors as it encounters them. That way, if a user reports a bug, you will be able to examine the log files on the machine to see if your program reported any errors prior to encountering the bug. Unfortunately, error logging is platform dependent: C++ does not contain a standard logging mechanism. Examples of platform-specific logging mechanisms include the syslog facility in Unix, and the event reporting API in Windows. You should consult the documentation for your development platform. There are also some open-source implementations of cross-platform logging frameworks. Here are two examples:

Now that you’re convinced that logging is a great feature to add to your programs, you might be tempted to log messages every few lines in your code so that, in the event of any bug, you’ll be able to trace the code path that was executing. These types of log messages are appropriately called traces.

However, you should not write these traces to log files for two reasons. First, writing to persistent storage is slow. Even on systems that write the logs asynchronously, logging that much information will slow down your program. Second, and most important, most of the information that you would put in your traces is not appropriate for the end user to see. It will just confuse the user, leading to unwarranted service calls. That said, tracing is an important debugging technique under the correct circumstances, as described in the next section.

Here are some specific guidelines for the types of errors your programs should log:

  • Unrecoverable errors, such as a system call failing unexpectedly.
  • Errors for which an administrator can take action, such as low memory, an incorrectly formatted data file, an inability to write to disk, or a network connection being down.
  • Unexpected errors such as a code path that you never expected to take or variables with unexpected values. Note that your code should “expect” users to enter invalid data and should handle it appropriately. An unexpected error represents a bug in your program.
  • Potential security breaches, such as a network connection attempted from an unauthorized address, or too many network connections attempted (denial of service).

It is also useful to log warnings, or recoverable errors, which allows you to investigate if you can possibly avoid them.

Most logging APIs allow you to specify a log level or error level, typically error, warning, and info. You can log non-error conditions under a log level that is less severe than “error.” For example, you might want to log significant state changes in your application, or startup and shutdown of the program. You also might consider giving your users a way to adjust the log level of your program at run time so that they can customize the amount of logging that occurs.

Debug Traces

When debugging complicated problems, public error messages generally do not contain enough information. You often need a complete trace of the code path taken, or values of variables before the bug showed up. In addition to basic messages, it’s sometimes helpful to include the following information in debug traces:

  • The thread ID, if it’s a multithreaded program
  • The name of the function that generates the trace
  • The name of the source file in which the code that generates the trace lives

You can add this tracing to your program through a special debug mode, or via a ring buffer. Both of these methods are explained in detail in the following sections. Note that in multithreaded programs you have to make your trace logging thread-safe. See Chapter 23 for details on multithreaded programming.

Debug Mode

The first technique to add debug traces is to provide a debug mode for your program. In debug mode, the program writes trace output to standard error or to a file, and perhaps does extra checking during execution. There are several ways to add a debug mode to your program. Note that all these examples are writing traces in text format.

Start-Time Debug Mode

Start-time debug mode allows your application to run with or without debug mode depending on a command-line argument. This strategy includes the debug code in the “release” binary, and allows debug mode to be enabled at a customer site. However, it does require users to restart the program in order to run it in debug mode, which may prevent you from obtaining useful information about certain bugs.

The following example is a simple program implementing a start-time debug mode. This program doesn’t do anything useful; it is only for demonstrating the technique.

All logging functionality is wrapped in a Logger class. This class has two static data members: the name of the log file, and a Boolean saying whether logging is enabled or disabled. The class has a static public log() variadic template method. Variadic templates are discussed in Chapter 22. Note that the log file is opened, flushed, and closed on each call to log(). This might lower performance a bit; however, it does guarantee correct logging, which is more important.

class Logger
{
    public:
        static void enableLogging(bool enable) { msLoggingEnabled = enable; }
        static bool isLoggingEnabled() { return msLoggingEnabled; }

        template<typename… Args>
        static void log(const Args&… args)
        {
            if (!msLoggingEnabled)
                return;

            ofstream logfile(msDebugFileName, ios_base::app);
            if (logfile.fail()) {
                cerr << "Unable to open debug file!" << endl;
                return;
            }
            // Use a C++17 unary right fold, see Chapter 22.
            ((logfile << args),…);
            logfile << endl;
        }
    private:
        static const string msDebugFileName;
        static bool msLoggingEnabled;
};

const string Logger::msDebugFileName = "debugfile.out";
bool Logger::msLoggingEnabled = false;

The following helper macro is defined to make it easy to log something. It uses __func__, a predefined variable defined by the C++ standard that contains the name of the current function.

#define log(…) Logger::log(__func__, "(): ", __VA_ARGS__)

This macro replaces every call to log() in your code with a call to Logger::log(). The macro automatically includes the function name as first argument to Logger::log(). For example, suppose you call the macro as follows:

log("The value is: ", value);

The log() macro replaces this with the following:

Logger::log(__func__, "(): ", "The value is: ", value);

Start-time debug mode needs to parse the command-line arguments to find out whether or not it should enable debug mode. Unfortunately, there is no standard functionality in C++ for parsing command-line arguments. This program uses a simple isDebugSet() function to check for the debug flag among all the command-line arguments, but a function to parse all command-line arguments would need to be more sophisticated.

bool isDebugSet(int argc, char* argv[])
{
    for (int i = 1; i < argc; i++) {
        if (strcmp(argv[i], "-d") == 0) {
            return true;
        }
    }
    return false;
}

Some arbitrary test code is used to exercise the debug mode in this example. Two classes are defined, ComplicatedClass and UserCommand. Both classes define an operator<< to write instances of them to a stream. The Logger class uses this operator to dump objects to the log file.

class ComplicatedClass { /* … */ };
ostream& operator<<(ostream& ostr, const ComplicatedClass& src)
{
    ostr << "ComplicatedClass";
    return ostr;
}

class UserCommand { /* … */ };
ostream& operator<<(ostream& ostr, const UserCommand& src)
{
    ostr << "UserCommand";
    return ostr;
}

Here is some test code with a number of log calls:

UserCommand getNextCommand(ComplicatedClass* obj)
{
    UserCommand cmd;
    return cmd;
}

void processUserCommand(UserCommand& cmd)
{
    // details omitted for brevity
}

void trickyFunction(ComplicatedClass* obj)
{
    log("given argument: ", *obj);

    for (size_t i = 0; i < 100; ++i) {
        UserCommand cmd = getNextCommand(obj);
        log("retrieved cmd ", i, ": ", cmd);

        try {
            processUserCommand(cmd);
        } catch (const exception& e) {
            log("exception from processUserCommand(): ", e.what());
        }
    }
}

int main(int argc, char* argv[])
{
    Logger::enableLogging(isDebugSet(argc, argv));

    if (Logger::isLoggingEnabled()) {
        // Print the command-line arguments to the trace
        for (int i = 0; i < argc; i++) {
            log(argv[i]);
        }
    }

    ComplicatedClass obj;
    trickyFunction(&obj);

    // Rest of the function not shown
    return 0;
}

There are two ways to run this application:

> STDebug
> STDebug -d

Debug mode is activated only when the -d argument is specified on the command line.

Compile-Time Debug Mode

Instead of enabling or disabling debug mode through a command-line argument, you could also use a preprocessor symbol such as DEBUG_MODE and #ifdefs to selectively compile the debug code into your program. In order to generate a debug version of this program, you would have to compile it with the symbol DEBUG_MODE defined. Your compiler should allow you to define symbols during compilation; consult your compiler’s documentation for details. For example, GCC allows you to specify –Dsymbol through the command-line. Microsoft VC++ allows you to specify the symbols through the Visual Studio IDE, or by specifying /D symbol if you use the VC++ command-line tools. Visual C++ automatically defines the _DEBUG symbol for debug builds. However, that symbol is Visual C++ specific, so the example in this section uses a custom symbol called DEBUG_MODE.

The advantage of this method is that your debug code is not compiled into the “release” binary, and so does not increase its size. The disadvantage is that there is no way to enable debugging at a customer site for testing or following the discovery of a bug.

An example implementation is given in CTDebug.cpp in the downloadable source code archive. One important remark on this implementation is that it contains the following definition for the log() macro:

#ifdef DEBUG_MODE
    #define log(…) Logger::log(__func__, "(): ", __VA_ARGS__)
#else
    #define log(…)
#endif

That is, if DEBUG_MODE is not defined, then all calls to log() are replaced with nothing, called no-ops.

Run-Time Debug Mode

The most flexible way to provide a debug mode is to allow it to be enabled or disabled at run time. One way to provide this feature is to supply an asynchronous interface that controls debug mode on the fly. This interface could be an asynchronous command that makes an interprocess call into the application (for example, using sockets, signals, or remote procedure calls). This interface could also take the form of a menu command in the user interface. C++ provides no standard way to perform interprocess communication, so an example of this technique is not shown.

Ring Buffers

Debug mode is useful for debugging reproducible problems and for running tests. However, bugs often appear when the program is running in non-debug mode, and by the time you or the customer enables debug mode, it is too late to gain any information about the bug. One solution to this problem is to enable tracing in your program at all times. You usually need only the most recent traces to debug a program, so you should store only the most recent traces at any point in a program’s execution. One way to provide for this is through careful use of log file rotations.

However, for performance reasons, it is better that your program doesn’t log these traces continuously to disk. Instead, it should store them in memory and provide a mechanism to dump all the trace messages to standard error or to a log file if the need arises.

A common technique is to use a ring buffer, also known as a circular buffer, to store a fixed number of messages, or messages in a fixed amount of memory. When the buffer fills up, it starts writing messages at the beginning of the buffer again, overwriting the older messages. This cycle can repeat indefinitely. The following sections provide an implementation of a ring buffer and show you how you can use it in your programs.

Ring Buffer Interface

The following RingBuffer class provides a simple debug ring buffer. The client specifies the number of entries in the constructor, and adds messages with the addEntry() method. Once the number of entries exceeds the number allowed, new entries overwrite the oldest entries in the buffer. The buffer also provides the option to output entries to a stream as they are added to the buffer. The client can specify an output stream in the constructor, and can reset it with the setOutput() method. Finally, the operator<< streams the entire buffer to an output stream. This implementation uses a variadic template method. Variadic templates are discussed in Chapter 22.

class RingBuffer
{
    public:
        // Constructs a ring buffer with space for numEntries.
        // Entries are written to *ostr as they are queued (optional).
        explicit RingBuffer(size_t numEntries = kDefaultNumEntries,
            std::ostream* ostr = nullptr);
        virtual ~RingBuffer() = default;

        // Adds an entry to the ring buffer, possibly overwriting the
        // oldest entry in the buffer (if the buffer is full).
        template<typename… Args>
        void addEntry(const Args&… args)
        {
            std::ostringstream os;
            // Use a C++17 unary right fold, see Chapter 22.
            ((os << args), …);
            addStringEntry(os.str());
        }

        // Streams the buffer entries, separated by newlines, to ostr.
        friend std::ostream& operator<<(std::ostream& ostr, RingBuffer& rb);

        // Streams entries as they are added to the given stream.
        // Specify nullptr to disable this feature.
        // Returns the old output stream.
        std::ostream* setOutput(std::ostream* newOstr);

    private:
        std::vector<std::string> mEntries;
        std::vector<std::string>::iterator mNext;

        std::ostream* mOstr;
        bool mWrapped;

        static const size_t kDefaultNumEntries = 500;

        void addStringEntry(std::string&& entry);
};

Ring Buffer Implementation

This implementation of the ring buffer stores a fixed number of string objects. This approach certainly is not the most efficient solution. Other possibilities would be to provide a fixed number of bytes of memory for the buffer. However, this implementation should be sufficient unless you’re writing a high-performance application.

For multithreaded programs, it’s useful to add the ID of the thread and a timestamp to each trace entry. Of course, the ring buffer has to be made thread-safe before using it in a multithreaded application. See Chapter 23 for multithreaded programming.

Here are the implementations:

// Initialize the vector to hold exactly numEntries. The vector size
// does not need to change during the lifetime of the object.
// Initialize the other members.
RingBuffer::RingBuffer(size_t numEntries, ostream* ostr)
    : mEntries(numEntries), mOstr(ostr), mWrapped(false)
{
    if (numEntries == 0)
        throw invalid_argument("Number of entries must be > 0.");
    mNext = begin(mEntries);
}

// The addStringEntry algorithm is pretty simple: add the entry to the next
// free spot, then reset mNext to indicate the free spot after
// that. If mNext reaches the end of the vector, it starts over at 0.
//
// The buffer needs to know if the buffer has wrapped or not so
// that it knows whether to print the entries past mNext in operator<<.
void RingBuffer::addStringEntry(string&& entry)
{
    // If there is a valid ostream, write this entry to it.
    if (mOstr) {
        *mOstr << entry << endl;
    }

    // Move the entry to the next free spot and increment
    // mNext to point to the free spot after that.
    *mNext = std::move(entry);
    ++mNext;

    // Check if we've reached the end of the buffer. If so, we need to wrap.
    if (mNext == end(mEntries)) {
        mNext = begin(mEntries);
        mWrapped = true;
    }
}

// Set the output stream.
ostream* RingBuffer::setOutput(ostream* newOstr)
{
    return std::exchange(mOstr, newOstr);
}

// operator<< uses an ostream_iterator to "copy" entries directly
// from the vector to the output stream.
//
// operator<< must print the entries in order. If the buffer has wrapped,
// the earliest entry is one past the most recent entry, which is the entry
// indicated by mNext. So, first print from entry mNext to the end.
//
// Then (even if the buffer hasn't wrapped) print from beginning to mNext-1.
ostream& operator<<(ostream& ostr, RingBuffer& rb)
{
    if (rb.mWrapped) {
        // If the buffer has wrapped, print the elements from
        // the earliest entry to the end.
        copy(rb.mNext, end(rb.mEntries), ostream_iterator<string>(ostr, "
"));
    }

    // Now, print up to the most recent entry.
    // Go up to mNext because the range is not inclusive on the right side.
    copy(begin(rb.mEntries), rb.mNext, ostream_iterator<string>(ostr, "
"));

    return ostr;
}

Using the Ring Buffer

In order to use the ring buffer, you can create an instance of it and start adding messages to it. When you want to print the buffer, just use operator<< to print it to the appropriate ostream. Here is the earlier start-time debug mode program modified to use a ring buffer instead. Changes are highlighted. The definitions of the ComplicatedClass and UserCommand classes, and the functions getNextCommand(), processUserCommand(), and trickyFunction() are not shown. They are exactly the same as before.

RingBuffer debugBuf;

#define log(…) debugBuf.addEntry(__func__, "(): ", __VA_ARGS__)

int main(int argc, char* argv[])
{
    // Log the command-line arguments
    for (int i = 0; i < argc; i++) {
        log(argv[i]);
    }

    ComplicatedClass obj;
    trickyFunction(&obj);

    // Print the current contents of the debug buffer to cout
    cout << debugBuf;

    return 0;
}

Displaying the Ring Buffer Contents

Storing trace debug messages in memory is a great start, but in order for them to be useful, you need a way to access these traces for debugging.

Your program should provide a “hook” to tell it to export the messages. This hook could be similar to the interface you would use to enable debugging at run time. Additionally, if your program encounters a fatal error that causes it to exit, it could export the ring buffer automatically to a log file before exiting.

Another way to retrieve these messages is to obtain a memory dump of the program. Each platform handles memory dumps differently, so you should consult a reference or expert for your platform.

Assertions

The <cassert> header defines an assert macro. It takes a Boolean expression and, if the expression evaluates to false, prints an error message and terminates the program. If the expression evaluates to true, it does nothing.

Assertions allow you to “force” your program to exhibit a bug at the exact point where that bug originates. If you didn’t assert at that point, your program might proceed with those incorrect values, and the bug might not show up until much later. Thus, assertions allow you to detect bugs earlier than you otherwise would.

You could use assertions in your code whenever you are “assuming” something about the state of your variables. For example, if you call a library function that is supposed to return a pointer and claims never to return nullptr, throw in an assert after the function call to make sure that the pointer isn’t nullptr.

Note that you should assume as little as possible. For example, if you are writing a library function, don’t assert that the parameters are valid. Instead, check the parameters, and return an error code or throw an exception if they are invalid.

As a rule, assertions should only be used for cases that are truly problematic, and should therefore never be ignored when they occur during development. If you hit an assertion during development, fix it, don’t just disable the assertion.

Crash Dumps

Make sure your programs create crash dumps, also called memory dumps, core dumps, and so on. A crash dump is a dump file that is created when your application crashes. It contains information about which threads were running at the time of the crash, a call stack of all the threads, and so on. How you create such dumps is platform dependent, so you should consult the documentation of your platform, or use a third-party library that takes care of it for you. Breakpad1 is an example of such an open-source cross-platform library that can write and process crash dumps.

Also make sure you set up a symbol server and a source code server. The symbol server is used to store debugging symbols of released binaries of your software. These symbols are used later on to interpret crash dumps received from customers. The source code server, discussed in Chapter 24, stores all revisions of your source code. When debugging crash dumps, this source code server is used to download the correct source code for the revision of your software that created the crash dump.

The exact procedure of analyzing crash dumps depends on your platform and compiler, so consult their documentation.

From my personal experience, I have found that a crash dump is often worth more than a thousand bug reports.

STATIC ASSERTIONS

The assertions discussed earlier in this chapter are evaluated at run time. static_assert() allows assertions evaluated at compile time. A call to static_assert() accepts two parameters: an expression to evaluate at compile time and a string. When the expression evaluates to false, the compiler issues an error that contains the given string. A simple example could be to check that you are compiling with a 64-bit compiler:

static_assert(sizeof(void*) == 8, "Requires 64-bit compilation.");

If you compile this with a 32-bit compiler where a pointer is four bytes, the compiler issues an error that can look like this:

test.cpp(3): error C2338: Requires 64-bit compilation.

image Since C++17, the string parameter is optional, as in this example:

static_assert(sizeof(void*) == 8);

In this case, if the expression evaluates to false, you get a compiler-dependent error message. For example, Microsoft Visual C++ 2017 gives the following error:

test.cpp(3): error C2607: static assertion failed

Another example where static_asserts are pretty powerful is in combination with type traits, which are discussed in Chapter 22. For example, if you write a function template or class template, you could use static_asserts together with type traits to issue compiler errors when template types don’t satisfy certain conditions. The following example requires that the template type for the process() function template has Base1 as a base class:

class Base1 {};
class Base1Child : public Base1 {};

class Base2 {};
class Base2Child : public Base2 {};

template<typename T>
void process(const T& t)
{
    static_assert(is_base_of_v<Base1, T>, "Base1 should be a base for T.");
}

int main()
{
    process(Base1());
    process(Base1Child());
    //process(Base2());      // Error
    //process(Base2Child()); // Error
}

If you try to call process() with an instance of Base2 or Base2Child, the compiler issues an error that could look like this:

test.cpp(13): error C2338: Base1 should be a base for T.
    test.cpp(21) : see reference to function template
    instantiation 'void process<Base2>(const T &)' being compiled
    with
    [
        T=Base2
    ]

DEBUGGING TECHNIQUES

Debugging a program can be incredibly frustrating. However, with a systematic approach it becomes significantly easier. Your first step in trying to debug a program should always be to reproduce the bug. Depending on whether or not you can reproduce the bug, your subsequent approach will differ. The next four sections explain how to reproduce bugs, how to debug reproducible bugs, how to debug nonreproducible bugs, and how to debug regressions. Additional sections explain details about debugging memory errors and debugging multithreaded programs.

Reproducing Bugs

If you can reproduce the bug consistently, it will be much easier to determine the root cause. Finding the root cause of bugs that are not reproducible is difficult, if not impossible.

As a first step to reproduce the bug, run the program with exactly the same inputs as the run when the bug first appeared. Be sure to include all inputs, from the program’s startup to the time of the bug’s appearance. A common mistake is to attempt to reproduce the bug by performing only the triggering action. This technique may not reproduce the bug because the bug might be caused by an entire sequence of actions.

For example, if your web browser program dies when you request a certain web page, it may be due to memory corruption triggered by that particular request’s network address. On the other hand, it may be because your program records all requests in a queue, with space for one million entries, and this entry was number one million and one. Starting the program over and sending one request certainly wouldn’t trigger the bug in that case.

Sometimes it is impossible to emulate the entire sequence of events that leads to the bug. Perhaps the bug was reported by someone who can’t remember everything that she did. Alternatively, maybe the program was running for too long to emulate every input. In that case, do your best to reproduce the bug. It takes some guesswork, and can be time-consuming, but effort at this point will save time later in the debugging process. Here are some techniques you can try:

  • Repeat the triggering action in the correct environment and with as many inputs as possible similar to the initial report.
  • Do a quick review of the code related to the bug. More often than not, you’ll find a likely cause that will guide you in reproducing the problem.
  • Run automated tests that exercise similar functionality. Reproducing bugs is one benefit of automated tests. If it takes 24 hours of testing before the bug shows up, it’s preferable to let those tests run on their own rather than spend 24 hours of your time trying to reproduce the bug.
  • If you have the necessary hardware available, running slight variations of tests concurrently on different machines can sometimes save time.
  • Run stress tests that exercise similar functionality. If your program is a web server that died on a particular request, try running as many browsers as possible simultaneously that make that request.

After you are able to reproduce the bug consistently, you should attempt to find the smallest sequence that triggers the bug. You can start with the minimum sequence, containing only the triggering action, and slowly expand the sequence to cover the entire sequence from startup until the bug is triggered. This will result in the simplest and most efficient test case to reproduce it, which makes it simpler to find the root cause of the problem, and easier to verify the fix.

Debugging Reproducible Bugs

When you can reproduce a bug consistently and efficiently, it’s time to figure out the problem in the code that causes the bug. Your goal at this point is to find the exact lines of code that trigger the problem. You can use two different strategies.

  1. Logging debug messages: By adding enough debug messages to your program and watching its output when you reproduce the bug, you should be able to pinpoint the exact lines of code where the bug occurs. If you have a debugger at your disposal, adding debug messages is usually not recommended because it requires modifications to the program and can be time-consuming. However, if you have already instrumented your program with debug messages as described earlier, you might be able to find the root cause of your bug by running your program in debug mode while reproducing the bug. Note that bugs sometimes disappear simply when you enable logging because the act of enabling logging can slightly change the timings of your application.
  2. Using a debugger: Debuggers allow you to step through the execution of your program and to view the state of memory and the values of variables at various points. They are often indispensable tools for finding the root cause of bugs. When you have access to the source code, you will use a symbolic debugger: a debugger that utilizes the variable names, class names, and other symbols in your code. In order to use a symbolic debugger, you must instruct your compiler to generate debug symbols. Check the documentation of your compiler for details on how to enable symbol generation.

The debugging example at the end of this chapter demonstrates both these approaches.

Debugging Nonreproducible Bugs

Fixing bugs that are not reproducible is significantly more difficult than fixing reproducible bugs. You often have very little information and must employ a lot of guesswork. Nevertheless, a few strategies can aid you.

  1. Try to turn a nonreproducible bug into a reproducible bug. By using educated guesses, you can often determine approximately where the bug lies. It’s worthwhile to spend some time trying to reproduce the bug. Once you have a reproducible bug, you can figure out its root cause by using the techniques described earlier.
  2. Analyze error logs. This is easy to do if you have instrumented your program with error log generation, as described earlier. You should sift through this information because any errors that were logged directly before the bug occurred are likely to have contributed to the bug itself. If you’re lucky (or if you coded your program well), your program will have logged the exact reason for the bug at hand.
  3. Obtain and analyze traces. Again, this is easy to do if you have instrumented your program with tracing output, for example, via a ring buffer as described earlier. At the time of the bug’s occurrence, you hopefully obtained a copy of the traces. These traces should lead you right to the location of the bug in your code.
  4. Examine a crash/memory dump file, if it exists. Some platforms generate memory dump files of applications that terminate abnormally. On Unix and Linux, these memory dumps are called core files. Each platform provides tools for analyzing these memory dumps. They can, for example, be used to generate a stack trace of the application, or to view the contents of its memory before the application died.
  5. Inspect the code. Unfortunately, this is often the only strategy to determine the cause of a nonreproducible bug. Surprisingly, it often works. When you examine code, even code that you wrote yourself, with the perspective of the bug that just occurred, you can often find mistakes that you overlooked previously. I don’t recommend spending hours staring at your code, but tracing through the code path manually can often lead you directly to the problem.
  6. Use a memory-watching tool, such as one of those described in the section “Debugging Memory Problems,” later in this chapter. Such tools often alert you to memory errors that don’t always cause your program to misbehave, but could potentially be the cause of the bug in question.
  7. File or update a bug report. Even if you can’t find the root cause of the bug right away, the report will be a useful record of your attempts if the problem is encountered again.
  8. If you are unable to find the root cause of the bug, be sure to add extra logging or tracing, so that you will have a better chance next time the bug occurs.

Once you have found the root cause of a nonreproducible bug, you should create a reproducible test case and move it to the “reproducible bugs” category. It is important to be able to reproduce a bug before you actually fix it. Otherwise, how will you test the fix? A common mistake when debugging nonreproducible bugs is to fix the wrong problem in the code. Because you can’t reproduce the bug, you don’t know if you’ve really fixed it, so you shouldn’t be surprised when it shows up again a month later.

Debugging Regressions

If a feature contains a regression bug, it means that the feature used to work correctly, but stopped working due to the introduction of a bug.

A useful debugging technique for investigating regressions is to look at the change log of relevant files. If you know at what time the feature was still working, look at all the change logs since that time. You might notice something suspicious that could lead you to the root cause.

Another approach that can save you a lot of time when debugging regressions is to use a binary search approach with older versions of the software to try and figure out when it started to go wrong. You can use binaries of older versions if you keep them, or revert the source code to an older revision. Once you know when it started to go wrong, inspect the change logs to see what changed at that time. This mechanism is only possible when you can reproduce the bug.

Debugging Memory Problems

Most catastrophic bugs, such as application death, are caused by memory errors. Many noncatastrophic bugs are triggered by memory errors as well. Some memory bugs are obvious. If your program attempts to dereference a nullptr pointer, the default action is to terminate the program. However, nearly every platform enables you to respond to catastrophic errors and take remedial action. The amount of effort you devote to the response depends on the importance of this kind of recovery to your end users. For example, a text editor really needs to make a best attempt to save the modified buffers (possibly under a “recovered” name), while for other programs, users may find the default behavior acceptable, even if it is unpleasant.

Some memory bugs are more insidious. If you write past the end of an array in C++, your program will probably not crash at that point. However, if that array was on the stack, you may have written into a different variable or array, changing values that won’t show up until later in the program. Alternatively, if the array was on the heap, you could cause memory corruption in the heap, which will cause errors later when you attempt to allocate or free more memory dynamically.

Chapter 7 introduces some of the common memory errors from the perspective of what to avoid when you’re coding. This section discusses memory errors from the perspective of identifying problems in code that exhibits bugs. You should be familiar with the discussion in Chapter 7 before continuing with this section.

Categories of Memory Errors

In order to debug memory problems, you should be familiar with the types of errors that can occur. This section describes the major categories of memory errors. Each category lists different types of memory errors, including a small code example demonstrating each error, and a list of possible symptoms that you might observe. Note that a symptom is not the same thing as a bug: a symptom is an observable behavior caused by a bug.

Memory-Freeing Errors

The following table summarizes five major errors that involve freeing memory.

ERROR TYPE SYMPTOMS EXAMPLE
Memory leak Process memory usage grows over time.
Process runs more slowly over time.
Eventually, depending on the OS, operations and system calls fail because of lack of memory.
void memoryLeak()
{
int* p = new int[1000];
return; // Bug! Not freeing p.
}
Using mismatched allocation and free operations Does not usually cause a crash immediately.
This type of error can cause memory corruption on some platforms, which might show up as a crash later in the program.
Certain mismatches can also cause memory leaks.
void mismatchedFree()
{
int* p1 = (int*)malloc(sizeof(int));
int* p2 = new int;
int* p3 = new int[1000];
delete p1; // BUG! Should use free()
delete[] p2; // BUG! Should use delete
free(p3); // BUG! Should use delete[]
}
Freeing memory more than once Can cause a crash if the memory at that location has been handed out in another allocation between the two calls to delete. void doubleFree()
{
int* p1 = new int[1000];
delete[] p1;
int* p2 = new int[1000];
delete[] p1; // BUG! freeing p1 twice
} // BUG! Leaking memory of p2
Freeing unallocated memory Will usually cause a crash. void freeUnallocated()
{
int* p = reinterpret_cast<int*>(10000);
delete p; // BUG! p not a valid pointer.
}
Freeing stack memory Technically a special case of freeing unallocated memory. This will usually cause a crash. void freeStack()
{
int x;
int* p = &x;
delete p; // BUG! Freeing stack memory
}

The crashes mentioned in this table can have different manifestations depending on your platform, such as segmentation faults, bus errors, access violations, and so on.

As you can see, some of the errors do not cause immediate program termination. These bugs are more subtle, leading to problems later in the program’s execution.

Memory-Access Errors

Another category of memory errors involves the actual reading and writing of memory.

ERROR TYPE SYMPTOMS EXAMPLE
Accessing invalid memory Almost always causes the program to crash immediately. void accessInvalid()
{
int* p = reinterpret_cast<int*>(10000);
*p = 5; // BUG! p is not a valid pointer.
}
Accessing freed memory Does not usually cause a program crash.
If the memory has been handed out in another allocation, this error type can cause “strange” values to appear unexpectedly.
void accessFreed()
{
int* p1 = new int;
delete p1;
int* p2 = new int;
*p1 = 5; // BUG! The memory pointed to
// by p1 has been freed.
}
Accessing memory in a different allocation Does not cause a crash.
This error type can cause “strange” and potentially dangerous values to appear unexpectedly.
void accessElsewhere()
{
int x, y[10], z;
x = 0;
z = 0;
for (int i = 0; i <= 10; i++) {
y[i] = 5; // BUG for i==10! element 10
// is past end of array.
}
}
Reading uninitialized memory Does not cause a crash, unless you use the uninitialized value as a pointer and dereference it (as in the example). Even then, it will not always cause a crash. void readUninitialized()
{
int* p;
cout << *p; // BUG! p is uninitialized
}

Memory-access errors don’t always cause a crash. They can instead lead to subtle errors, in which the program does not terminate but instead produces erroneous results. Erroneous results can lead to serious consequences, for example, when external devices (such as robotic arms, X-ray machines, radiation treatments, life support systems, and so on) are being controlled by the computer.

Note that the symptoms discussed here for both memory-freeing errors and memory-access errors are the default symptoms for release builds of your program. Debug builds will most likely behave differently, and when you run the program inside a debugger, the debugger might break into the code when an error occurs.

Tips for Debugging Memory Errors

Memory-related bugs often show up in slightly different places in the code each time you run the program. This is usually the case with heap memory corruption. Heap memory corruption is like a time bomb, ready to explode at some attempt to allocate, free, or use memory on the heap. So, when you see a bug that is reproducible, but that shows up in slightly different places, you should suspect memory corruption.

If you suspect a memory bug, your best option is to use a memory-checking tool for C++. Debuggers often provide options to run the program while checking for memory errors. For example, if you run a debug build of your application in the Microsoft Visual C++ debugger, it will catch almost all types of errors discussed in the previous sections. Additionally, there are some excellent third-party tools such as Purify from Rational Software (now owned by IBM) and Valgrind for Linux (discussed in Chapter 7). Microsoft also provides a free download called Application Verifier, which can be used with release builds of your applications in a Windows environment. It is a run-time verification tool to help you find subtle programming errors like the previously discussed memory errors. These debuggers and tools work by interposing their own memory-allocation and -freeing routines in order to check for any misuse of dynamic memory, such as freeing unallocated memory, dereferencing unallocated memory, or writing off the end of an array.

If you don’t have a memory-checking tool at your disposal, and the normal strategies for debugging are not helping, you may need to resort to code inspection. First, narrow down the part of the code containing the bug. Then, as a general rule, look at all raw pointers. Provided that you work on moderate- to good-quality code, most pointers should already be wrapped in smart pointers. If you do encounter raw pointers, take a closer look at how they are used, because they might be the cause of the error. Here are some more items to look for in your code.

Object and Class-Related Errors
  • Verify that your classes with dynamically allocated memory have destructors that free exactly the memory that’s allocated in the object: no more, and no less.
  • Ensure that your classes handle copying and assignment correctly with copy constructors and assignment operators, as described in Chapter 9. Make sure move constructors and move assignment operators properly set pointers in the source object to nullptr so that their destructors don’t try to free that memory.
  • Check for suspicious casts. If you are casting a pointer to an object from one type to another, make sure that it’s valid. When possible, use dynamic_casts.
General Memory Errors
  • Make sure that every call to new is matched with exactly one call to delete. Similarly, every call to malloc, alloc, or calloc should be matched with one call to free, and every call to new[] should be matched with one call to delete[]. To avoid freeing memory multiple times or using freed memory, it’s recommended to set your pointer to nullptr after freeing its memory. Of course, the best solution is to avoid using raw pointers to handle ownership of resources, and instead use smart pointers.
  • Check for buffer overruns. Whenever you iterate over an array or write into or read from a C-style string, verify that you are not accessing memory past the end of the array or string. These problems can often be avoided by using Standard Library containers and strings.
  • Check for dereferencing of invalid pointers.
  • When declaring a pointer on the stack, make sure you always initialize it as part of its declaration. For example, use T* p = nullptr; or T* p = new T; but never T* p;. Better yet, use smart pointers!
  • Similarly, make sure your classes always initialize pointer data members with in-class initializers or in their constructors, by either allocating memory in the constructor or setting the pointers to nullptr. Also here, the best solution is to use smart pointers.

Debugging Multithreaded Programs

C++ includes a threading support library that provides mechanisms for threading and synchronization between threads. This threading support library is discussed in Chapter 23. Multithreaded C++ programs are common, so it is important to think about the special issues involved in debugging a multithreaded program. Bugs in multithreaded programs are often caused by variations in timings in the operating system scheduling, and can be difficult to reproduce. Thus, debugging multithreaded programs requires a special set of techniques.

  1. Use a debugger: A debugger makes it relatively easy to diagnose certain multithreaded problems, for example, deadlocks. When the deadlock appears, break into the debugger and inspect the different threads. You will be able to see which threads are blocked and on which line in the code they are blocked. Combining this with trace logs that show you how you came into the deadlock situation should be enough to fix deadlocks.
  2. Use log-based debugging: When debugging multithreaded programs, log-based debugging can sometimes be more effective than using a debugger to debug certain problems. You can add log statements to your program before and after critical sections, and before acquiring and after releasing locks. Log-based debugging is extremely useful in investigating race conditions. However, the act of adding log statements slightly changes run-time timings, which might hide the bug.
  3. Insert forced sleeps and context switches: If you are having trouble consistently reproducing the problem, or you have a hunch about the root cause but want to verify it, you can force certain thread-scheduling behavior by making your threads sleep for specific amounts of time. The <thread> header defines sleep_until() and sleep_for() in the std::this_thread namespace, which you can use to sleep. The time to sleep is specified as an std::time_point or an std::duration respectively, both part of the chrono library discussed in Chapter 20. Sleeping for several seconds right before releasing a lock, immediately before signaling a condition variable, or directly before accessing shared data can reveal race conditions that would otherwise go undetected. If this debugging technique reveals the root cause, it must be fixed, so that it works correctly after removing these forced sleeps and context switches. Never leave these forced sleeps and context switches in your code! That would be the wrong “fix” for the problem.
  4. Perform code review: Reviewing your thread synchronization code often helps in fixing race conditions. Try to prove over and over that what happened is not possible, until you see how it is. It doesn’t hurt to write down these “proofs” in code comments. Also, ask a coworker to do pair debugging; she might see something you are overlooking.

Debugging Example: Article Citations

This section presents a buggy program and shows you the steps to take in order to debug it and fix the problem.

Suppose that you’re part of a team writing a web page that allows users to search for the research articles that cite a particular paper. This type of service is useful for authors who are trying to find work similar to their own. Once they find one paper representing a related work, they can look for every paper that cites that one to find other related work.

In this project, you are responsible for the code that reads the raw citation data from text files. For simplicity, assume that the citation information for each paper is found in its own file. Furthermore, assume that the first line of each file contains the author, title, and publication information for the paper; that the second line is always empty; and that all subsequent lines contain the citations from the article (one on each line). Here is an example file for one of the most important papers in computer science:

Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.

Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.
Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.
Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.
E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.

Buggy Implementation of an ArticleCitations Class

You may decide to structure your program by writing an ArticleCitations class that reads the file and stores the information. This class stores the article information from the first line in one string, and the citations in a C-style array of strings.

The ArticleCitations class definition looks like this:

class ArticleCitations
{
    public:
        ArticleCitations(std::string_view fileName);
        virtual ~ArticleCitations();
        ArticleCitations(const ArticleCitations& src);
        ArticleCitations& operator=(const ArticleCitations& rhs);

        std::string_view getArticle() const { return mArticle; }
        int getNumCitations() const { return mNumCitations; }
        std::string_view getCitation(int i) const { return mCitations[i]; }
    private:
        void readFile(std::string_view fileName);
        void copy(const ArticleCitations& src);

        std::string mArticle;
        std::string* mCitations;
        int mNumCitations;
};

The implementation is as follows. Keep in mind that this program is buggy! Don’t use it verbatim or as a model.

ArticleCitations::ArticleCitations(string_view fileName)
    : mCitations(nullptr), mNumCitations(0)
{
    // All we have to do is read the file.
    readFile(fileName);
}

ArticleCitations::ArticleCitations(const ArticleCitations& src)
{
    copy(src);
}

ArticleCitations& ArticleCitations::operator=(const ArticleCitations& rhs)
{
    // Check for self-assignment.
    if (this == &rhs) {
        return *this;
    }
    // Free the old memory.
    delete [] mCitations;
    // Copy the data
    copy(rhs);
    return *this;
}

void ArticleCitations::copy(const ArticleCitations& src)
{
    // Copy the article name, author, etc.
    mArticle = src.mArticle;
    // Copy the number of citations
    mNumCitations = src.mNumCitations;
    // Allocate an array of the correct size
    mCitations = new string[mNumCitations];
    // Copy each element of the array
    for (int i = 0; i < mNumCitations; i++) {
        mCitations[i] = src.mCitations[i];
    }
}

ArticleCitations::~ArticleCitations()
{
    delete [] mCitations;
}

void ArticleCitations::readFile(string_view fileName)
{
    // Open the file and check for failure.
    ifstream inputFile(fileName.data());
    if (inputFile.fail()) {
        throw invalid_argument("Unable to open file");
    }
    // Read the article author, title, etc. line.
    getline(inputFile, mArticle);

    // Skip the white space before the citations start.
    inputFile >> ws;

    int count = 0;
    // Save the current position so we can return to it.
    streampos citationsStart = inputFile.tellg();
    // First count the number of citations.
    while (!inputFile.eof()) {
        // Skip white space before the next entry.
        inputFile >> ws;
        string temp;
        getline(inputFile, temp);
        if (!temp.empty()) {
            count++;
        }
    }

    if (count != 0) {
        // Allocate an array of strings to store the citations.
        mCitations = new string[count];
        mNumCitations = count;
        // Seek back to the start of the citations.
        inputFile.seekg(citationsStart);
        // Read each citation and store it in the new array.
        for (count = 0; count < mNumCitations; count++) {
            string temp;
            getline(inputFile, temp);
            if (!temp.empty()) {
                mCitations[count] = temp;
            }
        }
    } else {
        mNumCitations = -1;
    }
}

Testing the ArticleCitations class

The following program asks the user for a filename, constructs an ArticleCitations instance for that file, and passes this instance by value to the processCitations() function, which prints out all the information:

void processCitations(ArticleCitations cit)
{
    cout << cit.getArticle() << endl;
    int num = cit.getNumCitations();
    for (int i = 0; i < num; i++) {
        cout << cit.getCitation(i) << endl;
    }
}

int main()
{
    while (true) {
        cout << "Enter a file name ("STOP" to stop): ";
        string fileName;
        cin >> fileName;
        if (fileName == "STOP") {
            break;
        }

        ArticleCitations cit(fileName);
        processCitations(cit);
    }
    return 0;
}

You decide to test the program on the Alan Turing example (stored in a file called paper1.txt). Here is the output:

Enter a file name ("STOP" to stop): paper1.txt
Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.
[ 4 empty lines omitted for brevity ]
Enter a file name ("STOP" to stop): STOP

That doesn’t look right. There are supposed to be four citations printed instead of four blank lines.

Message-Based Debugging

For this bug, you decide to try log-based debugging, and because this is a console application, you decide to just print messages to cout. In this case, it makes sense to start by looking at the function that reads the citations from the file. If that doesn’t work right, then obviously the object won’t have the citations. You can modify readFile() as follows:

void ArticleCitations::readFile(string_view fileName)
{
    // Code omitted for brevity

    // First count the number of citations.
    cout << "readFile(): counting number of citations" << endl;
    while (!inputFile.eof()) {
        // Skip white space before the next entry.
        inputFile >> ws;
        string temp;
        getline(inputFile, temp);
        if (!temp.empty()) {
        cout << "Citation " << count << ": " << temp << endl;
            count++;
        }
    }

    cout << "Found " << count << " citations" << endl;
    cout << "readFile(): reading citations" << endl;
    if (count != 0) {
        // Allocate an array of strings to store the citations.
        mCitations = new string[count];
        mNumCitations = count;
        // Seek back to the start of the citations.
        inputFile.seekg(citationsStart);
        // Read each citation and store it in the new array.
        for (count = 0; count < mNumCitations; count++) {
            string temp;
            getline(inputFile, temp);
            if (!temp.empty()) {
                cout << temp << endl;
                mCitations[count] = temp;
            }
        }
    } else {
        mNumCitations = -1;
    }
    cout << "readFile(): finished" << endl;
}

Running the same test with this program gives the following output:

Enter a file name ("STOP" to stop): paper1.txt
readFile(): counting number of citations
Citation 0: Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.
Citation 1: Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.
Citation 2: Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.
Citation 3: E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.
Found 4 citations
readFile(): reading citations
readFile(): finished
Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.
[ 4 empty lines omitted for brevity ]
Enter a file name ("STOP" to stop): STOP

As you can see from the output, the first time the program reads the citations from the file, in order to count them, it reads them correctly. However, the second time, they are not read correctly; nothing is printed between “readFile(): reading citations” and “readFile(): finished”. Why not? One way to delve deeper into this issue is to add some debugging code to check the state of the file stream after each attempt to read a citation:

void printStreamState(const istream& inputStream)
{
    if (inputStream.good()) {
        cout << "stream state is good" << endl;
    }
    if (inputStream.bad()) {
        cout << "stream state is bad" << endl;
    }
    if (inputStream.fail()) {
        cout << "stream state is fail" << endl;
    }
    if (inputStream.eof()) {
        cout << "stream state is eof" << endl;
    }
}

void ArticleCitations::readFile(string_view fileName)
{
    // Code omitted for brevity

    // First count the number of citations.
    cout << "readFile(): counting number of citations" << endl;
    while (!inputFile.eof()) {
        // Skip white space before the next entry.
        inputFile >> ws;
        printStreamState(inputFile);
        string temp;
        getline(inputFile, temp);
        printStreamState(inputFile);
        if (!temp.empty()) {
            cout << "Citation " << count << ": " << temp << endl;
            count++;
        }
    }

    cout << "Found " << count << " citations" << endl;
    cout << "readFile(): reading citations" << endl;
    if (count != 0) {
        // Allocate an array of strings to store the citations.
        mCitations = new string[count];
        mNumCitations = count;
        // Seek back to the start of the citations.
        inputFile.seekg(citationsStart);
        // Read each citation and store it in the new array.
        for (count = 0; count < mNumCitations; count++) {
            string temp;
            getline(inputFile, temp);
            printStreamState(inputFile);
            if (!temp.empty()) {
                cout << temp << endl;
                mCitations[count] = temp;
            }
        }
    } else {
        mNumCitations = -1;
    }
    cout << "readFile(): finished" << endl;
}

When you run your program this time, you find some interesting information:

Enter a file name ("STOP" to stop): paper1.txt
readFile(): counting number of citations
stream state is good
stream state is good
Citation 0: Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.
stream state is good
stream state is good
Citation 1: Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.
stream state is good
stream state is good
Citation 2: Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.
stream state is good
stream state is good
Citation 3: E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.
stream state is eof
stream state is fail
stream state is eof
Found 4 citations
readFile(): reading citations
stream state is fail
stream state is fail
stream state is fail
stream state is fail
readFile(): finished
Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.
[ 4 empty lines omitted for brevity ]
Enter a file name ("STOP" to stop): STOP

It looks like the stream state is good until after the final citation is read for the first time. Because the paper1.txt file contains an empty last line, the while loop is executed one more time after having read the last citation. In this last loop, inputFile >> ws reads the white-space of the last line, which causes the stream state to become eof. Then, the code still tries to read a line using getline() which causes the stream state to become fail and eof. That is expected. What is not expected is that the stream state remains as fail after all attempts to read the citations a second time. That doesn’t appear to make sense at first: the code uses seekg() to seek back to the beginning of the citations before reading them a second time.

However, Chapter 13 explains that streams maintain their error states until you clear them explicitly; seekg() doesn’t clear the fail state automatically. When in an error state, streams fail to read data correctly, which explains why the stream state is also fail after trying to read the citations a second time. A closer look at the code reveals that it fails to call clear() on the istream after reaching the end of the file. If you modify the code by adding a call to clear(), it will read the citations properly.

Here is the corrected readFile() method without the debugging cout statements:

void ArticleCitations::readFile(string_view fileName)
{
    // Code omitted for brevity

    if (count != 0) {
        // Allocate an array of strings to store the citations.
        mCitations = new string[count];
        mNumCitations = count;
        // Clear the stream state.
        inputFile.clear();
        // Seek back to the start of the citations.
        inputFile.seekg(citationsStart);
        // Read each citation and store it in the new array.
        for (count = 0; count < mNumCitations; count++) {
            string temp;
            getline(inputFile, temp);
            if (!temp.empty()) {
                mCitations[count] = temp;
            }
        }
    } else {
        mNumCitations = -1;
    }
}

Running the same test again on paper1.txt now shows you the correct four citations.

Using the GDB Debugger on Linux

Now that your ArticleCitations class seems to work well on one citations file, you decide to blaze ahead and test some special cases, starting with a file with no citations. The file looks like this, and is stored in a file named paper2.txt:

Author with no citations

When you try to run your program on this file, depending on your version of Linux and your compiler, you might get a crash that looks something like the following:

Enter a file name ("STOP" to stop): paper2.txt
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

The message “core dumped” means that the program crashed. This time you decide to give the debugger a shot. The Gnu Debugger (GDB) is widely available on Unix and Linux platforms. First, you must compile your program with debugging information (-g with g++). Then you can launch the program under GDB. Here’s an example session using the debugger to find the root cause of this problem. This example assumes your compiled executable is called buggyprogram. Text that you have to type is shown in bold.

> 
gdb buggyprogram
[ Start-up messages omitted for brevity ]
Reading symbols from /home/marc/c++/gdb/buggyprogram…done.
(gdb) run
Starting program: buggyprogram
Enter a file name ("STOP" to stop): paper2.txt
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Program received signal SIGABRT, Aborted.
0x00007ffff7535c39 in raise () from /lib64/libc.so.6
(gdb)

When the program crashes, the debugger breaks the execution, and allows you to poke around in the state of the program at that time. The backtrace or bt command shows the current stack trace. The last operation is at the top, with frame number zero (#0):

(gdb) bt
#0  0x00007ffff7535c39 in raise () from /lib64/libc.so.6
#1  0x00007ffff7537348 in abort () from /lib64/libc.so.6
#2  0x00007ffff7b35f85 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
#3  0x00007ffff7b33ee6 in ?? () from /lib64/libstdc++.so.6
#4  0x00007ffff7b33f13 in std::terminate() () from /lib64/libstdc++.so.6
#5  0x00007ffff7b3413f in __cxa_throw () from /lib64/libstdc++.so.6
#6  0x00007ffff7b346cd in operator new(unsigned long) () from /lib64/libstdc++.so.6
#7  0x00007ffff7b34769 in operator new[](unsigned long) () from /lib64/libstdc++.so.6
#8  0x00000000004016ea in ArticleCitations::copy (this=0x7fffffffe090, src=…) at ArticleCitations.cpp:40
#9  0x00000000004015b5 in ArticleCitations::ArticleCitations (this=0x7fffffffe090, src=…)
    at ArticleCitations.cpp:16
#10 0x0000000000401d0c in main () at ArticleCitationsTest.cpp:20

When you get a stack trace like this, you should try to find the first stack frame from the top that is in your own code. In this example, this is stack frame #8. From this frame, you can see that there seems to be a problem in the copy() method of ArticleCitations. This method is invoked because main() calls processCitations() and passes the argument by value, which triggers a call to the copy constructor, which calls copy(). Of course, in production code you should pass a const reference, but pass-by-value is used in this example of a buggy program. You can tell the debugger to switch to stack frame #8 with the frame command, which requires the index of the frame to jump to:

(gdb) frame 8
#8  0x00000000004016ea in ArticleCitations::copy (this=0x7fffffffe090, src=…) at ArticleCitations.cpp:40
40    mCitations = new string[mNumCitations];

This output shows that the following line caused a problem:

mCitations = new string[mNumCitations];

Now, you can use the list command to show the code in the current stack frame around the offending line:

(gdb) list
35    // Copy the article name, author, etc.
36    mArticle = src.mArticle;
37    // Copy the number of citations
38    mNumCitations = src.mNumCitations;
39    // Allocate an array of the correct size
40    mCitations = new string[mNumCitations];
41    // Copy each element of the array
42    for (int i = 0; i < mNumCitations; i++) {
43        mCitations[i] = src.mCitations[i];
44    }

In GDB, you can print values available in the current scope with the print command. In order to find the root cause of the problem, you can try printing some of the variables. The error happens inside the copy() method, so checking the value of the src parameter is a good start:

(gdb) print src
$1 = (const ArticleCitations &) @0x7fffffffe060: {
  _vptr.ArticleCitations = 0x401fb0 <vtable for ArticleCitations+16>, 
  mArticle = "Author with no citations", mCitations = 0x7fffffffe080, mNumCitations = -1}

Ah-ha! Here’s the problem. This article isn’t supposed to have any citations. Why is mNumCitations set to the strange value -1? Take another look at the code in readFile() for the case where there are no citations. In that case, it looks like mNumCitations is erroneously set to -1. The fix is easy: you always need to initialize mNumCitations to 0, instead of setting it to -1 when there are no citations. Another problem is that readFile() can be called multiple times on the same ArticleCitations object, so you also need to free a previously allocated mCitations array. Here is the fixed code:

void ArticleCitations::readFile(string_view fileName)
{
    // Code omitted for brevity

    delete [] mCitations;  // Free previously allocated citations.
    mCitations = nullptr;
    mNumCitations = 0;
    if (count != 0) {
        // Allocate an array of strings to store the citations.
        mCitations = new string[count];
        mNumCitations = count;
        // Clear the stream state.
        inputFile.clear();
        // Seek back to the start of the citations.
        inputFile.seekg(citationsStart);

        // Code omitted for brevity
    }
}

As this example shows, bugs don’t always show up right away. It often takes a debugger and some persistence to find them.

Using the Visual C++ 2017 Debugger

This section explains the same debugging procedure as described in the previous section, but uses the Microsoft Visual C++ 2017 debugger instead of GDB.

First, you need to create a project. Start VC++ and click File ➪ New ➪ Project. In the project template tree on the left, select Visual C++ ➪ Win32 (or Windows Desktop). Then select the Win32 Console Application (or Windows Console Application) template in the list in the middle of the window. At the bottom, you can specify a name for the project and a location where to save it. Specify ArticleCitations as the name, choose a folder in which to save the project, and click OK. A wizard opens.2 In this wizard, click Next, select Console application, select Empty Project, and click Finish.

Once your new project is created, you can see a list of project files in the Solution Explorer. If this docking window is not visible, go to View ➪ Solution Explorer. There should be no files in the solution right now. Right-click the ArticleCitations project in the Solution Explorer and click Add ➪ Existing Item. Add all the files from the 06_ArticleCitations6_VisualStudio folder in the downloadable source code archive to the project. After this, your Solution Explorer should look similar to Figure 27-1.

c27-fig-0001

FIGURE 27-1

VC++ 2017 does not yet automatically enable C++17 features yet. Because this example uses std::string_view from C++17, you have to tell VC++ to enable C++17 features. In the Solution Explorer window, right-click the ArticleCitations project and click Properties. In the properties window, go to Configuration Properties ➪ C/C++ ➪ Language, and set the C++ Language Standard option to “ISO C++17 Standard” or “ISO C++ Latest Draft Standard”, whichever is available in your version of Visual C++.

Visual C++ supports so-called precompiled headers, a topic outside the scope of this book. In general, I recommend to use precompiled headers if your compiler supports them. However, the ArticleCitations implementation does not use precompiled headers, so you have to disable that feature for this particular project. In the Solution Explorer window, right-click the ArticleCitations project and click Properties. In the properties window, go to Configuration Properties ➪ C/C++ ➪ Precompiled Headers, and set the Precompiled Header option to “Not Using Precompiled Headers.”

Now you can compile the program. Click Build ➪ Build Solution. Then copy the paper1.txt and paper2.txt test files to your ArticleCitations project folder, which is the folder containing the ArticleCitations.vcxproj file.

Run the application with Debug ➪ Start Debugging, and test the program by first specifying the paper1.txt file. It should properly read the file and output the result to the console. Then, test paper2.txt. The debugger breaks the execution with a message similar to Figure 27-2.

c27-fig-0002

FIGURE 27-2

This immediately shows you the line where the crash happened. If you only see disassembly code, right-click anywhere on the disassembly and select Go To Source Code. You can now inspect variables by simply hovering your mouse over the name of a variable. If you hover over src, you’ll notice that mNumCitations is -1. The reason and the fix are exactly the same as in the earlier example.

You can come to the same conclusion by inspecting the call stack (Debug ➪ Windows ➪ Call Stack). In this call stack, you need to find the first line that contains code that you wrote. This is shown in Figure 27-3.

c27-fig-0003

FIGURE 27-3

Just as with GDB, you see that the problem is in copy(). You can double-click that line in the call stack window to jump to the right place in the code.

Instead of hovering over variables to inspect their values, you can also use the Debug ➪ Windows ➪ Autos window, which shows a list of variables. Figure 27-4 shows this list with the src variable expanded to show its data members. From this window, you can also see that mNumCitations is -1.

c27-fig-0004

FIGURE 27-4

Lessons from the ArticleCitations Example

You might be inclined to disregard this example as too small to be representative of real debugging. Although the buggy code is not lengthy, many classes that you write will not be much bigger, even in large projects. Imagine if you had failed to test this example thoroughly before integrating it with the rest of the project. If these bugs showed up later, you and other engineers would have to spend more time narrowing down the problem before you could debug it as shown here. Additionally, the techniques shown in this example apply to all debugging, whether on a large or small scale.

SUMMARY

The most important concept in this chapter is the fundamental law of debugging: avoid bugs when you’re coding, but plan for bugs in your code. The reality of programming is that bugs will appear. If you’ve prepared your program properly, with error logging, debug traces, and assertions, then the actual debugging will be significantly easier.

In addition to these techniques, this chapter also presented specific approaches for debugging bugs. The most important rule when actually debugging is to reproduce the problem. Then, you can use a symbolic debugger, or log-based debugging, to track down the root cause. Memory errors present particular difficulties, and account for the majority of bugs in legacy C++ code. This chapter described the various categories of memory bugs and their symptoms, and showed examples of debugging errors in a program.

Debugging is a hard skill to learn. To take your C++ skills to a professional level, you will have to practice debugging a lot.

NOTES

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset