Chapter 8

Debugging

In brief, debugging is what you do after you have executed a successful test case. Remember that a successful test case is one that shows that a program does not do what it was designed to do. Debugging is a two-step process that begins when you find an error as a result of a successful test case. Step 1 is the determination of the exact nature and location of the suspected error within the program. Step 2 consists of fixing the error.

As necessary and integral as debugging is to program testing, it seems to be the one aspect of the software production process that programmers enjoy the least, for these reasons primarily:

  • Your ego may get in the way Like it or not, debugging confirms that programmers are not perfect; they commit errors in either the design or the coding of the program.
  • You may run out of steam Of all the software development activities, debugging is the most mentally taxing activity. Moreover, debugging usually is performed under a tremendous amount of organizational or self-induced pressure to fix the problem as quickly as possible.
  • You may lose your way Debugging is mentally taxing because the error you've found could occur in virtually any statement within the program. Without examining the program first, you can't be absolutely sure, for example, that the origin of a numerical error in a paycheck produced by a payroll program is not a subroutine that asks the operator to load a particular form into the printer. Contrast this with the debugging of a physical system, such as an automobile. If a car stalls when moving up an incline (the symptom), you can immediately and validly eliminate as the cause of the problem certain parts of the system—the AM/FM radio, for example, or the speedometer or the trunk lock. The problem must be in the engine; and, based on our overall knowledge of automotive engines, we can even rule out certain engine components such as the water pump and the oil filter.
  • You may be on your own Compared to other software development activities, comparatively little research, literature, and formal instruction exist on the process of debugging.

Although this is a book about software testing, not debugging, the two processes are obviously related. Of the two aspects of debugging, locating the error and correcting it, locating the error represents perhaps 95 percent of the problem. Hence, this chapter concentrates on the process of finding the location of an error, given that a successful test case has found one.

Debugging by Brute Force

The most common scheme for debugging a program is the so-called brute-force method. It is popular because it requires little thought and is the least mentally taxing of the methods; unfortunately, it is inefficient and generally unsuccessful.

Brute-force methods can be partitioned into at least three categories:

  • Debugging with a storage dump.
  • Debugging according to the common suggestion to “scatter print statements throughout your program.”
  • Debugging with automated debugging tools.

The first, debugging with a storage dump (usually a crude display of all storage locations in hexadecimal or octal format) is the most inefficient of the brute-force methods. Here's why:

  • It is difficult to establish a correspondence between memory locations and the variables in a source program.
  • With any program of reasonable complexity, such a memory dump will produce a massive amount of data, most of which is irrelevant.
  • A memory dump is a static picture of the program, showing the state of the program at only one instant in time; to find errors, you have to study the dynamics of a program (state changes over time).
  • A memory dump is rarely produced at the exact point of the error, so it doesn't show the program's state at the point of the error. Program actions between the time of the dump and the time of the error can mask the clues you need to find the error.
  • Adequate methodologies don't exist for finding errors by analyzing a memory dump (so many programmers stare, with glazed eyes, wistfully expecting the error to expose itself magically from the program dump).

Scattering statements throughout a failing program to display variable values isn't much better. It may be better than a memory dump because it shows the dynamics of a program and lets you examine information that is easier to relate to the source program, but this method, too, has many shortcomings:

  • Rather than encouraging you to think about the problem, it is largely a hit-or-miss method.
  • It produces a massive amount of data to be analyzed.
  • It requires you to change the program; such changes can mask the error, alter critical timing relationships, or introduce new errors.
  • It may work on small programs, but the cost of using it in large programs is quite high. Furthermore, it often is not even feasible on certain types of programs such as operating systems or process control programs.

Automated debugging tools work similarly to inserting print statements within the program, but rather than making changes to the program, you analyze the dynamics of the program with the debugging features of the programming language or special interactive debugging tools. Typical language features that might be used are facilities that produce printed traces of statement executions, subroutine calls, and/or alterations of specified variables. A common capability and function of debugging tools is to set breakpoints that cause the program to be suspended when a particular statement is executed or when a particular variable is altered, enabling the programmer to examine the current state of the program. This method, too, is largely hit or miss, however, and often results in an excessive amount of irrelevant data.

The general problem with these brute-force methods is that they ignore the process of thinking. You can draw an analogy between program debugging and solving a homicide. In virtually all murder mystery novels, the crime is solved by careful analysis of the clues and by piecing together seemingly insignificant details. This is not a brute-force method; setting up roadblocks or conducting property searches would be.

There also is some evidence to indicate that whether the debugging teams are composed of experienced programmers or students, people who use their brains rather than a set of aids work faster and more accurately in finding program errors. Therefore, we could recommend brute-force methods only: (1) when all other methods fail, or (2) as a supplement to, not a substitute for, the thought processes we'll describe next.

Debugging by Induction

It should be obvious that careful thought will find most errors without the debugger even going near the computer. One particular thought process is induction, where you move from the particulars of a situation to the whole. That is, start with the clues (the symptoms of the error and possibly the results of one or more test cases) and look for relationships among the clues. The induction process is illustrated in Figure 8.1.

Figure 8.1 The Inductive Debugging Process.

img

The steps are as follows:

1. Locate the pertinent data A major mistake debuggers make is failing to take account of all available data or symptoms about the problem. Therefore, the first step is the enumeration of all you know about what the program did correctly and what it did incorrectly—the symptoms that led you to believe there was an error. Additional valuable clues are provided by similar, but different, test cases that do not cause the symptoms to appear.

2. Organize the data Remember that induction implies that you're processing from the particulars to the general, so the second step is to structure the pertinent data to let you observe the patterns. Of particular importance is the search for contradictions, events such as the error occurs only when the customer has no outstanding balance in his or her margin account.

You can use a form such as the one shown in Figure 8.2 to structure the available data. In the “what” boxes list the general symptoms; in the “where” boxes describe where the symptoms were observed; in the “when” boxes list anything you know about the times when the symptoms occurred; and in the “to what extent” boxes describe the scope and magnitude of the symptoms. Notice the “is” and “is not” columns: In them describe the contradictions that may eventually lead to a hypothesis about the error.

3. Devise a hypothesis Next, study the relationships among the clues, and devise, using the patterns that might be visible in the structure of the clues, one or more hypotheses about the cause of the error. If you can't devise a theory, more data are needed, perhaps from new test cases. If multiple theories seem possible, select the more probable one first.

4. Prove the hypothesis A major mistake at this point, given the pressures under which debugging usually is performed, is to skip this step and jump to conclusions to fix the problem. Resist this urge, for it is vital to prove the reasonableness of the hypothesis before you proceed. If you skip this step, you'll probably succeed in correcting only the problem symptom, not the problem itself. Prove the hypothesis by comparing it to the original clues or data, making sure that this hypothesis completely explains the existence of the clues. If it does not, the hypothesis is invalid, the hypothesis is incomplete, or multiple errors are present.

5. Fix the problem You can proceed with fixing the problem once you complete the previous steps. By taking the time to fully work through each step, you can feel confident that your fix will correct the bug. Remember though, that you still need to perform some type of regression testing to ensure your bug fix didn't create problems in other program areas. As the application grows larger, so does the likelihood that your fix will cause problems elsewhere.

Figure 8.2 A Method for Structuring the Clues.

img

As a simple example, assume that an apparent error has been reported in the examination grading program described in Chapter 4. The apparent error is that the median grade seems incorrect in some, but not all, instances. In a particular test case, 51 students were graded. The mean score was correctly printed as 73.2, but the median printed was 26 instead of the expected value of 82. By examining the results of this test case and a few other test cases, the clues are organized as shown in Figure 8.3.

Figure 8.3 An Example of Clue Structuring.

img

The next step is to derive a hypothesis about the error by looking for patterns and contradictions. One contradiction we see is that the error seems to occur only in test cases that use an odd number of students. This might be a coincidence, but it seems significant, since you compute a median differently for sets of odd and even numbers. There's another strange pattern: In some test cases, the calculated median always is less than or equal to the number of students (26 ≤ 51 and 1 ≤ 1). One possible avenue at this point is to run the 51-student test case again, giving the students different grades from before to see how this affects the median calculation. If we do so, the median is still 26, so the “to what extent→ is not” box could be filled in with, “The median seems to be independent of the actual grades.” Although this result provides a valuable clue, we might have been able to surmise the error without it. From available data, the calculated median appears to equal half of the number of students, rounded up to the next integer. In other words, if you think of the grades as being stored in a sorted table, the program is printing the entry number of the middle student rather than his or her grade. Hence, we have a firm hypothesis about the precise nature of the error. Next, we prove the hypothesis by examining the code or by running a few extra test cases.

Debugging by Deduction

The process of deduction proceeds from some general theories or premises, using the processes of elimination and refinement, to arrive at a conclusion (the location of the error), as shown in Figure 8.4.

Figure 8.4 The Deductive Debugging Process.

img

As opposed to the process of induction in a murder case, for example, where you induce a suspect from the clues, using deduction, you start with a set of suspects and, by the process of elimination (the gardener has a valid alibi) and refinement (it must be someone with red hair), decide that the butler must have done it. The steps are as follows:

1. Enumerate the possible causes or hypotheses The first step is to develop a list of all conceivable causes of the error. They don't have to be complete explanations; they are merely theories to help you structure and analyze the available data.

2. Use the data to eliminate possible causes Carefully examine all of the data, particularly by looking for contradictions (you could use Figure 8.2 here), and try to eliminate all but one of the possible causes. If all are eliminated, you need more data gained from additional test cases to devise new theories. If more than one possible cause remains, select the most probable cause—the prime hypothesis—first.

3. Refine the remaining hypothesis The possible cause at this point might be correct, but it is unlikely to be specific enough to pinpoint the error. Hence, the next step is to use the available clues to refine the theory. For example, you might start with the idea that “there is an error in handling the last transaction in the file” and refine it to “the last transaction in the buffer is overlaid with the end-of-file indicator.”

4. Prove the remaining hypothesis This vital step is identical to step 4 in the induction method.

5. Fix the error Again this step is identical to step 5 in the induction method. To re-emphasize though, you should thoroughly test your fix to ensure it does not create problems elsewhere in the application.

As an example, assume that we are commencing the function testing of the DISPLAY command discussed in Chapter 4. Of the 38 test cases identified by the process of cause-effect graphing, we start by running four test cases. As part of the process of establishing input conditions, we will initialize memory that the first, fifth, ninth,. . ., words have the value 000; the second, sixth,. . ., words have the value 4444; the third, seventh,. . ., words have the value 8888; and the fourth, eighth,. . ., words have the value CCCC. That is, each memory word is initialized to the low-order hexadecimal digit in the address of the first byte of the word (the values of locations 23FC, 23FD, 23FE, and 23FF are C).

The test cases, their expected output, and the actual output after the test are shown in Figure 8.5.

Figure 8.5 Test Case Results from the DISPLAY Command.

img

Obviously, we have some problems, since apparently none of the test cases produced the expected results (all were successful). But let's start by debugging the error associated with the first test case. The command indicates that, starting at location 0 (the default), E locations (14 in decimal) are to be displayed. (Recall that the specification stated that all output will contain four words, or 16 bytes per line.)

Enumerating the possible causes for the unexpected error message, we might get:

1. The program does not accept the word DISPLAY.

2. The program does not accept the period.

3. The program does not allow a default as a first operand; it expects a storage address to precede the period.

4. The program does not allow an E as a valid byte count.

The next step is to try to eliminate the causes. If all are eliminated, we must retreat and expand the list. If more than one remain, we might want to examine additional test cases to arrive at a single error hypothesis, or proceed with the most probable cause. Since we have other test cases at hand, we see that the second test case in Figure 8.5 seems to eliminate the first hypothesis; and the third test case, although it produced an incorrect result, seems to eliminate the second and third hypotheses.

The next step is to refine the fourth hypothesis. It seems specific enough, but intuition might tell us that there is more to it than meets the eye—it sounds like an instance of a more general error. We might contend, then, that the program does not recognize the special hexadecimal characters AF. This absence of such characters in the other test cases makes this sound like a viable explanation.

Rather than jumping to a conclusion, however, we should first consider all of the available information. The fourth test case might represent a totally different error, or it might provide a clue about the current error. Given that the highest valid address in our system is 7FFF, how could the fourth test case display an area that appears to be nonexistent? The fact that the displayed values are our initialized values, and not garbage, might lead to the supposition that this command is somehow displaying something in the range 07FFF. One idea that may arise is that this could occur if the program were treating the operands in the command as decimal values rather than hexadecimal, as stated in the specification. This is borne out by the third test case: Rather than displaying 32 bytes of memory, the next increment above 11 in hexadecimal (17 in base 10), it displays 16 bytes of memory, which is consistent with our hypothesis that the 11 is being treated as a base-10 value. Hence, the refined hypothesis is that the program is treating the byte count as storage address operands, and the storage addresses on the output listing as decimal values.

The last step is to prove this hypothesis. Looking at the fourth test case, if 8000 is interpreted as a decimal number, the corresponding base-16 value is 1F40, which would lead to the output shown. As further proof, examine the second test case. The output is incorrect, but if 21 and 29 are treated as decimal numbers, the locations of storage addresses 151D would be displayed; this is consistent with the erroneous result of the test case. Hence, we have almost certainly located the error: The program is assuming that the operands are decimal values and is printing the memory addresses as decimal values, which is inconsistent with the specification. Moreover, this error seems to be the cause of the erroneous results of all four test cases. A little thought has led to the error, and it also solved three other problems that, at first glance, appear to be unrelated.

Note that the error probably manifests itself at two locations in the program: the part that interprets the input command and the part that prints memory addresses on the output listing.

As an aside, this error, likely caused by a misunderstanding of the specification, reinforces the suggestion that a programmer should not attempt to test his or her own program. If the programmer who created this error is also designing the test cases, he or she likely will make the same mistake while writing the test cases. In other words, the programmer's expected outputs would not be those of Figure 8.5; they would be the outputs calculated under the assumption that the operands are decimal values. Therefore, this fundamental error probably would go unnoticed.

Debugging by Backtracking

An effective method for locating errors in small programs is to backtrack the incorrect results through the logic of the program until you find the point where the logic went astray. In other words, start at the point where the program gives the incorrect result—such as where incorrect data were printed. Here, you deduce from the observed output what the values of the program's variables must have been. By performing a mental reverse execution of the program from this point and repeatedly applying the if-then logic that states “if this was the state of the program at this point, then this must have been the state of the program up here,” you can quickly pinpoint the error. You're looking for the location in the program between the point where the state of the program was what it was expected to be and the first point where the state of the program was not what it was expected to be.

Debugging by Testing

The last “thinking type” debugging method is the use of test cases. This probably sounds a bit peculiar since, at the beginning of this chapter, we distinguished debugging from testing. However, consider two types of test cases: test cases for testing, whose purpose is to expose a previously undetected error, and test cases for debugging, whose purpose is to provide information useful in locating a suspected error. The difference between the two is that test cases for testing tend to be “fat,” in that you are trying to cover many conditions in a small number of test cases. Test cases for debugging, on the other hand, are “slim,” because you want to cover only a single condition or a few conditions in each test case.

In other words, after you have discovered a symptom of a suspected error, you write variants of the original test case to attempt to pinpoint the error. Actually, this is not an entirely separate method; it often is used in conjunction with the induction method to obtain information needed to generate a hypothesis and/or to prove a hypothesis. It also is used with the deduction method to eliminate suspected causes, refine the remaining hypothesis, and/or prove a hypothesis.

Debugging Principles

In this section, we want to discuss a set of debugging principles that are psychological in nature. As with the testing principles in Chapter 2, many of these debugging principles are intuitively obvious, yet they are often forgotten or overlooked.

Since debugging is a two-part process—locating an error and then repairing it—we discuss two sets of principles here.

Error-Locating Principles

Think

As implied in the previous section, debugging is a problem-solving process. The most effective method of debugging involves a mental analysis of the information associated with the error's symptoms. An efficient program debugger should be able to pinpoint most errors without going near a computer. Here's how:

1. Position yourself in a quiet place, where outside stimuli—voices of coworkers, telephones, radio or other potential interruptions—won't interfere with your concentration.

2. Without looking at the program code, review in your mind how the program is designed, how the software should be performing within the area that is performing incorrectly.

3. Concentrate on the process for correct performance, and then imagine ways in which the code may be incorrectly designed.

This sort of prethinking the physical debugging process will, in many cases, lead you directly to the area of the program that is causing problems and help you achieve a fix, quickly.

If You Reach an Impasse, Sleep on It

The human subconscious is a potent problem solver. What we often refer to as inspiration is simply the subconscious mind working on a problem when the conscious mind is focused on something else, such as eating, walking, or watching a movie. If you cannot locate an error in a reasonable amount of time (perhaps 30 minutes for a small program, several hours for a larger one), drop it and turn your attention to something else, since your thinking efficiency is about to collapse anyway. After putting aside the problem for a while, your subconscious mind will have solved the problem, or your conscious mind will be clear for a fresh examination of its symptoms.

We have used this technique regularly over the years, both as a development process as well as a debugging process. It may take some practice to accept this extraordinary functioning of the human brain, and make efficient use of it, but it does work. We have actually awakened in the night to realize we have solved a software problem while asleep. For this reason, we recommend that you keep by your bedside a small tape recorder, a telephone capable of voice recording, a PDA, or a notepad to capture the solution you found while sleeping. Resist the temptation to return to sleep believing you will be able to regenerate the solution in the morning. You probably won't—at least not in our experience.

If You Reach an Impasse, Describe the Problem to Someone Else

Talking about the problem with someone else may help you discover something new. In fact, often, simply by describing the problem to a good listener, you will suddenly see the solution without any assistance from the person.

Use Debugging Tools Only as a Second Resort

Turn to debugging tools only after you've tried other methods, and then only as an adjunct to, not a substitute for, thinking. As noted earlier in this chapter, debugging tools, such as dumps and traces, represent a haphazard approach to debugging. Experiments show that people who shun such tools, even when they are debugging programs that are unfamiliar to them, are more successful than people who use the tools.

Why should this be so? Depending on a tool to solve a problem can short-circuit the diagnostic process. If you believe that the tool can solve the problem, you are likely to be less attentive to the clues you already have picked up, information that could help you solve the problem directly, without the help of a generic diagnostic tool.

Avoid Experimentation—Use It Only as a Last Resort

The most common mistake novice debuggers make is to try to solve a problem by making experimental changes to the program. You might think, “I know what is wrong, so I'll change this DO statement and see what happens.” This totally haphazard approach cannot even be considered debugging; it represents an act of blind hope. Not only does it have a minuscule chance of success, but you will often compound the problem by adding new errors to the program.

Error-Repairing Techniques

Where There Is One Bug, There Is Likely to Be Another

This is a restatement of principle 9 in Chapter 2, which states that when you find an error in a section of a program, the probability of the existence of another error in that same section is higher than if you hadn't already found one error. In other words, errors tend to cluster. When repairing an error, examine its immediate vicinity for anything else that looks suspicious.

Fix the Error, Not Just a Symptom of It

Another common failing is repairing the symptoms of the error, or just one instance of the error, rather than the error itself. If the proposed correction does not match all the clues about the error, you may be fixing only a part of the error.

The Probability of the Fix Being Correct Is Not 100 Percent

Tell this to someone in general conversation and of course he or she would agree; but tell it to someone in the process of correcting an error and you may get a different answer—“Yes, in most cases, but this correction is so minor that it just has to work.” Never assume that code added to a program to fix an error is correct. Statement for statement, corrections are much more error prone than the original code in the program. One implication is that error corrections must be tested, perhaps more rigorously than the original program. A solid regression testing plan can help ensure that correcting an error does not introduce another error somewhere else in the application.

The Probability of the Fix Being Correct Drops as the Size of the Program Increases

Stating it differently, in our experience, the ratio of errors caused by incorrect fixes, versus original errors, increases in large programs. In one widely used large program, one of every six new errors discovered is an error in a prior correction to the program.

If you accept this as fact, how can you avoid causing problems by trying to fix them? Read the first three techniques in this section, for starters. One error found does not mean all errors have been found, and you must be sure you are correcting the actual error, not just its symptom.

Beware of the Possibility That an Error Correction Creates a New Error

Not only do you have to worry about incorrect corrections, you also have to worry about a seemingly valid correction having an undesirable side effect, thus introducing a new error. Not only is there a probability that a fix will be invalid, but there also is a probability that a fix will introduce a new error. One implication is that not only do you have to test the error situation after the correction is made, but you must also perform regression testing to determine whether a new error has been introduced.

The Process of Error Repair Should Put You Temporarily Back into the Design Phase

Realize that error correction is a form of program design. Given the error-prone nature of corrections, common sense says that whatever procedures, methodologies, and formalism were used in the design process should also apply to the error-correction process. For instance, if the project rationalized that code inspections were desirable, then it must be doubly important that they be implemented after correcting an error.

Change the Source Code, Not the Object Code

When debugging large systems, particularly those written in an assembly language, occasionally there is the tendency to correct an error by making an immediate change to the object code, with the intention of changing the source program later. Two problems are associated with this approach: (1) It usually is a sign that “debugging by experimentation” is being practiced; and (2) the object code and source program are now out of synchronization, meaning that the error could easily resurface when the program is recompiled or reassembled. This practice is an indication of a sloppy, unprofessional approach to debugging.

Error Analysis

The last point to realize about program debugging is that in addition to its value in removing an error from the program, it can have another valuable effect: It can tell us something about the nature of software errors, something we still know too little about. Information about the nature of software errors can provide valuable feedback in terms of improving future design, coding, and testing processes.

Every programmer and programming organization could improve immensely by performing a detailed analysis of the detected errors, or at least a subset of them. Admittedly, it is a difficult and time-consuming task, for it implies much more than a superficial grouping such as “x percent of the errors are logic design errors,” or “x percent of the errors occur in IF statements.” A careful analysis might include the following studies:

  • Where was the error made? This question is the most difficult one to answer, because it requires a backward search through the documentation and history of the project; at the same time, it also is the most valuable question. It requires that you pinpoint the original source and time of the error. For example, the original source of the error might be an ambiguous statement in a specification, a correction to a prior error, or a misunderstanding of an end-user requirement.
  • Who made the error? Wouldn't it be useful to discover that 60 percent of the design errors were created by one of the 10 analysts, or that programmer X makes three times as many mistakes as the other programmers? (Not for the purposes of punishment but for the purposes of education.)
  • What was done incorrectly? It is not sufficient to determine when and by whom each error was made; the missing link is a determination of exactly why the error occurred. Was it caused by someone's inability to write clearly? Someone's lack of education in the programming language? A typing mistake? An invalid assumption? A failure to consider valid input?
  • How could the error have been prevented? What can be done differently in the next project to prevent this type of error? The answer to this question constitutes much of the valuable feedback or learning for which we are searching.
  • Why wasn't the error detected earlier? If the error was detected during a test phase, you should study why the error was not unearthed during earlier testing phases, code inspections, and design reviews.
  • How could the error have been detected earlier? The answer to this offers another piece of valuable feedback. How can the review and testing processes be improved to find this type of error earlier in future projects? Providing that we are not analyzing an error found by an end user (that is, the error was found by a test case), we should realize that something valuable has happened: We have written a successful test case. Why was this test case successful? Can we learn something from it that will result in additional successful test cases, either for this program or for future programs?

We repeat, this analysis process is difficult, and costly, but the answers you may discover by going through it can be invaluable in improving subsequent programming efforts. The quality of future products will increase while the capital investment will decrease. It is alarming that the vast majority of programmers and programming organizations do not employ it.

Summary

The main focus of this book is on software testing: How do you go about uncovering as many software errors as possible? Therefore, we don't want to spend too much time on the next step—debugging—but the simple fact is, errors found by successful test cases lead directly to it.

In this chapter we touched on some of the more important aspects of software debugging. The least desirable method, debugging by brute force, involves such techniques as dumping memory locations, placing print statements throughout the program, or using automated tools. Brute-force techniques may point you to the solution for some errors uncovered during testing, but they are not an efficient way to go about debugging.

We demonstrated that you can begin debugging by studying the error symptoms, or clues, and moving from them to the larger picture (inductive debugging). Another technique begins the debugging process by considering general theories, then, through the process of elimination, identifies the error locations (deductive debugging). We also covered program backtracking—starting with the error and moving backwards through the program to determine where incorrect information originated. Finally, we discussed debugging by testing.

If, however, we were to offer a single directive to those tasked with debugging a software system, we would say, “Think!” Review the numerous debugging principles described in this chapter. We believe they can lead you in the right direction, toward accurate and efficient debugging. But the bottom line is, depend on your expertise and knowledge of the program itself. Open your mind to creative solutions, review what you know, and let your knowledge and subconscious lead you to the error locations.

In the next chapter we take on the subject of extreme testing, techniques well suited to help uncover errors in extreme programming environments such as agile development.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset