Are you one of those programmers who scoff at the very idea of
using a debugger to trace through code? Is it your philosophy that if
the code is too complex for even the programmer to understand, the
programmer deserves no mercy when it comes to bugs? Do you step
through your code, mentally, using a magnifying glass and a toothpick?
More often than not, are bugs usually caused by a single-character
omission, such as using the =
operator when you mean +=
?
Then perhaps you should meet gdb--the GNU debugger. Whether or not you know it, gdb is your friend. It can locate obscure and difficult-to-find bugs that result in core dumps, memory leaks, and erratic behavior (both for the program and the programmer). Sometimes even the most harmless-looking glitches in your code can cause everything to go haywire, and without the aid of a debugger like gdb, finding these problems can be nearly impossible—especially for programs longer than a few hundred lines. In this section, we introduce you to the most useful features of gdb by way of examples. There’s a book on gdb: Debugging with GDB (Free Software Foundation).
gdb is capable of either debugging programs as they run or examining the cause for a program crash with a core dump. Programs debugged at runtime with gdb can either be executed from within gdb itself or can be run separately; that is, gdb can attach itself to an already running process to examine it. We first discuss how to debug programs running within gdb and then move on to attaching to running processes and examining core dumps.
Our first example is a program called trymh that detects edges in a grayscale image. trymh takes as input an image file, does some calculations on the data, and spits out another image file. Unfortunately, it crashes whenever it is invoked, as so:
papaya$ trymh < image00.pgm > image00.pbm
Segmentation fault (core dumped)
Now, using gdb, we could analyze the resulting core file, but for this example, we’ll show how to trace the program as it runs.[*]
Before we use gdb to trace through the executable trymh, we need to ensure that the executable has been compiled with debugging code (see "Enabling Debugging Code,” earlier in this chapter). To do so, we should compile trymh using the -g switch with gcc.
Note that enabling optimization (-O) with debug code (-g) is legal but discouraged. The problem is that gcc is too smart for its own good. For example, if you have two identical lines of code in two different places in a function, gdb may unexpectedly jump to the second occurrence of the line, instead of the first, as expected. This is because gcc combined the two lines into a single line of machine code used in both instances.
Some of the automatic optimizations performed by gcc can be confusing when using a debugger. To turn off all optimization (even optimizations performed without specifying -O), use the -O0 (that’s dash-oh-zero) option with gcc.
Now we can fire up gdb to see what the problem might be:
papaya$ gdb trymh
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i586-suse-linux".
(gdb)
Now gdb is waiting for a command. (The command help displays information on the available commands.) The first thing we want to do is start running the program so that we can observe its behavior. However, if we immediately use the run command, the program simply executes until it exits or crashes.
First, we need to set a breakpoint somewhere in the program. A breakpoint is just a location in the program where gdb should stop and allow us to control execution of the program. For the sake of simplicity, let’s set a breakpoint on the first line of actual code so that the program stops just as it begins to execute. The list command displays several lines of code (an amount that is variable) at a time:
(gdb)list
12 main() { 13 14 FloatImage inimage; 15 FloatImage outimage; 16 BinaryImage binimage; 17 int i,j; 18 19 inimage = (FloatImage)imLoadF(IMAGE_FLOAT,stdin); 20 outimage = laplacian_float(inimage); 21 (gdb)break 19
Breakpoint 1 at 0x289c: file trymh.c, line 19. (gdb)
A breakpoint is now set at line 19 in the current source file. You can set many breakpoints in the program; breakpoints may be conditional (that is, triggered only when a certain expression is true), unconditional, delayed, temporarily disabled, and so on. You may set breakpoints on a particular line of code, a particular function, or a set of functions, and in a slew of other ways. You may also set a watchpoint, using the watch command, which is similar to a breakpoint but is triggered whenever a certain event takes place—not necessarily at a specific line of code within the program. We’ll talk more about breakpoints and watchpoints later in the chapter.
Next, we use the run command to start running the program. run takes as arguments the same arguments you’d give trymh on the command line; these can include shell wildcards and input/output redirection, as the command is passed to /bin/sh for execution:
(gdb) run < image00.pgm > image00.pfm
Starting program: /amd/dusk/d/mdw/vis/src/trymh < image00.pgm > image00.pfm
Breakpoint 1, main () at trymh.c:19
19 inimage = (FloatImage)imLoadF(IMAGE_FLOAT,stdin);
(gdb)
As expected, the breakpoint is reached immediately at the first line of code. We can now take over.
The most useful program-stepping commands are next and step. Both commands execute the next line of code in the program, except that step descends into any function calls in the program, and next steps directly to the next line of code in the same function. next quietly executes any function calls that it steps over but does not descend into their code for us to examine.
imLoadF
is a function that
loads an image from a disk file. We know this function is not at
fault (you’ll have to trust us on that one), so we wish to step over
it using the next command:
(gdb) next
20 outimage = laplacian_float(inimage);
(gdb)
Here, we are interested in tracing the suspicious-looking
laplacian_float
function, so we
use the step command:
(gdb) step
laplacian_float (fim=0x0) at laplacian.c:21
21 i = 20.0;
(gdb)
Let’s use the list command to get some idea of where we are:
(gdb)list
16 FloatImage laplacian_float(FloatImage fim) { 17 18 FloatImage mask; 19 float i; 20 21 i = 20.0; 22 mask=(FloatImage)imNew(IMAGE_FLOAT,3,3); 23 imRef(mask,0,0) = imRef(mask,2,0) = imRef(mask,0,2) = 1.0; 24 imRef(mask,2,2) = 1.0; imRef(mask,1,0) = imRef(mask,0,1) = i/5; 25 imRef(mask,2,1) = imRef(mask,1,2) = i/5; imRef(mask,1,1) = -i; (gdb)list
26 27 return convolveFloatWithFloat(fim,mask); 28 } (gdb)
As you can see, using list multiple times
just displays more of the code. Because we don’t want to step
manually through this code, and we’re not interested in the imNew
function on line 22, let’s continue
execution until line 27. For this, we use the
until command:
(gdb) until 27
laplacian_float (fim=0x0) at laplacian.c:27
27 return convolveFloatWithFloat(fim,mask);
(gdb)
Before we step into the convolveFloatWithFloat
function, let’s be
sure the two parameters, fim
and
mask
, are valid. The
print command examines the value of a
variable:
(gdb)print mask
$1 = (struct {...} *) 0xe838 (gdb)print fim
$2 = (struct {...} *) 0x0 (gdb)
mask
looks fine, but
fim
, the input image, is null.
Obviously, laplacian_float
was
passed a null pointer instead of a valid image. If you have been
paying close attention, you noticed this as we entered laplacian_float
earlier.
Instead of stepping deeper into the program (as it’s apparent that something has already gone wrong), let’s continue execution until the current function returns. The finish command accomplishes this:
(gdb) finish
Run till exit from #0 laplacian_float (fim=0x0) at laplacian.c:27
0x28c0 in main () at trymh.c:20
20 outimage = laplacian_float(inimage);
Value returned is $3 = (struct {...} *) 0x0
(gdb)
Now we’re back in main
.
To determine the source of the problem, let’s examine the values of
some variables:
(gdb)list
15 FloatImage outimage; 16 BinaryImage binimage; 17 int i,j; 18 19 inimage = (FloatImage)imLoadF(IMAGE_FLOAT,stdin); 20 outimage = laplacian_float(inimage); 21 22 binimage = marr_hildreth(outimage); 23 if (binimage = = NULL) { 24 fprintf(stderr,"trymh: binimage returned NULL "); (gdb)print inimage
$6 = (struct {...} *) 0x0 (gdb)
The variable inimage
,
containing the input image returned from imLoadF
, is null. Passing a null pointer
into the image manipulation routines certainly would cause a core
dump in this case. However, we know imLoadF
to be tried and true because it’s
in a well-tested library, so what’s the problem?
As it turns out, our library function imLoadF
returns NULL
on failure—if the input format is
bad, for example. Because we never checked the return value of
imLoadF
before passing it along
to laplacian_float
, the program
goes haywire when inimage
is
assigned NULL
. To correct the
problem, we simply insert code to cause the program to exit with an
error message if imLoadF
returns
a null pointer.
To quit gdb, just use the command quit. Unless the program has finished execution, gdb will complain that the program is still running:
(gdb)quit
The program is running. Quit anyway (and kill it)? (y or n)y
papaya$
In the following sections we examine some specific features provided by the debugger, given the general picture just presented.
Do you hate it when a program crashes and spites you again by leaving a 20-MB core file in your working directory, wasting much-needed space? Don’t be so quick to delete that core file; it can be very helpful. A core file is just a dump of the memory image of a process at the time of the crash. You can use the core file with gdb to examine the state of your program (such as the values of variables and data) and determine the cause for failure.
The core file is written to disk by the operating system whenever certain failures occur. The most frequent reason for a crash and the subsequent core dump is a memory violation—that is, trying to read or write memory to which your program does not have access. For example, attempting to write data using a null pointer can cause a segmentation fault , which is essentially a fancy way of saying, “you screwed up.” Segmentation faults are a common error and occur when you try to access (read from or write to) a memory address that does not belong to your process’s address space. This includes the address 0, as often happens with uninitialized pointers. Segmentation faults are often caused by trying to access an array item outside the declared size of the array, and are commonly a result of an off-by-one error. They can also be caused by a failure to allocate memory for a data structure.
Other errors that result in core files are so-called bus errors and floating-point exceptions. Bus errors result from using incorrectly aligned data and are therefore rare on the Intel architecture, which does not pose the strong alignment conditions that other architectures do. Floating-point exceptions point to a severe problem in a floating-point calculation like an overflow, but the most usual case is a division by zero.
However, not all such memory errors will cause immediate crashes. For example, you may overwrite memory in some way, but the program continues to run, not knowing the difference between actual data and instructions or garbage. Subtle memory violations can cause programs to behave erratically. One of the authors once witnessed a bug that caused the program to jump randomly around, but without tracing it with gdb, it still appeared to work normally. The only evidence of a bug was that the program returned output that meant, roughly, that two and two did not add up to four. Sure enough, the bug was an attempt to write one too many characters into a block of allocated memory. That single-byte error caused hours of grief.
You can prevent these kinds of memory problems (even the best
programmers make these mistakes!) using the Valgrind package , a set of memory-management routines that replaces
the commonly used malloc()
and
free()
functions as well as
their C++ counterparts, the operators new
and delete
. We talk about Valgrind in "Using Valgrind,” later in
this chapter.
However, if your program does cause a memory fault, it will crash and dump core. Under Linux, core files are named, appropriately, core. The core file appears in the current working directory of the running process, which is usually the working directory of the shell that started the program; on occasion, however, programs may change their own working directory.
Some shells provide facilities for controlling whether core files are written. Under bash, for example, the default behavior is not to write core files. To enable core file output, you should use the command:
ulimit -c unlimited
probably in your .bashrc
initialization file. You can specify a maximum size for core files
other than unlimited
, but
truncated core files may not be of use when debugging
applications.
Also, in order for a core file to be useful, the program must be compiled with debugging code enabled, as described in the previous section. Most binaries on your system will not contain debugging code, so the core file will be of limited value.
Our example for using gdb with a core file is yet another mythical program called cross. Like trymh in the previous section, cross takes an image file as input, does some calculations on it, and outputs another image file. However, when running cross, we get a segmentation fault:
papaya$ cross < image30.pfm > image30.pbm
Segmentation fault (core dumped)
papaya$
To invoke gdb for use with a core file, you must specify not only the core filename, but also the name of the executable that goes along with that core file. This is because the core file does not contain all the information necessary for debugging:
papaya$ gdb cross core
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.8, Copyright 1993 Free Software Foundation, Inc...
Core was generated by `cross'.
Program terminated with signal 11, Segmentation fault.
#0 0x2494 in crossings (image=0xc7c8) at cross.c:31
31 if ((image[i][j] >= 0) &&
(gdb)
gdb tells us that the core file was created when the program terminated with signal 11. A signal is a kind of message that is sent to a running program from the kernel, the user, or the program itself. Signals are generally used to terminate a program (and possibly cause it to dump core). For example, when you type the interrupt character, a signal is sent to the running program, which will probably kill the program.
In this case, signal 11 was sent to the running cross process by the kernel when cross attempted to read or write to memory to which it did not have access. This signal caused cross to die and dump core. gdb says that the illegal memory reference occurred on line 31 of the source file cross.c:
(gdb) list
26 xmax = imGetWidth(image)-1;
27 ymax = imGetHeight(image)-1;
28
29 for (j=1; j<xmax; j++) {
30 for (i=1; i<ymax; i++) {
31 if ((image[i][j] >= 0) &&
32 (image[i-1][j-1] < 0) ||
33 (image[i-1][j] < 0) ||
34 (image[i-1][j+1] < 0) ||
35 (image[i][j-1] < 0) ||
(gdb)
Here, we see several things. First of all, there is a loop
across the two index variables i
and j
, presumably in order to do
calculations on the input image. Line 31 is an attempt to reference
data from image[i][j]
, a
two-dimensional array. When a program dumps core while attempting to
access data from an array, it’s usually a sign that one of the
indices is out of bounds. Let’s check them:
(gdb)print i
$1 = 1 (gdb)print j
$2 = 1194 (gdb)print xmax
$3 = 1551 (gdb)print ymax
$4 = 1194 (gdb)
Here we see the problem. The program was attempting to
reference element image[1][1194]
;
however, the array extends only to image[1550][1193]
(remember that arrays in
C are indexed from 0 to max
- 1). In
other words, we attempted to read the 1195th row of an image that
has only 1194 rows.
If we look at lines 29 and 30, we see the problem: the values
xmax
and ymax
are reversed. The variable j
should range from 1 to ymax
(because it is the row index of the
array), and i
should range from 1
to xmax
. Fixing the two for
loops on lines 29 and 30 corrects the
problem.
Let’s say that your program is crashing within a function that
is called from many different locations, and you want to determine
where the function was invoked from and what situation led up to the
crash. The backtrace command displays the call
stack of the program at the time of failure. If you are like the
author of this section and are too lazy to type backtrace
all the time, you will be
delighted to hear that you can also use the shortcut
bt.
The call stack is the list of functions
that led up to the current one. For example, if the program starts
in function main
, which calls
function foo
, which calls
bamf
, the call stack looks like
this:
(gdb) backtrace
#0 0x1384 in bamf () at goop.c:31
#1 0x4280 in foo () at goop.c:48
#2 0x218 in main () at goop.c:116
(gdb)
As each function is called, it pushes certain data onto the stack, such as saved registers, function arguments, local variables, and so forth. Each function has a certain amount of space allocated on the stack for its use. The chunk of memory on the stack for a particular function is called a stack frame, and the call stack is the ordered list of stack frames .
In the following example, we are looking at a core file for an X-based animation program. Using backtrace gives us the following:
(gdb) backtrace
#0 0x602b4982 in _end ()
#1 0xbffff934 in _end ()
#2 0x13c6 in stream_drawimage (wgt=0x38330000, sn=4)
at stream_display.c:94
#3 0x1497 in stream_refresh_all () at stream_display.c:116
#4 0x49c in control_update_all () at control_init.c:73
#5 0x224 in play_timeout (Cannot access memory at address 0x602b7676.
(gdb)
This is a list of stack frames for the process. The most
recently called function is frame 0, which is the “function”
_end
in this case. Here, we see
that play_timeout
called control_update_all
, which called stream_refresh_all
, and so on. Somehow,
the program jumped to _end
, where
it crashed.
However, _end
is not a
function; it is simply a label that specifies the end of the process
data segment. When a program branches to an address such as _end
, which is not a real function, it is
a sign that something must have caused the process to go haywire,
corrupting the call stack. (This is known in hacker jargon as
“jumping to hyperspace.”) In fact, the error Cannot access memory at address 0x602b7676
is another indication that something bizarre has occurred.
We can see, however, that the last “real” function called was
stream_drawimage
, and we might
guess that it is the source of the problem. To examine the state of
stream_drawimage
, we need to
select its stack frame (frame number 2), using the
frame command:
(gdb)frame 2
#2 0x13c6 in stream_drawimage (wgt=0x38330000, sn=4) at stream_display.c:94 94 XCopyArea(mydisplay,streams[sn].frames[currentframe], XtWindow(wgt), (gdb)list
91 92 printf("CopyArea frame %d, sn %d, wid %d ",currentframe,sn,wgt); 93 94 XCopyArea(mydisplay,streams[sn].frames[currentframe], XtWindow(wgt), 95 picGC,0,0,streams[sn].width,streams[sn].height,0,0); (gdb)
Well, not knowing anything else about the program at hand, we
can’t see anything wrong here, unless the variable sn
(being used as an index into the array
streams
) is out of range. From
the output of frame, we see that stream_drawimage
was called with an
sn
parameter of 4. (Function
parameters are displayed in the output of
backtrace, as well as whenever we change
frames.)
Let’s move up another frame, to stream_refresh_all
, to see how stream_display
was called. To do this, we
use the up command, which selects the stack
frame above the current one:
(gdb)up
#3 0x1497 in stream_refresh_all () at stream_display.c:116 116 stream_drawimage(streams[i].drawbox,i); (gdb)list
113 void stream_refresh_all(void) { 114 int i; 115 for (i=0; i<=numstreams; i++) { 116 stream_drawimage(streams[i].drawbox,i); 117 (gdb)print i
$2 = 4 (gdb)print numstreams
$3 = 4 (gdb)
Here, we see that the index variable i
is looping from 0 to numstreams
, and indeed i
here is 4, the second parameter to
stream_drawimage
. However,
numstreams
is also 4. What’s
going on?
The for
loop on line 115
looks funny; it should read as follows:
for (i=0; i<numstreams; i++) {
The error is in the use of the <=
comparison operator. The streams
array is indexed from 0
to numstreams-1
, not from 0
to numstreams
. This simple off-by-one error
caused the program to go berserk.
As you can see, using gdb with a core dump allows you to browse through the image of a crashed program to find bugs. Never again will you delete those pesky core files, right?
gdb can also debug a program that is already running, allowing you to interrupt it, examine it, and then return the process to its regularly scheduled execution. This is very similar to running a program from within gdb, and there are only a few new commands to learn.
The attach command attaches gdb to a running process. In order to use attach you must also have access to the executable that corresponds to the process.
For example, if you have started the program pgmseq with process ID 254, you can start up gdb with
papaya$ gdb pgmseq
and once inside gdb, use the command
(gdb) attach 254
Attaching program `/home/loomer/mdw/pgmseq/pgmseq', pid 254
_ _select (nd=4, in=0xbffff96c, out=0xbffff94c, ex=0xbffff92c, tv=0x0)
at _ _select.c:22
_ _select.c:22: No such file or directory.
(gdb)
The No such file or
directory
error is given because gdb
can’t locate the source file for _
_select
. This is often the case with system calls and
library functions, and it’s nothing to worry about.
You can also start gdb with the command
papaya$ gdb pgmseq 254
Once gdb attaches to the running process, it temporarily suspends the program and lets you take over, issuing gdb commands. Or you can set a breakpoint or watchpoint (with the break and watch commands) and use continue to cause the program to continue execution until the breakpoint is triggered.
The detach command detaches gdb from the running process. You can then use attach again, on another process, if necessary. If you find a bug, you can detach the current process, make changes to the source, recompile, and use the file command to load the new executable into gdb. You can then start the new version of the program and use the attach command to debug it. All without leaving gdb!
In fact, gdb allows you to debug three programs concurrently: one running directly under gdb, one tracing with a core file, and one running as an independent process. The target command allows you to select which one you wish to debug.
To examine the values of variables in your program, you can use the print, x, and ptype commands. The print command is the most commonly used data inspection command; it takes as an argument an expression in the source language (usually C or C++) and returns its value. For example:
(gdb) print mydisplay
$10 = (struct _XDisplay *) 0x9c800
(gdb)
This displays the value of the variable mydisplay
, as well as an indication of its
type. Because this variable is a pointer, you can examine its
contents by dereferencing the pointer, as you would in C:
(gdb) print *mydisplay
$11 = {ext_data = 0x0, free_funcs = 0x99c20, fd = 5, lock = 0,
proto_major_version = 11, proto_minor_version = 0,
vendor = 0x9dff0 "XFree86", resource_base = 41943040,
...
error_vec = 0x0, cms = {defaultCCCs = 0xa3d80 "",
clientCmaps = 0x991a0 "'",
perVisualIntensityMaps = 0x0}, conn_checker = 0, im_filters = 0x0}
(gdb)
mydisplay
is an extensive
structure used by X programs; we have abbreviated the output for
your reading enjoyment.
print can print the value of just about any expression, including C function calls (which it executes on the fly, within the context of the running program):
(gdb) print getpid()
$11 = 138
(gdb)
Of course, not all functions may be called in this manner. Only those functions that have been linked to the running program may be called. If a function has not been linked to the program and you attempt to call it, gdb will complain that there is no such symbol in the current context.
More complicated expressions may be used as arguments to print as well, including assignments to variables. For example:
(gdb) print mydisplay->vendor = "Linux"
$19 = 0x9de70 "Linux"
(gdb)
assigns to the vendor
member of the mydisplay
structure
the value "Linux
" instead of
"XFree86
" (a useless
modification, but interesting nonetheless). In this way, you can
interactively change data in a running program to correct errant
behavior or test uncommon situations.
Note that after each print command, the
value displayed is assigned to one of the gdb
convenience registers, which are gdb internal
variables that may be handy for you to use. For example, to recall
the value of mydisplay
in the
previous example, we need to merely print the value of $10
:
(gdb) print $10
$21 = (struct _XDisplay *) 0x9c800
(gdb)
You may also use expressions, such as typecasts, with the print command. Almost anything goes.
The ptype command gives you detailed (and
often long-winded) information about a variable’s type or the
definition of a struct or
typedef. To get a full definition for the
struct_XDisplay
used by the mydisplay
variable, we use:
(gdb) ptype mydisplay
type = struct _XDisplay {
struct _XExtData *ext_data;
struct _XFreeFuncs *free_funcs;
int fd;
int lock;
int proto_major_version;
....
struct _XIMFilter *im_filters;
} *
(gdb)
If you’re interested in examining memory on a more fundamental level, beyond the petty confines of defined types, you can use the x command. x takes a memory address as an argument. If you give it a variable, it uses the value of that variable as the address.
x also takes a count and a type
specification as an optional argument. The count is the number of
objects of the given type to display. For example, x/100x 0x4200
displays 100 bytes of data,
represented in hexadecimal format, at the address 0x4200. Use
help x to get a description of the various
output formats.
To examine the value of mydisplay->vendor
, we can use:
(gdb)x mydisplay->vendor
0x9de70 <_end+35376>: 76 'L' (gdb)x/6c mydisplay->vendor
0x9de70 <_end+35376>: 76 'L' 105 'i' 110 'n' 117 'u' 120 'x' 0 ' 00' (gdb)x/s mydisplay->vendor
0x9de70 <_end+35376>: "Linux" (gdb)
The first field of each line gives the absolute address of the
data. The second represents the address as some symbol (in this
case, _end
) plus an offset in
bytes. The remaining fields give the actual value of memory at that
address, first in decimal, then as an ASCII
character. As described earlier, you can force
x to print the data in other formats.
The info command provides information about the status of the program being debugged. There are many subcommands under info; use help info to see them all. For example, info program displays the execution status of the program:
(gdb) info program
Using the running image of child process 138.
Program stopped at 0x9e.
It stopped at breakpoint 1.
(gdb)
Another useful command is info locals, which displays the names and values of all local variables in the current function:
(gdb) info locals
inimage = (struct {...} *) 0x2000
outimage = (struct {...} *) 0x8000
(gdb)
This is a rather cursory description of the variables. The print or x commands describe them further.
Similarly, info variables displays a list of all known variables in the program, ordered by source file. Note that many of the variables displayed will be from sources outside your actual program—for example, the names of variables used within the library code. The values for these variables are not displayed because the list is culled more or less directly from the executable’s symbol table. Only those local variables in the current stack frame and global (static) variables are actually accessible from gdb. info address gives you information about exactly where a certain variable is stored. For example:
(gdb) info address inimage
Symbol "inimage" is a local variable at frame offset -20.
(gdb)
By frame offset
,
gdb means that inimage
is stored 20 bytes below the top
of the stack frame.
You can get information on the current frame using the info frame command, as so:
(gdb) info frame
Stack level 0, frame at 0xbffffaa8:
eip = 0x9e in main (main.c:44); saved eip 0x34
source language c.
Arglist at 0xbffffaa8, args: argc=1, argv=0xbffffabc
Locals at 0xbffffaa8, Previous frame's sp is 0x0
Saved registers:
ebx at 0xbffffaa0, ebp at 0xbffffaa8, esi at 0xbffffaa4, eip at
0xbffffaac
(gdb)
This kind of information is useful if you’re debugging at the assembly-language level with the disass, nexti, and stepi commands (see "Instruction-level debugging,” later in this chapter).
We have barely scratched the surface of what gdb can do. It is an amazing program with a lot of power; we have introduced you only to the most commonly used commands. In this section, we look at other features of gdb and then send you on your way.
If you’re interested in learning more about gdb, we encourage you to read the gdb manual page and the Free Software Foundation manual. The manual is also available as an online Info file. (Info files may be read under Emacs or using the info reader; see "Tutorial and Online Help" in Chapter 19 for details.)
As promised, this section demonstrates further use of breakpoints and watchpoints. Breakpoints are set with the break command; similarly, watchpoints are set with the watch command. The only difference between the two is that breakpoints must break at a particular location in the program—on a certain line of code, for example—whereas watchpoints may be triggered whenever a certain expression is true, regardless of location within the program. Though powerful, watchpoints can be extremely inefficient; any time the state of the program changes, all watchpoints must be reevaluated.
When a breakpoint or watchpoint is triggered, gdb suspends the program and returns control to you. Breakpoints and watchpoints allow you to run the program (using the run and continue commands) and stop only in certain situations, thus saving you the trouble of using many next and step commands to walk through the program manually.
There are many ways to set a breakpoint in the program. You can specify a line number, as in break 20. Or, you can specify a particular function, as in break stream_unload. You can also specify a line number in another source file, as in break foo.c:38. Use help break to see the complete syntax.
Breakpoints may be conditional; that is, the breakpoint triggers only when a certain expression is true. For example, using the command:
break 184 if (status = = 0)
sets a conditional breakpoint at line 184 in the current
source file, which triggers only when the variable status
is zero. The variable status
must be either a global variable
or a local variable in the current stack frame. The expression may
be any valid expression in the source language that
gdb understands, identical to the expressions
used by the print command. You can change the
breakpoint condition (if it is conditional) using the
condition command.
Using the command info break gives you a list of all breakpoints and watchpoints and their status. This allows you to delete or disable breakpoints, using the commands clear, delete, or disable. A disabled breakpoint is merely inactive, until you reenable it (with the enable command). A breakpoint that has been deleted, on the other hand, is gone from the list of breakpoints for good. You can also specify that a breakpoint be enabled once; meaning that once it is triggered, it will be disabled again.
To set a watchpoint, use the watch command, as in the following example:
watch (numticks < 1024 && incoming != clear)
Watchpoint conditions may be any valid source expression, as with conditional breakpoints.
gdb is capable of debugging on the processor-instruction level, allowing you to watch the innards of your program with great scrutiny. However, understanding what you see requires not only knowledge of the processor architecture and assembly language, but also some idea of how the operating system sets up process address space. For example, it helps to understand the conventions used for setting up stack frames, calling functions, passing parameters and return values, and so on. Any book on protected-mode 80386/80486 programming can fill you in on these details. But be warned: protected-mode programming on this processor is quite different from real-mode programming (as is used in the MS-DOS world). Be sure that you’re reading about native protected-mode ’386 programming, or else you might subject yourself to terminal confusion.
The primary gdb commands used for instruction-level debugging are nexti, stepi, and disass. nexti is equivalent to next, except that it steps to the next instruction, not the next source line. Similarly, stepi is the instruction-level analog of step.
The disass command displays a
disassembly of an address range that you supply. This address
range may be specified by literal address or function name. For
example, to display a disassembly of the function play_timeout
, use the following
command:
(gdb) disass play_timeout
Dump of assembler code for function play_timeout:
to 0x2ac:
0x21c <play_timeout>: pushl %ebp
0x21d <play_timeout+1>: movl %esp,%ebp
0x21f <play_timeout+3>: call 0x494 <control_update_all>
0x224 <play_timeout+8>: movl 0x952f4,%eax
0x229 <play_timeout+13>: decl %eax
0x22a <play_timeout+14>: cmpl %eax,0x9530c
0x230 <play_timeout+20>: jne 0x24c <play_timeout+48>
0x232 <play_timeout+22>: jmp 0x29c <play_timeout+128>
0x234 <play_timeout+24>: nop
0x235 <play_timeout+25>: nop
...
0x2a8 <play_timeout+140>: addb %al,(%eax)
0x2aa <play_timeout+142>: addb %al,(%eax)
(gdb)
This is equivalent to using the command disass
0x21c (where 0x21c
is the literal address of the beginning of play_timeout
).
You can specify an optional second argument to
disass, which will be used as the address
where disassembly should stop. Using disass 0x21c
0x232 will display only the first seven lines of the
assembly listing in the previous example (the instruction starting
with 0x232
itself will not be
displayed).
If you use nexti and stepi often, you may wish to use the command:
display/i $pc
This causes the current instruction to be displayed after
every nexti or stepi
command. display specifies variables to watch
or commands to execute after every stepping command. $pc
is a gdb
internal register that corresponds to the processor’s program
counter, pointing to the current instruction.
(X)Emacs (described in Chapter 19) provides a debugging mode that lets you run gdb--or another debugger—within the integrated program-tracing environment provided by Emacs. This so-called Grand Unified Debugger library is very powerful and allows you to debug and edit your programs entirely within Emacs.
To start gdb under Emacs, use the Emacs
command M-x gdb
and give the
name of the executable to debug as the argument. A buffer will be
created for gdb, which is similar to using
gdb alone. You can then use
core-file to load a core file or
attach to attach to a running process, if you
wish.
Whenever you step to a new frame (when you first trigger a
breakpoint), gdb opens a separate window that
displays the source corresponding to the current stack frame. You
may use this buffer to edit the source text just as you normally
would with Emacs, but the current source line is highlighted with
an arrow (the characters =>
). This allows you to watch the
source in one window and execute gdb commands
in the other.
Within the debugging window, you can use several special key sequences. They are fairly long, though, so it’s not clear that you’ll find them more convenient than just entering gdb commands directly. Some of the more common commands include the following:
C-x C-a C-s
The equivalent of a gdb step command, updating the source window appropriately
C-x C-a C-i
The equivalent of a stepi command
C-x C-a C-n
The equivalent of a next command
C-x C-a C-r
The equivalent of a continue command
C-x C-a <
The equivalent of an up command
C-x C-a >
The equivalent of a down command
If you do enter commands in the traditional manner, you can
use M-p
to move backward to
previously issued commands and M-n
to move forward. You can also move
around in the buffer using Emacs commands for searching, cursor movement, and so on.
All in all, using gdb within Emacs is more
convenient than using it from the shell.
In addition, you may edit the source text in the gdb source buffer; the prefix arrow will not be present in the source when it is saved.
Emacs is very easy to customize, and you can write many extensions to this gdb interface yourself. You can define Emacs keys for other commonly used gdb commands or change the behavior of the source window. (For example, you can highlight all breakpoints in some fashion or provide keys to disable or clear breakpoints.)
[*] The sample programs in this section are not programs you’re likely to run into anywhere; they were thrown together by the authors for the purpose of demonstration.