In this chapter we explore the rudimentary facilities within MDB for analyzing kernel crash images and debugging live kernels. The objective is not to provide an all-encompassing kernel crash analysis tutorial, but rather to introduce the most relevant MDB dcmds and techniques.
A more comprehensive guide to crash dump analysis can be found in some of the recommended reference texts, for example, Panic! by Chris Drake and Kimberly Brown for SPARC [8], and “Crash Dump Analysis” by Frank Hoffman for x86/x64 [12].
The most common type of kernel debug target is a core file, saved from a prior system crash. In the following sections, we highlight some of the introductory steps as used with mdb
to explore a kernel core image.
If a system has crashed, then we should have a core image saved in /var/crash
on the target machine. The mdb
debugger should be invoked from a system with the same architecture and Solaris revision as the crash image. The first steps are to locate the appropriate saved image and then to invoke mdb
.
# cd /var/crash/nodename # ls bounds unix.1 unix.3 unix.5 unix.7 vmcore.1 vmcore.3 vmcore.5 vmcore.7 unix.0 unix.2 unix.4 unix.6 vmcore.0 vmcore.2 vmcore.4 vmcore.6 # mdb -k unix.7 vmcore.7 Loading modules: [ unix krtld$c genunix specfs dtrace ufs ip sctp usba uhci s1394 fcp fctl nca lofs zfs random nfs audiosup sppp crypto md fcip logindmux ptm ipc ] >
The kernel core contains important summary information from which we can extract the following:
Revision of the kernel
Hostname
CPU and platform architecture of the system
Panic string
Module causing the panic
We can use the ::showrev
and ::status
dcmds to extract this information.
> ::showrev Hostname: zones-internal Release: 5.11 Kernel architecture: i86pc Application architecture: i386 Kernel version: SunOS 5.11 i86pc snv_27 Platform: i86pc > ::status debugging crash dump vmcore.2 (32-bit) from zones-internal operating system: 5.11 snv_27 (i86pc) panic message: BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module "unix" due to a NULL pointer dereference dump content: kernel pages only > ::panicinfo cpu 0 thread d2a58de0 message BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module "unix" due to a NULL pointer dereference gs fe8301b0 fs fec30000 es fe8d0160 ds d9820160 edi 0 esi dc062298 ebp d2a58828 esp d2a58800 ebx de453000 edx d2a58de0 ecx 1 eax 0 trapno e err 2 eip fe82ca58 cs 158 eflags 10282 uesp fe89ab0d ss 0 gdt fec1f2f002cf idt fec1f5c007ff ldt 140 task 150 cr0 8005003b cr2 0 cr3 4cb3000 cr4 6d8
The kernel keeps a cyclic buffer of the recent kernel messages. In this buffer we can observe the messages up to the time of the panic. The ::msgbuf
dcmd shows the contents of the buffer.
> ::msgbuf
MESSAGE
/pseudo/zconsnex@1/zcons@5 (zcons5) online
/pseudo/zconsnex@1/zcons@6 (zcons6) online
/pseudo/zconsnex@1/zcons@7 (zcons7) online
pseudo-device: ramdisk1024
...
panic[cpu0]/thread=d2a58de0:
BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module "unix" due to a
NULL pointer dereference
sched:
#pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0xfe82ca58, sp=0xfe89ab0d, eflags=0x10282
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6d8<xmme,fxsr,pge,mce,pse,de>
cr2: 0 cr3: 4cb3000
gs: fe8301b0 fs: fec30000 es: fe8d0160 ds: d9820160
edi: 0 esi: dc062298 ebp: d2a58828 esp: d2a58800
ebx: de453000 edx: d2a58de0 ecx: 1 eax: 0
trp: e err: 2 eip: fe82ca58 cs: 158
efl: 10282 usp: fe89ab0d ss: 0
...
We can obtain a stack backtrace of the current thread by using the $C
command. Note that the displayed arguments to each function are not necessarily accurate. On each platform, the meaning of the shown arguments is as follows:
SPARC. The values of the arguments if they are available from a saved stack frame, assuming they are not overwritten by use of registers during the called function. With SPARC architectures, a function’s input argument registers are sometimes saved on the way out of a function—if the input registers are reused during the function, then values of the input arguments are overwritten and lost.
x86. Accurate values of the input arguments. Input arguments are always saved onto the stack and can be accurately displayed
x64. The values of the arguments, assuming they are available. As with the SPARC architectures, input arguments are passed in registers and may be overwritten.
> $C
d2a58828 atomic_add_32+8(0)
d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0)
d2a58880 nfs4_inactive+0x41()
d2a5889c fop_inactive+0x15(dc1c29c0, 0)
d2a588b0 vn_rele+0x4b(dc1c29c0)
d2a588c0 snf_smap_desbfree+0x59(dda94080)
d2a588dc dblk_lastfree_desb+0x13(de45b520, d826fb40)
d2a588f4 dblk_decref+0x4e(de45b520, d826fb40)
d2a58918 freemsg+0x69(de45b520)
d2a5893c FreeTxSwPacket+0x3b(d38b84f0)
d2a58968 CleanTxInterrupts+0xb4(d2f9cac0)
d2a589a4 e1000g_send+0xf6(d2f9cac0, d9ffba00)
d2a589c0 e1000g_m_tx+0x22()
d2a589dc dls_tx+0x16(d4520f68, d9ffba00)
d2a589f4 str_mdata_fastpath_put+0x1e(d3843f20, d9ffba00)
d2a58a40 tcp_send_data+0x62d(db0ecac0, d97ee250, d9ffba00)
d2a58aac tcp_send+0x6b6(d97ee250, db0ecac0, 564, 28, 14, 0)
d2a58b40 tcp_wput_data+0x622(db0ecac0, 0, 0)
d2a58c28 tcp_rput_data+0x2560(db0ec980, db15bd20, d2d45f40)
d2a58c40 tcp_input+0x3c(db0ec980, db15bd20, d2d45f40)
d2a58c78 squeue_enter_chain+0xe9(d2d45f40, db15bd20, db15bd20, 1, 1)
d2a58cec ip_input+0x658(d990e554, d3164010, 0, e)
d2a58d40 i_dls_link_ether_rx+0x156(d4523db8, d3164010, db15bd20)
d2a58d70 mac_rx+0x56(d3520200, d3164010, db15bd20)
d2a58dac e1000g_intr+0xa6(d2f9cac0, 0)
d2a58ddc intr_thread+0x122()
If the stack trace is of a kernel housekeeping or interrupt thread, the process reported for the thread will be that of p0
—“sched.
” The process pointer for the thread can be obtained with ::thread
, and ::ps
will then display summary information about that process. In this example, the thread is an interrupt thread (as indicated by the top entry in the stack from $C
), and the process name maps to sched
.
> d2a58de0::thread -p ADDR PROC LWP CRED d2a58de0 fec1d280 0 d9d1cf38 > fec1d280::ps -t S PID PPID PGID SID UID FLAGS ADDR NAME R 0 0 0 0 0 0x00000001 fec1d280 sched T t0 <TS_STOPPED>
Once we’ve located the thread of interest, we often learn more about what happened by disassembling the target and looking at the instruction that reportedly caused the panic. MDB’s ::dis dcmd will disassemble the code around the target instruction that we extract from the stack backtrace.
> $C d2a58828 atomic_add_32+8(0) d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0) d2a58880 nfs4_inactive+0x41() d2a5889c fop_inactive+0x15(dc1c29c0, 0) d2a588b0 vn_rele+0x4b(dc1c29c0) ... > nfs4_async_inactive+0x3b::dis nfs4_async_inactive+0x1a: pushl $0x28 nfs4_async_inactive+0x1c: call +0x51faa30 <kmem_alloc> nfs4_async_inactive+0x21: addl $0x8,%esp nfs4_async_inactive+0x24: movl %eax,%esi nfs4_async_inactive+0x26: movl $0x0,(%esi) nfs4_async_inactive+0x2c: movl -0x4(%ebp),%eax nfs4_async_inactive+0x2f: movl %eax,0x4(%esi) nfs4_async_inactive+0x32: movl 0xc(%ebp),%edi nfs4_async_inactive+0x35: pushl %edi nfs4_async_inactive+0x36: call +0x51b7cdc <crhold> nfs4_async_inactive+0x3b: addl $0x4,%esp nfs4_async_inactive+0x3e: movl %edi,0x8(%esi) nfs4_async_inactive+0x41: movl $0x4,0xc(%esi) nfs4_async_inactive+0x48: leal 0xe0(%ebx),%eax nfs4_async_inactive+0x4e: movl %eax,-0x8(%ebp) nfs4_async_inactive+0x51: pushl %eax nfs4_async_inactive+0x52: call +0x51477f4 <mutex_enter> nfs4_async_inactive+0x57: addl $0x4,%esp nfs4_async_inactive+0x5a: cmpl $0x0,0xd4(%ebx) nfs4_async_inactive+0x61: je +0x7e <nfs4_async_inactive+0xdf> nfs4_async_inactive+0x63: cmpl $0x0,0xd0(%ebx) > crhold::dis crhold: pushl %ebp crhold+1: movl %esp,%ebp crhold+3: andl $0xfffffff0,%esp crhold+6: pushl $0x1 crhold+8: movl 0x8(%ebp),%eax crhold+0xb: pushl %eax crhold+0xc: call -0x6e0b8 <atomic_add_32> crhold+0x11: movl %ebp,%esp crhold+0x13: popl %ebp crhold+0x14: ret > atomic_add_32::dis atomic_add_32: movl 0x4(%esp),%eax atomic_add_32+4: movl 0x8(%esp),%ecx atomic_add_32+8: lock addl %ecx,(%eax) atomic_add_32+0xb: ret
In this example, the system had a NULL pointer reference at atomic_add_ 32+8(0)
. The faulting instruction was atomic, referencing the memory at the location pointed to by %eax
. By looking at the registers at the time of the panic, we can see that %eax
was indeed NULL. The next step is to attempt to find out why %eax
was NULL.
> ::regs
%cs = 0x0158 %eax = 0x00000000
%ds = 0xd9820160 %ebx = 0xde453000
%ss = 0x0000 %ecx = 0x00000001
%es = 0xfe8d0160 %edx = 0xd2a58de0
%fs = 0xfec30000 %esi = 0xdc062298
%gs = 0xfe8301b0 %edi = 0x00000000
%eip = 0xfe82ca58 atomic_add_32+8
%ebp = 0xd2a58828
%esp = 0xd2a58800
%eflags = 0x00010282
id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
status=<of,df,IF,tf,SF,zf,af,pf,cf>
%uesp = 0xfe89ab0d
%trapno = 0xe
%err = 0x2
The function prototype for atomic_add_32()
reveals that the first argument is a pointer to the memory location to be added. Since this was an x86 machine, the arguments reported by the stack backtrace are known to be useful, and we can look to see where the NULL pointer was handed down—in this case nfs4_async_inactive()
.
void atomic_add_32(volatile uint32_t *target, int32_t delta) { *target += delta; } > atomic_add_32::dis atomic_add_32: movl 0x4(%esp),%eax atomic_add_32+4: movl 0x8(%esp),%ecx atomic_add_32+8: lock addl %ecx,(%eax) atomic_add_32+0xb: ret > $C d2a58828 atomic_add_32+8(0) d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0) d2a58880 nfs4_inactive+0x41() d2a5889c fop_inactive+0x15(dc1c29c0, 0) d2a588b0 vn_rele+0x4b(dc1c29c0) ... > $C d2a58828 atomic_add_32+8(0) d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0) d2a58880 nfs4_inactive+0x41() d2a5889c fop_inactive+0x15(dc1c29c0, 0) d2a588b0 vn_rele+0x4b(dc1c29c0) ... > nfs4_async_inactive+0x3b::dis nfs4_async_inactive+0x1a: pushl $0x28 nfs4_async_inactive+0x1c: call +0x51faa30 <kmem_alloc> nfs4_async_inactive+0x21: addl $0x8,%esp nfs4_async_inactive+0x24: movl %eax,%esi nfs4_async_inactive+0x26: movl $0x0,(%esi) nfs4_async_inactive+0x2c: movl -0x4(%ebp),%eax nfs4_async_inactive+0x2f: movl %eax,0x4(%esi) nfs4_async_inactive+0x32: movl 0xc(%ebp),%edi nfs4_async_inactive+0x35: pushl %edi nfs4_async_inactive+0x36: call +0x51b7cdc <crhold> nfs4_async_inactive+0x3b: addl $0x4,%esp nfs4_async_inactive+0x3e: movl %edi,0x8(%esi) nfs4_async_inactive+0x41: movl $0x4,0xc(%esi) nfs4_async_inactive+0x48: leal 0xe0(%ebx),%eax nfs4_async_inactive+0x4e: movl %eax,-0x8(%ebp) nfs4_async_inactive+0x51: pushl %eax nfs4_async_inactive+0x52: call +0x51477f4 <mutex_enter> nfs4_async_inactive+0x57: addl $0x4,%esp nfs4_async_inactive+0x5a: cmpl $0x0,0xd4(%ebx) nfs4_async_inactive+0x61: je +0x7e <nfs4_async_inactive+0xdf> nfs4_async_inactive+0x63: cmpl $0x0,0xd0(%ebx) ...
Looking at the disassembly, it appears that there is an additional function call, which is omitted from the stack backtrack (typically due to tail call compiler optimization). The call is to crhold()
, passing the address of a credential structure from the arguments to nfs4_async_inactive()
. Here we can see that crhold()
does in fact call atomic_add_32().
/*
* Put a hold on a cred structure.
*/
void
crhold(cred_t *cr)
{
atomic_add_32(&cr->cr_ref, 1);
}
> crhold::dis
crhold: pushl %ebp
crhold+1: movl %esp,%ebp
crhold+3: andl $0xfffffff0,%esp
crhold+6: pushl $0x1
crhold+8: movl 0x8(%ebp),%eax
crhold+0xb: pushl %eax
crhold+0xc: call -0x6e0b8 <atomic_add_32>
crhold+0x11: movl %ebp,%esp
crhold+0x13: popl %ebp
crhold+0x14: ret
Next, we look into the situation in which nfs4_async_inactive()
was called. The first argument is a vnode
pointer, and the second is our suspicious credential pointer. The vnode
pointer can be examined with the CTF information and the ::print
dcmd. We can see that we were performing an nfs4_async_inactive
function on the vnode
referencing a pdf file in this case.
*/ void nfs4_async_inactive(vnode_t *vp, cred_t *cr) { > $C d2a58828 atomic_add_32+8(0) d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0) > dc1c29c0::print vnode_t { ... v_type = 1 (VREG) v_rdev = 0 ... v_path = 0xdc3de800 "/zones/si/root/home/ftp/book/solarisinternals_projtaskipc.pdf" ... }
Looking further at the stack backtrace and the code, we can try to identify where the credentials were derived from. nfs4_async_inactive()
was called by nfs4_inactive()
, which is one of the standard VOP
methods (VOP_INACTIVE
).
> $C
d2a58828 atomic_add_32+8(0)
d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0)
d2a58880 nfs4_inactive+0x41()
d2a5889c fop_inactive+0x15(dc1c29c0, 0)
d2a588b0 vn_rele+0x4b(dc1c29c0)
The credential can be followed all the way up to vn_rele()
, which derives the pointer from CRED()
, which references the current thread’s t_cred
.
vn_rele(vnode_t *vp) { if (vp->v_count == 0) cmn_err(CE_PANIC, "vn_rele: vnode ref count 0"); mutex_enter(&vp->v_lock); if (vp->v_count == 1) { mutex_exit(&vp->v_lock); VOP_INACTIVE(vp, CRED()); ... #define CRED() curthread->t_cred
We know which thread called vn_rele()
—the interrupt thread with a thread pointer of d2a58de0
. We can use
::print
to take a look at the thread’s t_cred
.
> d2a58de0::print kthread_t t_cred
t_cred = 0xd9d1cf38
Interestingly, it’s not NULL! A further look around the code gives us some clues as to what’s going on. In the initialization code during the creation of an interrupt thread, the t_cred
is set to NULL
:
/* * Create and initialize an interrupt thread. * Returns non-zero on error. * Called at spl7() or better. */ void thread_create_intr(struct cpu *cp) { ... /* * Nobody should ever reference the credentials of an interrupt * thread so make it NULL to catch any such references. */ tp->t_cred = NULL;
Our curthread->t_cred
is not NULL
, but NULL
was passed in when CRED()
accessed it in the not-too-distant past—an interesting situation indeed. It turns out that the NFS client code wills credentials to the interrupt thread’s t_cred
, so what we are in fact seeing is a race condition, where vn_rele()
is called from the interrupt thread with no credentials. In this case, a bug was logged accordingly and the problem was fixed!
Another good source of information is the ::cpuinfo
dcmd. It shows a rich set of information of the processors in the system. For each CPU, the details of the thread currently running on each processor are shown. If the current CPU is handling an interrupt, then the thread running the interrupt and the preempted thread are shown. In addition, a list of threads waiting in the run queue for this processor is shown.
In this example, we can see that the idle thread was preempted by a level 6 interrupt. Three threads are on the run queue: the thread that was running immediately before preemption and two other threads waiting to be scheduled on the run queue. We can traverse these manually, by traversing the stack of the thread pointer with ::findstack
.
> :da509de0:findstack
stack pointer for thread da509de0: da509d08
da509d3c swtch+0x165()
da509d60 cv_timedwait+0xa3()
da509dc8 taskq_d_thread+0x149()
da509dd8 thread_start+8()
The CPU containing the thread that caused the panic will, we hope, be reported in the panic string and, furthermore, will be used by MDB as the default thread for other dcmds in the core image. Once we determine the status of the CPU, we can observe which thread was involved in the panic.
Additionally, we can use the CPU’s run queue (cpu_dispq
) to provide a stack list for other threads queued up to run. We might do this just to gather a little more information about the circumstance in which the panic occurred.
> fec225b8::walk cpu_dispq |::thread ADDR STATE FLG PFLG SFLG PRI EPRI PIL INTR DISPTIME BOUND PR da509de0 run 8 0 13 60 0 0 n/a 7e6f9c -1 0 da0cdde0 run 8 2000 13 60 0 0 n/a 7e8452 -1 0 da0d6de0 run 8 2000 13 60 0 0 n/a 7e8452 -1 0 > fec225b8::walk cpu_dispq |::findstack stack pointer for thread da509de0: da509d08 da509d3c swtch+0x165() da509d60 cv_timedwait+0xa3() da509dc8 taskq_d_thread+0x149() da509dd8 thread_start+8() stack pointer for thread da0cdde0: da0cdd48 da0cdd74 swtch+0x165() da0cdd84 cv_wait+0x4e() da0cddc8 nfs4_async_manager+0xc9() da0cddd8 thread_start+8() stack pointer for thread da0d6de0: da0d6d48 da0d6d74 swtch+0x165() da0d6d84 cv_wait+0x4e() da0d6dc8 nfs4_async_manager+0xc9() da0d6dd8 thread_start+8()
We briefly mentioned in Section 14.1.4 some of the problems we encounter when trying to glean argument values from stack backtraces. In the SPARC architecture, the values of the input arguments’ registers are saved into register windows at the exit of each function. In most cases, we can traverse the stack frames to look at the values of the registers as they are saved in register windows. Historically, this was done by manually traversing the stack frames (as illustrated in Panic!). Conveniently, MDB has a dcmd that understands and walks SPARC stack frames. We can use the ::stackregs
dcmd to display the SPARC input registers and locals (%l0-%l7
) for each frame on the stack.
> ::stackregs
000002a100d074c1 vpanic(12871f0, e, e, fffffffffffffffe, 1, 185d400)
%l0-%l3: 0 2a100d07f10 2a100d07f40 ffffffff
%l4-%l7: fffffffffffffffe 0 1845400 1287000
px_err_fabric_intr+0xbc: call -0x1946c0 <fm_panic>
000002a100d07571 px_err_fabric_intr+0xbc(600024f9880, 31, 340, 600024d75d0,
30000842020, 0)
%l0-%l3: 0 2a100d07f10 2a100d07f40 ffffffff
%l4-%l7: fffffffffffffffe 0 1845400 1287000
px_msiq_intr+0x1ac: call -0x13b0 <px_err_fabric_intr>
000002a100d07651 px_msiq_intr+0x1ac(60002551db8, 0, 127dcc8, 6000252e9e0, 30000828a58,
30000842020)
%l0-%l3: 0 2a100d07f10 2a100d07f40 2a100d07f10
%l4-%l7: 0 31 30000842020 600024d21d8
current_thread+0x174: jmpl %o5, %o7
000002a100d07751 current_thread+0x174(16, 2000, ddf7dfff, ddf7ffff, 2000, 12)
%l0-%l3: 100994c 2a100cdf021 e 7b9
%l4-%l7: 0 0 0 2a100cdf8d0
cpu_halt+0x134: call -0x29dcc <enable_vec_intr>
000002a100cdf171 cpu_halt+0x134(16, d, 184bbd0, 30001334000, 16, 1)
%l0-%l3: 60001db16c8 0 60001db16c8 ffffffffffffffff
%l4-%l7: 0 0 0 10371d0
idle+0x124: jmpl %l7, %o7
000002a100cdf221 idle+0x124(1819800, 0, 30001334000, ffffffffffffffff, e, 1818400)
%l0-%l3: 60001db16c8 1b 0 ffffffffffffffff
%l4-%l7: 0 0 0 10371d0
thread_start+4: jmpl %i7, %o7
000002a100cdf2d1 thread_start+4(0, 0, 0, 0, 0, 0)
%l0-%l3: 0 0 0 0
%l4-%l7: 0 0 0 0
SPARC input registers become output registers, which are then saved on the stack. The common technique when trying to qualify registers as valid arguments is to ascertain, before the registers are saved in the stack frame, whether they have been overwritten during the function. A common technique is to disassemble the target function, looking to see if the input registers (%i0
-%i7
) are reused in the function’s code body. A quick and dirty way to look for register usage is to use ::dis
piped to a UNIX grep
; however, at this stage, examining the code for use of input registers is left as an exercise for the reader. For example, if we are looking to see if the values of the first argument to cpu_halt()
are valid, we could see if %i0
is reused during the cpu_halt()
function, before we branch out at cpu_halt+0x134
.
> cpu_halt::dis !grep i0
cpu_halt+0x24: ld [%g1 + 0x394], %i0
cpu_halt+0x28: cmp %i0, 1
cpu_halt+0x90: add %i2, 0x120, %i0
cpu_halt+0xd0: srl %i4, 0, %i0
cpu_halt+0x100: srl %i4, 0, %i0
cpu_halt+0x144: ldub [%i3 + 0xf9], %i0
cpu_halt+0x150: and %i0, 0xfd, %l7
cpu_halt+0x160: add %i2, 0x120, %i0
As we can see in this case, %i0
is reused very early in cpu_halt()
and would be invalid in the stack backtrace.
We can obtain the list of processes by using the ::ps
dcmd. In addition, we can search for processes by using the pgrep
(1M)-like ::pgrep
dcmd.
> ::ps -f S PID PPID PGID SID UID FLAGS ADDR NAME R 0 0 0 0 0 0x00000001 fec1d280 sched R 3 0 0 0 0 0x00020001 d318d248 fsflush R 2 0 0 0 0 0x00020001 d318daa8 pageout R 1 0 0 0 0 0x42004000 d318e308 /sbin/init R 9066 1 9066 9066 1 0x52000400 da2b7130 /usr/lib/nfs/nfsmapid R 9065 1 9063 9063 1 0x42000400 d965a978 /usr/lib/nfs/nfs4cbd R 4125 1 4125 4125 0 0x42000400 d9659420 /local/local/bin/httpd -k start R 9351 4125 4125 4125 40000 0x52000000 da2c0428 /local/local/bin/httpd -k start R 4118 1 4117 4117 1 0x42000400 da2bc988 /usr/lib/nfs/nfs4cbd R 4116 1 4116 4116 1 0x52000400 d8da7240 /usr/lib/nfs/nfsmapid R 4105 1 4105 4105 0 0x42000400 d9664108 /usr/apache/bin/httpd R 4263 4105 4105 4105 60001 0x52000000 da2bf368 /usr/apache/bin/httpd ... > ::ps -t S PID PPID PGID SID UID FLAGS ADDR NAME R 0 0 0 0 0 0x00000001 fec1d280 sched T t0 <TS_STOPPED> R 3 0 0 0 0 0x00020001 d318d248 fsflush T 0xd3108a00 <TS_SLEEP> R 2 0 0 0 0 0x00020001 d318daa8 pageout T 0xd3108c00 <TS_SLEEP> R 1 0 0 0 0 0x42004000 d318e308 init T 0xd3108e00 <TS_SLEEP> R 9066 1 9066 9066 1 0x52000400 da2b7130 nfsmapid T 0xd942be00 <TS_SLEEP> T 0xda68f000 <TS_SLEEP> T 0xda4e8800 <TS_SLEEP> T 0xda48f800 <TS_SLEEP> ... ::pgrep httpd > ::pgrep http S PID PPID PGID SID UID FLAGS ADDR NAME R 4125 1 4125 4125 0 0x42000400 d9659420 httpd R 9351 4125 4125 4125 40000 0x52000000 da2c0428 httpd R 4105 1 4105 4105 0 0x42000400 d9664108 httpd R 4263 4105 4105 4105 60001 0x52000000 da2bf368 httpd R 4111 4105 4105 4105 60001 0x52000000 da2b2138 httpd ...
We can observe several aspects of the user process by using the ptool-like dcmds.
> ::pgrep nscd S PID PPID PGID SID UID FLAGS ADDR NAME R 575 1 575 575 0 0x42000000 ffffffff866f1878 nscd > 0t575 |::pid2proc |::walk thread |::findstack (or) > ffffffff82f5f860::walk thread |::findstack stack pointer for thread ffffffff866cb060: fffffe8000c7fdd0 [ fffffe8000c7fdd0 _resume_from_idle+0xde() ] fffffe8000c7fe10 swtch+0x185() fffffe8000c7fe80 cv_wait_sig_swap_core+0x17a() fffffe8000c7fea0 cv_wait_sig_swap+0x1a() fffffe8000c7fec0 pause+0x59() fffffe8000c7ff10 sys_syscall32+0x101() ... > ffffffff866f1878::ptree fffffffffbc23640 sched ffffffff82f6b148 init ffffffff866f1878 nscd > ffffffff866f1878::pfiles FD TYPE VNODE INFO 0 CHR ffffffff833d4700 /devices/pseudo/mm@0:null 1 CHR ffffffff833d4700 /devices/pseudo/mm@0:null 2 CHR ffffffff833d4700 /devices/pseudo/mm@0:null 3 DOOR ffffffff86a0eb40 [door to 'nscd' (proc=ffffffff866f1878)] 4 SOCK ffffffff835381c0 > ffffffff866f1878::pmap SEG BASE SIZE RES PATH ffffffff85e416c0 0000000008046000 8k 8k [ anon ] ffffffff866ab5e8 0000000008050000 48k /usr/sbin/nscd ffffffff839b1950 000000000806c000 8k 8k /usr/sbin/nscd ffffffff866ab750 000000000806e000 520k 480k [ anon ] ...
The major buckets of memory allocation are available with the ::memstat
dcmd.
> ::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 49022 191 19%
Anon 68062 265 27%
Exec and libs 3951 15 2%
Page cache 4782 18 2%
Free (cachelist) 7673 29 3%
Free (freelist) 118301 462 47%
Total 251791 983
Physical 251789 983
We can use the ::netstat
dcmd to obtain the list of network connections.
> ::netstat
TCPv4 St Local Address Remote Address Zone
da348600 6 10.0.5.104.63710 10.0.5.10.38189 7
da348a80 0 10.0.5.106.1016 10.0.5.10.2049 2
da34fc40 0 10.0.5.108.1018 10.0.5.10.2049 3
da3501c0 0 10.0.4.106.22 192.18.42.17.64836 2
d8ed2800 0 10.0.4.101.22 192.18.42.17.637
...
A stack backtrace of all threads in the kernel can be obtained with the ::threadlist
dcmd. (If you are familiar with adb
, this is a modern version of adb
’s $<threadlist
macro). With this dcmd, we can quickly and easily capture a useful snapshot of all current activity in text form, for deeper analysis.
> ::threadlist ADDR PROC LWP CMD/LWPID fec1dae0 fec1d280 fec1fdc0 sched/1 d296cde0 fec1d280 0 idle() d2969de0 fec1d280 0 taskq_thread() d2966de0 fec1d280 0 taskq_thread() d2963de0 fec1d280 0 taskq_thread() d2960de0 fec1d280 0 taskq_thread() d29e3de0 fec1d280 0 taskq_thread() d29e0de0 fec1d280 0 taskq_thread() ... > ::threadlist -v ADDR PROC LWP CLS PRI WCHAN fec1dae0 fec1d280 fec1fdc0 0 96 0 PC: 0xfe82b507 CMD: sched stack pointer for thread fec1dae0: fec33df8 swtch+0x165() sched+0x3aa() main+0x365() d296cde0 fec1d280 0 0 -1 0 PC: 0xfe82b507 THREAD: idle() stack pointer for thread d296cde0: d296cd88 swtch+0x165() idle+0x32() thread_start+8() ... # echo "::threadlist" |mdb -k >mythreadlist.txt
The ::findleaks
dcmd efficiently detects memory leaks in kernel crash dumps when the full set of kmem debug features has been enabled. The first execution of ::findleaks
processes the dump for memory leaks (this can take a few minutes), then coalesces the leaks by the allocation stack trace. The findleaks report shows a bufctl address and the topmost stack frame for each memory leak that was identified. See Section 11.4.9.1 in Solaris™ Internals for more information on ::findleaks
.
> ::findleaks
CACHE LEAKED BUFCTL CALLER
70039ba8 1 703746c0 pm_autoconfig+0x708
70039ba8 1 703748a0 pm_autoconfig+0x708
7003a028 1 70d3b1a0 sigaddq+0x108
7003c7a8 1 70515200 pm_ioctl+0x187c
------------------------------------------------------
Total 4 buffers, 376 bytes
If the -v
option is specified, the dcmd prints more verbose messages as it executes. If an explicit address is specified prior to the dcmd, the report is filtered and only leaks whose allocation stack traces contain the specified function address are displayed.
The ::vatopfn
dcmd translates virtual addresses to physical addresses, using the appropriate platform translation tables.
> fec4b8d0::vatopfn
level=1 htable=d9d53848 pte=30007e3
Virtual fec4b8d0 maps Physical 304b8d0
The ::whatis
dcmd attempts to determine if the address is a pointer to a kmem-managed buffer or another type of special memory region, such as a thread stack, and reports its findings. When the -a
option is specified, the dcmd reports all matches instead of just the first match to its queries. When the -b
option is specified, the dcmd also attempts to determine if the address is referred to by a known kmem bufctl. When the -v
option is specified, the dcmd reports its progress as it searches various kernel data structures. See Section 11.4.9.2 in Solaris™
> 0x705d8640::whatis
705d8640 is 705d8640+0, allocated from streams_mblk
The ::kgrep
dcmd lets you search the kernel for occurrences of a supplied value. This is particularly useful when you are trying to debug software with multiple instances of a value.
> 0x705d8640::kgrep
400a3720
70580d24
7069d7f0
706a37ec
706add34
A kernel crash dump can save memory pages of user processes in Solaris. We explain how to save process memory pages and how to examine user processes by using the kernel crash dump.
We must modify the dump configuration to save process pages. We confirm the dump configuration by running dumpadm
with no option.
# /usr/sbin/dumpadm
Dump content: all pages
Dump device: /dev/dsk/c0t0d0s1 (swap)
Savecore directory: /var/crash/example
Savecore enabled: yes
If Dump content
is not all pages
or curproc
, no process memory page will be dumped. In that case, we run dumpadm -c all
or dumpadm -c curproc
.
We gather a crash dump and confirm that user pages are contained.
# /usr/bin/mdb unix.0 vmcore.0 Loading modules: [ unix krtld genunix ufs_log ip nfs random ptm logindmux ] > ::status debugging crash dump vmcore.0 (64-bit) from rmcferrari operating system: 5.11 snv_31 (i86pc) panic message: forced crash dump initiated at user request dump content: all kernel and user pages
The dump content
line shows that this dump includes user pages.
Next, we search for process information with which we are concerned. We use nscd
as the target of this test case. The first thing to find is the address of the process.
> ::pgrep nscd
S PID PPID PGID SID UID FLAGS ADDR NAME
R 575 1 575 575 0 0x42000000 ffffffff866f1878 nscd
The address of the process is ffffffff866f1878
. As a sanity check, we can look at the kernel thread stacks for each process—we’ll use these later to double-check that the user stack matches the kernel stack, for those threads blocked in a system call.
> 0t575::pid2proc |::print proc_t p_tlist |::list kthread_t t_forw
stack pointer for thread ffffffff866cb060: fffffe8000c7fdd0
[ fffffe8000c7fdd0 _resume_from_idle+0xde() ]
fffffe8000c7fe10 swtch+0x185()
fffffe8000c7fe80 cv_wait_sig_swap_core+0x17a()
fffffe8000c7fea0 cv_wait_sig_swap+0x1a()
fffffe8000c7fec0 pause+0x59()
fffffe8000c7ff10 sys_syscall32+0x101()
stack pointer for thread ffffffff866cc140: fffffe8000c61d70
[ fffffe8000c61d70 _resume_from_idle+0xde() ]
fffffe8000c61db0 swtch+0x185()
fffffe8000c61e10 cv_wait_sig+0x150()
fffffe8000c61e50 door_unref+0x94()
fffffe8000c61ec0 doorfs32+0x90()
fffffe8000c61f10 sys_syscall32+0x101()
stack pointer for thread ffffffff866cba80: fffffe8000c6dd10
[ fffffe8000c6dd10 _resume_from_idle+0xde() ]
fffffe8000c6dd50 swtch_to+0xc9()
fffffe8000c6ddb0 shuttle_resume+0x376()
fffffe8000c6de50 door_return+0x228()
fffffe8000c6dec0 doorfs32+0x157()
fffffe8000c6df10 sys_syscall32+0x101()
stack pointer for thread ffffffff866cb720: fffffe8000c73cf0
[ fffffe8000c73cf0 _resume_from_idle+0xde() ]
fffffe8000c73d30 swtch+0x185()
fffffe8000c73db0 cv_timedwait_sig+0x1a3()
fffffe8000c73e30 cv_waituntil_sig+0xab()
fffffe8000c73ec0 nanosleep+0x141()
fffffe8000c73f10 sys_syscall32+0x101()
...
It appears that the first few threads on the process are blocked in the pause()
, door()
, and nanosleep()
system calls. We’ll double-check against these later when we traverse the user stacks.
The next things to find are the stack pointers for the user threads, which are stored in each thread’s lwp
.
> ffffffff866f1878::walk thread |::print kthread_t t_lwp->lwp_regs|::print "struct regs" r_rsp |=X 8047d54 fecc9f80 febbac08 fea9df78 fe99df78 fe89df78 fe79df78 fe69df78 fe59df78 fe49df78 fe39df58 fe29df58 fe19df58 fe09df58 fdf9df58 fde9df58 fdd9df58 fdc9df58 fdb9df58 fda9df58 fd99df58 fd89d538 fd79bc08
Each entry is a thread’s stack pointer in the user process’s address space. We can use these to traverse the stack in the user process’s context.
An mdb
command, <proc address>::context
, switches a context to a specified user process.
> ffffffff866f1878::context
debugger context set to proc ffffffff866f1878
After the context is switched, several mdb
commands return process information rather than kernel information. For example:
> ::nm Value Size Type Bind Other Shndx Name 0x0000000000000000|0x0000000000000000|NOTY |LOCL |0x0 |UNDEF | 0x0000000008056c29|0x0000000000000076|FUNC |GLOB |0x0 |10 |gethost_revalidate 0x0000000008056ad2|0x0000000000000024|FUNC |GLOB |0x0 |10 |getgr_uid_reaper 0x000000000805be5f|0x0000000000000000|OBJT |GLOB |0x0 |14 |_etext 0x0000000008052778|0x0000000000000000|FUNC |GLOB |0x0 |UNDEF |strncpy 0x0000000008052788|0x0000000000000000|FUNC |GLOB |0x0 |UNDEF |_uncached_getgrnam_r 0x000000000805b364|0x000000000000001b|FUNC |GLOB |0x0 |12 |_fini 0x0000000008058f54|0x0000000000000480|FUNC |GLOB |0x0 |10 |nscd_parse 0x0000000008052508|0x0000000000000000|FUNC |GLOB |0x0 |UNDEF |pause 0x00000000080554e0|0x0000000000000076|FUNC |GLOB |0x0 |10 |getpw_revalidate ... > ::mappings BASE LIMIT SIZE NAME 8046000 8048000 2000 [ anon ] 8050000 805c000 c000 /usr/sbin/nscd 806c000 806e000 2000 /usr/sbin/nscd 806e000 80f0000 82000 [ anon ] fd650000 fd655000 5000 /lib/nss_files.so.1 fd665000 fd666000 1000 /lib/nss_files.so.1 fd680000 fd690000 10000 [ anon ] fd6a0000 fd79e000 fe000 [ anon ] fd7a0000 fd89e000 fe000 [ anon ] ...
Unlike examining the kernel, where we would ordinarily use the stack-related mdb
commands like ::stack
or ::findstack
, we need to use stack pointers to traverse a process stack. In this case, nscd
is an x86 32-bit application. So a “stack pointer + 0x38” and a “stack pointer + 0x3c” shows the stack pointer and the program counter of the previous frame.
/*
* In the Intel world, a stack frame looks like this:
*
* %fp0->| |
* |-------------------------------|
* | Args to next subroutine |
* |-------------------------------|-
* %sp0->| One-word struct-ret address | |
* |-------------------------------| > minimum stack frame
* %fp1->| Previous frame pointer (%fp0)| |
* |-------------------------------|-/
* | Local variables |
* %sp1->|-------------------------------|
*
* For amd64, the minimum stack frame is 16 bytes and the frame pointer must
* be 16-byte aligned.
*/
struct frame {
greg_t fr_savfp; /* saved frame pointer */
greg_t fr_savpc; /* saved program counter */
};
#ifdef _SYSCALL32
/*
* Kernel's view of a 32-bit stack frame.
*/
struct frame32 {
greg32_t fr_savfp; /* saved frame pointer */
greg32_t fr_savpc; /* saved program counter */
};
See sys/stack.h
Each individual stack frame is defined as follows:
/*
* In the x86 world, a stack frame looks like this:
*
* |---------------------------|
* 4n+8(%ebp) ->| argument word n |
* | ... | (Previous frame)
* 8(%ebp) ->| argument word 0 |
* |---------------------------|--------------------
* 4(%ebp) ->| return address |
* |---------------------------|
* 0(%ebp) ->| previous %ebp (optional) |
* |---------------------------|
* -4(%ebp) ->| unspecified | (Current frame)
* | ... |
* 0(%esp) ->| variable size |
* |---------------------------|
*/
See sys/stack.h
We can explore the stack frames from Section 14.2.4.
> ffffffff866f1878::walk thread |::print kthread_t t_lwp->lwp_regs|::print "struct regs" r_rsp |=X 8047d54 fecc9f80 febbac08 fea9df78 fe99df78 fe89df78 fe79df78 fe69df78 fe59df78 fe49df78 fe39df58 fe29df58 fe19df58 fe09df58 fdf9df58 fde9df58 fdd9df58 fdc9df58 fdb9df58 fda9df58 fd99df58 fd89d538 fd79bc08
> 8047d54/X 0x8047d54: fedac74f > fedac74f/ libc.so.1'pause+0x67: 8e89c933 = xorl %ecx,%ecx > febbac08/X 0xfebbac08: feda83ec > feda83ec/ libc.so.1'_door_return+0xac: eb14c483 = addl $0x14,%esp > fea9df78/X 0xfea9df78: fedabe4c > fedabe4c/ libc.so.1'_sleep+0x88: 8908c483 = addl $0x8,%esp
Thus, we observe user stacks of pause()
, door_return()
, and sleep()
, as we expected.
In the process context, we can examine process memory as usual. For example, we can dissasemble instructions from a processes’s address space:
> libc.so.1'_sleep+0x88::dis
libc.so.1'_sleep+0x67: pushq $-0x13
libc.so.1'_sleep+0x69: call -0x5cb59 <0xfed4f2d4>
libc.so.1'_sleep+0x6e: addl $0x4,%esp
libc.so.1'_sleep+0x71: movl %esp,%eax
libc.so.1'_sleep+0x73: movl %eax,0x22c(%rsi)
libc.so.1'_sleep+0x79: leal 0x14(%rsp),%eax
libc.so.1'_sleep+0x7d: pushq %rax
libc.so.1'_sleep+0x7e: leal 0x10(%rsp),%eax
libc.so.1'_sleep+0x82: pushq %rax
libc.so.1'_sleep+0x83: call +0xc419 <0xfedb8260>
libc.so.1'_sleep+0x88: addl $0x8,%esp
libc.so.1'_sleep+0x8b: movl %edi,0x22c(%rsi)
libc.so.1'_sleep+0x91: movb 0xb3(%rsi),%cl
libc.so.1'_sleep+0x97: movb %cl,0xb2(%rsi)
libc.so.1'_sleep+0x9d: jmp +0x14 <libc.so.1'_sleep+0xb1>
libc.so.1'_sleep+0x9f: leal 0x14(%rsp),%eax
libc.so.1'_sleep+0xa3: pushq %rax
libc.so.1'_sleep+0xa4: leal 0x10(%rsp),%eax
libc.so.1'_sleep+0xa8: pushq %rax
libc.so.1'_sleep+0xa9: call +0xc3f3 <0xfedb8260>
libc.so.1'_sleep+0xae: addl $0x8,%esp
The userland debugger, mdb
, debugs the running kernel and kernel crash dumps. It can also control and debug live user processes as well as user core dumps. kmdb
extends the debugger’s functionality to include instruction-level execution control of the kernel. mdb
, by contrast, can only observe the running kernel.
The goal for kmdb
is to bring the advanced debugging functionality of mdb
, to the maximum extent practicable, to in-situ kernel debugging. This includes loadable-debugger module support, debugger commands, ability to process symbolic debugging information, and the various other features that make mdb
so powerful.
kmdb
is often compared with tracing tools like DTrace. DTrace is designed for tracing in the large—for safely examining kernel and user process execution at a function level, with minimal impact upon the running system. kmdb
, on the other hand, grabs the system by the throat, stopping it in its tracks. It then allows for micro-level (per-instruction) analysis, allowing users observe the execution of individual instructions and allowing them to observe and change processor state. Whereas DTrace spends a great deal of energy trying to be safe, kmdb
scoffs at safety, letting developers wreak unpleasantness upon the machine in furtherance of the debugging of their code.
Diagnosing problems with kmdb
builds on the techniques used with mdb
. In this section, we cover some basic examples of how to use kmdb
to boot the system.
kmdb
can be started from the command line of the console login with mdb
and the -K
option.
# mdb -K
Welcome to kmdb
Loaded modules: [ audiosup cpc uppc ptm ufs unix zfs krtld s1394 sppp nca lofs
genunix ip logindmux usba specfs pcplusmp nfs md random sctp ]
[0]> $c
kmdbmod'kaif_enter+8()
kdi_dvec_enter+0x13()
kmdbmod'kctl_modload_activate+0x112(0, fffffe85ad938000, 1)
kmdb'kdrv_activate+0xfa(4c6450)
kmdb'kdrv_ioctl+0x32(ab00000000, db0001, 4c6450, 202001, ffffffff8b483570,
fffffe8000c48edc)
cdev_ioctl+0x55(ab00000000, db0001, 4c6450, 202001, ffffffff8b483570,
fffffe8000c48edc)
specfs'spec_ioctl+0x99(ffffffffbc4cc880, db0001, 4c6450, 202001,
ffffffff8b483570, fffffe8000c48edc)
fop_ioctl+0x2d(ffffffffbc4cc880, db0001, 4c6450, 202001, ffffffff8b483570,
fffffe8000c48edc)
ioctl+0x180(4, db0001, 4c6450)
sys_syscall+0x17b()
[0]> :c
If you experience hangs or panics during Solaris boot, whether during installation or after you’ve already installed, using the kernel debugger can be a big help in collecting the first set of “what happened” information.
You invoke the kernel debugger by supplying the -k
switch in the kernel boot arguments. So a common request from a kernel engineer starting to examine a problem is often “try booting with kmdb.”
Sometimes it’s useful either to set a breakpoint to pause the kernel startup and examine something, or to just set a kernel variable to enable or disable a feature or to enable debugging output. If you use -k
to invoke kmdb
but also supply the -d
switch, the debugger will be entered before the kernel really starts to do anything of consequence, so you can set kernel variables or breakpoints.
To enter the debugger at boot with Solaris 10, enter b -kd
at the appropriate prompt; this is slightly different whether you’re installing or booting an already installed system.
ok boot kmdb -d
Loading kmdb...
Welcome to kmdb
[0]>
If, instead, you’re doing this with a system where GRUB boots Solaris, you add the -kd
to the “kernel” line in the GRUB menu entry (you can edit GRUB menu entries for this boot by using the GRUB menu interface, and the “e” (for edit) key).
kernel /platform/i86pc/multiboot -kd -B console=ttya
Either way, you’ll drop into the kernel debugger in short order, which will announce itself with this prompt:
[0]>
Now we’re in the kernel debugger. The number in square brackets is the CPU that is running the kernel debugger; that number might change for later entries into the debugger.
Solaris uses a bitmap screen and keyboard by default. To facilitate remote debugging, it is often desirable to configure the system to use a serial tty console. To do this, change the bootenv.rc
and grub boot configuration.
setprop ttya-rts-dtr-off true
setprop console 'text'
See /boot/solaris/bootenv.rc
Edit the grub boot configuration to include -B console=ttya
via the grub menu at boot time, or via bootadm(1M)
.
kernel /platform/i86pc/multiboot -kd -B console=ttya
For investigating hangs, try turning on module debugging output. You can set the value of a kernel variable by using the /W
command (“write a 32-bit value”). Here’s how you set moddebug
to 0x80000000 and then continue execution of the kernel.
[0]> moddebug/W 80000000
[0]> :c
This command gives you debug output for each kernel module that loads. The bit masks for moddebug
are shown below. Often, 0x80000000
is sufficient for the majority of initial exploratory debugging.
/*
* bit definitions for moddebug.
*/
#define MODDEBUG_LOADMSG 0x80000000 /* print "[un]loading..." msg */
#define MODDEBUG_ERRMSG 0x40000000 /* print detailed error msgs */
#define MODDEBUG_LOADMSG2 0x20000000 /* print 2nd level msgs */
#define MODDEBUG_FINI_EBUSY 0x00020000 /* pretend fini returns EBUSY */
#define MODDEBUG_NOAUL_IPP 0x00010000 /* no Autounloading ipp mods */
#define MODDEBUG_NOAUL_DACF 0x00008000 /* no Autounloading dacf mods */
#define MODDEBUG_KEEPTEXT 0x00004000 /* keep text after unloading */
#define MODDEBUG_NOAUL_DRV 0x00001000 /* no Autounloading Drivers */
#define MODDEBUG_NOAUL_EXEC 0x00000800 /* no Autounloading Execs */
#define MODDEBUG_NOAUL_FS 0x00000400 /* no Autounloading File sys */
#define MODDEBUG_NOAUL_MISC 0x00000200 /* no Autounloading misc */
#define MODDEBUG_NOAUL_SCHED 0x00000100 /* no Autounloading scheds */
#define MODDEBUG_NOAUL_STR 0x00000080 /* no Autounloading streams */
#define MODDEBUG_NOAUL_SYS 0x00000040 /* no Autounloading syscalls */
#define MODDEBUG_NOCTF 0x00000020 /* do not load CTF debug data */
#define MODDEBUG_NOAUTOUNLOAD 0x00000010 /* no autounloading at all */
#define MODDEBUG_DDI_MOD 0x00000008 /* ddi_mod{open,sym,close} */
#define MODDEBUG_MP_MATCH 0x00000004 /* dev_minorperm */
#define MODDEBUG_MINORPERM 0x00000002 /* minor perm modctls */
#define MODDEBUG_USERDEBUG 0x00000001 /* bpt after init_module() */
See sys/modctl.h
When the kernel panics, it drops into the debugger and prints some interesting information; usually, however, the most interesting thing is the stack backtrace; this shows, in reverse order, all the functions that were active at the time of panic. To generate a stack backtrace, use the following:
[0]> $c
A few other useful information commands during a panic are ::msgbuf
and ::status
, as shown in Section 14.1.
[0]> ::msgbuf - which will show you the last things the kernel printed onscreen, and [0]> ::status - which shows a summary of the state of the machine in panic.
If you’re running the kernel while the kernel debugger is active and you experience a hang, you may be able to break into the debugger to examine the system state; you can do this by pressing the <F1> and <A> keys at the same time (a sort of “F1-shifted-A” keypress). (On SPARC systems, this key sequence is <Stop>-<A>.) This should give you the same debugger prompt as above, although on a multi-CPU system you may see that the CPU number in the prompt is something other than 0. Once in the kernel debugger, you can get a stack backtrace as above; you can also use ::switch
to change the CPU and get stack backtraces on the different CPU, which might shed more light on the hang. For instance, if you break into the debugger on CPU 1, you could switch to CPU 0 with the following:
[1]> 0::switch
For the most part, the execution control facilities provided by kmdb
for the kernel mirror those provided by the mdb
process target. Breakpoints (:bp
), watchpoints (::wp
), ::continue
, and the various flavors of ::step
can be used.
We discuss more about debugging targets in Section 13.3 and Section 14.1. The common commands for controlling kmdb
targets are summarized in Table 14.1.
Table 14.1. Core kmdb
dcmds
dcmd | Description |
---|---|
::status | Print summary of current target. |
$r ::regs | Display current register values for target. |
$c ::stack $C | Print current stack trace ( |
addr[,b] ::dump [-g sz] [-e] | Dump at least |
addr::dis | Disassemble text, starting around |
[ addr ] :b [ addr ] ::bp [+/-dDestT] [-n count] sym ... addr | Set breakpoint at |
$b | Display all breakpoints. |
::branches | Display the last branches taken by the CPU. (x86 only) |
addr ::delete [id | all] addr :d [id | all] | Delete a breakpoint at |
:z | Delete all breakpoints. |
function ::call [arg [arg ...]] | Call the specified function, using the specified arguments. |
[cpuid] ::cpuregs [-c cpuid] | Display the current general-purpose register set. |
[cpuid] ::cpustack [-c cpuid] | Print a C stack backtrace for the specified CPU. |
::cont :c | Continue the target program. |
$M | List the macro files that are cached by |
::next :e | Step the target program one instruction, but step over subroutine calls. |
::step [branch | over | out] | Step the target program one instruction. |
$<systemdump | Initiate a panic/dump. |
::quit [-u] $q | Cause the debugger to exit. When the |
addr [, len]::wp [+/-dDestT] [-rwx] [-ip] [-n count] | Set a watchpoint at the specified address. |
addr [, len]:a [cmd ...] addr [, len]:p [cmd ...] addr [, len]:w [cmd ...] |
Setting breakpoints with kmdb
is done in the same way as with generic mdb
targets, using the :b
dcmd. Refer to Table 13.12 for a complete list of debugger dcmds.
# mdb -K Loaded modules: [ crypto ] kmdb: target stopped at: kmdbmod'kaif_enter+8: popfq [0]> resume:b [0]> :c kmdb: stop at resume kmdb: target stopped at: resume: movq %gs:0x18,%rax [0]> :z [0]> :c #
The following example shows how to force a crash dump and reboot of the x86-based system by using the halt -d
and boot commands. Use this method to force a crash dump of the system. Afterwards, reboot the system manually.
# halt -d
4ay 30 15:35:15 wacked.Central.Sun.COM halt: halted by user
panic[cpu0]/thread=ffffffff83246ec0: forced crash dump initiated at user request
fffffe80006bbd60 genunix:kadmin+4c1 ()
fffffe80006bbec0 genunix:uadmin+93 ()
fffffe80006bbf10 unix:sys_syscall32+101 ()
syncing file systems... done
dumping to /dev/dsk/c1t0d0s1, offset 107675648, content: kernel
NOTICE: adpu320: bus reset
100% done: 38438 pages dumped, compression ratio 4.29, dump succeeded
Welcome to kmdb
Loaded modules: [ audiosup crypto ufs unix krtld s1394 sppp nca uhci lofs
genunix ip usba specfs nfs md random sctp ]
[0]>
kmdb: Do you really want to reboot? (y/n) y
If you cannot use the reboot -d
or the halt -d
command, you can use the kernel debugger, kmdb
, to force a crash dump. The kernel debugger must have been loaded, either at boot or with the mdb -k
command, for the following procedure to work. Enter kmdb
by using L1–A on SPARC, F1-A on x86, or break on a tty.
[0]> $<systemdump
panic[cpu0]/thread=ffffffff83246ec0: forced crash dump initiated at user request
fffffe80006bbd60 genunix:kadmin+4c1 ()
fffffe80006bbec0 genunix:uadmin+93 ()
fffffe80006bbf10 unix:sys_syscall32+101 ()
syncing file systems... done
dumping to /dev/dsk/c1t0d0s1, offset 107675648, content: kernel
NOTICE: adpu320: bus reset
100% done: 38438 pages dumped, compression ratio 4.29, dump succeeded
dcmd $< - replace input with macro dcmd $<< - source macro dcmd $> - log session to a file dcmd $? - print status and registers dcmd $C - print stack backtrace dcmd $G - enable/disable C++ demangling support dcmd $M - list macro aliases dcmd $P - set debugger prompt string dcmd $Q - quit debugger dcmd $V - get/set disassembly mode dcmd $W - reopen target in write mode dcmd $X - print floating-point registers dcmd $Y - print floating- point registers dcmd $b - list traced software events dcmd $c - print stack backtrace dcmd $d - get/set default output radix dcmd $e - print listing of global symbols dcmd $f - print listing of source files dcmd $g - get/set C++ demangling options dcmd $i - print signals that are ignored dcmd $l - print the representative thread's lwp id dcmd $m - print address space mappings dcmd $p - change debugger target context dcmd $q - quit debugger dcmd $r - print general-purpose registers dcmd $s - get/set symbol matching distance dcmd $v - print non-zero variables dcmd $w - get/set output page width dcmd $x - print floating-point registers dcmd $y - print floating-point registers dcmd / - format data from virtual as dcmd :A - attach to process or core file dcmd :R - release the previously attached process dcmd :a - set read access watchpoint dcmd :b - set breakpoint at the specified address dcmd :c - continue target execution dcmd :d - delete traced software events dcmd :e - step target over next instruction dcmd :i - ignore signal (delete all matching events) dcmd :k - forcibly kill and release target dcmd :p - set execute access watchpoint dcmd :r - run a new target process dcmd :s - single-step target to next instruction dcmd :t - stop on delivery of the specified signals dcmd :u - step target out of current function dcmd :w - set write access watchpoint dcmd :z - delete all traced software events dcmd = - format immediate value dcmd > - assign variable dcmd ? - format data from object file dcmd @ - format data from physical as dcmd - format data from physical as dcmd array - print each array element's address dcmd attach - attach to process or corefile dcmd bp - set breakpoint at the specified addresses or symbols dcmd cat - concatenate and display files dcmd cont - continue target execution dcmd context - change debugger target context dcmd dcmds - list available debugger commands dcmd delete - delete traced software events dcmd dem - demangle C++ symbol names dcmd dis - disassemble near addr dcmd disasms - list available disassemblers dcmd dismode - get/set disassembly mode dcmd dmods - list loaded debugger modules dcmd dump - dump memory from specified address dcmd echo - echo arguments dcmd enum - print an enumeration dcmd eval - evaluate the specified command dcmd events - list traced software events dcmd evset - set software event specifier attributes dcmd files - print listing of source files dcmd fltbp - stop on machine fault dcmd formats - list format specifiers dcmd fpregs - print floating point registers dcmd grep - print dot if expression is true dcmd head - limit number of elements in pipe dcmd help - list commands/command help dcmd kill - forcibly kill and release target dcmd list - walk list using member as link pointer dcmd load - load debugger module dcmd log - log session to a file dcmd map - print dot after evaluating expression dcmd mappings - print address space mappings dcmd next - step target over next instruction dcmd nm - print symbols dcmd nmadd - add name to private symbol table dcmd nmdel - remove name from private symbol table dcmd objects - print load objects information dcmd offsetof - print the offset of a given struct or union member dcmd print - print the contents of a data structure dcmd quit - quit debugger dcmd regs - print general-purpose registers dcmd release - release the previously attached process dcmd run - run a new target process dcmd set - get/set debugger properties dcmd showrev - print version information dcmd sigbp - stop on delivery of the specified signals dcmd sizeof - print the size of a type dcmd stack - print stack backtrace dcmd stackregs - print stack backtrace and registers dcmd status - print summary of current target dcmd step - single-step target to next instruction dcmd sysbp - stop on entry or exit from system call dcmd term - display current terminal type dcmd typeset - set variable attributes dcmd unload - unload debugger module dcmd unset - unset variables dcmd vars - print listing of variables dcmd version - print debugger version string dcmd vtop - print physical mapping of virtual address dcmd walk - walk data structure dcmd walkers - list available walkers dcmd whence - show source of walk or dcmd dcmd which - show source of walk or dcmd dcmd wp - set a watchpoint at the specified address dcmd xdata - print list of external data buffers krtld dcmd ctfinfo - list module CTF information dcmd modctl - list modctl structures dcmd modhdrs - given modctl, dump module ehdr and shdrs dcmd modinfo - list module information walk modctl - list modctl structures mdb_kvm ctor 0x8076f20 - target constructor dcmd $? - print status and registers dcmd $C - print stack backtrace dcmd $c - print stack backtrace dcmd $r - print general-purpose registers dcmd regs - print general-purpose registers dcmd stack - print stack backtrace dcmd stackregs - print stack backtrace and registers dcmd status - print summary of current target