Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 14. Debugging Kernels

In this chapter we explore the rudimentary facilities within MDB for analyzing kernel crash images and debugging live kernels. The objective is not to provide an all-encompassing kernel crash analysis tutorial, but rather to introduce the most relevant MDB dcmds and techniques.

A more comprehensive guide to crash dump analysis can be found in some of the recommended reference texts, for example, Panic! by Chris Drake and Kimberly Brown for SPARC [8], and “Crash Dump Analysis” by Frank Hoffman for x86/x64 [12].

Working with Kernel Cores

The most common type of kernel debug target is a core file, saved from a prior system crash. In the following sections, we highlight some of the introductory steps as used with mdb to explore a kernel core image.

Locating and Attaching the Target

If a system has crashed, then we should have a core image saved in /var/crash on the target machine. The mdb debugger should be invoked from a system with the same architecture and Solaris revision as the crash image. The first steps are to locate the appropriate saved image and then to invoke mdb.

#  cd /var/crash/nodename

# ls
bounds    unix.1    unix.3    unix.5    unix.7    vmcore.1  vmcore.3  vmcore.5  vmcore.7
unix.0    unix.2    unix.4    unix.6    vmcore.0  vmcore.2  vmcore.4  vmcore.6

#  mdb -k unix.7 vmcore.7
Loading modules: [ unix krtld$c
 genunix specfs dtrace ufs ip sctp usba uhci s1394 fcp fctl nca lofs zfs random nfs
audiosup sppp crypto md fcip logindmux ptm ipc ]
>

Examining Kernel Core Summary Information

The kernel core contains important summary information from which we can extract the following:

Revision of the kernel
Hostname
CPU and platform architecture of the system
Panic string
Module causing the panic

We can use the ::showrev and ::status dcmds to extract this information.

> ::showrev
Hostname: zones-internal
Release: 5.11
Kernel architecture: i86pc
Application architecture: i386
Kernel version: SunOS 5.11 i86pc snv_27
Platform: i86pc
> ::status
debugging crash dump vmcore.2 (32-bit) from zones-internal
operating system: 5.11 snv_27 (i86pc)
panic message: BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module
"unix" due to a NULL pointer dereference
dump content: kernel pages only
> ::panicinfo
             cpu         0
          thread d2a58de0
         message BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module
"unix" due to a NULL pointer dereference
              gs fe8301b0
              fs fec30000
              es fe8d0160
              ds d9820160
             edi        0
             esi dc062298
             ebp d2a58828
             esp d2a58800
             ebx de453000
             edx d2a58de0
             ecx        1
             eax        0
          trapno        e
             err        2
             eip fe82ca58
              cs      158
          eflags    10282
            uesp fe89ab0d
             ss         0
             gdt fec1f2f002cf
             idt fec1f5c007ff
             ldt      140
            task      150
             cr0 8005003b
             cr2        0
             cr3  4cb3000
             cr4      6d8

Examining the Message Buffer

The kernel keeps a cyclic buffer of the recent kernel messages. In this buffer we can observe the messages up to the time of the panic. The ::msgbuf dcmd shows the contents of the buffer.

> ::msgbuf
MESSAGE
/pseudo/zconsnex@1/zcons@5 (zcons5) online
/pseudo/zconsnex@1/zcons@6 (zcons6) online
/pseudo/zconsnex@1/zcons@7 (zcons7) online
pseudo-device: ramdisk1024
...
panic[cpu0]/thread=d2a58de0:
BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module "unix" due to a
NULL pointer dereference


sched:
#pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0xfe82ca58, sp=0xfe89ab0d, eflags=0x10282
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6d8<xmme,fxsr,pge,mce,pse,de>
cr2: 0 cr3: 4cb3000
         gs: fe8301b0  fs: fec30000  es: fe8d0160  ds: d9820160
        edi:        0 esi: dc062298 ebp: d2a58828 esp: d2a58800
        ebx: de453000 edx: d2a58de0 ecx:        1 eax:        0
        trp:        e err:        2 eip: fe82ca58  cs:      158
        efl:    10282 usp: fe89ab0d  ss:        0
...

Obtaining a Stack Trace of the Running Thread

We can obtain a stack backtrace of the current thread by using the $C command. Note that the displayed arguments to each function are not necessarily accurate. On each platform, the meaning of the shown arguments is as follows:

SPARC. The values of the arguments if they are available from a saved stack frame, assuming they are not overwritten by use of registers during the called function. With SPARC architectures, a function’s input argument registers are sometimes saved on the way out of a function—if the input registers are reused during the function, then values of the input arguments are overwritten and lost.
x86. Accurate values of the input arguments. Input arguments are always saved onto the stack and can be accurately displayed
x64. The values of the arguments, assuming they are available. As with the SPARC architectures, input arguments are passed in registers and may be overwritten.

> $C
d2a58828 atomic_add_32+8(0)
d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0)
d2a58880 nfs4_inactive+0x41()
d2a5889c fop_inactive+0x15(dc1c29c0, 0)
d2a588b0 vn_rele+0x4b(dc1c29c0)
d2a588c0 snf_smap_desbfree+0x59(dda94080)
d2a588dc dblk_lastfree_desb+0x13(de45b520, d826fb40)
d2a588f4 dblk_decref+0x4e(de45b520, d826fb40)
d2a58918 freemsg+0x69(de45b520)
d2a5893c FreeTxSwPacket+0x3b(d38b84f0)
d2a58968 CleanTxInterrupts+0xb4(d2f9cac0)
d2a589a4 e1000g_send+0xf6(d2f9cac0, d9ffba00)
d2a589c0 e1000g_m_tx+0x22()
d2a589dc dls_tx+0x16(d4520f68, d9ffba00)
d2a589f4 str_mdata_fastpath_put+0x1e(d3843f20, d9ffba00)
d2a58a40 tcp_send_data+0x62d(db0ecac0, d97ee250, d9ffba00)
d2a58aac tcp_send+0x6b6(d97ee250, db0ecac0, 564, 28, 14, 0)
d2a58b40 tcp_wput_data+0x622(db0ecac0, 0, 0)
d2a58c28 tcp_rput_data+0x2560(db0ec980, db15bd20, d2d45f40)
d2a58c40 tcp_input+0x3c(db0ec980, db15bd20, d2d45f40)
d2a58c78 squeue_enter_chain+0xe9(d2d45f40, db15bd20, db15bd20, 1, 1)
d2a58cec ip_input+0x658(d990e554, d3164010, 0, e)
d2a58d40 i_dls_link_ether_rx+0x156(d4523db8, d3164010, db15bd20)
d2a58d70 mac_rx+0x56(d3520200, d3164010, db15bd20)
d2a58dac e1000g_intr+0xa6(d2f9cac0, 0)
d2a58ddc intr_thread+0x122()

Which Process?

If the stack trace is of a kernel housekeeping or interrupt thread, the process reported for the thread will be that of p0—“sched.” The process pointer for the thread can be obtained with ::thread, and ::ps will then display summary information about that process. In this example, the thread is an interrupt thread (as indicated by the top entry in the stack from $C), and the process name maps to sched.

> d2a58de0::thread -p
    ADDR     PROC      LWP      CRED
d2a58de0 fec1d280        0 d9d1cf38
> fec1d280::ps -t
S    PID   PPID   PGID    SID    UID      FLAGS     ADDR NAME
R      0      0      0      0      0 0x00000001 fec1d280 sched
        T        t0 <TS_STOPPED>

Disassembling the Suspect Code

Once we’ve located the thread of interest, we often learn more about what happened by disassembling the target and looking at the instruction that reportedly caused the panic. MDB’s ::dis dcmd will disassemble the code around the target instruction that we extract from the stack backtrace.

> $C
d2a58828 atomic_add_32+8(0)
d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0)
d2a58880 nfs4_inactive+0x41()
d2a5889c fop_inactive+0x15(dc1c29c0, 0)
d2a588b0 vn_rele+0x4b(dc1c29c0)
...
> nfs4_async_inactive+0x3b::dis
nfs4_async_inactive+0x1a:       pushl  $0x28
nfs4_async_inactive+0x1c:       call   +0x51faa30       <kmem_alloc>
nfs4_async_inactive+0x21:       addl   $0x8,%esp
nfs4_async_inactive+0x24:       movl   %eax,%esi
nfs4_async_inactive+0x26:       movl   $0x0,(%esi)
nfs4_async_inactive+0x2c:       movl   -0x4(%ebp),%eax
nfs4_async_inactive+0x2f:       movl   %eax,0x4(%esi)
nfs4_async_inactive+0x32:       movl   0xc(%ebp),%edi
nfs4_async_inactive+0x35:       pushl  %edi
nfs4_async_inactive+0x36:       call   +0x51b7cdc       <crhold>
nfs4_async_inactive+0x3b:       addl   $0x4,%esp
nfs4_async_inactive+0x3e:       movl   %edi,0x8(%esi)
nfs4_async_inactive+0x41:       movl   $0x4,0xc(%esi)
nfs4_async_inactive+0x48:       leal   0xe0(%ebx),%eax
nfs4_async_inactive+0x4e:       movl   %eax,-0x8(%ebp)
nfs4_async_inactive+0x51:       pushl  %eax
nfs4_async_inactive+0x52:       call   +0x51477f4       <mutex_enter>
nfs4_async_inactive+0x57:       addl   $0x4,%esp
nfs4_async_inactive+0x5a:       cmpl   $0x0,0xd4(%ebx)
nfs4_async_inactive+0x61:       je     +0x7e    <nfs4_async_inactive+0xdf>
nfs4_async_inactive+0x63:       cmpl   $0x0,0xd0(%ebx)
> crhold::dis
crhold:                         pushl  %ebp
crhold+1:                       movl   %esp,%ebp
crhold+3:                       andl   $0xfffffff0,%esp
crhold+6:                       pushl  $0x1
crhold+8:                       movl   0x8(%ebp),%eax
crhold+0xb:                     pushl  %eax
crhold+0xc:                     call   -0x6e0b8 <atomic_add_32>
crhold+0x11:                    movl   %ebp,%esp
crhold+0x13:                    popl   %ebp
crhold+0x14:                    ret
> atomic_add_32::dis
atomic_add_32:                  movl   0x4(%esp),%eax
atomic_add_32+4:                movl   0x8(%esp),%ecx
atomic_add_32+8:                lock   addl %ecx,(%eax)
atomic_add_32+0xb:              ret

Displaying General-Purpose Registers

In this example, the system had a NULL pointer reference at atomic_add_ 32+8(0). The faulting instruction was atomic, referencing the memory at the location pointed to by %eax. By looking at the registers at the time of the panic, we can see that %eax was indeed NULL. The next step is to attempt to find out why %eax was NULL.

> ::regs
%cs = 0x0158             %eax = 0x00000000
%ds = 0xd9820160                 %ebx = 0xde453000
%ss = 0x0000             %ecx = 0x00000001
%es = 0xfe8d0160                 %edx = 0xd2a58de0
%fs = 0xfec30000                 %esi = 0xdc062298
%gs = 0xfe8301b0                 %edi = 0x00000000

%eip = 0xfe82ca58 atomic_add_32+8
%ebp = 0xd2a58828
%esp = 0xd2a58800

%eflags = 0x00010282
  id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
  status=<of,df,IF,tf,SF,zf,af,pf,cf>

  %uesp = 0xfe89ab0d
%trapno = 0xe
   %err = 0x2

Navigating the Stack Backtrace

The function prototype for atomic_add_32() reveals that the first argument is a pointer to the memory location to be added. Since this was an x86 machine, the arguments reported by the stack backtrace are known to be useful, and we can look to see where the NULL pointer was handed down—in this case nfs4_async_inactive().

void
atomic_add_32(volatile uint32_t *target, int32_t delta)
{
        *target += delta;
}


> atomic_add_32::dis
atomic_add_32:                  movl    0x4(%esp),%eax
atomic_add_32+4:                movl    0x8(%esp),%ecx
atomic_add_32+8:                lock addl %ecx,(%eax)
atomic_add_32+0xb:              ret
> $C
d2a58828 atomic_add_32+8(0)
d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0)
d2a58880 nfs4_inactive+0x41()
d2a5889c fop_inactive+0x15(dc1c29c0, 0)
d2a588b0 vn_rele+0x4b(dc1c29c0)
...

> $C
d2a58828 atomic_add_32+8(0)
d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0)
d2a58880 nfs4_inactive+0x41()
d2a5889c fop_inactive+0x15(dc1c29c0, 0)
d2a588b0 vn_rele+0x4b(dc1c29c0)
...
> nfs4_async_inactive+0x3b::dis
nfs4_async_inactive+0x1a:       pushl  $0x28
nfs4_async_inactive+0x1c:       call   +0x51faa30       <kmem_alloc>
nfs4_async_inactive+0x21:       addl   $0x8,%esp
nfs4_async_inactive+0x24:       movl   %eax,%esi
nfs4_async_inactive+0x26:       movl   $0x0,(%esi)
nfs4_async_inactive+0x2c:       movl   -0x4(%ebp),%eax
nfs4_async_inactive+0x2f:       movl   %eax,0x4(%esi)
nfs4_async_inactive+0x32:       movl   0xc(%ebp),%edi
nfs4_async_inactive+0x35:       pushl  %edi
nfs4_async_inactive+0x36:       call   +0x51b7cdc       <crhold>
nfs4_async_inactive+0x3b:       addl   $0x4,%esp
nfs4_async_inactive+0x3e:       movl   %edi,0x8(%esi)
nfs4_async_inactive+0x41:       movl   $0x4,0xc(%esi)
nfs4_async_inactive+0x48:       leal   0xe0(%ebx),%eax
nfs4_async_inactive+0x4e:       movl   %eax,-0x8(%ebp)
nfs4_async_inactive+0x51:       pushl  %eax
nfs4_async_inactive+0x52:       call   +0x51477f4       <mutex_enter>
nfs4_async_inactive+0x57:       addl   $0x4,%esp
nfs4_async_inactive+0x5a:       cmpl   $0x0,0xd4(%ebx)
nfs4_async_inactive+0x61:       je     +0x7e    <nfs4_async_inactive+0xdf>
nfs4_async_inactive+0x63:       cmpl   $0x0,0xd0(%ebx)
...

Looking at the disassembly, it appears that there is an additional function call, which is omitted from the stack backtrack (typically due to tail call compiler optimization). The call is to crhold(), passing the address of a credential structure from the arguments to nfs4_async_inactive(). Here we can see that crhold() does in fact call atomic_add_32().

/*
 * Put a hold on a cred structure.
 */
void
crhold(cred_t *cr)
{
        atomic_add_32(&cr->cr_ref, 1);
}


> crhold::dis
crhold:                         pushl  %ebp
crhold+1:                       movl   %esp,%ebp
crhold+3:                       andl   $0xfffffff0,%esp
crhold+6:                       pushl  $0x1
crhold+8:                       movl   0x8(%ebp),%eax
crhold+0xb:                     pushl  %eax
crhold+0xc:                     call   -0x6e0b8 <atomic_add_32>
crhold+0x11:                    movl   %ebp,%esp
crhold+0x13:                    popl   %ebp
crhold+0x14:                    ret

Next, we look into the situation in which nfs4_async_inactive() was called. The first argument is a vnode pointer, and the second is our suspicious credential pointer. The vnode pointer can be examined with the CTF information and the ::print dcmd. We can see that we were performing an nfs4_async_inactive function on the vnode referencing a pdf file in this case.

*/
void
nfs4_async_inactive(vnode_t *vp, cred_t *cr)
{



> $C
d2a58828 atomic_add_32+8(0)
d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0)
> dc1c29c0::print vnode_t
{
...
    v_type = 1 (VREG)
    v_rdev = 0
...
    v_path = 0xdc3de800 "/zones/si/root/home/ftp/book/solarisinternals_projtaskipc.pdf"
...
}

Looking further at the stack backtrace and the code, we can try to identify where the credentials were derived from. nfs4_async_inactive() was called by nfs4_inactive(), which is one of the standard VOP methods (VOP_INACTIVE).

> $C
d2a58828 atomic_add_32+8(0)
d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0)
d2a58880 nfs4_inactive+0x41()
d2a5889c fop_inactive+0x15(dc1c29c0, 0)
d2a588b0 vn_rele+0x4b(dc1c29c0)

The credential can be followed all the way up to vn_rele(), which derives the pointer from CRED(), which references the current thread’s t_cred.

vn_rele(vnode_t *vp)
{
        if (vp->v_count == 0)
                cmn_err(CE_PANIC, "vn_rele: vnode ref count 0");
        mutex_enter(&vp->v_lock);
        if (vp->v_count == 1) {
                mutex_exit(&vp->v_lock);
                VOP_INACTIVE(vp, CRED());
...

#define CRED()          curthread->t_cred

We know which thread called vn_rele()—the interrupt thread with a thread pointer of d2a58de0. We can use ::print to take a look at the thread’s t_cred.

> d2a58de0::print kthread_t t_cred
t_cred = 0xd9d1cf38

Interestingly, it’s not NULL! A further look around the code gives us some clues as to what’s going on. In the initialization code during the creation of an interrupt thread, the t_cred is set to NULL:

/*
 * Create and initialize an interrupt thread.
 *      Returns non-zero on error.
 *      Called at spl7() or better.
 */
void
thread_create_intr(struct cpu *cp)
{
...
        /*
         * Nobody should ever reference the credentials of an interrupt
         * thread so make it NULL to catch any such references.
         */
        tp->t_cred = NULL;

Our curthread->t_cred is not NULL, but NULL was passed in when CRED() accessed it in the not-too-distant past—an interesting situation indeed. It turns out that the NFS client code wills credentials to the interrupt thread’s t_cred, so what we are in fact seeing is a race condition, where vn_rele() is called from the interrupt thread with no credentials. In this case, a bug was logged accordingly and the problem was fixed!

Looking at the Status of the CPUs

Another good source of information is the ::cpuinfo dcmd. It shows a rich set of information of the processors in the system. For each CPU, the details of the thread currently running on each processor are shown. If the current CPU is handling an interrupt, then the thread running the interrupt and the preempted thread are shown. In addition, a list of threads waiting in the run queue for this processor is shown.

In this example, we can see that the idle thread was preempted by a level 6 interrupt. Three threads are on the run queue: the thread that was running immediately before preemption and two other threads waiting to be scheduled on the run queue. We can traverse these manually, by traversing the stack of the thread pointer with ::findstack.

> :da509de0:findstack
stack pointer for thread da509de0: da509d08
  da509d3c swtch+0x165()
  da509d60 cv_timedwait+0xa3()
  da509dc8 taskq_d_thread+0x149()
  da509dd8 thread_start+8()

The CPU containing the thread that caused the panic will, we hope, be reported in the panic string and, furthermore, will be used by MDB as the default thread for other dcmds in the core image. Once we determine the status of the CPU, we can observe which thread was involved in the panic.

Additionally, we can use the CPU’s run queue (cpu_dispq) to provide a stack list for other threads queued up to run. We might do this just to gather a little more information about the circumstance in which the panic occurred.

> fec225b8::walk cpu_dispq |::thread
    ADDR    STATE  FLG PFLG SFLG   PRI  EPRI PIL     INTR DISPTIME BOUND PR
da509de0 run         8    0   13    60     0   0      n/a   7e6f9c    -1  0
da0cdde0 run         8 2000   13    60     0   0      n/a   7e8452    -1  0
da0d6de0 run         8 2000   13    60     0   0      n/a   7e8452    -1  0

> fec225b8::walk cpu_dispq |::findstack
stack pointer for thread da509de0: da509d08
  da509d3c swtch+0x165()
  da509d60 cv_timedwait+0xa3()
  da509dc8 taskq_d_thread+0x149()
  da509dd8 thread_start+8()
stack pointer for thread da0cdde0: da0cdd48
  da0cdd74 swtch+0x165()
  da0cdd84 cv_wait+0x4e()
  da0cddc8 nfs4_async_manager+0xc9()
  da0cddd8 thread_start+8()
stack pointer for thread da0d6de0: da0d6d48
  da0d6d74 swtch+0x165()
  da0d6d84 cv_wait+0x4e()
  da0d6dc8 nfs4_async_manager+0xc9()
  da0d6dd8 thread_start+8()

Traversing Stack Frames in SPARC Architectures

We briefly mentioned in Section 14.1.4 some of the problems we encounter when trying to glean argument values from stack backtraces. In the SPARC architecture, the values of the input arguments’ registers are saved into register windows at the exit of each function. In most cases, we can traverse the stack frames to look at the values of the registers as they are saved in register windows. Historically, this was done by manually traversing the stack frames (as illustrated in Panic!). Conveniently, MDB has a dcmd that understands and walks SPARC stack frames. We can use the ::stackregs dcmd to display the SPARC input registers and locals (%l0-%l7) for each frame on the stack.

> ::stackregs
000002a100d074c1 vpanic(12871f0, e, e, fffffffffffffffe, 1, 185d400)
  %l0-%l3:                0      2a100d07f10      2a100d07f40         ffffffff
  %l4-%l7: fffffffffffffffe                0          1845400          1287000
  px_err_fabric_intr+0xbc: call      -0x1946c0     <fm_panic>

000002a100d07571 px_err_fabric_intr+0xbc(600024f9880, 31, 340, 600024d75d0,
30000842020, 0)
  %l0-%l3:                0      2a100d07f10      2a100d07f40         ffffffff
  %l4-%l7: fffffffffffffffe                0          1845400          1287000
  px_msiq_intr+0x1ac:      call      -0x13b0       <px_err_fabric_intr>

000002a100d07651 px_msiq_intr+0x1ac(60002551db8, 0, 127dcc8, 6000252e9e0, 30000828a58,
30000842020)
  %l0-%l3:                0      2a100d07f10      2a100d07f40      2a100d07f10
  %l4-%l7:                0               31      30000842020      600024d21d8
  current_thread+0x174:    jmpl      %o5, %o7

000002a100d07751 current_thread+0x174(16, 2000, ddf7dfff, ddf7ffff, 2000, 12)
  %l0-%l3:          100994c      2a100cdf021                e              7b9
  %l4-%l7:                0                0                0      2a100cdf8d0
  cpu_halt+0x134:          call      -0x29dcc      <enable_vec_intr>
000002a100cdf171 cpu_halt+0x134(16, d, 184bbd0, 30001334000, 16, 1)
  %l0-%l3:      60001db16c8                0      60001db16c8 ffffffffffffffff
  %l4-%l7:                0                0                0           10371d0
  idle+0x124:              jmpl      %l7, %o7

000002a100cdf221 idle+0x124(1819800, 0, 30001334000, ffffffffffffffff, e, 1818400)
  %l0-%l3:      60001db16c8               1b                0 ffffffffffffffff
  %l4-%l7:                0                0                0          10371d0
  thread_start+4:          jmpl      %i7, %o7

000002a100cdf2d1 thread_start+4(0, 0, 0, 0, 0, 0)
  %l0-%l3:                0                0                0                0
  %l4-%l7:                0                0                0                0

SPARC input registers become output registers, which are then saved on the stack. The common technique when trying to qualify registers as valid arguments is to ascertain, before the registers are saved in the stack frame, whether they have been overwritten during the function. A common technique is to disassemble the target function, looking to see if the input registers (%i0-%i7) are reused in the function’s code body. A quick and dirty way to look for register usage is to use ::dis piped to a UNIX grep; however, at this stage, examining the code for use of input registers is left as an exercise for the reader. For example, if we are looking to see if the values of the first argument to cpu_halt() are valid, we could see if %i0 is reused during the cpu_halt() function, before we branch out at cpu_halt+0x134.

> cpu_halt::dis !grep i0
cpu_halt+0x24:                  ld        [%g1 + 0x394], %i0
cpu_halt+0x28:                  cmp       %i0, 1
cpu_halt+0x90:                  add       %i2, 0x120, %i0
cpu_halt+0xd0:                  srl       %i4, 0, %i0
cpu_halt+0x100:                 srl       %i4, 0, %i0
cpu_halt+0x144:                 ldub      [%i3 + 0xf9], %i0
cpu_halt+0x150:                 and       %i0, 0xfd, %l7
cpu_halt+0x160:                 add       %i2, 0x120, %i0

As we can see in this case, %i0 is reused very early in cpu_halt() and would be invalid in the stack backtrace.

Listing Processes and Process Stacks

We can obtain the list of processes by using the ::ps dcmd. In addition, we can search for processes by using the pgrep(1M)-like ::pgrep dcmd.

> ::ps -f
S    PID   PPID   PGID    SID    UID      FLAGS     ADDR NAME
R      0      0      0      0      0 0x00000001 fec1d280 sched
R      3      0      0      0      0 0x00020001 d318d248 fsflush
R      2      0      0      0      0 0x00020001 d318daa8 pageout
R      1      0      0      0      0 0x42004000 d318e308 /sbin/init
R   9066      1   9066   9066      1 0x52000400 da2b7130 /usr/lib/nfs/nfsmapid
R   9065      1   9063   9063      1 0x42000400 d965a978 /usr/lib/nfs/nfs4cbd
R   4125      1   4125   4125      0 0x42000400 d9659420 /local/local/bin/httpd -k start
R   9351   4125   4125   4125  40000 0x52000000 da2c0428 /local/local/bin/httpd -k start
R   4118      1   4117   4117      1 0x42000400 da2bc988 /usr/lib/nfs/nfs4cbd
R   4116      1   4116   4116      1 0x52000400 d8da7240 /usr/lib/nfs/nfsmapid
R   4105      1   4105   4105      0 0x42000400 d9664108 /usr/apache/bin/httpd
R   4263   4105   4105   4105  60001 0x52000000 da2bf368 /usr/apache/bin/httpd
...
> ::ps -t
S    PID   PPID   PGID    SID    UID      FLAGS     ADDR NAME
R      0      0      0      0      0 0x00000001 fec1d280 sched
        T        t0 <TS_STOPPED>
R      3      0      0      0      0 0x00020001 d318d248 fsflush
        T  0xd3108a00 <TS_SLEEP>
R      2      0      0      0      0 0x00020001 d318daa8 pageout
        T  0xd3108c00 <TS_SLEEP>
R      1      0      0      0      0 0x42004000 d318e308 init
        T  0xd3108e00 <TS_SLEEP>
R   9066      1   9066   9066      1 0x52000400 da2b7130 nfsmapid
        T  0xd942be00 <TS_SLEEP>
        T  0xda68f000 <TS_SLEEP>
        T  0xda4e8800 <TS_SLEEP>
        T  0xda48f800 <TS_SLEEP>
...
::pgrep httpd
> ::pgrep http
S    PID   PPID   PGID    SID    UID      FLAGS     ADDR NAME
R   4125      1   4125   4125      0 0x42000400 d9659420 httpd
R   9351   4125   4125   4125  40000 0x52000000 da2c0428 httpd
R   4105      1   4105   4105      0 0x42000400 d9664108 httpd
R   4263   4105   4105   4105  60001 0x52000000 da2bf368 httpd
R   4111   4105   4105   4105  60001 0x52000000 da2b2138 httpd
...

We can observe several aspects of the user process by using the ptool-like dcmds.

> ::pgrep nscd
S    PID   PPID   PGID    SID    UID      FLAGS              ADDR NAME
R    575      1    575    575       0 0x42000000 ffffffff866f1878 nscd

> 0t575 |::pid2proc |::walk thread |::findstack
(or)
> ffffffff82f5f860::walk thread |::findstack
stack pointer for thread ffffffff866cb060: fffffe8000c7fdd0
[ fffffe8000c7fdd0 _resume_from_idle+0xde() ]
  fffffe8000c7fe10 swtch+0x185()
  fffffe8000c7fe80 cv_wait_sig_swap_core+0x17a()
  fffffe8000c7fea0 cv_wait_sig_swap+0x1a()
  fffffe8000c7fec0 pause+0x59()
  fffffe8000c7ff10 sys_syscall32+0x101()
...

> ffffffff866f1878::ptree
fffffffffbc23640  sched
     ffffffff82f6b148  init
          ffffffff866f1878  nscd

> ffffffff866f1878::pfiles
FD   TYPE            VNODE INFO
   0  CHR ffffffff833d4700 /devices/pseudo/mm@0:null
   1  CHR ffffffff833d4700 /devices/pseudo/mm@0:null
   2  CHR ffffffff833d4700 /devices/pseudo/mm@0:null
   3 DOOR ffffffff86a0eb40 [door to 'nscd' (proc=ffffffff866f1878)]
   4 SOCK ffffffff835381c0

> ffffffff866f1878::pmap
             SEG             BASE     SIZE      RES PATH
ffffffff85e416c0 0000000008046000       8k        8k [ anon ]
ffffffff866ab5e8 0000000008050000      48k           /usr/sbin/nscd
ffffffff839b1950 000000000806c000       8k        8k /usr/sbin/nscd
ffffffff866ab750 000000000806e000     520k      480k [ anon ]
...

Global Memory Summary

The major buckets of memory allocation are available with the ::memstat dcmd.

> ::memstat
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                      49022               191   19%
Anon                        68062               265   27%
Exec and libs                3951                15    2%
Page cache                   4782                18    2%
Free (cachelist)             7673                29    3%
Free (freelist)            118301               462   47%

Total                      251791               983
Physical                   251789               983

Listing Network Connections

We can use the ::netstat dcmd to obtain the list of network connections.

> ::netstat
TCPv4    St   Local Address        Remote Address        Zone
da348600  6      10.0.5.104.63710       10.0.5.10.38189     7
da348a80  0      10.0.5.106.1016        10.0.5.10.2049      2
da34fc40  0      10.0.5.108.1018        10.0.5.10.2049      3
da3501c0  0      10.0.4.106.22       192.18.42.17.64836     2
d8ed2800  0      10.0.4.101.22       192.18.42.17.637
...

Listing All Kernel Threads

A stack backtrace of all threads in the kernel can be obtained with the ::threadlist dcmd. (If you are familiar with adb, this is a modern version of adb’s $<threadlist macro). With this dcmd, we can quickly and easily capture a useful snapshot of all current activity in text form, for deeper analysis.

> ::threadlist
    ADDR     PROC      LWP CMD/LWPID
fec1dae0 fec1d280 fec1fdc0 sched/1
d296cde0 fec1d280        0 idle()
d2969de0 fec1d280        0 taskq_thread()
d2966de0 fec1d280        0 taskq_thread()
d2963de0 fec1d280        0 taskq_thread()
d2960de0 fec1d280        0 taskq_thread()
d29e3de0 fec1d280        0 taskq_thread()
d29e0de0 fec1d280        0 taskq_thread()
...
> ::threadlist -v
    ADDR     PROC      LWP CLS PRI     WCHAN
fec1dae0 fec1d280 fec1fdc0   0  96         0
  PC: 0xfe82b507    CMD: sched
  stack pointer for thread fec1dae0: fec33df8
    swtch+0x165()
    sched+0x3aa()
    main+0x365()

d296cde0 fec1d280        0   0  -1        0
  PC: 0xfe82b507    THREAD: idle()
  stack pointer for thread d296cde0: d296cd88
    swtch+0x165()
    idle+0x32()
    thread_start+8()
...

# echo "::threadlist" |mdb -k >mythreadlist.txt

Other Notable Kernel dcmds

The ::findleaks dcmd efficiently detects memory leaks in kernel crash dumps when the full set of kmem debug features has been enabled. The first execution of ::findleaks processes the dump for memory leaks (this can take a few minutes), then coalesces the leaks by the allocation stack trace. The findleaks report shows a bufctl address and the topmost stack frame for each memory leak that was identified. See Section 11.4.9.1 in Solaris^™ Internals for more information on ::findleaks.

> ::findleaks
CACHE     LEAKED   BUFCTL CALLER
70039ba8       1 703746c0 pm_autoconfig+0x708
70039ba8       1 703748a0 pm_autoconfig+0x708
7003a028       1 70d3b1a0 sigaddq+0x108
7003c7a8       1 70515200 pm_ioctl+0x187c
------------------------------------------------------
   Total       4 buffers, 376 bytes

If the -v option is specified, the dcmd prints more verbose messages as it executes. If an explicit address is specified prior to the dcmd, the report is filtered and only leaks whose allocation stack traces contain the specified function address are displayed.

The ::vatopfn dcmd translates virtual addresses to physical addresses, using the appropriate platform translation tables.

> fec4b8d0::vatopfn
        level=1 htable=d9d53848 pte=30007e3
Virtual fec4b8d0 maps Physical 304b8d0

The ::whatis dcmd attempts to determine if the address is a pointer to a kmem-managed buffer or another type of special memory region, such as a thread stack, and reports its findings. When the -a option is specified, the dcmd reports all matches instead of just the first match to its queries. When the -b option is specified, the dcmd also attempts to determine if the address is referred to by a known kmem bufctl. When the -v option is specified, the dcmd reports its progress as it searches various kernel data structures. See Section 11.4.9.2 in Solaris^™

> 0x705d8640::whatis
705d8640 is 705d8640+0, allocated from streams_mblk

The ::kgrep dcmd lets you search the kernel for occurrences of a supplied value. This is particularly useful when you are trying to debug software with multiple instances of a value.

> 0x705d8640::kgrep
400a3720
70580d24
7069d7f0
706a37ec
706add34

Examining User Process Stacks within a Kernel Image

A kernel crash dump can save memory pages of user processes in Solaris. We explain how to save process memory pages and how to examine user processes by using the kernel crash dump.

Enabling Process Pages in a Dump

We must modify the dump configuration to save process pages. We confirm the dump configuration by running dumpadm with no option.

# /usr/sbin/dumpadm
         Dump content: all pages
          Dump device: /dev/dsk/c0t0d0s1 (swap)
   Savecore directory: /var/crash/example
     Savecore enabled: yes

If Dump content is not all pages or curproc, no process memory page will be dumped. In that case, we run dumpadm -c all or dumpadm -c curproc.

Invoking MDB to Examine the Kernel Image

We gather a crash dump and confirm that user pages are contained.

# /usr/bin/mdb unix.0 vmcore.0
   Loading modules: [ unix krtld genunix ufs_log ip nfs random ptm
   logindmux ]

> ::status
debugging crash dump vmcore.0 (64-bit) from rmcferrari
operating system: 5.11 snv_31 (i86pc)
panic message: forced crash dump initiated at user request
dump content: all kernel and user pages

The dump content line shows that this dump includes user pages.

Locating the Target Process

Next, we search for process information with which we are concerned. We use nscd as the target of this test case. The first thing to find is the address of the process.

> ::pgrep nscd
S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
R    575      1    575    575      0 0x42000000 ffffffff866f1878 nscd

The address of the process is ffffffff866f1878. As a sanity check, we can look at the kernel thread stacks for each process—we’ll use these later to double-check that the user stack matches the kernel stack, for those threads blocked in a system call.

> 0t575::pid2proc |::print proc_t p_tlist |::list kthread_t t_forw
stack pointer for thread ffffffff866cb060: fffffe8000c7fdd0
[ fffffe8000c7fdd0 _resume_from_idle+0xde() ]
  fffffe8000c7fe10 swtch+0x185()
  fffffe8000c7fe80 cv_wait_sig_swap_core+0x17a()
  fffffe8000c7fea0 cv_wait_sig_swap+0x1a()
  fffffe8000c7fec0 pause+0x59()
  fffffe8000c7ff10 sys_syscall32+0x101()
stack pointer for thread ffffffff866cc140: fffffe8000c61d70
[ fffffe8000c61d70 _resume_from_idle+0xde() ]
  fffffe8000c61db0 swtch+0x185()
  fffffe8000c61e10 cv_wait_sig+0x150()
  fffffe8000c61e50 door_unref+0x94()
  fffffe8000c61ec0 doorfs32+0x90()
  fffffe8000c61f10 sys_syscall32+0x101()
stack pointer for thread ffffffff866cba80: fffffe8000c6dd10
[ fffffe8000c6dd10 _resume_from_idle+0xde() ]
  fffffe8000c6dd50 swtch_to+0xc9()
  fffffe8000c6ddb0 shuttle_resume+0x376()
  fffffe8000c6de50 door_return+0x228()
  fffffe8000c6dec0 doorfs32+0x157()
  fffffe8000c6df10 sys_syscall32+0x101()
stack pointer for thread ffffffff866cb720: fffffe8000c73cf0
[ fffffe8000c73cf0 _resume_from_idle+0xde() ]
  fffffe8000c73d30 swtch+0x185()
  fffffe8000c73db0 cv_timedwait_sig+0x1a3()
  fffffe8000c73e30 cv_waituntil_sig+0xab()
  fffffe8000c73ec0 nanosleep+0x141()
  fffffe8000c73f10 sys_syscall32+0x101()
...

It appears that the first few threads on the process are blocked in the pause(), door(), and nanosleep() system calls. We’ll double-check against these later when we traverse the user stacks.

Extracting the User-Mode Stack Frame Pointers

The next things to find are the stack pointers for the user threads, which are stored in each thread’s lwp.

> ffffffff866f1878::walk thread |::print kthread_t t_lwp->lwp_regs|::print "struct
regs" r_rsp |=X
                8047d54         fecc9f80        febbac08        fea9df78        fe99df78  
      
fe89df78           fe79df78
                fe69df78        fe59df78        fe49df78        fe39df58        fe29df58  
      
fe19df58           fe09df58
                fdf9df58        fde9df58        fdd9df58        fdc9df58        fdb9df58  
      
fda9df58           fd99df58
                   fd89d538         fd79bc08

Each entry is a thread’s stack pointer in the user process’s address space. We can use these to traverse the stack in the user process’s context.

Switching MDB to Debug a Specific Process

An mdb command, <proc address>::context, switches a context to a specified user process.

> ffffffff866f1878::context
debugger context set to proc ffffffff866f1878

After the context is switched, several mdb commands return process information rather than kernel information. For example:

> ::nm
Value              Size               Type  Bind  Other Shndx    Name
0x0000000000000000|0x0000000000000000|NOTY |LOCL |0x0  |UNDEF   |
0x0000000008056c29|0x0000000000000076|FUNC |GLOB |0x0  |10      |gethost_revalidate
0x0000000008056ad2|0x0000000000000024|FUNC |GLOB |0x0  |10      |getgr_uid_reaper
0x000000000805be5f|0x0000000000000000|OBJT |GLOB |0x0  |14      |_etext
0x0000000008052778|0x0000000000000000|FUNC |GLOB |0x0  |UNDEF   |strncpy
0x0000000008052788|0x0000000000000000|FUNC |GLOB |0x0  |UNDEF   |_uncached_getgrnam_r
0x000000000805b364|0x000000000000001b|FUNC |GLOB |0x0  |12      |_fini
0x0000000008058f54|0x0000000000000480|FUNC |GLOB |0x0  |10      |nscd_parse
0x0000000008052508|0x0000000000000000|FUNC |GLOB |0x0  |UNDEF   |pause
0x00000000080554e0|0x0000000000000076|FUNC |GLOB |0x0  |10      |getpw_revalidate
...

> ::mappings
            BASE            LIMIT             SIZE NAME
         8046000          8048000             2000 [ anon ]
         8050000          805c000             c000 /usr/sbin/nscd
         806c000          806e000             2000 /usr/sbin/nscd
         806e000          80f0000            82000 [ anon ]
        fd650000         fd655000             5000 /lib/nss_files.so.1
        fd665000         fd666000             1000 /lib/nss_files.so.1
        fd680000         fd690000            10000 [ anon ]
        fd6a0000         fd79e000            fe000 [ anon ]
        fd7a0000         fd89e000            fe000 [ anon ]
...

Constructing the Process Stack

Unlike examining the kernel, where we would ordinarily use the stack-related mdb commands like ::stack or ::findstack, we need to use stack pointers to traverse a process stack. In this case, nscd is an x86 32-bit application. So a “stack pointer + 0x38” and a “stack pointer + 0x3c” shows the stack pointer and the program counter of the previous frame.

/*
 * In the Intel world, a stack frame looks like this:
 *
 * %fp0->|                               |
 *       |-------------------------------|
 *       |  Args to next subroutine      |
 *       |-------------------------------|-
 * %sp0->|  One-word struct-ret address  | |
 *       |-------------------------------|  > minimum stack frame
 * %fp1->|  Previous frame pointer (%fp0)| |
 *       |-------------------------------|-/
 *       |  Local variables              |
 * %sp1->|-------------------------------|
 *
 * For amd64, the minimum stack frame is 16 bytes and the frame pointer must
 * be 16-byte aligned.
 */

struct frame {
        greg_t  fr_savfp;               /* saved frame pointer */
        greg_t  fr_savpc;               /* saved program counter */
};

#ifdef _SYSCALL32

/*
 * Kernel's view of a 32-bit stack frame.
 */
struct frame32 {
        greg32_t fr_savfp;              /* saved frame pointer */
        greg32_t fr_savpc;              /* saved program counter */
};
                                                                        See sys/stack.h

Each individual stack frame is defined as follows:

/*
 * In the x86 world, a stack frame looks like this:
 *
 *              |---------------------------|
 * 4n+8(%ebp) ->| argument word n           |
 *              | ...                       |    (Previous frame)
 *    8(%ebp) ->| argument word 0           |
 *              |---------------------------|--------------------
 *    4(%ebp) ->| return address            |
 *              |---------------------------|
 *    0(%ebp) ->| previous %ebp (optional)  |
 *              |---------------------------|
 *   -4(%ebp) ->| unspecified               |    (Current frame)
 *              | ...                       |
 *    0(%esp) ->| variable size             |
 *              |---------------------------|
 */
                                                                        See sys/stack.h

We can explore the stack frames from Section 14.2.4.

> ffffffff866f1878::walk thread |::print kthread_t t_lwp->lwp_regs|::print "struct
regs" r_rsp |=X
                8047d54         fecc9f80        febbac08        fea9df78         fe99df78
fe89df78           fe79df78
                fe69df78        fe59df78        fe49df78        fe39df58         fe29df58
fe19df58           fe09df58
                fdf9df58        fde9df58        fdd9df58        fdc9df58         fdb9df58
fda9df58           fd99df58
                   fd89d538         fd79bc08

> 8047d54/X
0x8047d54:      fedac74f
> fedac74f/
libc.so.1'pause+0x67:           8e89c933        = xorl   %ecx,%ecx

> febbac08/X
0xfebbac08:     feda83ec
> feda83ec/
libc.so.1'_door_return+0xac:    eb14c483        = addl   $0x14,%esp

> fea9df78/X
0xfea9df78:     fedabe4c
> fedabe4c/
libc.so.1'_sleep+0x88:          8908c483        = addl   $0x8,%esp

Thus, we observe user stacks of pause(), door_return(), and sleep(), as we expected.

Examining the Process Memory

In the process context, we can examine process memory as usual. For example, we can dissasemble instructions from a processes’s address space:

> libc.so.1'_sleep+0x88::dis
libc.so.1'_sleep+0x67:          pushq   $-0x13
libc.so.1'_sleep+0x69:          call    -0x5cb59 <0xfed4f2d4>
libc.so.1'_sleep+0x6e:          addl    $0x4,%esp
libc.so.1'_sleep+0x71:          movl    %esp,%eax
libc.so.1'_sleep+0x73:          movl    %eax,0x22c(%rsi)
libc.so.1'_sleep+0x79:          leal    0x14(%rsp),%eax
libc.so.1'_sleep+0x7d:          pushq   %rax
libc.so.1'_sleep+0x7e:          leal    0x10(%rsp),%eax
libc.so.1'_sleep+0x82:          pushq   %rax
libc.so.1'_sleep+0x83:          call    +0xc419  <0xfedb8260>
libc.so.1'_sleep+0x88:          addl    $0x8,%esp
libc.so.1'_sleep+0x8b:          movl    %edi,0x22c(%rsi)
libc.so.1'_sleep+0x91:          movb    0xb3(%rsi),%cl
libc.so.1'_sleep+0x97:          movb    %cl,0xb2(%rsi)
libc.so.1'_sleep+0x9d:          jmp     +0x14    <libc.so.1'_sleep+0xb1>
libc.so.1'_sleep+0x9f:          leal    0x14(%rsp),%eax
libc.so.1'_sleep+0xa3:          pushq   %rax
libc.so.1'_sleep+0xa4:          leal    0x10(%rsp),%eax
libc.so.1'_sleep+0xa8:          pushq   %rax
libc.so.1'_sleep+0xa9:          call    +0xc3f3  <0xfedb8260>
libc.so.1'_sleep+0xae:          addl    $0x8,%esp

`kmdb`, the Kernel Modular Debugger

The userland debugger, mdb, debugs the running kernel and kernel crash dumps. It can also control and debug live user processes as well as user core dumps. kmdb extends the debugger’s functionality to include instruction-level execution control of the kernel. mdb, by contrast, can only observe the running kernel.

The goal for kmdb is to bring the advanced debugging functionality of mdb, to the maximum extent practicable, to in-situ kernel debugging. This includes loadable-debugger module support, debugger commands, ability to process symbolic debugging information, and the various other features that make mdb so powerful.

kmdb is often compared with tracing tools like DTrace. DTrace is designed for tracing in the large—for safely examining kernel and user process execution at a function level, with minimal impact upon the running system. kmdb, on the other hand, grabs the system by the throat, stopping it in its tracks. It then allows for micro-level (per-instruction) analysis, allowing users observe the execution of individual instructions and allowing them to observe and change processor state. Whereas DTrace spends a great deal of energy trying to be safe, kmdb scoffs at safety, letting developers wreak unpleasantness upon the machine in furtherance of the debugging of their code.

Diagnosing with `kmdb` and `moddebug`

Diagnosing problems with kmdb builds on the techniques used with mdb. In this section, we cover some basic examples of how to use kmdb to boot the system.

Starting `kmdb` from the Console

kmdb can be started from the command line of the console login with mdb and the -K option.

# mdb -K

Welcome to kmdb
Loaded modules: [ audiosup cpc uppc ptm ufs unix zfs krtld s1394 sppp nca lofs
genunix ip logindmux usba specfs pcplusmp nfs md random sctp ]
[0]> $c
kmdbmod'kaif_enter+8()
kdi_dvec_enter+0x13()
kmdbmod'kctl_modload_activate+0x112(0, fffffe85ad938000, 1)
kmdb'kdrv_activate+0xfa(4c6450)
kmdb'kdrv_ioctl+0x32(ab00000000, db0001, 4c6450, 202001, ffffffff8b483570,
fffffe8000c48edc)
cdev_ioctl+0x55(ab00000000, db0001, 4c6450, 202001, ffffffff8b483570,
fffffe8000c48edc)
specfs'spec_ioctl+0x99(ffffffffbc4cc880, db0001, 4c6450, 202001,
ffffffff8b483570, fffffe8000c48edc)
fop_ioctl+0x2d(ffffffffbc4cc880, db0001, 4c6450, 202001, ffffffff8b483570,
fffffe8000c48edc)
ioctl+0x180(4, db0001, 4c6450)
sys_syscall+0x17b()
[0]> :c

Booting with the Kernel Debugger

If you experience hangs or panics during Solaris boot, whether during installation or after you’ve already installed, using the kernel debugger can be a big help in collecting the first set of “what happened” information.

You invoke the kernel debugger by supplying the -k switch in the kernel boot arguments. So a common request from a kernel engineer starting to examine a problem is often “try booting with kmdb.”

Sometimes it’s useful either to set a breakpoint to pause the kernel startup and examine something, or to just set a kernel variable to enable or disable a feature or to enable debugging output. If you use -k to invoke kmdb but also supply the -d switch, the debugger will be entered before the kernel really starts to do anything of consequence, so you can set kernel variables or breakpoints.

To enter the debugger at boot with Solaris 10, enter b -kd at the appropriate prompt; this is slightly different whether you’re installing or booting an already installed system.

ok boot kmdb -d
Loading kmdb...

Welcome to kmdb
[0]>

If, instead, you’re doing this with a system where GRUB boots Solaris, you add the -kd to the “kernel” line in the GRUB menu entry (you can edit GRUB menu entries for this boot by using the GRUB menu interface, and the “e” (for edit) key).

kernel /platform/i86pc/multiboot -kd -B console=ttya

Either way, you’ll drop into the kernel debugger in short order, which will announce itself with this prompt:

[0]>

Now we’re in the kernel debugger. The number in square brackets is the CPU that is running the kernel debugger; that number might change for later entries into the debugger.

Configuring a tty Console on x86

Solaris uses a bitmap screen and keyboard by default. To facilitate remote debugging, it is often desirable to configure the system to use a serial tty console. To do this, change the bootenv.rc and grub boot configuration.

setprop ttya-rts-dtr-off true
setprop console 'text'
                                                           See /boot/solaris/bootenv.rc

Edit the grub boot configuration to include -B console=ttya via the grub menu at boot time, or via bootadm(1M).

kernel /platform/i86pc/multiboot -kd -B console=ttya

Investigating Hangs

For investigating hangs, try turning on module debugging output. You can set the value of a kernel variable by using the /W command (“write a 32-bit value”). Here’s how you set moddebug to 0x80000000 and then continue execution of the kernel.

[0]> moddebug/W 80000000
[0]> :c

This command gives you debug output for each kernel module that loads. The bit masks for moddebug are shown below. Often, 0x80000000 is sufficient for the majority of initial exploratory debugging.

/*
 * bit definitions for moddebug.
 */
#define MODDEBUG_LOADMSG        0x80000000       /* print "[un]loading..." msg */
#define MODDEBUG_ERRMSG         0x40000000       /* print detailed error msgs */
#define MODDEBUG_LOADMSG2       0x20000000       /* print 2nd level msgs */
#define MODDEBUG_FINI_EBUSY     0x00020000       /* pretend fini returns EBUSY */
#define MODDEBUG_NOAUL_IPP      0x00010000       /* no Autounloading ipp mods */
#define MODDEBUG_NOAUL_DACF     0x00008000       /* no Autounloading dacf mods */
#define MODDEBUG_KEEPTEXT       0x00004000       /* keep text after unloading */
#define MODDEBUG_NOAUL_DRV      0x00001000       /* no Autounloading Drivers */
#define MODDEBUG_NOAUL_EXEC     0x00000800       /* no Autounloading Execs */
#define MODDEBUG_NOAUL_FS       0x00000400       /* no Autounloading File sys */
#define MODDEBUG_NOAUL_MISC     0x00000200       /* no Autounloading misc */
#define MODDEBUG_NOAUL_SCHED    0x00000100       /* no Autounloading scheds */
#define MODDEBUG_NOAUL_STR      0x00000080       /* no Autounloading streams */
#define MODDEBUG_NOAUL_SYS      0x00000040       /* no Autounloading syscalls */
#define MODDEBUG_NOCTF          0x00000020       /* do not load CTF debug data */
#define MODDEBUG_NOAUTOUNLOAD   0x00000010       /* no autounloading at all */
#define MODDEBUG_DDI_MOD        0x00000008       /* ddi_mod{open,sym,close} */
#define MODDEBUG_MP_MATCH       0x00000004       /* dev_minorperm */
#define MODDEBUG_MINORPERM      0x00000002       /* minor perm modctls */
#define MODDEBUG_USERDEBUG      0x00000001       /* bpt after init_module() */
                                                                        See sys/modctl.h

Collecting Information about Panics

When the kernel panics, it drops into the debugger and prints some interesting information; usually, however, the most interesting thing is the stack backtrace; this shows, in reverse order, all the functions that were active at the time of panic. To generate a stack backtrace, use the following:

[0]> $c

A few other useful information commands during a panic are ::msgbuf and ::status, as shown in Section 14.1.

[0]> ::msgbuf   - which will show you the last things the kernel printed onscreen, and
[0]> ::status   - which shows a summary of the state of the machine in panic.

If you’re running the kernel while the kernel debugger is active and you experience a hang, you may be able to break into the debugger to examine the system state; you can do this by pressing the <F1> and <A> keys at the same time (a sort of “F1-shifted-A” keypress). (On SPARC systems, this key sequence is <Stop>-<A>.) This should give you the same debugger prompt as above, although on a multi-CPU system you may see that the CPU number in the prompt is something other than 0. Once in the kernel debugger, you can get a stack backtrace as above; you can also use ::switch to change the CPU and get stack backtraces on the different CPU, which might shed more light on the hang. For instance, if you break into the debugger on CPU 1, you could switch to CPU 0 with the following:

[1]> 0::switch

Working with Debugging Targets

For the most part, the execution control facilities provided by kmdb for the kernel mirror those provided by the mdb process target. Breakpoints (:bp), watchpoints (::wp), ::continue, and the various flavors of ::step can be used.

We discuss more about debugging targets in Section 13.3 and Section 14.1. The common commands for controlling kmdb targets are summarized in Table 14.1.

Table 14.1. Core kmdb dcmds

dcmd	Description
::status	Print summary of current target.
$r ::regs	Display current register values for target.
$c ::stack $C	Print current stack trace (`$C`: with frame pointers).
addr[,b] ::dump [-g sz] [-e]	Dump at least `b` bytes starting at address `addr`. `-g` sets the group size; for 64-bit debugging, `-g 8` is useful.
addr::dis	Disassemble text, starting around `addr`.
[ addr ] :b [ addr ] ::bp [+/-dDestT] [-n count] sym ... addr	Set breakpoint at `addr`.
$b	Display all breakpoints.
::branches	Display the last branches taken by the CPU. (x86 only)
addr ::delete [id \| all] addr :d [id \| all]	Delete a breakpoint at `addr`.
:z	Delete all breakpoints.
function ::call [arg [arg ...]]	Call the specified function, using the specified arguments.
[cpuid] ::cpuregs [-c cpuid]	Display the current general-purpose register set.
[cpuid] ::cpustack [-c cpuid]	Print a C stack backtrace for the specified CPU.
::cont :c	Continue the target program.
$M	List the macro files that are cached by `kmdb` for use with the `$<` dcmd.
::next :e	Step the target program one instruction, but step over subroutine calls.
::step [branch \| over \| out]	Step the target program one instruction.
$<systemdump	Initiate a panic/dump.
::quit [-u] $q	Cause the debugger to exit. When the `-u` option is used, the system is resumed and the debugger is unloaded.
addr [, len]::wp [+/-dDestT] [-rwx] [-ip] [-n count]	Set a watchpoint at the specified address.
addr [, len]:a [cmd ...] addr [, len]:p [cmd ...] addr [, len]:w [cmd ...]

Setting Breakpoints

Setting breakpoints with kmdb is done in the same way as with generic mdb targets, using the :b dcmd. Refer to Table 13.12 for a complete list of debugger dcmds.

# mdb -K
Loaded modules: [ crypto ]
kmdb: target stopped at:
kmdbmod'kaif_enter+8:   popfq
[0]> resume:b
[0]>  :c
kmdb: stop at resume
kmdb: target stopped at:
resume:         movq   %gs:0x18,%rax
[0]> :z
[0]>  :c
#

Forcing a Crash Dump with `halt -d`

The following example shows how to force a crash dump and reboot of the x86-based system by using the halt -d and boot commands. Use this method to force a crash dump of the system. Afterwards, reboot the system manually.

# halt -d
4ay 30 15:35:15 wacked.Central.Sun.COM halt: halted by user

panic[cpu0]/thread=ffffffff83246ec0: forced crash dump initiated at user request

fffffe80006bbd60 genunix:kadmin+4c1 ()
fffffe80006bbec0 genunix:uadmin+93 ()
fffffe80006bbf10 unix:sys_syscall32+101 ()

syncing file systems... done
dumping to /dev/dsk/c1t0d0s1, offset 107675648, content: kernel
NOTICE: adpu320: bus reset
100% done: 38438 pages dumped, compression ratio 4.29, dump succeeded

Welcome to kmdb
Loaded modules: [ audiosup crypto ufs unix krtld s1394 sppp nca uhci lofs
genunix ip usba specfs nfs md random sctp ]
[0]>
kmdb: Do you really want to reboot? (y/n) y

Forcing a Dump with `kmdb`

If you cannot use the reboot -d or the halt -d command, you can use the kernel debugger, kmdb, to force a crash dump. The kernel debugger must have been loaded, either at boot or with the mdb -k command, for the following procedure to work. Enter kmdb by using L1–A on SPARC, F1-A on x86, or break on a tty.

[0]> $<systemdump
panic[cpu0]/thread=ffffffff83246ec0: forced crash dump initiated at user request

fffffe80006bbd60 genunix:kadmin+4c1 ()
fffffe80006bbec0 genunix:uadmin+93 ()
fffffe80006bbf10 unix:sys_syscall32+101 ()

syncing file systems... done
dumping to /dev/dsk/c1t0d0s1, offset 107675648, content: kernel
NOTICE: adpu320: bus reset
100% done: 38438 pages dumped, compression ratio 4.29, dump succeeded

Kernel Built-In MDB dcmds

  dcmd $<                    - replace input with macro
  dcmd $<<                   - source macro
  dcmd $>                    - log session to a file
  dcmd $?                    - print status and registers
  dcmd $C                    - print stack backtrace
  dcmd $G                    - enable/disable C++ demangling support
  dcmd $M                    - list macro aliases
  dcmd $P                    - set debugger prompt string
  dcmd $Q                    - quit debugger
  dcmd $V                    - get/set disassembly mode
  dcmd $W                    - reopen target in write mode
  dcmd $X                    - print floating-point registers
  dcmd $Y                    - print floating- point registers
  dcmd $b                    - list traced software events
  dcmd $c                    - print stack backtrace
  dcmd $d                    - get/set default output radix
  dcmd $e                    - print listing of global symbols
  dcmd $f                    - print listing of source files
  dcmd $g                    - get/set C++ demangling options
  dcmd $i                    - print signals that are ignored
  dcmd $l                    - print the representative thread's lwp id
  dcmd $m                    - print address space mappings
  dcmd $p                    - change debugger target context
  dcmd $q                    - quit debugger
  dcmd $r                    - print general-purpose registers
  dcmd $s                    - get/set symbol matching distance
  dcmd $v                    - print non-zero variables
  dcmd $w                    - get/set output page width
  dcmd $x                    - print floating-point registers
  dcmd $y                    - print floating-point registers
  dcmd /                     - format data from virtual as
  dcmd :A                    - attach to process or core file
  dcmd :R                    - release the previously attached process
  dcmd :a                    - set read access watchpoint
  dcmd :b                    - set breakpoint at the specified address
  dcmd :c                    - continue target execution
  dcmd :d                    - delete traced software events
  dcmd :e                    - step target over next instruction
  dcmd :i                    - ignore signal (delete all matching events)
  dcmd :k                    - forcibly kill and release target
  dcmd :p                    - set execute access watchpoint
  dcmd :r                    - run a new target process
  dcmd :s                    - single-step target to next instruction
  dcmd :t                    - stop on delivery of the specified signals
  dcmd :u                    - step target out of current function
  dcmd :w                    - set write access watchpoint
  dcmd :z                    - delete all traced software events
  dcmd =                     - format immediate value
  dcmd >                     - assign variable
  dcmd ?                     - format data from object file
  dcmd @                     - format data from physical as
  dcmd                      - format data from physical as
  dcmd array                 - print each array element's address
  dcmd attach                - attach to process or corefile
  dcmd bp                    - set breakpoint at the specified addresses or symbols
  dcmd cat                   - concatenate and display files
  dcmd cont                  - continue target execution
  dcmd context               - change debugger target context
  dcmd dcmds                 - list available debugger commands
  dcmd delete                - delete traced software events
  dcmd dem                   - demangle C++ symbol names
  dcmd dis                   - disassemble near addr
  dcmd disasms               - list available disassemblers
  dcmd dismode               - get/set disassembly mode
  dcmd dmods                 - list loaded debugger modules
  dcmd dump                  - dump memory from specified address
  dcmd echo                  - echo arguments
  dcmd enum                  - print an enumeration
  dcmd eval                  - evaluate the specified command
  dcmd events                - list traced software events
  dcmd evset                 - set software event specifier attributes
  dcmd files                 - print listing of source files
  dcmd fltbp                 - stop on machine fault
  dcmd formats               - list format specifiers
  dcmd fpregs                - print floating point registers
  dcmd grep                  - print dot if expression is true
  dcmd head                  - limit number of elements in pipe
  dcmd help                  - list commands/command help
  dcmd kill                  - forcibly kill and release target
  dcmd list                  - walk list using member as link pointer
  dcmd load                  - load debugger module
  dcmd log                   - log session to a file
  dcmd map                   - print dot after evaluating expression
  dcmd mappings              - print address space mappings
  dcmd next                  - step target over next instruction
  dcmd nm                    - print symbols
  dcmd nmadd                 - add name to private symbol table
  dcmd nmdel                 - remove name from private symbol table
  dcmd objects               - print load objects information
  dcmd offsetof              - print the offset of a given struct or union member
  dcmd print                 - print the contents of a data structure
  dcmd quit                  - quit debugger
  dcmd regs                  - print general-purpose registers
  dcmd release               - release the previously attached process
  dcmd run                   - run a new target process
  dcmd set                   - get/set debugger properties
  dcmd showrev               - print version information
  dcmd sigbp                 - stop on delivery of the specified signals
  dcmd sizeof                - print the size of a type
  dcmd stack                 - print stack backtrace
  dcmd stackregs             - print stack backtrace and registers
  dcmd status                - print summary of current target
  dcmd step                  - single-step target to next instruction
  dcmd sysbp                 - stop on entry or exit from system call
  dcmd term                  - display current terminal type
  dcmd typeset               - set variable attributes
  dcmd unload                - unload debugger module
  dcmd unset                 - unset variables
  dcmd vars                  - print listing of variables
  dcmd version               - print debugger version string
  dcmd vtop                  - print physical mapping of virtual address
  dcmd walk                  - walk data structure
  dcmd walkers               - list available walkers
  dcmd whence                - show source of walk or dcmd
  dcmd which                 - show source of walk or dcmd
  dcmd wp                    - set a watchpoint at the specified address
  dcmd xdata                 - print list of external data buffers

krtld
  dcmd ctfinfo               - list module CTF information
  dcmd modctl                - list modctl structures
  dcmd modhdrs               - given modctl, dump module ehdr and shdrs
  dcmd modinfo               - list module information
  walk modctl                - list modctl structures
mdb_kvm
  ctor 0x8076f20             - target constructor
  dcmd $?                    - print status and registers
  dcmd $C                    - print stack backtrace
  dcmd $c                    - print stack backtrace
  dcmd $r                    - print general-purpose registers
  dcmd regs                  - print general-purpose registers
  dcmd stack                 - print stack backtrace
  dcmd stackregs             - print stack backtrace and registers
  dcmd status                - print summary of current target

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 14. Debugging Kernels

Create new playlist

Sign In

Sign Up