7.6. Using the trace facility

The trace facility is a flexible system monitoring service that supplies a stream of events. It is up to you what information you extract. Trace provides detailed information about system activity based on a collection of pre-defined system events. You can add new trace hooks into your application in order to examine the application behavior. By utilizing trace hooks, your application may be debugged without having core files.

Please refer to the “Analyzing Performance with the Trace Facility” section of the AIX 5L Version 5.2 Performance Management Guide for further information.

7.6.1. Introduction to trace

Events are the basics of every trace running. These events are compiled into kernel or application code, but are only traced if tracing is active. You activate tracing with the trace command or the trcstart() subroutine. Tracing is stopped with the trcstop command or the trcstop() subroutine.

When tracing is active, you suspend or resume it with the trcoff and trcon commands or with the trcoff() and trcon() subroutines.

During the tracing, the events are stored in the trace log file. The recorded events can be selected and formatted with the trcrpt command to be a human-readable report.

Note

Tracing may impact system performance, if many events are traced.


Trace hooks

The events monitored by trace are the trace hooks. Each hook is assigned to a unique number. The trace hooks are defined in the /usr/include/sys/trchkid.h header file.

You can define additional user trace hooks used for tracing your application. The range for user defined hooks is between 0x010 and 0x0FF.

Trace report

Once the trace events are stored in the log file, use the trcrpt command to produce a human-readable report.

7.6.2. Tracing an application on the command line

This is the most common and useful way of tracing user defined trace hooks in user applications.

Defining the trace hook in the application

Before starting trace, you have to modify your application code. The trace interfaces in your application will be only active if the trace daemon is active and is collecting data.

We use the example source code shown in Example 7-5 to demonstrate a simple usage of trace hook in the application source code.

Example 7-5. Sample source code using trchook (prog11.c)
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
    unsigned int d1,d2,d3,d4,d5;
    int a,b;

    trchook(0x02010000,d1,d2,d3,d4,d5); /* trace hook 0x020 */
    trchook(0x02110000,d1,d2,d3,d4,d5); /* trace hook 0x021 */

    a = 13;
    b = 3;
    trchook(0x02220000,(a+b),d2,d3,d4,d5);/* trace hook 0x022 */

    exit(0);
}

The trchook() subroutine has the following syntax:

void trchook(unsigned int HkWord, unsigned int d1
            , unsigned int d2, unsigned int d3
            , unsigned int d4, unsigned int d5);

where HkWord is an unsigned integer consisting of a hook ID (HkID), a hook type (Hk_Type), and two bytes of data from the calling program (HkData). A hook ID is a 12-bit value. For user programs, the hook ID may be a value from 0x010 to 0x0FF. Hook identifiers are defined in the /usr/include/sys/trchkid.h file. Hk_Type is a 4-bit value that identifies the amount of trace data to be recorded (see Figure 7-1).

Figure 7-1. Definition of HkWord


Defining trace templates

Prepare the trace report by naming the desired messages in the output by defining your user trace hooks (in our example, hook id 020, 021, and 022).

The trace templates for the trcrpt command are stored in the /etc/trcfmt file. Instead of using the default /etc/trcfmt file, we have made our own copy file, mytrcfmt, so that we can refer to this file by using the -t filename option.

In our example, we have inserted the following definition entries in mytrcfmt:

020 1.0 L=APPL "USER EVENT - HKWD_USER2" 02.0      
 
               "Program runs over here"
021 1.0 L=INT  "USER EVENT - HKWD_USER2" 02.0      
 
               "Simulatied Interrupt"
022 1.0 L=APPL "USER EVENT - HKWD_USER2" 02.0      
 
               "Program runs over here value:" U4

Please refer to AIX 5L Version 5.2 Files Reference for the complete definition of trace templates.

Tracing the application

We have compiled the program as follows:

$ cc prog11.c -o prog11

Then we start to trace it as follows:

$ trace
-> trcon
-> ! prog11
-> trcoff
-> q

When invoked, the trace command gives you the sub-command prompt “->”. The trcon sub-command starts the collection of trace hooks. We started the program, prog11, and stopped the collection with the trcoff sub-command.

Reporting output of the trace events

The following example shows the trace report of newly defined trace hooks:

# trcrpt -d 020,021,022 -t mytrcfmt
Mon Feb  3 17:31:25 2003
System: AIX lpar02 Node: 5
Machine: 0021768A4C00
Internet Address: 09030442 9.3.4.66
The system contains 2 cpus, of which 2 were traced.
Buffering: Kernel Heap
This is from a 32-bit kernel.
Tracing all hooks.

trace


ID    ELAPSED_SEC DELTA_MSEC   APPL SYSCALL KERNEL  INTERRUPT

006   0.864997761 864.997761                TRACEBUFFER WRAPAROUND 0002
006   1.802352538 937.354777                TRACEBUFFER WRAPAROUND 0003
006   2.789985334 987.632796                TRACEBUFFER WRAPAROUND 0004
006   3.687272139 897.286805                TRACEBUFFER WRAPAROUND 0005
006   4.454988847 767.716708                TRACEBUFFER WRAPAROUND 0006
006   4.657498888 202.510041                TRACEBUFFER WRAPAROUND 0007
020   4.669032274*             USER EVENT - HKWD_USER2
							Program runs over here
							201   4.669032274*                                   USER EVENT - HKWD_USER2
							Simulatied Interrupt
							022   4.669032274*             USER EVENT - HKWD_USER2
							Program runs over here value: 16
006   4.749999720 80.967446                  TRACEBUFFER WRAPAROUND 0008
006   5.695142879 945.143159                 TRACEBUFFER WRAPAROUND 0009
005   5.695142879*                           LOGFILE WRAPAROUND 0001
006   6.534979135 839.836256                 TRACEBUFFER WRAPAROUND 000A
002   6.609241540 74.262405                  TRACE OFF channel 0000 Mon Feb 3
17:31:33 2003

7.6.3. Tracing an application with subroutine calls

You can include the trace control, starting the trace, enable and resume data collection, as well as trace stopping completely inside the application.

Defining the trace hook in the application

In the prog10.c sample, we use a predefined macro called TRCHKL1T (see Example 7-6). The TCHKL1T macro found in /usr/include/sys/trcmacros.h is defined to call the utrchook() subroutine, which is similar to the trchook() subroutine used in our previous example, but used for non generic hook events. The utrchook() subroutine is called by the macro with hook ID 0x010 and HType 2.

In Example 7-6, instead of using trcon() and trcoff() subroutines, we use ioctl() calls for the trcon and trcoff functions. trcon() and trcoff() subroutines perform the following tasks:

1.
Opens the trace control device (/dev/systrctl).

2.
Issues the appropriate ioctl() subroutine.

3.
Closes the control device.

4.
Returns to the calling program.

So, using the ioctl()-calls directly tunes the trace, because we do not need additional I/O and system calls.

Example 7-6. Sample source code using TRCHKL1T (prog10.c)
#include <stdlib.h>
#include <stdio.h>
#include <sys/trcctl.h>
#include <sys/trcmacros.h>
#include <sys/trchkid.h>

#include <sys/trchkid.h>
char *ctl_file = "/dev/systrctl";
int ctlfd;
int i;

main(int argc, char *argv[])
{
    printf("Configuring trace collection 
");
    if (trcstart("-ad")) {
        perror("trcstart");
        exit(1);
    }

    printf("Opening the trace device 
");
    if ((ctlfd = open(ctl_file, 0)) < 0) {
        perror(ctl_file);
        exit(1);
    }

    printf("Turning trace on 
");
    if (ioctl(ctlfd, TRCON, 0)) {
        perror("TRCON");
        exit(1);
    }

    for (i=1;i <6; i++) {
        TRCHKL1T(HKWD_USER1, i);
    }

    printf("Turning trace off
");
    if (ioctl(ctlfd, TRCSTOP, 0)) {
        perror("TRCOFF");
        exit(1);
    }

    printf("Stopping the trace daemon 
");
    if (trcstop(0)) {
        perror("trcstop");
        exit(1);
    }

    exit(0);
}

We have compiled the application as follows:

$ cc prog10.c -o prog10 -lrts

Tracing the application

When the application starts, it automatically starts and stops the trace:

$ prog10
Configuring trace collection
Opening the trace device
Turning  trace on
Turning trace off
Stopping the trace daemon

Reporting the output of the trace events

After defining the template for hook ID 0x010, we can generate the trace report, as shown in Example 7-7 on page 273.

Example 7-7. Trace report for the hook ID 010
$ trcrpt -d 010 -t mytrcfmt

Mon Feb  3 18:20:06 2003
System: AIX lpar02 Node: 5
Machine: 0021768A4C00
Internet Address: 09030442 9.3.4.66
The system contains 2 cpus, of which 2 were traced.
Buffering: Kernel Heap
This is from a 32-bit kernel.
Tracing all hooks.

/usr/sbin/trace -ad


ID     ELAPSED_SEC     DELTA_MSEC    APPL    SYSCALL KERNEL  INTERRUPT

001    0.000000000     0.000000                      TRACE ON channel 0
                                                     Mon Feb  3 18:20:06 2003
010    0.000004057     0.004057      USER EVENT - HKWD_USER1
                                     The # of loop iterations =1
                                     The elapsed time of the last loop =
010    0.000005344     0.001287      USER EVENT - HKWD_USER1
                                     The # of loop iterations =2
                                     The elapsed time of the last loop = [1 usec]
010    0.000006565     0.001221      USER EVENT - HKWD_USER1
                                     The # of loop iterations =3
                                     The elapsed time of the last loop = [1 usec]
010    0.000007785     0.001220      USER EVENT - HKWD_USER1
                                     The # of loop iterations =4
                                     The elapsed time of the last loop = [1 usec]
010    0.000009020     0.001235      USER EVENT - HKWD_USER1
                                     The # of loop iterations =5
                                     The elapsed time of the last loop = [1 usec]
002    0.000043623     0.034603                      TRACE OFF channel 0000 Mon Feb  3 18
:20:06
2003

For more information about the trace facility, please refer to the “Trace Facility” section of AIX 5L Version 5.2 General Programming Concepts: Writing and Debugging Programs and the “Analyzing Performance with the Trace Facility” section of AIX 5L Version 5.2 Performance Management Guide. See the AIX 5L Version 5.2 Files Reference for the trcfmt file format.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset