The trace facility is a flexible system monitoring service that supplies a stream of events. It is up to you what information you extract. Trace provides detailed information about system activity based on a collection of pre-defined system events. You can add new trace hooks into your application in order to examine the application behavior. By utilizing trace hooks, your application may be debugged without having core files.
Please refer to the “Analyzing Performance with the Trace Facility” section of the AIX 5L Version 5.2 Performance Management Guide for further information.
Events are the basics of every trace running. These events are compiled into kernel or application code, but are only traced if tracing is active. You activate tracing with the trace command or the trcstart() subroutine. Tracing is stopped with the trcstop command or the trcstop() subroutine.
When tracing is active, you suspend or resume it with the trcoff and trcon commands or with the trcoff() and trcon() subroutines.
During the tracing, the events are stored in the trace log file. The recorded events can be selected and formatted with the trcrpt command to be a human-readable report.
Note
Tracing may impact system performance, if many events are traced.
The events monitored by trace are the trace hooks. Each hook is assigned to a unique number. The trace hooks are defined in the /usr/include/sys/trchkid.h header file.
You can define additional user trace hooks used for tracing your application. The range for user defined hooks is between 0x010 and 0x0FF.
Once the trace events are stored in the log file, use the trcrpt command to produce a human-readable report.
This is the most common and useful way of tracing user defined trace hooks in user applications.
Before starting trace, you have to modify your application code. The trace interfaces in your application will be only active if the trace daemon is active and is collecting data.
We use the example source code shown in Example 7-5 to demonstrate a simple usage of trace hook in the application source code.
#include <stdlib.h> #include <stdio.h> int main(int argc, char *argv[]) { unsigned int d1,d2,d3,d4,d5; int a,b; trchook(0x02010000,d1,d2,d3,d4,d5); /* trace hook 0x020 */ trchook(0x02110000,d1,d2,d3,d4,d5); /* trace hook 0x021 */ a = 13; b = 3; trchook(0x02220000,(a+b),d2,d3,d4,d5);/* trace hook 0x022 */ exit(0); } |
The trchook() subroutine has the following syntax:
void trchook(unsigned int HkWord, unsigned int d1 , unsigned int d2, unsigned int d3 , unsigned int d4, unsigned int d5);
where HkWord is an unsigned integer consisting of a hook ID (HkID), a hook type (Hk_Type), and two bytes of data from the calling program (HkData). A hook ID is a 12-bit value. For user programs, the hook ID may be a value from 0x010 to 0x0FF. Hook identifiers are defined in the /usr/include/sys/trchkid.h file. Hk_Type is a 4-bit value that identifies the amount of trace data to be recorded (see Figure 7-1).
Prepare the trace report by naming the desired messages in the output by defining your user trace hooks (in our example, hook id 020, 021, and 022).
The trace templates for the trcrpt command are stored in the /etc/trcfmt file. Instead of using the default /etc/trcfmt file, we have made our own copy file, mytrcfmt, so that we can refer to this file by using the -t filename option.
In our example, we have inserted the following definition entries in mytrcfmt:
020 1.0 L=APPL "USER EVENT - HKWD_USER2" 02.0 "Program runs over here" 021 1.0 L=INT "USER EVENT - HKWD_USER2" 02.0 "Simulatied Interrupt" 022 1.0 L=APPL "USER EVENT - HKWD_USER2" 02.0 "Program runs over here value:" U4
Please refer to AIX 5L Version 5.2 Files Reference for the complete definition of trace templates.
We have compiled the program as follows:
$ cc prog11.c -o prog11
Then we start to trace it as follows:
$ trace -> trcon -> ! prog11 -> trcoff -> q
When invoked, the trace command gives you the sub-command prompt “->”. The trcon sub-command starts the collection of trace hooks. We started the program, prog11, and stopped the collection with the trcoff sub-command.
The following example shows the trace report of newly defined trace hooks:
# trcrpt -d 020,021,022 -t mytrcfmt Mon Feb 3 17:31:25 2003 System: AIX lpar02 Node: 5 Machine: 0021768A4C00 Internet Address: 09030442 9.3.4.66 The system contains 2 cpus, of which 2 were traced. Buffering: Kernel Heap This is from a 32-bit kernel. Tracing all hooks. trace ID ELAPSED_SEC DELTA_MSEC APPL SYSCALL KERNEL INTERRUPT 006 0.864997761 864.997761 TRACEBUFFER WRAPAROUND 0002 006 1.802352538 937.354777 TRACEBUFFER WRAPAROUND 0003 006 2.789985334 987.632796 TRACEBUFFER WRAPAROUND 0004 006 3.687272139 897.286805 TRACEBUFFER WRAPAROUND 0005 006 4.454988847 767.716708 TRACEBUFFER WRAPAROUND 0006 006 4.657498888 202.510041 TRACEBUFFER WRAPAROUND 0007 020 4.669032274* USER EVENT - HKWD_USER2 Program runs over here 201 4.669032274* USER EVENT - HKWD_USER2 Simulatied Interrupt 022 4.669032274* USER EVENT - HKWD_USER2 Program runs over here value: 16 006 4.749999720 80.967446 TRACEBUFFER WRAPAROUND 0008 006 5.695142879 945.143159 TRACEBUFFER WRAPAROUND 0009 005 5.695142879* LOGFILE WRAPAROUND 0001 006 6.534979135 839.836256 TRACEBUFFER WRAPAROUND 000A 002 6.609241540 74.262405 TRACE OFF channel 0000 Mon Feb 3 17:31:33 2003
You can include the trace control, starting the trace, enable and resume data collection, as well as trace stopping completely inside the application.
In the prog10.c sample, we use a predefined macro called TRCHKL1T (see Example 7-6). The TCHKL1T macro found in /usr/include/sys/trcmacros.h is defined to call the utrchook() subroutine, which is similar to the trchook() subroutine used in our previous example, but used for non generic hook events. The utrchook() subroutine is called by the macro with hook ID 0x010 and HType 2.
In Example 7-6, instead of using trcon() and trcoff() subroutines, we use ioctl() calls for the trcon and trcoff functions. trcon() and trcoff() subroutines perform the following tasks:
1. | Opens the trace control device (/dev/systrctl). |
2. | Issues the appropriate ioctl() subroutine. |
3. | Closes the control device. |
4. | Returns to the calling program. |
So, using the ioctl()-calls directly tunes the trace, because we do not need additional I/O and system calls.
#include <stdlib.h> #include <stdio.h> #include <sys/trcctl.h> #include <sys/trcmacros.h> #include <sys/trchkid.h> #include <sys/trchkid.h> char *ctl_file = "/dev/systrctl"; int ctlfd; int i; main(int argc, char *argv[]) { printf("Configuring trace collection "); if (trcstart("-ad")) { perror("trcstart"); exit(1); } printf("Opening the trace device "); if ((ctlfd = open(ctl_file, 0)) < 0) { perror(ctl_file); exit(1); } printf("Turning trace on "); if (ioctl(ctlfd, TRCON, 0)) { perror("TRCON"); exit(1); } for (i=1;i <6; i++) { TRCHKL1T(HKWD_USER1, i); } printf("Turning trace off "); if (ioctl(ctlfd, TRCSTOP, 0)) { perror("TRCOFF"); exit(1); } printf("Stopping the trace daemon "); if (trcstop(0)) { perror("trcstop"); exit(1); } exit(0); } |
We have compiled the application as follows:
$ cc prog10.c -o prog10 -lrts
When the application starts, it automatically starts and stops the trace:
$ prog10 Configuring trace collection Opening the trace device Turning trace on Turning trace off Stopping the trace daemon
After defining the template for hook ID 0x010, we can generate the trace report, as shown in Example 7-7 on page 273.
$ trcrpt -d 010 -t mytrcfmt Mon Feb 3 18:20:06 2003 System: AIX lpar02 Node: 5 Machine: 0021768A4C00 Internet Address: 09030442 9.3.4.66 The system contains 2 cpus, of which 2 were traced. Buffering: Kernel Heap This is from a 32-bit kernel. Tracing all hooks. /usr/sbin/trace -ad ID ELAPSED_SEC DELTA_MSEC APPL SYSCALL KERNEL INTERRUPT 001 0.000000000 0.000000 TRACE ON channel 0 Mon Feb 3 18:20:06 2003 010 0.000004057 0.004057 USER EVENT - HKWD_USER1 The # of loop iterations =1 The elapsed time of the last loop = 010 0.000005344 0.001287 USER EVENT - HKWD_USER1 The # of loop iterations =2 The elapsed time of the last loop = [1 usec] 010 0.000006565 0.001221 USER EVENT - HKWD_USER1 The # of loop iterations =3 The elapsed time of the last loop = [1 usec] 010 0.000007785 0.001220 USER EVENT - HKWD_USER1 The # of loop iterations =4 The elapsed time of the last loop = [1 usec] 010 0.000009020 0.001235 USER EVENT - HKWD_USER1 The # of loop iterations =5 The elapsed time of the last loop = [1 usec] 002 0.000043623 0.034603 TRACE OFF channel 0000 Mon Feb 3 18 :20:06 2003 |
For more information about the trace facility, please refer to the “Trace Facility” section of AIX 5L Version 5.2 General Programming Concepts: Writing and Debugging Programs and the “Analyzing Performance with the Trace Facility” section of AIX 5L Version 5.2 Performance Management Guide. See the AIX 5L Version 5.2 Files Reference for the trcfmt file format.