Chapter 9. Software Trace Analysis Patterns

Introduction

We expand the domain of software defect research into Trace Analysis Patterns. In addition to Citrix CDF / Microsoft ETW traces we cover other variants based on extensive software engineering background in the past where we used tracing in software products ranging from soft multi-platform real-time systems to static code analysis tools. Connection with memory dump analysis is covered too because, sometimes, the combination of static and dynamic data leads to interesting observations and helps to troubleshoot and resolve customer problems especially when not all data can be collected dynamically.

In fact, stack traces (Volume 1, page 395) and their collections (Volume 1, page 409) are specializations of the more general traces. Another example is historical information (Volume 1, page 457) in memory dump files especially when it is somehow time stamped.

Periodic Error

This is the obvious and to some extent the trivial pattern. This is an error or a status value that is observed periodically many times:

No     PID  TID   Date      Time          Message
[...]
664957 1788 22504 4/23/2009 17:59:14.600 MyClass::Initialize:  Cannot
open connection  "Client ID: 310", status=5
[...]
668834 1788 19868 4/23/2009 19:11:52.979 MyClass::Initialize:  Cannot
open connection  "Client ID: 612", status=5
[...]

or

No     PID  TID   Date      Time          Message
[...]
202314 1788 19128 4/21/2009 16:03:46.861 HandleDataLevel:  Error 12005
Getting Mask
[...]
347653 1788 17812 4/22/2009 13:26:00.735 HandleDataLevel : Error 12005
Getting Mask
[...]

Here single trace entries can be isolated from the trace and studied in detail. We should be aware though that some modules might report periodic errors that are false positive, in the sense, that they are expected as a part of implementation details, for example, when a function returns an error to indicate that the bigger buffer is required or to estimate its size for a subsequent call.

Basic Facts

A typical trace is a detailed narrative. It is accompanied by a problem description that lists essential facts. Therefore the first task of any trace analysis is to check the presence of Basic Facts in the trace. If they are not visible or do not correspond then the trace was possibly not recorded during the problem occurrence or was taken from a different computer or under different conditions. Here is an example. A user "test01″ can-not connect to a published application in a terminal services environment. We look at the trace and find this statement:

No   PID  TID  Date      Time         Statement
[...]
3903 3648 5436 4/29/2009 16:17:36.150 User Name: test01
[...]

At least we can be sure that this trace was taken for the user test01 especially when we expect this or similar trace statements. If we don't see this trace statement we may suppose that the trace was taken at the wrong time, for example, when the problem had already happened.

Circular Trace

This is an obvious structural trace analysis pattern. Sometimes information about circularity is missing in the problem description or the trace metadata doesn't reflect it. Then circular traces can be detected by trace file size (usually large) and from time-stamps, like this 100Mb CDF trace snippet:

No     Module  PID  TID  Date      Time         Statement
[Begin of trace listing]
1      ModuleA 4280 1736 5/28/2009 08:53:50.496 [... Trace statement 1]
2      ModuleB 6212 6216 5/28/2009 08:53:52.876 [... Trace statement 2]
3      ModuleA 4280 4776 5/28/2009 08:54:13.537 [... Trace statement 3]
[... Some traced exceptions helpful for analysis ...]
3799   ModuleA 4280 3776 5/28/2009 09:15:00.853 [... Trace statement 3799]
3800   ModuleA 4280 1736 5/27/2009 09:42:12.029 [... Trace statement 3800]
[... Skipped ...]
[... Skipped ...]
[... Skipped ...]
579210 ModuleA 4280 4776 5/28/2009 08:53:35.989 [... Trace statement 579210]
[End of trace listing]

In such traces the analysis region usually can be found at the beginning at the trace because as soon as elusive and hard to reproduce problem happens the trace is stopped.

Intra-Correlation

Sometimes we see a functional activity and basic facts (page 345) in a trace. Then we might want to find a correlation between that activity and facts in another part of the trace. If that intra-correlation fits into our problem description we may claim a possible explanation or, if we are lucky, we have just found, an inference to the best explanation, as philosophers of science like to say. Here is an example, but this time using Citrix WindowHistory tracing tool[33]. A third-party application was frequently loosing the focus and the suspicion was on a terminal services client process. It was found that the following WindowHistory trace fragment corresponds to that application:

Handle: 00050586 Class: "Application A Class" Title: ""
     Title changed at 15:52:4:3 to "Application A"
     Title changed at 15:52:10:212 to "Application A - File1"
[...]
   Process ID: 89c
   Thread ID: d6c
[...]
   Visible: true
   Window placement command: SW_SHOWNORMAL
        Placement changed at 15:54:57:506 to SW_SHOWMINIMIZED
        Placement changed at 15:55:2:139 to SW_SHOWNORMAL
   Foreground: false
        Foreground changed at 15:52:4:3 to true
        Foreground changed at 15:53:4:625 to false
        Foreground changed at 15:53:42:564 to true
        Foreground changed at 15:53:44:498 to false
        Foreground changed at 15:53:44:498 to true
        Foreground changed at 15:53:44:592 to false
        Foreground changed at 15:53:45:887 to true
        Foreground changed at 15:53:47:244 to false
        Foreground changed at 15:53:47:244 to true
        Foreground changed at 15:53:47:353 to false
        Foreground changed at 15:54:26:416 to true
        Foreground changed at 15:54:27:55 to false
        Foreground changed at 15:54:27:55 to true
        Foreground changed at 15:54:27:180 to false
        Foreground changed at 15:54:28:428 to true
        Foreground changed at 15:54:28:771 to false
        Foreground changed at 15:54:28:865 to true
        Foreground changed at 15:54:29:99 to false
        Foreground changed at 15:54:30:877 to true
        Foreground changed at 15:54:57:521 to false
        Foreground changed at 15:55:2:76 to true
        Foreground changed at 15:57:3:378 to false
        Foreground changed at 15:57:11:396 to true
        Foreground changed at 15:57:29:601 to false
        Foreground changed at 15:57:39:803 to true
        Foreground changed at 15:58:54:41 to false
        Foreground changed at 15:59:8:96 to true
        Foreground changed at 16:1:19:478 to false
        Foreground changed at 16:1:27:527 to true
        Foreground changed at 16:1:39:914 to false
        Foreground changed at 16:2:0:515 to true
        Foreground changed at 16:7:14:628 to false
        Foreground changed at 16:7:24:246 to true
        Foreground changed at 16:9:53:523 to false
        Foreground changed at 16:10:15:919 to true
        Foreground changed at 16:10:31:426 to false
        Foreground changed at 16:11:12:818 to true
        Foreground changed at 16:11:59:538 to false
        Foreground changed at 16:12:39:456 to true
        Foreground changed at 16:13:6:364 to false

Corresponding terminal services client window trace fragment doesn't show any foreground changes but another application main window has lots of them:

Handle: 000D0540 Class: "Application B Class" Title: "Application B"
[...]
   Process ID: 3ac
   Thread ID: bd4
[...]
   Foreground: false
        Foreground changed at 15:50:36:972 to true
        Foreground changed at 15:50:53:732 to false
        Foreground changed at 15:50:53:732 to true
        Foreground changed at 15:50:53:826 to false
        Foreground changed at 15:51:51:352 to true
        Foreground changed at 15:51:53:941 to false
        Foreground changed at 15:53:8:135 to true
        Foreground changed at 15:53:8:182 to false
        Foreground changed at 15:53:10:178 to true
        Foreground changed at 15:53:13:938 to false
        Foreground changed at 15:53:30:443 to true
        Foreground changed at 15:53:31:20 to false
        Foreground changed at 15:53:31:20 to true
        Foreground changed at 15:53:31:129 to false
        Foreground changed at 15:53:34:78 to true
        Foreground changed at 15:53:34:795 to false
        Foreground changed at 15:53:34:795 to true
        Foreground changed at 15:53:34:873 to false
        Foreground changed at 15:53:36:901 to true
        Foreground changed at 15:53:42:502 to false
        Foreground changed at 15:53:42:502 to true
        Foreground changed at 15:53:42:564 to false
        Foreground changed at 15:57:3:425 to true
        Foreground changed at 15:57:4:595 to false
        Foreground changed at 15:57:10:507 to true
        Foreground changed at 15:57:11:318 to false
        Foreground changed at 15:57:29:632 to true
        Foreground changed at 15:57:31:67 to false
        Foreground changed at 15:57:32:721 to true
        Foreground changed at 15:57:33:844 to false
        Foreground changed at 15:58:54:88 to true
        Foreground changed at 15:58:56:178 to false
        Foreground changed at 15:59:6:505 to true
        Foreground changed at 15:59:7:987 to false
        Foreground changed at 16:1:19:525 to true
        Foreground changed at 16:1:19:961 to false
        Foreground changed at 16:1:26:607 to true
        Foreground changed at 16:1:27:434 to false
        Foreground changed at 16:1:39:914 to true
        Foreground changed at 16:1:39:992 to false
        Foreground changed at 16:1:49:798 to true
        Foreground changed at 16:2:0:437 to false
        Foreground changed at 16:7:14:628 to true
        Foreground changed at 16:7:14:847 to false
        Foreground changed at 16:7:18:76 to true
        Foreground changed at 16:7:24:106 to false
        Foreground changed at 16:9:58:790 to true
        Foreground changed at 16:10:4:16 to false
        Foreground changed at 16:10:4:874 to true
        Foreground changed at 16:10:4:890 to false
        Foreground changed at 16:10:8:634 to true
        Foreground changed at 16:10:15:779 to false
        Foreground changed at 16:10:56:766 to true
        Foreground changed at 16:10:59:402 to false
        Foreground changed at 16:10:59:652 to true
        Foreground changed at 16:10:59:667 to false
        Foreground changed at 16:12:9:397 to true
        Foreground changed at 16:12:39:347 to false
        Foreground changed at 16:13:18:375 to true
        Foreground changed at 16:14:33:656 to false

We can see that most of the time when Application A window looses focus, Application B window gets it.



[33] http://www.dumpanalysis.org/blog/index.php/2007/02/15/windowhistory-40/

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset