I

 

Identification Messages

Often, we need to identify the source of messages based on problem object or subsystem description (what question) before we proceed to answer a “where” question (where, in a trace, we can find messages related to a problem). Even when we know where are messages, there can be many sources to select from (if we don’t know the where question we can use Indirect Message analysis pattern). To answer “what” question, we propose Identification Messages analysis pattern. Basic Fact problem description may include properties and behavioral description of the problem object or subsystem. Based on that, we can map them to the log messages that such an object can produce:

These messages may not be Error Messages or some other type of messages reflecting abnormal behavior. These messages are only used to identify the software object, module, or subsystem.

For example, in one case, there were problems with the custom status bar. However, the window handle for it or its parent wasn’t specified in the problem report. In the log file, we had many messages describing the GUI behavior of many windows. To find out the status bar, we thought that it should have a small height but long width. Indeed, we found one such child window. Also, for this window, the log file contained many messages related to frequent window text changes, possibly reflecting the status bar updates. Having identified the window handle, we proceeded to the analysis of another log with thousands of window messages. Because of the known window handle, we were able to select only messages related to our problem status bar.

 

Implementation Discourse

If we look at any non-trivial trace, we see different Implementation Discourses. Components are written in different languages and adhere to different runtime environments, binary models, and interface frameworks. All these implementation variations influence the structure, syntax, and semantics of trace messages. For example, .NET debugging traces differ from file system driver or COM debugging messages. For this reason, we establish the new field of Software Trace Linguistics as a science of software trace languages. Some parallels can be drawn here towards software linguistics (the science of software languages) although we came to that conclusion independently while thinking about applying “ethnography of speaking” to software trace narration.

 

Impossible Trace

Sometimes, we look at a trace and say it is impossible. For example, this fragment shows that the function foo had been called:

# Module PID TID Message​
[...]​
1001 ModuleA 202 404 foo: start​
1002 ModuleA 202 404 foo: end​
[...]

However, if we look at the corresponding source code (PLOT71) we see that something is missing: the function bar must have been called with its own set of trace messages we don’t see in the trace:

void foo()​
{​
TRACE("foo: start");​
bar(); ​
TRACE("foo: end");​
}

void bar()​
{​
TRACE("bar: start");​
// some code ... ​
TRACE("bar: end");​
}

We suspect the runtime code was modified, perhaps by patching. In other cases of missing messages, we can also suspect thrown exceptions, or local buffer overflows that lead to wrong return address skipping the code with expected tracing statements. The mismatch between the trace and the source code we are looking at is also possible if the old source code did not have bar function called.

 

Incomplete History

Typical software narrative history consists of requests and responses, for example, function or object method calls and returns:

#     Module PID  TID  Time         File    Function Message​
[...]​
26060 dllA   1604 7108 10:06:21.746 fileA.c foo      Calling bar​
[...]​
26232 dllA   1604 7108 10:06:22.262 fileA.c foo      bar returns 0x5​
[...]

The code that generates execution history is response-complete if it traces both requests and responses. For such code (except in cases where tracing is stopped before a response) the absence of expected responses could be a sign of blocked threads or quiet exception processing. The code that generates execution history is exception-complete if it also traces exception processing. Response-complete and exception-complete code is called call-complete. If we do not see response messages for call-complete code, we have Incomplete History.

In general, we can talk about the absence of certain messages in a trace as a deviation from the standard trace sequence template corresponding to a use case. The difference there is in a missing request too.

 

Indexical Trace

This pattern describes Inter-Correlation pattern variant when we have a trace that has messages of interest pointing to specific Activity Regions in another trace. The latter trace can be huge, taken from another computer, and split into many parts (Split Trace). This pattern is very helpful when we need to diagnose the problem in the large split trace, but we do not know when it happened. Then an index trace that may have recorded software execution account (for example, in the case of a broker-like architecture) can point to the right trace fragment from the split trace.

 

Indirect Facts

Sometimes in the case of missing Basic Facts, we can discern Indirect Facts from the message text and even from other patterns. For example, in one incident we were interested in all messages from the certain process name, but its PID was missing from the problem description. Fortunately, we were able to get its PID from one of the individual messages from a completely different source:

 

Indirect Message

Sometimes we have Basic Facts in a problem description but can’t find messages corresponding to them in a trace or log file, but we are sure the tracing (logging) was done correctly. This may be because we have Sparse Trace, or we are not familiar well with product or system tracing messages (such as with Implementation Discourse).

In such a case, we search for Indirect Message of a possible cause:

Having found such a message we may hypothesize that Missing Message should have located nearby (this is based on the semantics of both messages), and we then explore corresponding Message Context:

The same analysis strategy is possible for missing causal messages. Here we search for effect or side effect messages:

Having found them, we proceed with further analysis:

 

Inter-Correlation 

This pattern is analogous to the previously described Intra-Correlation pattern, but it involves several traces from possibly different trace agents recorded (most commonly) at the same time or during an overlapping time interval:

Let’s look at a typical example of an application subclassing windows to add the additional look and feel element to its GUI or inject hooks into window messaging. Suppose this application also records important trace data like window parameters before and after subclassing using ETW technology. When we run the application in a terminal services environment, all windows (including other processes) are shown with an incorrect dimension. We, for this reason, request the application trace and, also, WindowHistory72 trace to see how coordinates of all windows change over time. We easily find some Basic Facts in both traces such as the window class name or time, but it looks like the window handle is different. In another set of traces recorded for comparison, we have the same window handle values; the class name is absent from the ETW trace, but a process and thread ID for the same window handle are different. We, for this reason, do not see a correlation between these traces and suspect that both traces in two sets were recorded in different terminal sessions, for example:

ETW trace:

#     PID  TID  Time          Message​
[...] ​
46750 
5890 6960 10:17:18.825  Subclassing, handle=0×100B8, class=MyWindowClass, [...]​
[...]

WindowHistory trace:

Handle: 0001006E Class: “MyWindowClass” Title: “”​
   Captured at: 10:17:19:637​
   Process ID: 19e0​
   Thread ID: 16e4​
   Parent: 0​
   Screen position (l,t,r,b): (-2,896,1282,1026)​
   Client rectangle (l,t,r,b): (0,0,1276,122)​
   Visible: true​
   Window placement command: SW_SHOWNORMAL​
   Foreground: false​
   HungApp: false​
   Minimized: false​
   Maximized: false​
[...]

 

Interspace

General traces and logs73 may have Message Space regions “surrounded” by the so-called Interspace. Such Interspace regions may link individual Message Space regions like in this diagram generalizing WinDbg !process 0 3f command output:

 

Intra-Correlation

Sometimes we see a functional activity and Basic Facts in a trace. Then we might want to find a correlation between that activity and facts in another part of the trace. If that intra-correlation fits into our problem description, we may claim a possible explanation or, if we are lucky, we have just found, an inference to the best explanation, as philosophers of science like to say. Here is an example, but this time using WindowHistory tracing tool74. A third-party application was frequently losing the focus, and the suspicion was on a terminal services client process. We found that the following WindowHistory trace fragment corresponded to that application:

Handle: 00050586 Class: "Application A Class" Title: ""​
     Title changed at 15:52:4:3 to "Application A"​
     Title changed at 15:52:10:212 to "Application A - File1"​
[...]​
   Process ID: 89c​
   Thread ID: d6c​
[...]​
   Visible: true​
   Window placement command: SW_SHOWNORMAL​
        Placement changed at 15:54:57:506 to SW_SHOWMINIMIZED​
        Placement changed at 15:55:2:139 to SW_SHOWNORMAL​
   Foreground: false​
        Foreground changed at 15:52:4:3 to true​
        Foreground changed at 15:53:4:625 to false​
        Foreground changed at 15:53:42:564 to true​
        Foreground changed at 15:53:44:498 to false​
        Foreground changed at 15:53:44:498 to true​
        Foreground changed at 15:53:44:592 to false​
        Foreground changed at 15:53:45:887 to true​
        Foreground changed at 15:53:47:244 to false​
        Foreground changed at 15:53:47:244 to true​
        Foreground changed at 15:53:47:353 to false​
        Foreground changed at 15:54:26:416 to true​
        Foreground changed at 15:54:27:55 to false​
        Foreground changed at 15:54:27:55 to true​
        Foreground changed at 15:54:27:180 to false​
        Foreground changed at 15:54:28:428 to true​
        Foreground changed at 15:54:28:771 to false​
        Foreground changed at 15:54:28:865 to true​
        Foreground changed at 15:54:29:99 to false​
        Foreground changed at 15:54:30:877 to true​
        Foreground changed at 15:54:57:521 to false​
        Foreground changed at 15:55:2:76 to true​
        Foreground changed at 15:57:3:378 to false​
        Foreground changed at 15:57:11:396 to true​
        Foreground changed at 15:57:29:601 to false​
        Foreground changed at 15:57:39:803 to true​
        Foreground changed at 15:58:54:41 to false​
        Foreground changed at 15:59:8:96 to true​
        Foreground changed at 16:1:19:478 to false​
        Foreground changed at 16:1:27:527 to true​
        Foreground changed at 16:1:39:914 to false​
        Foreground changed at 16:2:0:515 to true​
        Foreground changed at 16:7:14:628 to false​
        Foreground changed at 16:7:24:246 to true​
        Foreground changed at 16:9:53:523 to false​
        Foreground changed at 16:10:15:919 to true​
        Foreground changed at 16:10:31:426 to false​
        Foreground changed at 16:11:12:818 to true​
        Foreground changed at 16:11:59:538 to false​
        Foreground changed at 16:12:39:456 to true​
        Foreground changed at 16:13:6:364 to false

Corresponding terminal services client window trace fragment does not show any foreground changes, but another application’s main window has lots of them:

Handle: 000D0540 Class: "Application B Class" Title: "Application B"​
[...]​
   Process ID: 3ac​
   Thread ID: bd4​
[...]​
   Foreground: false​
        Foreground changed at 15:50:36:972 to true​
        Foreground changed at 15:50:53:732 to false​
        Foreground changed at 15:50:53:732 to true​
        Foreground changed at 15:50:53:826 to false​
        Foreground changed at 15:51:51:352 to true​
        Foreground changed at 15:51:53:941 to false​
        Foreground changed at 15:53:8:135 to true​
        Foreground changed at 15:53:8:182 to false​
        Foreground changed at 15:53:10:178 to true​
        Foreground changed at 15:53:13:938 to false​
        Foreground changed at 15:53:30:443 to true​
        Foreground changed at 15:53:31:20 to false​
        Foreground changed at 15:53:31:20 to true​
        Foreground changed at 15:53:31:129 to false​
        Foreground changed at 15:53:34:78 to true​
        Foreground changed at 15:53:34:795 to false​
        Foreground changed at 15:53:34:795 to true​
        Foreground changed at 15:53:34:873 to false​
        Foreground changed at 15:53:36:901 to true​
        Foreground changed at 15:53:42:502 to false​
        Foreground changed at 15:53:42:502 to true​
        Foreground changed at 15:53:42:564 to false​
        Foreground changed at 15:57:3:425 to true​
        Foreground changed at 15:57:4:595 to false​
        Foreground changed at 15:57:10:507 to true​
        Foreground changed at 15:57:11:318 to false​
        Foreground changed at 15:57:29:632 to true​
        Foreground changed at 15:57:31:67 to false​
        Foreground changed at 15:57:32:721 to true​
        Foreground changed at 15:57:33:844 to false​
        Foreground changed at 15:58:54:88 to true​
        Foreground changed at 15:58:56:178 to false​
        Foreground changed at 15:59:6:505 to true​
        Foreground changed at 15:59:7:987 to false​
        Foreground changed at 16:1:19:525 to true​
        Foreground changed at 16:1:19:961 to false​
        Foreground changed at 16:1:26:607 to true​
        Foreground changed at 16:1:27:434 to false​
        Foreground changed at 16:1:39:914 to true​
        Foreground changed at 16:1:39:992 to false​
        Foreground changed at 16:1:49:798 to true​
        Foreground changed at 16:2:0:437 to false​
        Foreground changed at 16:7:14:628 to true​
        Foreground changed at 16:7:14:847 to false​
        Foreground changed at 16:7:18:76 to true​
        Foreground changed at 16:7:24:106 to false​
        Foreground changed at 16:9:58:790 to true​
        Foreground changed at 16:10:4:16 to false​
        Foreground changed at 16:10:4:874 to true​
        Foreground changed at 16:10:4:890 to false​
        Foreground changed at 16:10:8:634 to true​
        Foreground changed at 16:10:15:779 to false​
        Foreground changed at 16:10:56:766 to true​
        Foreground changed at 16:10:59:402 to false​
        Foreground changed at 16:10:59:652 to true​
        Foreground changed at 16:10:59:667 to false​
        Foreground changed at 16:12:9:397 to true​
        Foreground changed at 16:12:39:347 to false​
        Foreground changed at 16:13:18:375 to true​
        Foreground changed at 16:14:33:656 to false

We can see that most of the time when Application A window loses focus, Application B window gets it.

 

Intrinsic ID

In order to perform Inter-Correlational analysis among traces and logs especially when we have Indexical Trace we need a dual operation: an ability to identify traces and Use Case Trails, and, if necessary, find their corresponding Braids of Activity in an index trace. Some data from the tracing domain or use case description may serve is Intrinsic ID. It can be itself some numeric ID, a user or computer name.

A typical log analysis case from a distributed environment is illustrated in the following diagram:

 

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset