C

 

Calibrating Trace

Multiple traces and logs are usually collected for diagnosing distributed systems. Different tools and tracing settings (circular, sequential, file size limit) may be used, systems may be unsynchronized, and individual system tracing may be started at different times due to manual tracing setup and switching between systems. There may be Blackouts, Circular, and Truncated traces. When we analyze such a trace set (Inter-Correlation), we usually select one trace or log that is used as Calibrating Trace. It is used for measuring all other traces against Basic Facts such as start and end tracing times, and the time of the problem. One such scenario is illustrated in the following diagram:

 

Cartesian Trace

Cartesian Trace analysis pattern has its analogical roots in a Cartesian product27. It covers a case where we have a long trace and a few Small DA+TA configuration traces (files). The former trace messages are associated with the latter messages (content or content changes) as depicted in the following diagram:

Think about a rectangle as a product of two-line fragments or a cylinder as a product of a circle and a line fragment. Both traces are completely independent in comparison to Fiber Bundle, Trace Presheaf, or Trace Extension.

 

Characteristic Message Block

Bird’s eye view of software traces28 makes it easier to see their coarse blocked structure:

The further finer structure is discernible, and we can even see nested blocks:

We can see some blocks of output when scrolling a trace viewer window, but if a viewer supports zooming it is possible to get an overview and jump directly into Characteristic Message Block, for example, debug messages of repeated attempts to query a database. If a viewer supports message coloring, it also helps here. Sometimes, the latter technique is useful when we want to ignore bulk messages and start an analysis around block boundaries.

 

Circular Trace

It is an obvious structural trace analysis pattern. Sometimes, the information about circularity is missing in the problem description, or the trace metadata does not reflect it. Then Circular Traces can be detected by trace File Size (usually large) and from timestamps, like this 100Mb CDF trace snippet:

No Module PID TID Date Time Statement

[Begin of trace listing]

1 ModuleA 4280 1736 5/28/2009 08:53:50.496 [... Trace statement 1]

2 ModuleB 6212 6216 5/28/2009 08:53:52.876 [... Trace statement 2]

3 ModuleA 4280 4776 5/28/2009 08:54:13.537 [... Trace statement 3]

[... Some traced exceptions helpful for analysis ...]

3799 ModuleA 4280 3776 5/28/2009 09:15:00.853 [... Trace statement 3799]

3800 ModuleA 4280 1736 5/27/2009 09:42:12.029 [... Trace statement 3800]

[... Skipped ...]

[... Skipped ...]

[... Skipped ...]

579210 ModuleA 4280 4776 5/28/2009 08:53:35.989 [... Trace statement 579210]

[End of trace listing]

We can usually find the analysis region at the beginning of such traces because as soon as elusive and hard to reproduce problem happens then tracing is stopped:

 

Combed Trace

A typical software trace or log (for example, from Process Monitor) lists messages from several processes and threads sequentially. However, such columns may be split into individual process ID or thread ID columns. The same can be done for any Adjoint Thread and illustrated in the following diagram:

We call this analysis pattern Combed Trace by analogy with multibraiding29.

 

Correlated Discontinuity

When analyzing Inter-Correlation or Intra-Correlation and finding Discontinuities in a part of one trace or in a different trace (for example, in client-server environments), it is useful to see if there are corresponding Correlated Discontinuities in another part of the same trace. For example, in a different Thread of Activity) or a different trace. Such a pattern may point to the underlying communication problem and may suggest gathering a different trace (for example, a network trace) for further analysis.

 

Corrupt Message

Sometimes log messages are formatted with mistakes; buffers are not cleared before copying; copied strings are truncated; tracing implementation and presentation contains coding defects. There can be internal corruption when messages are formed or “corruption” during a presentation, for example, default field conversion rules (like in Excel). We call this pattern Corrupt Message. Such messages may affect trace and log analysis where data search may not show full relevant results. We then recommend double-checking findings by using Data Flow of a different Message Invariant.

 

CoTrace (CoLog, CoData)

When we do trace and log analysis (and software data in general) we look at specific messages found from search (Message Patterns), Error Messages, Significant Events, visit Activity Regions, filter Message Sets, walkthrough (Adjoint) Threads of Activity, and do other actions necessitated by trace and log analysis patterns. All these can be done in random order (starting from some analysis point), not necessarily representing the flow of Time or some other metric30:

Analyzed messages form their own analysis trace that we call CoTrace (CoLog, CoData) where the prefix Co- denotes a space dual to trace (log, data) space:

Instead of messages (or in addition to), we can also form CoTraces consisting of visited Activity Regions or some other areas:

We can apply trace analysis patterns to CoTraces as well. The latter can also be used in the creation of higher-order pattern narratives31.

 

Counter Value

This pattern covers performance monitoring and its logs. Counter Value is some variable in memory, for example, Module Variable32 memory analysis pattern, that is updated periodically to reflect some aspect of state or calculated from different variables and presented in trace messages. We can organize such messages in a similar format as ETW based traces we usually consider as examples for our trace patterns:

Source  PID TID   Function         Value​
=================================================​
[...]​
System    0   0   Committed Memory 12,002,234,654​
Process 844   0   Private Bytes    345,206,456​
System    0   0   Committed Memory 12,002,236,654​
Process 844   0   Working Set      122,160,068​
[...]

Therefore, all other trace and log analysis patterns such as Adjoint Thread (can be visualized via different colors on a graph), Focus of Tracing, Characteristic Message Block (for graphs), Activity Region, Significant Event, and others can be applicable here. There are also some specific patterns such as Global Monotonicity and Constant Value that we discuss with examples in later reference editions.

 

Coupled Activities

Sometimes we need to know about the client-server interaction between components, threads, or processes to find out where the problem started. For example, if we have Error Message or Discontinuity in one PID Adjoint Thread of Activity, and we know that that process uses API from another PID, we can look at the latter PID Adjoint Thread to see if there are any Error Messages or other problems. The failure in the server can propagate to the client, as illustrated in the following diagram:

We call this pattern Coupled Activities similar to Coupled Processes memory analysis pattern33. It can help in Intra- and Inter-Correlation analysis, for example, in choosing Adjoint Threads from Sheaf of Activities.

 

Critical Point

Based on a mathematical analogy with critical points34 in topology (Morse theory35) we introduce Critical Points in trace and log analysis where they signify the change of trace or log “shape” (topological or “geometric” properties) as illustrated in the following diagram:

Such a point may be an individual message, its Message Context, or Activity Region.

Critical Points are examples of Intra-Correlation whereas Bifurcation Points are examples of Inter-Correlation.

 

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset