S

 

Script Messages

Messages that contain scripting statements can be signs of malnarratives112 that resulted from log injection during attempts to exploit possible cross-channel scripting113 (XCS) and cross-site scripting114 (XSS) vulnerabilities. Such Script Messages may be spread across a log, as illustrated in the following diagram:

 

Sequence Repeat Anomaly

Sometimes we have Periodic Message Blocks of a few adjacent messages, for example, when flags are translated into separate messages per bit. Then we may have a pattern of Sequence Repeat Anomaly when one of several message blocks have missing or added messages compared to more numerous expected identical message blocks. Then Missing Message Message Context may be explored further. The following diagram illustrates the pattern:

The name of the pattern comes from the notion of repeated DNA sequences115.

 

Shared Point

Sometimes we know from Basic Facts some data or activity we seek to identify in different traces collected together to perform Inter-Correlational analysis. It can be a shared file name, a named synchronization object, a locked file with sharing violations, a common virtual address in kernel space, or just some activity notification. We call this pattern by analogy with intersecting curves in some abstract space.

It is similar to Linked Messages pattern but is more high level and not confined to a common parameter (can be an action description).

 

Sheaf of Activities

Inter-Correlation analysis between a normal and problem logs to find Bifurcation Point (and a possible root cause) becomes a difficult task when both traces come from separate environments with different Background Components. Here a new analysis pattern with a name borrowed from sheaves116 from mathematics can help. This pattern is also a tool for tracking the properties of trace message subsets. First, we find out important message types around some Activity Region where we hope to find a difference between two traces:

Then we create several Adjoint Threads from different message types, for example, based on the operation type or function name:

Then we analyze subtraces separately to find out a bifurcation point in each of them, and then use this knowledge to find out differences between the original full traces.

 

Significant Event

When looking at software traces and doing either a search for or just scrolling, certain messages grab attention immediately. We call them Significant Events. It could be a recorded exception (Exception Stack Trace) or an error, Basic Fact, a trace message from Vocabulary Index, or just any trace statement that marks the start of some activity we want to explore in-depth, for example, a certain DLL is attached to the process, a coupled process is started, or a function is called. The start of a trace and the end of it are trivial Significant Events and are used in deciding whether the trace is Circular, in determining the trace recording interval or its average Statement Current.

 

Significant Interval

Sometimes, we ask for a log file to see State and Event pattern, and see it there, only to find that we cannot do Back Trace of State Dumps from some Significant Event for Inter-Correlation analysis because our Data Interval is truncated (Truncated Trace). This highlights the importance of proper tracing intervals that we call Significant Interval analysis pattern by analogy with significant digits117 in scientific measurements. The following diagram illustrates the pattern:

If you find out you get truncated traces and logs often, you may want to increase Statement Current for state logging.

 

Silent Messages

We mostly analyze real messages in software traces and logs. In such message streams, we may easily see detectable Discontinuity patterns. However, in some cases, it is beneficial to analyze the absence of messages. The message stream is not uniform; there may be different Statement Currents. If time resolution is 1 ms, for example, then we may have current N msg/ms, or in the case of lesser current, such as 0.5 msg/ms, we have the so-called Silent Messages (----):

[...]​
11 ms: message​
12 ms: ----​
13 ms: message​
14 ms: ----​
15 ms: message​
16 ms: message​
17 ms: ----​
18 ms: ----​
19 ms: message​
[...]

So, by a silent message, we understand the possible message that would occupy the minimal time resolution gap. If we look at the following illustration, we see that the whole pattern analysis apparatus can be applied to the analysis of the distribution of silent messages.

This pattern is different from Discontinuity pattern because the latter is about large unexpected silences and is different from Sparse Trace, which is about missing trace statements from source code.

 

Singleton Event

There are events that by design or system configuration should be seen in a log only once or not seen at all if code responsible for them was executed before the tracing session. For example, the launch of certain services during system initialization should not be seen again when we trace system activity long after that. It can also be just messages from singleton118 objects in the application log. The appearance of extra Singleton Events may point to design violations or some abnormal system events such as process restart. The latter may Intra-Correlate with the start of the fault handling process such as WerFault.exe in Windows Process Monitor logs (Guest Component).

 

Singleton Trace

Status updates, error message boxes, and even abort errors can all be considered as examples of software traces and logs. We call such a log with one message only Singleton Trace. Please don’t confuse it with Singleton Event analysis pattern which is trivial for Singleton Traces. We illustrate this pattern with this picture of the error message (in Russian it means “The machine doesn’t work. Code: SB”) from a lemonade-dispensing machine:

The message in such a trace may contain the associated stack trace as a trivial example of Fiber Bundle.

 

Small DA+TA

Recently we performed the diagnostic analysis of a software incident where certain functionality was not available to users and provided the report based on analysis patterns such as Focus of Tracing and Opposition Messages. We also conjectured some hypotheses explaining the observed abnormal behavior. However, in the end, the problem was solved not by the analysis of a lengthy software execution log but by looking at the small configuration INI file where not working functionality was simply disabled in one line:

EnableFunctionality = 0

Even before that analysis, we were thinking about the importance of Small DA+TA such as configuration files and registry details that can be considered as general software traces119. Here DA+TA means Dump Artifact + Trace Artifact and Big DA+TA refers to software execution memory dump artifacts and trace artifacts that can be huge. The analysis pattern is illustrated in the following diagram, where we see no difference between working and non-working scenarios due to insufficient trace coverage (Sparse Trace):

 

Sparse Trace

Sometimes we do not see anything in the trace or see very little because trace statements did not cover particular source code fragment (see also PLOTs120):

Sparse Trace pattern is different from Missing Component pattern where some modules are not included for logging explicitly, although there is logging code there or Visibility Limit pattern where tracing is intrinsically impossible. Often technical support and escalation engineers request to add more trace statements, and software engineers extend tracing coverage iteratively as needed.

 

Split Trace

Some tracing tools such as CDFControl121 have the option to split software traces and logs into several files during long recording. Although this should be done judiciously, it is necessary sometimes. What to do if we get several trace files and we want to use some other analysis tool such as CDFAnalyzer122? If we know that the problem happened just before the tracing was stopped, we can look at the last few such files from the file sequence (although we recommend Circular Trace here). Otherwise, we can convert them into CVS files and import into Excel, which also supports adjoint threading123.

The following diagram illustrates this pattern:

 

State and Event

For the event- or message-driven architectures, it is important to differentiate between event and state messages (including state transition). For example, a system may be doing some work while being in some particular state with much tracing and respond to various external events with each of them having a corresponding trace message. Upon such an event, the system transitions to some other state with its own set of possible trace messages. We call such a pattern State and Event. A typical example here is a windowing terminal services system and WM_ENDSESSION event illustrated in the following abstract trace diagram with a corresponding state transition diagram below it:

 

State Dump

Introduced in Debugging TV124 Frames episode 0×32 about Android / Java debugging this pattern solves the problem of program state analysis when memory dump generation is not available or does not help or complicated as in the case of interpreted code. A developer identifies a set of state variables and periodically prints their values to the output logging stream. Such output may also include but not limited to Counter Values.

 

Statement Density and Current

Sometimes we have several disjoint Periodic Errors and possible False Positives. We wonder where we should start or assign relative priorities for troubleshooting suggestions. Here Statement Density and Current pattern can help. The statement (message) density is simply the ratio of the number of occurrences of the specific trace statement (message) in the trace to the total number of all different recorded messages.

Consider this software trace with two frequent messages:

N     PID  TID​
21    5928 8092 LookupAccountSid failed. Result = -2146238462​
[...]​
1013  5928 1340 SQL execution needs a retry. Result = 0

We have approx. 7,500 statements for the former and approx. 1,250 statements for the latter. The total number of trace statements is 185,700, so we have the corresponding approx. trace densities: 0.04 and 0.0067. Their relative ratio 7,500/1,250 is 6.

We collected another trace of the same problem at a different time with the same errors. It has 71,100 statements and only 160 and 27 statements counted for messages above. We have approx. the same ratio of 160/27 (5.93) that suggests messages are correlated. However statement density is much lower, 0,002 and 0.00038 approx., and this suggests the closer look at the second trace to see whether these problems started at some time later after the start of the recording.

We can also check Statement Current as the number of statements (messages) per unit of time. We recorded the first trace over 195 seconds and the second over 650 seconds. For this reason, we have 952 msg/s and 109 msg/s respectively. It suggests that the problem might have started at some time during the second trace, or there were more modules selected for the first trace. To make sure, we adjust the total number of messages for these two traces. We find the first occurrence of the error and subtract its message number from the total number of messages. For our first trace, we see that messages start from the very beginning, and in our second trace, they also almost start from the beginning. So such adjustment should not give much better results here. Also, these statements continue to be recorded until the very end of these traces.

To avoid being lost in this discussion, we repeat the main results:

           Density             Relative Density   Current,​
                                                  all msg/s​
Trace 1    0.04 / 0.0067       6                  952​
Trace 2    0.002 / 0.00038     5.93               109

The possibility that much more was traced that resulted in lower density for the second trace should be discarded because we have much lower current. Perhaps the environment was not quite the same for the second tracing. However, the same relative density for two different errors suggests that they are correlated, and the higher density of the first error suggests that we should start our investigation from it.

The reason we came up with this statistical trace analysis pattern is that two different engineers analyzed the same trace, and both were suggesting different troubleshooting paths based on selected Error Messages from software traces. So, we did a statistical analysis to prioritize their suggestions.

Density part is illustrated in the following diagram:

Current part is illustrated in the following diagram:

 

Surveyor

Sometimes, the presence of some messages in a trace or log shows that some other tracing or logging tool was running or that some process was also doing tracing. We call this analysis pattern Surveyor. Such discovered tracing may not be related to the trace we are looking at (compare to Trace Extension) but may help with finding additional traces in the system as illustrated in the following diagram:

 

Signal

According to the definition by Michael Robinson,125 “a signal consists of a collection of related measurements.” For traces and logs, we can apply a similar definition and consider Signal as a collection of local messages having the same Message Invariant and related variable data values. Signals are examples of Message Sets. The typical example is sets of related Counter Value messages. Signals can be obtained by obtaining Adjoint Thread of Activity of a specific message (to filter out Background Components “noise”) as illustrated in the following diagram:

Generally, the variable “measurement” part can form Braid of Activity.

We introduce Signal analysis pattern to bridge the gap between Software Narratology126 and Hardware Narratology127.

 

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset