R

 

Random Data

Trace and log message text consists of constant, unchanging Message Invariants and some varying data. The latter can be classified into Random Data such as memory addresses, especially when ASLR106 is enabled, Counter Values, and variable data but constants, such as error values and NULL pointers. Individual values from Signals are not considered random, but their sequence can be. This analysis pattern is depicted in the following diagram (adapted from Data Association analysis pattern):

 

Recovered Messages

If we analyze ETW-based traces such as CDF, we may frequently encounter No Trace Metafile pattern, especially after product updates and fixes. This complicates pattern analysis because we may not be able to see Significant Events, Anchor Messages, and Error Messages. In some cases, we can recover messages by comparing Message Context for unknown messages. If we have source code access, this may also help. Both approaches are illustrated in the following diagram:

The same approach may also be applied to a different kind of trace artifacts when some messages are corrupt. In such cases, it is possible to recover diagnostic evidence and, therefore, we call this pattern Recovered Messages.

 

Relative Density

This pattern describes anomalies in semantically related pairs of trace messages, for example, “data arrival” and “data display.” Their Statement Densities can be put in a ratio (also called specific gravity107) and compared between working and non-working scenarios. Because the total numbers of trace messages cancel each other, we have just the mutual ratio of two message types. In our hypothetical “data” example, the increased ratio of “data arrival” to “data display” messages accounts for reported visual data loss and sluggish GUI.

 

Renormalization

Using the metaphor of renormalization108 from physics, we introduce Renormalization trace and log analysis pattern where a selected message and its Message Context are replaced by a single message:

 

Resume Activity

If Break-in Activity is usually unrelated to a thread or an Adjoint Thread that has a discontinuity then Resume Activity pattern highlights messages from that thread:

We can see the difference in the following graphical representation of the two traces where, in a working trace, a break-in preceded resume activity, but in a non-working trace, both patterns were absent:

 

Ruptured Trace

Recently we analyzed a few logs which ended with a specialized Activity Region from a subsystem that sets operational parameters. The problem description stated that the system became unresponsive after changing parameters in a certain sequence. Usually, for that system, when we stop logging (even after setting parameters) we end up with messages from some Background Components since some time passes between the end of setting parameters activity and the time the operator sends stop logging request:

However, in the problem case, we see message flow stops right in the middle of a parameter setting activity:

So we advised to check for any crashes or hangs, and, indeed, it was found that the system was experiencing system crashes, and we got memory dumps for analysis where we found Top Module109 from a 3rd-party vendor related to parameter setting activity.

Please also note an analogy here between normal thread stack traces from threads that are waiting for most of the time and Spiking Thread110 stack trace caught up in the middle of some function.

We call this pattern Ruptured Trace after a ruptured computation111.

Note, that if it is possible to restart the system and resume the same tracing, we may get an instance of Blackout analysis pattern.

 

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset