A

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

While preparing a presentation on malware narratives4, we found that one essential pattern is missing from the current log analysis pattern catalog. Most of the time, we see some abnormal or unexpected value in a software trace or log, such as a network address outside the expected range, and this triggers further investigation. The message structure may have the same Message Invariant, but the variable part may contain such values as depicted graphically:

Please note that we also have Significant Event pattern that is more general and also covers messages without variable part or just suspicious log entries.

Activity Disruption

Sometimes a few Error Messages or Periodic Errors with low Statement Density for specific Activity Regions or Adjoint Threads of Activity (for a specific component, file or function) may constitute Activity Disruption. If the particular functionality was no longer available at the logging time, then its unavailability may not be explained by such disruptions, and such messages may be considered False Positive Errors in relation to the reported problem:

But, if we have Periodic Message Blocks containing only Periodic Errors, Activity Region or Adjoint Thread Discontinuity, or simply No Activity, then we may have the complete cessation of activity that may correlate with the unavailable functionality:

Activity Divergence

Sometimes we have several Threads of Activity, for example, from the same process) visible for a certain period and then suddenly we see only one such thread till the end of a trace (or even none). It may be an indication of an application hang or some other abnormal behavior if several active threads doing logging are normal. If we consider such activities (including Adjoint Threads) as vectors running through some temporal “surface,” we can use an analogy of a divergence⁵:

Activity Overlap

Sometimes specific parts of simultaneous Use Case Trails, blocks of Significant Events or Message Sets in general may overlap. It may point to possible synchronization problems such as race conditions (prognostics) or be visible root causes of them if such problems are reported (diagnostics). We call this pattern Activity Overlap:

For example, a first request may start a new session, and we expect the second request processed by the same already established session:

However, users report the second session started upon the second request. If we filter execution log by session id, we find out that session initialization prologs (Trace Partition, page 256) are overlapped. The new session started because the first session initialization was not completed:

Activity Packet

Sometimes we are interested in Message Set that has the same time attribute value (or rounded to some digit). We call this analysis pattern Activity Packet by analogy with wave packets⁶. It may allow the identification of related threads and activities.

It is different from Activity Quantum analysis pattern where time attribute value may change for continuous Message Set with the same thread ID.

Activity Quantum

Having chosen a trace message, we are interested in its Message Context which can span all “continuous” messages before and after from the same Thread of Activity. We call it Activity Quantum, which is variable and independent from the so-called CPU quantum. Different messages from Activity Quantum may be executed on different CPUs. The following diagram depicts this analysis pattern:

Activity Region

When looking at lengthy traces with thousands and millions of messages (trace statements), we can see regions of activity where the statement current (J_m, msg/s) is much higher than in surrounding temporal regions (Statement Current). Here is an illustration for a typical ETW⁷ / CDF⁸ trace where a middle region of activity (J_m2) signifies a system performing some response function as a user session initialization and application launch:

Activity Theatre

In addition to Message Patterns, there are higher-level patterns of specific activities and Motives. Such activities may or may not coincide with specific components (modules) because they may be grouped based on implementation messages, software internals semantics and not on architectural and design entities (as in Use Case Trail analysis pattern). Moreover, the same components may “play” different activity roles. Once assigned, Activity Theatre “scripts” can be compared with “scripts” from other traces and logs (Inter-Correlation) or different parts of the same log (Intra-Correlation). This pattern is illustrated in the following diagram:

Adjoint Message

By analogy with Adjoint Thread of Activity, we introduce Adjoint Message analysis pattern. Most, if not all, analysis patterns focus on log message text and consider TID, PID, Module, source file, and function as its attributes. However, we can choose one of the attributes and consider it as a message in its own right with the original message text consigned now as another attribute. Then we can analyze the structure of the trace from the perspective of that newly selected message:

Since the number of different message values now is smaller (for example, module names) compared to normal trace messages, we can use them in protein-like encoding and structure analysis schemes (see Software Trace and Logs as Proteins⁹). We metaphorically name Adjoint Messages as Amino-acid-Messages (A-Messages). We can also compress same message sequences into one message which may be useful for pattern matching (and even use different color intensities to represent message cardinalities):

Adjoint Space

Sometimes we need memory reference information not available in software traces and logs, for example, to see the pointer dereferences, to follow pointers and linked structures. In such cases, memory dumps saved during logging sessions may help. In the case of process memory dumps, we can even have several Step Dumps¹⁰. We may force complete or kernel memory dumps after saving a log file. We call such pattern Adjoint Space:

Then we can analyze logs and memory dumps together, for example, to follow pointer data further in memory space:

There is also a reverse situation when we use logs to see past data changes before memory snapshot time (Paratext memory analysis pattern¹¹):

Adjoint Thread of Activity

This pattern is an extension of Thread of Activity based on the concept of multibraiding (see below). There is also an article published in Debugged! MZ/PE magazine¹².

Having considered computational threads as braided strings¹³ and after discerning software trace analysis patterns we can see formatted and tabulated software trace output in a new light and employ the “fabric of traces” and braid metaphors for Adjoint Thread concept. This new concept was motivated by reading about Extended Phenotype¹⁴ and extensive analysis of Citrix ETW¹⁵-based CDF¹⁶ traces using CDFAnalyzer¹⁷. The term Adjoint was borrowed from mathematics¹⁸ because the concept we discuss below resembles this metaphorical formula: (Thread A, B) = [A, Thread B]. Let me first illustrate adjoint threading using simplified trace tables. Consider this generalized software trace example (we omitted date and time column for visual clarity):


#	Source Dir	PID	TID	File Name	Function	Message
1	srcsubsystemA	2792	5676	file1.cpp	fooA	Message text…
2	srcsubsystemA	2792	5676	file1.cpp	fooA	Message text…
3	srcsubsystemA	2792	5676	file1.cpp	fooA	Message text…
4	srclib	2792	5680	file2.cpp	barA	Message text…
5	srcsubsystemA	2792	5680	file1.cpp	fooA	Message text…
6	srcsubsystemA	2792	5676	file1.cpp	fooA	Message text…
7	srclib	2792	5680	file2.cpp	fooA	Message text…
8	srclib	2792	5680	file2.cpp	fooA	Message text…
9	srcsubsystemB	2792	3912	file3.cpp	barB	Message text…
10	srcsubsystemB	2792	3912	file3.cpp	barB	Message text…
11	srcsubsystemB	2792	3912	file3.cpp	barB	Message text…
12	srcsubsystemB	2792	3912	file3.cpp	barB	Message text…
13	srcsubsystemB	2792	3912	file3.cpp	barB	Message text…
14	srcsubsystemB	2792	3912	file3.cpp	barB	Message text…
15	srcsubsystemB	2792	2992	file4.cpp	fooB	Message text…
16	srcsubsystemB	2792	3008	file4.cpp	fooB	Message text…
…	…	…	…	…	…	…

We see several threads in a process PID 2792. In CDFAnalyzer, we can filter trace messages that belong to any column, and if we filter by TID, we get a view of any Thread of Activity. However, each thread can “run” through any source directory, file name or function. If a function belongs to a library, then multiple threads access it. This source location (we can consider it as a subsystem), file or function view of activity is called an Adjoint Thread. For example, if we filter only subsystemA column in the trace above we get this table:


#	Source Dir	PID	TID	File Name	Function	Message
1	srcsubsystemA	2792	5676	file1.cpp	fooA	Message …
2	srcsubsystemA	2792	5676	file1.cpp	fooA	Message …
3	srcsubsystemA	2792	5676	file1.cpp	fooA	Message …
5	srcsubsystemA	2792	5680	file1.cpp	fooA	Message …
6	srcsubsystemA	2792	5676	file1.cpp	fooA	Message …
7005	srcsubsystemA	2792	5664	file1.cpp	fooA	Message …
10198	srcsubsystemA	2792	5664	file1.cpp	fooA	Message …
10364	srcsubsystemA	2792	5664	file1.cpp	fooA	Message …
10417	srcsubsystemA	2792	5664	file1.cpp	fooA	Message …
10420	srcsubsystemA	2792	5676	file1.cpp	fooA	Message …
10422	srcsubsystemA	2792	5680	file1.cpp	fooA	Message …
10587	srcsubsystemA	2792	5664	file1.cpp	fooA	Message …
10767	srcsubsystemA	2792	5680	file1.cpp	fooA	Message …
11126	srcsubsystemA	2792	5668	file1.cpp	fooA	Message …
11131	srcsubsystemA	2792	5680	file1.cpp	fooA	Message …
11398	srcsubsystemA	2792	5676	file1.cpp	fooA	Message …
11501	srcsubsystemA	2792	5668	file1.cpp	fooA	Message …
11507	srcsubsystemA	2792	5668	file1.cpp	fooA	Message …
11509	srcsubsystemA	2792	5664	file1.cpp	fooA	Message …
11513	srcsubsystemA	2792	5680	file1.cpp	fooA	Message …
11524	srcsubsystemA	2792	5668	file1.cpp	fooA	Message …
…	…	…	…	…	…	…

We can graphically view subsystemA as a braid string that “permeates the fabric of threads”:

We can get many different braids by changing filters, hence multbraiding. Here is another example of a driver source file view initially permeating two process contexts and four threads:


#	Source Dir	PID	TID	File Name	Function	Message
41	srcsysdriver	3636	3848	entry.c	DriverEntry	IOCTL …
80	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
99	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
102	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
179	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
180	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
311	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
447	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
448	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
457	srcsysdriver	2792	5108	entry.c	DriverEntry	IOCTL …
608	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
614	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
655	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
675	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
678	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
680	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
681	srcsysdriver	3636	3896	entry.c	DriverEntry	IOCTL …
1145	srcsysdriver	3636	4960	entry.c	DriverEntry	IOCTL …
1153	srcsysdriver	3636	4960	entry.c	DriverEntry	IOCTL …
1154	srcsysdriver	3636	4960	entry.c	DriverEntry	IOCTL …
…	…	…	…	…	…	…

The following diagram illustrates Adjoint Thread of Activity for a selected message:

Anchor Messages

When a software trace is lengthy, it is useful to partition it into several regions based on a sequence of Anchor Messages. We can determine the choice of them by Vocabulary Index or Adjoint Thread of Activity. For example, an ETW trace with almost 900,000 messages recorded during a desktop connection for 6 minutes can be split into 14 segments by the adjoint thread of DLL_PROCESS_ATTACH message (the message was generated by DllMain of an injected module, not shown in the trace output for formatting clarity):

# PID TID Time Message
24226 2656 3480 10:41:05.774 AppA.exe: DLL_PROCESS_ATTACH
108813 4288 4072 10:41:05.774 AppB.exe: DLL_PROCESS_ATTACH
112246 4180 3836 10:41:05.940 DllHost.exe: DLL_PROCESS_ATTACH
135473 2040 3296 10:41:12.615 AppC.exe: DLL_PROCESS_ATTACH
694723 1112 1992 10:44:23.393 AppD.exe: DLL_PROCESS_ATTACH
703962 5020 1080 10:44:42.014 DllHost.exe: DLL_PROCESS_ATTACH
705511 4680 3564 10:44:42.197 DllHost.exe: DLL_PROCESS_ATTACH
705891 1528 2592 10:44:42.307 regedit.exe: DLL_PROCESS_ATTACH
785231 2992 4912 10:45:26.516 AppE.exe: DLL_PROCESS_ATTACH
786523 3984 1156 10:45:26.605 powershell.exe: DLL_PROCESS_ATTACH
817979 4188 4336 10:45:48.707 wermgr.exe: DLL_PROCESS_ATTACH
834875 3976 1512 10:45:52.342 LogonUI.exe: DLL_PROCESS_ATTACH
835229 4116 3540 10:45:52.420 AppG.exe: DLL_PROCESS_ATTACH

Each region can be analyzed independently for any anomalies, for example, to look for the answer to a question why wermgr.exe was launched. We illustrate partitioning in the following schematic diagram:

It is also possible to make different trace segmentation by interleaving regions above with another set of Anchor Messages comprising of the adjoint thread of DLL_PROCESS_DETACH message:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for A

Create new playlist

Sign In

Sign Up

Table of Contents for
A