Integrated Approach for Prediction of Failures

So far we have analyzed sets of metrics—complexity metrics, code churn, code coverage, etc.—in isolation. In this section we address how the metrics may be combined to form stronger predictors of failures.

In Figure 23-3 a simple network of engineering working together in different binaries is shown. Similarly, Figure 23-3 shows the code dependencies between various networks. Figure 23-3 shows combining both pieces of information to integrate people, churn (in terms of edits/contributions), and dependencies together into one network.

Socio-technical networks [Bird et al. 2009]

Figure 23-3. Socio-technical networks [Bird et al. 2009]

For Windows Vista we generate such a network integrating the people, churn contribution, and dependency information. Several social-network measures [Bird et al. 2009], detailed next, are computed for the Windows Vista social network (similar to the network in Figure 23-3).

Ego network measures [Borgatti et al. 2002] are based on the neighborhood for any particular node. The node being evaluated is denoted ego, and the neighborhood includes ego, the set of nodes connected to an ego, and the complete set of edges between this set of nodes.

Size

The number of nodes in the ego network

Ties

Number of edges in the ego network

Pairs

Number of possible directed edges in the ego network

Density

Proportion of possible ties that actually are present (Ties/Pairs)

Weak Components

Number of weakly connected components

Normalized Weak Components

Number of weakly connected components normalized by size, i.e., (Weak Components/Size)

Two Step Reach

The proportion of nodes that are within two hops of ego

Reach Efficiency

Two Step Reach normalized by size of the network (higher reach efficiency indicates that the ego’s primary contacts are influential in the network)

Brokerage

Number of pairs of nodes that are connected only by ego (thus ego acts as the sole broker for the pair)

Normalized Brokerage

Brokerage normalized by number of pairs

Ego Betweenness

Betweenness of ego within its ego network

The preceding social network measures are computed for the complete Vista network (which includes the developers, contributions, and dependencies). Using these social network measures as input, prediction models are built. We observe that precision and recall of the built models are much higher when also using the dependency network for prediction. Similar results were also observed for multiple versions of IBM Eclipse (the open source IDE from IBM) [Bird et al. 2009]. Table 23-2 shows the precision and recall values with the model fit F-scores, which also indicate the increased ability of the socio-technical approach to predict failures. The “combined” model in Table 23-2 denotes a model purely built by just adding both the “Contribution” network (people working together) and the “Dependency” network (between various pieces of code), which provides a contrast to the socio-technical network. Further readings in which code complexity, churn, and coverage metrics are combined to predict failures can be found in [Nagappan et al. 2006b].

Table 23-2. Overall socio-technical network model efficacy using different release of Eclipse [Bird et al. 2009]

Release

Network

Precision

Recall

F-score

Nagel.

2.0

Dependency

0.667

0.779

0.705

0.532

 

Contribution

0.808

0.854

0.824

0.702

 

Combined

0.826

0.814

0.813

0.909

 

Socio-technical

0.755

0.859

0.800

0.747

2.1

Dependency

0.693

0.753

0.710

0.626

 

Contribution

0.675

0.780

0.719

0.607

 

Combined

0.755

0.777

0.758

0.805

 

Socio-technical

0.747

0.809

0.770

0.689

3.0

Dependency

0.631

0.737

0.673

0.494

 

Contribution

0.681

0.683

0.673

0.353

 

Combined

0.745

0.756

0.743

0.616

 

Socio-technical

0.767

0.777

0.769

0.600

3.1

Dependency

0.579

0.718

0.634

0.391

 

Contribution

0.639

0.646

0.629

0.295

 

Combined

0.693

0.796

0.735

0.689

 

Socio-technical

0.820

0.800

0.806

0.668

3.2

Dependency

0.698

0.780

0.731

0.495

 

Contribution

0.614

0.720

0.654

0.371

 

Combined

0.835

0.866

0.846

0.816

 

Socio-technical

0.793

0.784

0.785

0.572

3.3

Dependency

0.693

0.743

0.711

0.433

 

Contribution

0.725

0.669

0.688

0.356

 

Combined

0.742

0.780

0.754

0.686

 

Socio-technical

0.820

0.831

0.823

0.727

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset