Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4

Synthesizing Knowledge from Software Development Artifacts

Olga Baysal*; Oleksii Kononenko^†; Reid Holmes^‡; Michael W. Godfrey^† ^* School of Computer Science, Carleton University, Ottawa, ON, Canada
^† David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada
^‡ Department of Computer Science, University of British Columbia, Vancouver, BC, Canada

Abstract

When software practitioners make day-to-day design decisions about their projects, they are guided by not only their intuition and experience, but also by the variety of software artifacts that are available to them. This chapter describes how lifecycle models can be used to build a useful and intuitive model of these development artifacts. Lifecycle models capture the dynamic nature of how such artifacts change over time in a graphical form that can be easily understood and communicated. We show how lifecycle models can be generated, and we present two industrial case studies where we applied lifecycle models to assess a project’s code review process.

Keywords

Artifact lifecycle models

Patch lifecycles

Software development artifacts

Knowledge discovery

Decision-making

4.1 Problem Statement

The data is out there. The problem is making practical use of it.

It can be challenging, if not impossible, to process the myriad development artifacts that accrue in repositories, such as issue tracking and version control systems. The sheer size of the data, arriving in unrelenting streams, makes it impractical to perform manual processing for comprehension of the current state of the project, let alone seek historical trends. Help is needed to corral the information, organize it in a helpful and intuitive way, and provide an overview of development progress. With this in mind, we introduce a data pattern called the lifecycle model, which is a graph-based representation of specific properties within a development artifact and how they change over time.

Lifecycle models can be applied to any data that changes its state, or is annotated with new data, over time. For example, “issues” in issue-tracking systems often start their lives as OPEN, but are eventually RESOLVED or marked as WONTFIX. A lifecycle model for issues aggregates data to provide a graphical representation of how bugs flow through these three states. For example, the lifecycle model would capture the proportion of bugs that are reopened from a WONTFIX state that might be interesting for a manager considering adjustments to their issue triage process. Such lifecycle models often exist in process models (if defined). Therefore, extracting one from the history enables comparing the defined lifecycle with the actual one.

In this chapter, we apply lifecycle models to capture how the patch review process works within the Mozilla, WebKit, and Blink projects. We demonstrate how these models can expose interesting trends within individual projects, while at the same are succinct enough to permit an analyst to easily compare traits between projects.

4.2 Artifact Lifecycle Models

Using metadata history, lifecycle models can be extracted for any data elements that change over time, such as the change status of issues, patches under review, or evolving lines of code. By examining how each artifact evolves, we can build a summary that captures common dynamic evolutionary patterns. Each node in a lifecycle model represents a state that can be derived by examining the evolution of an artifact.

4.2.1 Example: Patch Lifecycle

To model the patch lifecycle, e.g., for the Mozilla project, we first examine Mozilla’s code review policy and processes. Mozilla employs a two-tier code review process for validating submitted patches — review and super-review [1]. The first type of a review is performed by a module owner or peers of the module; a reviewer is someone who has domain expertise in a problem area. The second type of review is called a super-review; these reviews are required if the patch involves integration or modifies core codebase infrastructure. We then extract the states a patch can go through and define the final states it can be assigned to.

Figure 4.1 illustrates a lifecycle model of a patch as a simple graph: the nodes show the various states a patch can go through during the review process, while the edges capture the transitions between lifecycle states. A transition represents an event that is labeled as a flag and its status reported during the review process. For simplicity, we show only the key code review patches that have “review” (r) and “super-review” (sr) flags attached to them.¹

f04-01-9780124115194 — Figure 4.1 Lifecycle of a patch.

The code review process begins when a patch is submitted and a review is requested; the initial transition is labeled “r? OR sr?,” i.e., a review or super-review has been requested. A patch can be assigned to one of three states: Submitted, Accepted, or Rejected. Once the review is requested (a flag contains a question mark “?” at the end), the patch enters the Submitted state. If a reviewer assigns “+” to a flag (e.g., “r+” or “sr+”), the patch is marked as Accepted; if a flag is reported with a status “–” (e.g., “r −” or “sr −”), the patch is Rejected.

Note that both the Accepted and Rejected states permit self-transitions. These self-transitions, when taken together with the transitions between the Accepted and Rejected states, illustrate the double-review process. The double-review process takes place when a reviewer believes the patch can benefit from additional reviews, or when code modifications affect several modules and thus needs to be reviewed by a reviewer from each affected module.

There are four possible terminal states for a patch:

• Landed—a patch meets the code review criteria and is incorporated into the codebase.

• Resubmitted—a patch is superseded by additional refinements after being accepted or rejected.

• Abandoned—a patch is not improved after being rejected.

• Timeout—a patch with review requests that are never answered.

By definition, the cumulative number of the patches in Landed, Resubmitted, Abandoned, and Timeout is equal to the number of the Submitted patches.

Each edge may be presented with quantitative data, such as the time required for a patch to pass from one state to another or the percentage of the total patches that appear in a certain state (e.g., how many patches are positively evaluated by reviewers).

4.2.2 Model Extraction

We now demonstrate the process of extracting patch lifecycle models. Lifecycle models can be generated as follows:

1. Determine the key states of the system or process (e.g., a number of events that could occur) and their attributes.

2. Define the necessary transitions between the states and specify an attribute for each transition.

3. Define the terminal outcomes (optional).

4. Define and gather qualitative or quantitative measurements for each state transition and, if present, for the final outcomes. In addition to these measurements, the time spent in the state or the transition to another state can also be considered and analyzed.

The pattern provides means to model, organize, and reason about the data or underlying processes otherwise hidden in individual artifacts. While lifecycle models can be modified or extended depending on the needs, they work best when the state space is well defined. Applying this pattern to complex data may require some abstraction.

4.3 Code Review

Code review is a key element of any mature software development process. It is particularly important for open source software (OSS) development, since contributions — in the form of bug fixes, new features, documentation, etc. — may come not only from core developers but also from members of the greater user community [2–5]. Indeed, community contributions are often the life blood of a successful open source project; however, the core developers must also be able to assess the quality of the incoming contributions, lest they negatively impact the overall quality of the system.

The code review process evaluates the quality of source code modifications (submitted as patches) before they are committed to a project’s version control repository. A strict review process is important to ensuring the quality of the system, and some contributions will be championed and succeed while others will not. Consequently, the carefulness, fairness, and transparency of the process will be keenly felt by the contributors.

Here, we want to explore whether code review in a project is “democratic,” i.e., are contributions reviewed “equally,” regardless of the developers’ previous involvement on a project. For example, do patches from core developers have a higher chance of being accepted? Do patches from casual contributors take longer to get feedback?

4.3.1 Mozilla Project

As we said earlier, Mozilla employs a two-tiered code review process for evaluating code modifications: reviews and super-reviews [1]. A review is typically performed by the owner of the module or a “peer”; the reviewer has domain expertise in the problem area. A super-review is required if the patch involves integration or modifies core Mozilla infrastructure. Currently, there are 29 super-reviewers [6] spread across all Mozilla modules, with 18 reviewers (peers) on the Firefox module alone [7]. However, any person with level 3 commit access — i.e., core product access to the Mercurial version control system — can become a reviewer.

Bugzilla users flag patches with metadata to capture code review requests and evaluations. A typical patch review process consists of the following steps:

1. Once a patch is ready, its owner requests a review from a module owner or a peer. The review flag is set to “r?”. If the patch owner decides to request a super-review, he may also do so and the flag is set to “sr?”.

2. When the patch passes a review or a super-review, the flag is set to “r +,” or “sr +,” respectively. If it fails, the reviewer sets the flag to “r −” or “sr −” and provides explanation on a review by adding comments to a bug in Bugzilla.

3. If the patch is rejected, the patch owner may resubmit a new version of the patch that will undergo a review process from the beginning. If the patch is approved, it will be checked into the project’s official codebase.

4.3.2 WebKit Project

WebKit is an HTML layout engine that renders web pages and executes embedded JavaScript code. The WebKit project was started in 2001 as a fork of the KHTML project. Prior to April 2013, developers from more than 30 companies actively contributed to WebKit, including Google, Apple, Adobe, BlackBerry, Digia, Igalia, Intel, Motorolla, Nokia, and Samsung. Google and Apple are the two primary contributors, submitting 50% and 20% of patches, respectively.

The WebKit project employs an explicit code review process for evaluating submitted patches; in particular, a WebKit reviewer must approve a patch before it can land in the project’s version control repository. The list of official WebKit reviewers is maintained through a system of voting to ensure that only highly-experienced candidates are eligible to review patches. A reviewer will either accept a patch by marking it “review +” or ask for further revisions from the patch owner by annotating the patch with “review −”. The review process for a particular submission may include multiple iterations between the reviewer and the patch writer before the patch is accepted (lands) in the version control repository.

4.3.3 Blink Project

Google forked WebKit to create the Blink project in April 2013 because they wanted to make larger-scale changes to WebKit to fit their own needs that did not align well with the WebKit project itself. Several of the organizations that contributed to WebKit migrated to Blink after the fork.

Every Blink patch is submitted to the project’s issue repository. The reviewers on the Blink project approve patches by annotating it “LGTM” (“Looks Good To Me,” case-insensitive) on the patch and reject patches by annotating “not LGTM.” In this work, we consider WebKit’s “review +”/“review −” flags and Blink’s “lgtm”/“not lgtm” annotations as equivalent. Since Blink does not have an explicit review request process (e.g., “review?”), we infer requests by adding a “review?” flag to a patch as soon as it is submitted to the repository. Since patches are typically committed to the version control system by an automated process, we define landed patches as those followed by the automated message from the “commit bot.” The last patch on the issue is likely to be the patch that eventually lands to the Blink’s source code repository. Committers can optionally perform a manual merge of the patches to the version control system, although we do not consider these due to their infrequence.

4.4 Lifecycle Analysis

We now apply lifecycle models to the code review processes to highlight how code reviews happen in practice. Here, the goal of creating the model is to identify the ways in which patches typically flow through the code review process, and also to identify exceptional review paths. We have extracted the code review lifecycle model for Mozilla Firefox (Section 4.4.1), WebKit (Section 4.4.2), and Blink (Section 4.4.3).

4.4.1 Mozilla Firefox

We modeled the patch lifecycle by examining Mozilla’s code review policy and processes and compared them to how developers worked with patches in practice. To generate lifecycle models all events have been extracted from Mozilla’s public Bugzilla instance. We extracted the states a patch can go through and defined the final states it can be assigned to.

All code committed to Mozilla Firefox undergoes code review. Developers submit a patch containing their change to Bugzilla and request a review. Reviewers annotate the patch either positively or negatively to reflect their opinion of the code under review. For highly-impactful patches super-reviews may be requested and performed. Once the reviewers approve a patch it can be applied to the Mozilla source code repository.

We generated the model by applying the four steps given in Section 4.2.2:

1. State identification: Patches exist in one of three key states: Submitted, Accepted, and Rejected.

2. Transition extraction: Code reviews transition patches between the three primary states. A review request is annotated with “r?”. Positive reviews are denoted with a “r +”. Negative reviews are annotated “r −”. Super-reviews are prepended with an s (e.g., “sr +”/“sr −”).

3. Terminal state determination: Patches can also exist in four terminal states, which are defined considering entire patch history to resolve an issue. Landed patches are those that pass the code review process and are included in the version control system. Resubmitted patches are patches a developer decides to further refine based on reviewer feedback. Abandoned patches capture those patches the developer decided to abandon based on the feedback they received. Finally, the Timeout state represents patches for which no review was given even though it was requested.

4. Measurement: For this study we measured both the number of times each transition happened along with the median time taken to transition between states.

Figure 4.2 illustrates the lifecycle model of the patch review process for core Mozilla contributors. Core contributors are those developers who submitted 100 patches or more during the studied period, while casual contributors are defined as those ones who wrote 20 patches or fewer [8]. Contributors who submitted more than 20 and fewer than 100 patches (12% of all contributors) were not included in this analysis.

f04-02-9780124115194 — Figure 4.2 Mozilla Firefox’s patch lifecycle for core developers.

The lifecycle demonstrates some interesting transitions that might not otherwise be obvious. For instance, a large proportion of accepted patches are still resubmitted by authors for revision. We can also see that rejected patches are usually resubmitted, easing concerns that rejecting a borderline patch could cause it to be abandoned. We also see that very few patches timeout in practice. From the timing data for core contributors (refer to Table 4.1) we see that it takes an average of 8.9 hours to get an initial “r+” review, while getting a negative “r–” review takes 11.8 hours.

Table 4.1

Median Time of a Transition (in Minutes) for Mozilla Firefox

Transition	Core Contributors	Casual Contributors
r? → r+	534	494
r? → r−	710	1024
r+→ r−	390	402
r−→ r+	1218	15
sr? → sr+	617	n/a
sr? → sr−	9148	n/a

We also measured the time it takes for a patch to go from one state to another. Table 4.1 reports the median time (in minutes) each transition of the model takes. The transition “r? → r+” is a lot faster than “r? → r–” showing that reviewers provide faster responses if a patch is of good quality. To our surprise, the fastest “r? → r+” is detected for casual developers. Our findings show that contributions from casual developers are less likely to get a positive review; however, if they do, the median response rate is about 8 hours (in contrast to 9 hours for core developers). Super-reviews, in general, are approved very quickly, within 8−10 hours. This finding conforms to the Mozilla’s code review policy: super-reviewers are supposed to respond within 24 hours of a super-review request. However, it takes much longer — 4 to 6 days — for a super-reviewer to reject a patch that requires an integration, as these patches often require extensive discussions with others. “r– → r+” is a lot slower for core developers, mainly because there is only one occurrence of this transition for the “casual” group.

Comparing the lifecycles for core (Figure 4.2) and casual contributors (Figure 4.3), we note that, in general, casual contributors have 7% fewer patches that get accepted or checked into the codebase and have 6% more patches that get rejected. The amount of the patches from casual contributors that received no response or are being abandoned is increased by the factor of 3.5x and 3.12x respectively. Review requests with timeouts are likely those that are directed to the wrong reviewers or to the “General” component that does not have an explicit owner. If a review was requested from a default reviewer, a component owner, the patch is likely to get no response due to heavy loads and long review queues the default reviewer has. Since contributors decide what reviewer to request an approval from, they might send their patch to the “graveyard” by asking the wrong person to review their patch. The process, by design, lacks transparency on the review queues of the reviewers.

f04-03-9780124115194 — Figure 4.3 Mozilla Firefox’s patch lifecycle for casual developers.

Moreover, casual contributors are more likely to give up on a patch that fails a review process — 16% fewer patches are resubmitted after rejection. Unlike patches from core developers, once rejected patches from the “casual” group do not get a chance to get in (0% on the “Rejected” to “Accepted” transition) and are three times more likely to receive a second negative response.

The results show that patches submitted by casual developers do not require super-reviews, as we found no super-review requests on these patches. This was unsurprising since community members who participate occasionally in a project often submit small and trivial patches [9].

Our findings suggest that patches from casual contributors are more likely to be abandoned by both reviewers and contributors themselves. Thus, it is likely that these patches should receive extra care to both ensure quality and encourage future contributions from the community members who prefer to participate in the collaborative development on a less regular basis.

The results of our analysis of the code review process of Firefox — and comparing the lifecycle models for patch review between core and casual contributors — generated discussion and raised some concerns among the developers on the Mozilla Development Planning team [10]:

“… rapid release has made life harder and more discouraging for the next generation of would-be Mozilla hackers. That’s not good… So the quality of patches from casual contributors is only slightly lower (it’s not that all their patches are rubbish), but the amount of patches that end up abandoned is still more than 3x higher. :-( ”

[Gerv Markham]

“I do agree that it’s worth looking into the “abandoned” data set more carefully to see what happened to those patches, of course.”

[Boris Zbarsky]

4.4.2 WebKit

The lifecycle model can be easily modified according to the dataset at hand. For example, we have applied the pattern to study the code review process of the WebKit project [11]. The model of the patch lifecycle is shown in Figure 4.4.

f04-04-9780124115194 — Figure 4.4 WebKit’s patch lifecycle.

Since WebKit is an industrial project, we were particularly interested to compare its code review process to that of Mozilla, which was run in more traditional open source development style. To do so, we extracted WebKit’s patch lifecycle (Figure 4.4) and compared it with the previously studied patch lifecycle of Mozilla Firefox [8] (Figure 4.2).

The patch lifecycle captures the various states patches undergo during the review process, and characterizes how the patches transition between these states. The patch lifecycles enable large data sets to be aggregated in a way that is convenient for analysis. For example, we were surprised to discover that a large proportion of patches that have been marked as accepted are subsequently resubmitted by authors for further revision. We can also see that rejected patches are usually resubmitted, which might ease concerns that rejecting a borderline patch could cause it to be abandoned.

While the set of states in our patch lifecycle models of both WebKit and Firefox are the same, WebKit has fewer state transitions; this is because the WebKit project does not employ a “super-review” policy. Furthermore, unlike in Mozilla, there are no self-edges on the “Accepted” and “Rejected” states in WebKit; this is because Mozilla patches are often reviewed by two people, while WebKit patches receive only individual reviews. Finally, the WebKit model introduces a new edge between “Submitted” and “Resubmitted”; WebKit developers frequently “obsolete” their own patches and submit updates before they receive any reviews at all. One reason for this behavior is that submitted patches can be automatically validated by the external test system and developers can thus submit patches before they are to be reviewed to see if they fail any tests. All together, however, comparing the two patch lifecycles suggests that the WebKit and Firefox code review processes are fairly similar in practice.

4.4.3 Blink

Blink’s patch lifecycle is depicted in Figure 4.5, which shows that 40% of the submitted patches receive positive reviews, while only 0.3% of the submitted patches are rejected. Furthermore, a large portion of patches (40.4%) are resubmitted. This is because Blink developers often update their patches prior receiving any reviews; as with WebKit, this practice enables the patches to be automatically validated. At first glance, outright rejection does not seem to be part of the Blink code review practice; the Rejected state seems to under-represent the number of patches that have been actually rejected. In fact, reviewers often leave comments about patch improvements, before the patch is accepted.

f04-05-9780124115194 — Figure 4.5 Blink’s patch lifecycle.

The model also illustrates the iterative nature of the patch lifecycle, as patches are frequently “Resubmitted”. The edge from “Submitted” to “Landed” represents patches that have been merged into Blink’s source code repository, often after one or more rounds of updates. Developers often fix “nits” (minor changes) after their patch has been approved, and land the updated version of the patch without receiving additional explicit approval. The lifecycle also shows that nearly 10% of patches are being neglected by the reviewers (i.e., “Timeout” transition); “Timeout” patches in Blink can be considered as “informal” rejects.

Comparing the patch lifecycle models of WebKit and Blink, we noticed that Blink has fewer state transitions. In particular, the edges from the “Accepted” and “Rejected” back to “Submitted” are absent in Blink. Since Blink does not provide any indication of the review request on patches, we had to reverse engineer this information for all patches by considering the timestamps on each item (patch) in the series. We automated this process by putting the “Submitted” label to the patch at the time the patch was filed to the issue repository.

Blink also accepts a smaller portion of patches (about 40% of all contributions compared to the WebKit’s 55% of submitted patches), but officially rejects less than 1%. “Timeouts” are more frequent for Blink patches than WebKit ones. Blink appears to exhibit a larger portion of patches being resubmitted (a 10% increase compared to the WebKit patches), including resubmissions after patches are successfully accepted (16.7%).

Finally, a new edge is introduced between “Submitted” and “Landed”, accounting for those contributions that were committed to the code base without official approval from the reviewers; these cases typically represent patch updates. Both WebKit and Blink developers frequently “obsolete” their own patches and submit updates before they receive any reviews at all.

Comparing the two patch lifecycle models suggests that the WebKit and Blink code review processes are similar in practice; at the same time, it appears that Google’s review policy may not be as strict as the one employed by Apple on the WebKit project.

4.5 Other Applications

Software projects put considerable effort into defining and documenting organizational rules and processes. However, the prescribed processes are not always followed in practice. Lifecycle models provide practitioners with fact-based views about their projects (e.g., code review as described in this chapter). Supporting practitioners with better insights into their processes and systems, these models help them make better data-driven development decisions.

The lifecycle model is a flexible approach that can be used in a variety of software investigation tasks. For example:

• Issues: As developers work on issues their state changes. Common states here would include NEW, ASSIGNED, WONTFIX, CLOSED, although these may vary from project to project.

• Component/Priority assignments: Issues are often assigned to specific code components (and are given priority assignments). As an issue is triaged and worked upon these assignments can change.

• Source code: The evolutionary history of any line or block of source code can be considered capturing Addition, Deletion, and Modification. This data can be aggregated at the line, block, method, or class level.

• Discussions: Online discussions, e.g., those on StackOverflow, can have status of CLOSED, UNANSWERED, REOPENED, DUPLICATE, and PROTECTED.

4.6 Conclusion

In this chapter we introduced the lifecycle model data pattern. This pattern can be used to capture both common and exceptional cases in the evolution of software artifacts. We have applied it successfully to code review in Mozilla, WebKit, and Blink and have received positive feedback from Mozilla developers as they investigated their own code review processes. Lifecycle models can be easy to generate and interpret, making them practical for use in a wide variety of data modeling applications.

Software developers and managers make decisions based on the understanding they have of their software systems. This understanding is both built up experientially and through investigating various software development artifacts. While artifacts can be investigated individually, being able to summarize characteristics about a set of development artifacts can be useful. In this chapter we proposed artifact lifecycle models as an effective way to gain an understanding of certain development artifacts. Lifecycle models capture the dynamic nature of how various development artifacts change over time in a graphical form that can be easily understood and communicated. Lifecycle models enables reasoning of the underlying processes and dynamics of the artifacts being analyzed. We described how lifecycle models can be generated and demonstrated how they can be applied to the code review processes of three industrial projects.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4: Synthesizing Knowledge from Software Development Artifacts

Create new playlist

Sign In

Sign Up