The Systems

Software development projects vary in many dimensions: the domain of the project, the expertise of the developers, the size of the project, and the programming languages used to express the source code comprising the system, to name just a few. We want the projects we analyze for the purpose of investigating the questions of interest to vary across at least some of these dimensions. We also want projects that have archival information about the software development available. Many projects meet these criteria. We chose to include in our analyses the following three projects:

Table 21-1 demonstrates the variability in these projects by providing an overview of these projects in terms of the length of the development, the primary language used to express the source code, the number of modules (see What Is a Module?), lines of code, and changes (see What Is a Change?). Only changes that were analyzed in our study are included in these counts.

Table 21-1. An overview of the three systems we analyzed

Project

First release

Primary language

Modules

Approximate SLOC[a]

Changes

Evolution

December 2001

C

43

300,000

1,939

Firefox

November 2004

C++

45

4,000,000

11,710

Mylyn

November 2006

Java

18

675,000

3,055

[a] Source lines of code were measured using cloc (http://cloc.sourceforge.net).

Aggregate statistics are a useful starting point for understanding the systems being analyzed, but they really tell only part of the story. In particular, these statistics treat the archives of these systems as static, hiding the dynamics of how developers make changes to the system over time. To better understand how the system developments compare, we also examine the rate at which the system changes.

We can characterize the rate of change within each system by looking at the number of lines of code modified per day. Figure 21-1 shows the activity on each project over time. Each dot in these graphs corresponds to a single day of data recorded in the project’s code repository; the height of each dot on the vertical axis represents the number of lines of code that were committed to the repository on that day. We see that Evolution is characterized by a period of slow change initially, followed by a long period of sustained activity. Firefox changes less frequently and shows almost no activity after a certain point, as developers completed work on the 3.5 release branch and moved on to the next version of the project. Mylyn appears to grow in periodic bursts over time.

Lines of code modified per day throughout the history of Eclipse, Firefox, and Mylyn

Figure 21-1. Lines of code modified per day throughout the history of Eclipse, Firefox, and Mylyn

We can reach similar conclusions about the rates of change of these systems by looking at the cumulative sum of this data. Figure 21-2 is such a plot for Evolution, clearly showing the initial period of slow change and later sustained activity. Plots for the other systems look similar.

Cumulative sum of lines of code modified for Evolution

Figure 21-2. Cumulative sum of lines of code modified for Evolution

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset