Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12

Applying Software Data Analysis in Industry Contexts

When Research Meets Reality

Madeline Diep*; Linda Esker*; Davide Falessi*; Lucas Layman*; Michele Shaw*; Forrest Shull^† ^* Fraunhofer Center for Experimental Software Engineering, College Park, MD, USA
^† Software Solutions Division, Software Engineering Institute, Arlington, VA, USA

Abstract

Software data analytics is key for helping stakeholders make decisions, and thus establishing a measurement and data analysis program is a recognized best practice within the software industry. However, practical implementation of measurement programs and analytics in industry is challenging. In this chapter, we discuss real-world challenges that arise during the implementation of a software measurement and analytics program. We also report lessons learned for overcoming these challenges and best practices for practical, effective data analysis in industry. The lessons learned provide guidance for researchers who wish to collaborate with industry partners in data analytics, as well as for industry practitioners interested in setting up and realizing the benefits of an effective measurement program.

Keywords

Practical Software Measurement

Data Analysis

Software Best Practices

Software Industry Challenges

12.1 Introduction

Software data analytics programs are founded upon the measurement of software products, processes, and organizations. Measurement is the “act or process of assigning a number or category to an entity to describe an attribute of that entity” [1]. Measurement allows us to build models or representations of what we observe so we can reason about relationships in context [2]. Measurement plays an important role in a number of analytical applications, from forecasting the cost of multi-billion dollar government defense systems to identifying faulty components from runtime execution logs. Measurement is crucial for process improvement, cost and effort estimation, defect prediction, release planning, and resource allocation.

As software data analytics matures as a discipline, the basic challenges remain of quantifying the resources, processes, and artifacts involved in software development. Practical software measurement, which serves as the basis of software data analysis, is challenging to execute. A well-intentioned developer who says, “We should measure the effort we spend on this project,” has just exposed her/himself to a variety of challenges. Management must be convinced that measurement and analytics activities will be worth the cost. Instrumentation must be put in place to collect raw data. Raw measurement data must be cleaned for analysis. Developers may not like their work effort being scrutinized. The initial measurements collected may not provide the whole story of effort spent. There will be disputes, negotiations, and consequences regarding the results.

Our goal in this chapter is to provide best practices and lessons learned for effective software data analysis in the software industry, drawn from Fraunhofer’s 15 years of measurement and data analysis experience in the software industry.¹ We provide practical advice both for researchers who need to know the real-world constraints of measurement in industry, and for practitioners interested in setting up a measurement program. In Section 12.2, we provide a brief background of our Fraunhofer team and our credentials in software measurement, together with a number of sources for the curious reader interested in implementing their own measurement program.

In Section 12.3, we discuss challenges and lessons learned around six key topics that must be considered when implementing a measurement program in industry. We present these issues in rough chronological order in which they occur during the lifecycle of implementing an applied software measurement and analytics program. This lifecycle roughly corresponds to a traditional software development lifecycle: gathering stakeholder requirements, formalizing measures, implementing data collection and analysis, and communicating results. The six topics are:

1. Stakeholders, requirements, and planning: the groundwork for a successful measurement program—Obtaining stakeholder buy-in, goal setting, and planning are the critical beginnings of an effective measurement program.

2. Gathering measurements—how, when, and who—Addressing the technical and organizational challenges of gathering measurement data, which extend beyond simply gaining access to data.

3. All data, no information—Facing the challenges of missing data, low quality data, or incorrectly formatted data.

4. The pivotal role of subject-matter expertise—Using qualitative input to understand the data and to interpret the results for practical software data analytics.

5. Responding to changing needs—Responding to changing goals, changing directions from consumers, and technical and budgetary challenges that occur during the lifetime of a project.

6. Effective ways to communicate analysis results to the consumers—Presenting the results to decision-makers and packaging the findings as reusable knowledge.

As we discuss these topics, we provide concrete examples of the challenges encountered and techniques used to overcome those challenges from past experiences in implementing measurement programs in industry. Each topic is punctuated with several short challenges and recommendations for the reader. Finally, in Section 12.4, we summarize the takeaways from our chapter and highlight several open issues in applied software measurement and analysis.

12.2 Background

The background of this chapter consists of four main aspects: our experience in software measurement; terminology; our empirical method; and our high-level approach to measurement. Each of these aspects is described in a specific section in the remainder of this section.

12.2.1 Fraunhofer’s Experience in Software Measurement

For 15 years, our team at the Fraunhofer Center for Experimental Software Engineering has performed software data analysis for government and commercial customers to provide actionable conclusions for key decision-makers. These analyses have covered all stages of product and service development, from proposal definition and requirements analysis to implementation, test, and operations. Fraunhofer has implemented measurement for projects of many types, from small web development projects in startup companies to safety-critical, systems-of-systems in government agencies. In small commercial projects, we have applied quantitative software data analyses to help organizations improve their process maturity level. In safety-critical, high-maturity contexts, we assisted government civil servants by using software data analytics to evaluate the progress of contractors and to quantify risk. Fraunhofer scientists and engineers have authored 40–50 publications on applied and theoretical software measurement; delivered software measurement keynotes; edited two measurement books [3, 4]; received multiple awards from the NASA on measurement-based research and program support; and assisted a commercial client in achieving CMMI®² Level 5 maturity [5].

Throughout this chapter, we draw on our practical experiences applying software measurement to government and industry projects. Table 12.1 summarizes 11 major projects from which we draw many of the measurement challenges and lessons learned in this chapter. Each project is described via six main characteristics: domain of the project, type of project, size of the team, lines of code, the specific phases covered by the measurement program, and the duration of the project.

Table 12.1

Excerpt of Project Characteristics

Project Domain	Project Type	Team Size	Lines of Code	Lifecycle Phases Covered by Metrics	Measurement Duration
Aerospace	Maintenance	Very large (100+)	1M+	Implementation	3+ years
Control	Greenfield	Very large (100+)	1M+	Implementation	3+ years
Aerospace	Greenfield	Very large (100+)	1M+	Design	<1 year
Aerospace	Both	Very large (100+)	1M+	Test, Operations	3 + years
Military health	Maintenance	Very large (100+)	1M+	DoD 5000—all aspects	1-3 years
Web applications	Both	large (30-100)	100K-500K	All	3 + years
Telecommunications	Maintenance	large (30-100)	1M+	Implementation	1-3 years
Software development	Maintenance	Very large (100+)	1M+	Implementation	1-3 years
Oil company	Maintenance	large (30-100)	100K-500K	Operations	1-3 years
FFRDC	Maintenance	N/A	N/A	Initiation	<1 year
Aerospace	Both	Small-very large	100K-500K	All	3 + years

t0010

12.2.2 Terminology

Although there are other standards for measurement terminology [6], throughout this chapter we will use the following terminology, which we have adapted from the IEEE Standard for a Software Quality Metrics Methodology (IEEE Std 1061-1998) [1]:

• Metric—A function whose inputs are software data and whose output is a single numerical value that can be interpreted as the degree to which that software possesses a given attribute. Examples of metrics include Lines of Code (LOC), defects/LOC, and person-hours.

• To measure—(a) a way to ascertain or appraise value by comparing it with a norm; (b) to apply a metric.

• Measurement—(a) the act or process of assigning a number or category to an entity to describe an attribute of that entity. A figure, extent, or amount obtained by measuring, e.g., we use SonarQube^TM(i.e., the act) to determine LOC on our projects.

• Metric value—a metric output or an element that is from the range of a metric, e.g., the metric value for LOC is 520.

12.2.3 Empirical Methods

Ad hoc and opportunistic “measurement for the sake of measurement” rarely yields useful results. Data that are collected for convenience suffer from data quality issues and rarely correlate with an organization’s overall quality improvement goals. As a result, the effort required to transform data into useful information is likely to be substantial. Further, collected data that does not relate to an improvement goal is often viewed as wasted effort. Undirected measurement is hurtful in the long term; not only are resources from gathering and analyzing data wasted without yielding apparent benefit, but measurement programs in general will be viewed as a waste of resources.

In response to this phenomenon, many approaches and paradigms have been proposed to make measurement programs more systematic and formal. Fraunhofer has initiated and leveraged several such approaches to support applied software measurement with industry and government partners. Our approaches are goal-directed, where the focus is placed on identifying goals or objectives, using the goals and objectives to systematically derive information needs, and collecting the necessary data to provide the information. Fraunhofer has extensive experience applying several measurement-based approaches and methods, including:

1. The Goal Question Metric (GQM) approach [7] provides mechanisms for defining measurements goals, refining goals into specifications for data collection, and analyzing and interpreting the collected data with respect to the formulated goals. Originally formulated for use at NASA’s Software Engineering Laboratory [8], the GQM approach has been applied in many domains, including aeronautics, telecommunications, the oil industry, defense, and medical.

2. GQM+Strategies^TM(GQM+S) [9] is an extension to the GQM approach that supports alignment between goal definitions, strategy development, and measurement implementation existing in the various hierarchies of the organization. This extension enables integration of measurement across the organization. With GQM+Strategies, organizations make explicit how strategies implemented at the lower level in the organization support the highest level business goals of the organization, and how measurement collected at the lower level is used to track achievement of the business goals. For example, a top-level business goal of a commercial software company could be to “increase customer satisfaction,” where it is supported by “perform effective code reviews” strategy at its technical division.

3. Quality Improvement Paradigm (QIP) [10] is a six-phase process for continuous organizational improvement that draws from the knowledge and experience gained executing individual projects. QIP consists of two cycles: (1) the organization-level cycle consists of phases for characterizing the organization, setting improvement goals, choosing processes to be implemented on projects, analyzing results, and packaging the experience for future organizational use; (2) the project-level cycle consists of phases for executing the selected process, analyzing results at the project scope, and providing feedback of the process.

4. Experience Factory (EF) [11] is a conceptual infrastructure supporting QIP for synthesizing, packaging, and storing work products and experiences provided by the projects as “reusable experience,” and supplying the experience to (future) projects on-demand.

12.2.4 Applying Software Measurement in Practice—The General Approach

In this section, we briefly discuss Fraunhofer’s general approach to implementing a measurement program with an industry or government partner. The steps of this process reflect, as we will discuss in Section 12.3, that many of the challenges facing an industry measurement program are not matters of processing or analyzing data, but rather in working with people and organizations.

As exemplified in the QIP, our general approach for measurement is comprised of three phases performed iteratively: (1) requirements gathering; (2) metrics planning and formalization; and (3) metric program execution consisting of implementation, interpretation, communication, and response (Figure 12.1).

f12-01-9780124115194 — Figure 12.1 General applied software measurement approach.

In the requirements gathering phase, we identify the relevant stakeholders and elicit their business needs. Through the elicitation process, the available assets (e.g., existing data, process, and insight) as well as constraints and limitations (e.g., data availability and access, personnel engaged in measurement, etc.) are discovered. We also obtain the stakeholders’ commitments by defining their roles and responsibilities in the measurement program.

In the planning and formalization phase, we articulate the business needs as measurement goals—specifying the purpose, object, focus, and context of the measurement. We also outline how the measures shall be analyzed and interpreted against the goal using GQM. We use the measurement goals, constraints, and limitations to define a measurement plan with specific metrics to gather, and the process (who, when, where, etc.) for gathering them. The formalization of the measurement plan includes the standardization of the vocabulary adopted in the measurement program. This alleviates the problem when dealing with a heterogeneous set of stakeholders.

The execution phase consists of four main activities: (1) implementation; (2) analysis and interpretation; (3) communication; and (4) response. In implementation, the measurement plan is executed and data is collected. During the data gathering, unanticipated changes and/or roadblocks may occur and need to be addressed. Next, the gathered data is analyzed and interpreted with respect to the business goals using the help of Subject Matter Experts (SMEs). Results are then communicated to the relevant stakeholders in an easy-to-understand format. Finally, we gather the organization’s responses to the measurement program findings to define organizational improvement activities as well evaluations and improvements to the measurement program. New business needs may be identified, and the measurement process may be repeated.

In the remainder of this chapter, our discussion of challenges and lessons learned follows this progression from measurement requirement gathering, to formalization, to the many phases of implementation.

12.3 Six Key Issues when Implementing a Measurement Program in Industry

12.3.1 Stakeholders, Requirements, and Planning: The Groundwork for a Successful Measurement Program

Whether you are applying software analyses in commercial or government settings, large or small organizations, the measurement process begins with requirements gathering, to understand the data analyses that the consumers, or end-users desire. As with any requirements-gathering process, there will be roadblocks along the way. This section highlights specific examples of challenges in three primary areas: stakeholder relationships, goal setting, and measurement program planning.

The first step in a successful measurement program is having a sponsor, i.e., a senior-level representative in the organization, who funds the activities as well as recognizes the value of the measurement program and communicates its importance to the organization at large. The sponsor may also act as the champion where the champion leads the measurement program initiative, assigns and motivates personnel resources, and ensures the measurement plan is managed and tracked to achieve the stated goals. In some cases, potential sponsors or champions may need to be convinced of the value of a measurement program. When doing so, one should focus on the general benefits of measurement to any organization:

• Understanding the business—Data collected from a measurement program can be used to build organizational baseline models to gain knowledge about the organization.

• Managing projects—Project management is supported by measurement, whether planning and estimating, tracking actual progress and cost versus estimates, or validating process/product models.

• Better prediction—Creation of process/product models through data collection improves the predictability of activity and decision-making within the organization.

• Guiding improvement—Measures and reports help to increase understanding by allowing users to analyze and assess the performance of processes and products and generate ideas for improvement.

The existence of both a sponsor and champion becomes critical to overcoming numerous challenges of implementing a measurement program, such as gaining buy-in from customer and supplier stakeholders, prioritizing measurement goals, obtaining data from project teams or subcontractors, attaining participation from subject matter experts, effectively communicating results to stakeholders, and more. As in software development project teams, paying attention to how well measurement program roles work together is extremely important [12].

From the trenches

Implementation of measurement programs is a multifaceted undertaking, particularly initiatives that are broad and affect many different groups of the organization. These programs are complex, organizational change efforts that require effective work, political, and/or cultural systems to achieve success. In a process improvement/measurement program in a commercial organization, both the sponsor and the champion were clearly identified in the program plan; however, two challenges occurred that ultimately resulted in the program ending without success. The champion was not positioned in the organization to influence change in all affected parts of the organization. In addition, there was a struggle between the sponsor and champion to keep program goals aligned, concrete, and explicit. On several occasions non-explicit or continually-changing goals confounded the activities of the program, thereby eliminating focused efforts and measurement success. This result highlighted the importance of the sponsor and champion working closely together including having the sponsor assist the champion to align the organization’s differing factions and sub-interests to assure an environment where change is a priority and can take place.

Measurement stakeholders who will provide data, and subject matter experts who will interpret measurements must also be identified early so that they understand the importance of their contributions to the measurement effort. For example, in large acquisition projects, suppliers provide important measurement data to the acquirers so that overall program status can be monitored. In these situations, gaining buy-in from the supplier stakeholders who provide the data is essential. If possible, encourage the sponsor to build data collection requirements into the project requirements.

Stakeholder time is valuable, and measurement is often viewed as an overhead activity. Stakeholder commitment can be easier to obtain if the sponsor communicates the importance of the measurement program and incentivizes participation. When engaging stakeholders, explain the benefits the data and analyses will bring to the project and organization, rather than using metrics as a means to evaluate individuals. Maintaining the goodwill of your stakeholders throughout a measurement program is essential.

“Setting goals is the first step in turning the invisible into the visible.”

– Tony Robbins

Aligning overarching business goals with individual team or organizational unit objectives is critical to a successful measurement program. For example, at a small business client of Fraunhofer’s, the measurement sponsor wanted to increase product quality. The company’s marketing group measured product quality using a customer satisfaction survey, whereas the software group measured software quality by post-release defects. Goals were set using different metrics within each division; however, both sets of metrics and division goals contributed to the overall goal of improving product quality. Aligning business goals with individual team or group goals can be difficult when eliciting measurement requirements with stakeholders. To make this problem easier, we apply the GQM+Strategies approach, which explicitly defines and aligns business goals and group strategies for achieving these goals into an integrated measurement program.

The measurement program must also be tailored to the needs of individual consumers (e.g., safety engineers, managers, technical staff, etc.). For example, managers on different levels have different needs. Project managers need to monitor and control their projects. Program managers need ways to manage a portfolio of projects, and higher-level managers need ways to manage the business based on even higher-level indicators. You cannot plan on a standard set of analyses that will be applicable to all stakeholders.

Consumers of the data analyses may find it difficult to express their desired goals or objectives for measurement and analyses especially when objectives are presented in abstract form, for example, quality objectives such as maintainability and portability. It may be helpful to define what cannot or should not happen, which can then be translated into measurable goals. When eliciting goals, be specific, “speak” in consumers’ language, and, if possible, draw from data already available to ground the measurement objectives in relatable terms. In general, consumers will find it easier to respond to concrete ideas and feedback. We recommend developing a straw man of the measurement requirements using available documentation, including business goals, process documentation, organization website, and information collected at prior meetings, to make the limited time available with stakeholders as productive as possible.

When setting measurement goals, we recommend focusing on mature processes that have been institutionalized across the organization. Focusing on institutionalized processes has several benefits: (1) raw data to measure is more likely to exist; (2) the organization may already have baseline performance measures established; (3) the metrics can be applied to a broader set of projects within the organization; and (4) the measurements can be compared across projects for useful insights. However, while measurement with institutionalized processes is ideal, less mature organizations may not have this luxury. In these situations, measurement goals formulated around product quality may be the best starting point, since usually data on product or service quality are readily available and improved product or service quality goal(s) are valued by most organizations even when business goals are not well defined or explicitly communicated.

Goal development should ultimately result in a measurement plan. The measurement plan becomes the vehicle to capture measurement-related decisions, stakeholders involvement, measurement analysis needs, and measurement activities and risks. A measurement plan must have well specified individual metrics, and well-defined procedures for data collection, extraction, modification, aggregation, analysis, and reporting. Furthermore, the plan is a basis for allocating the resources needed for the measurement program. As mentioned earlier, measurement is often viewed as an overheard or assurance activity, rather than a contributor to product development. Thus, defining the resource needs from the organization is essential for project planning and to ensure that stakeholders participate in measurement activities. The measurement plan should include regular meetings with stakeholders to ensure that the measurement results are not just a deliverable to be filed away—an invisible measurement program is no better than a non-existent one [11].

In this section, we have discussed some of the key issues when initiating a measurement program in industry. Determining consumers’ needs for measurement analyses is one of the first steps in applying software analytics in industry contexts. Doing so requires attention to stakeholder relationships, goal-setting, and measurement planning. The list below summarizes some of the key lessons learned from our experiences in laying the groundwork for a successful measurement program:

• Establish strong relationships with your sponsors and champion to leverage their support.

• Obtain and maintain buy-in from key stakeholders, including suppliers of data.

• Align consumer needs with the business goals and objectives of the organization.

• Understand measurement goals, questions, and metrics for each stakeholder.

• Tailor planned measurement analyses based on consumer needs.

• Explicitly capture the measurement goals, resources needed, and stakeholder involvement in a measurement plan.

12.3.2 Gathering Measurements—How, When, and Who

“Data! Data! Data!” he cried impatiently. “I can’t make bricks without clay.”

–Sir Arthur Conan Doyle, The Adventure of the Copper Beeches

Once the goals of the organization are understood, the next challenge is to get the information for the analyses. Gathering data and other information for analysis can be expensive and time-consuming [13]. The needs of stakeholders often compete for priority when deciding which data to gather, how to collect it, how and where the data should be stored, who is responsible for the collection and integrity of the data, and how the information should be aggregated and reported [14]. Key considerations are automation and tools, establishing access to data, and designing the data storage that supports data collection as well as analysis needs.

A strong automated metrics infrastructure makes analysis easier. Automating the measurement process enables rapid feedback and improvement, and also reduces collection costs.

For even greater efficiency, an organization should leverage data collected by existing tools used by a project whether for project management, engineering, software development, or testing activities. These data are generated as a result of the natural process of work and often minimal effort is needed to collect the data. For example, a project manager can integrate automatic metrics collection using the project’s build management tools. These tools provide many important metrics on the code size, churn (lines added, deleted, and modified), complexity, etc. Thus data collection can be a matter of copying comma separated values (CSV) files from a server. Projects can also use an effort estimation tool that stores data in a database as a valuable source of data to which researchers can connect and extract data with no burden on the project team members. By leveraging the output of existing tools and databases, we are able to accomplish more with our analyses in a shorter amount of time.

As beneficial as they are, data collection tools are not a “silver bullet” and can also be a hindrance. For example, on a large government project, the prime contractor decided to use a well-known commercial off-the-shelf (COTS) tool for data collection of work completed. The tool made it easy for teams to enter their data and to keep the data private to the specific teams. For basic analyses and predefined roll-ups or aggregations, it also worked well. However the tool’s ad hoc analysis and reporting capabilities were very limited. When situations changed and the project or government needed to perform other analyses, extracting the data from the tool’s proprietary data stores or reformatting the data within the tool was difficult or impossible. As a result, on subsequent projects, we have recommended that the project makes the government’s/client’s ability to extract and analyze data independently a priority.

For many projects, one tool or process cannot perform all the data collection and analyses needed. Furthermore, large programs often engage subcontractors or multiple project teams who each have their own processes and tools. To ensure that the data is useful, it is important that the same type of data is collected and the data have the same semantics. Thus, data must often be imported, processed, and transformed from many disparate files and tools, and merged before it is suitable for analysis. This collation process may be manual, semi-automatic, or automatic; but, in any case, it often requires significant effort to implement. The cost of collating such data is often overlooked, but a strong up-front measurement plan can help avoid this pain point.

Access to data must also be considered when planning the research and analyses. Access to some data may be impossible for both technical and organizational reasons, and these reasons are always difficult to overcome. An individual or organization may hide data or not allow access to it because it can make them look bad; the organization may not consider the data worth their time to collect; or stakeholders may simply distrust the researchers or the benefits of the analyses. On one project, we could not perform all of the analysis needed because we could not obtain the raw data, even though it existed somewhere electronically. The organization cited security concerns and it required extra time and upper management intervention to resolve the stalemate. Therefore, as much as possible, be sure to plan for access to data and tools as the project is being initiated to avoid delays or the inability to perform the work later. These issues are not easy to overcome, and it will require attacking the problem from viewpoints that include technical access and also political/social/ethical concerns (e.g., establishing trust among researchers and stakeholders and formalizing non-disclosure agreements).

Data analytics on large government projects also pose some unique situations and large-project-specific challenges. As explained in the following example from the trenches, the decisions made on how measurement data is stored can derail even the best intentions for a good measurement program.

From the trenches

On large government projects, measurement and analysis programs are often required by contract. Having the measurement program required rather than desired by the organization, can focus the project more on meeting the letter of the data collection and analyses requirements than on making the processes useful and efficient. On one large government contract, the prime contractor was required to keep a variety of data in an electronic data store. Unfortunately, the requirement did not specify what format or what analyses needed to be performed on this data. As a result, the contractor stored all the data and analyses as .pdf files. These .pdf files contained text documents, PowerPoint slides, and various other products developed and used by the project. It was impossible for researchers or the government to subsequently use the stored data for any further analysis without manually re-entering all the tables and numbers from the text reports. Obviously, this metrics repository was not useful and served no purpose. We now recommend that all projects we work with require project data to be stored in an electronic format that can facilitate and allow data aggregation and further electronic analysis.

In summary, lessons learned related to gathering measurements include:

• Automate data collection and transformation wherever possible:

• Leverage existing data from other tools used by a project while keeping the goals in mind.

• Be prepared to merge data from many sources, if necessary.

• Accessing data can be a difficult obstacle to overcome; it is important to plan ahead for data access.

• Make sure the data collected and stored is usable efficiently/electronically for analyses.

12.3.3 All Data, No Information—When the Data is not What You Need or Expect

“The goal is to turn data into information, and information into insight.”

– Cathy Fiorina

Once the data is accessible, the next step is to apply the metrics and gather measurements. When dealing with any form of raw data, there will always be issues of missing, incomplete, or incorrect data. However, these are just some of the issues that have to be addressed.

Often, when people think of software measurement, they think of code metrics. Yet, much of industry is focused on process improvement, and thus needs to quantify and understand their current processes and not just the process results (e.g., the resulting piece of software). When an organization’s goals include process improvement, measuring the process is inevitable. Processes, unlike products, are much harder to define. For instance, while we have numerous methods for measuring the size of a software product, what is an equivalent metric for the size or scale of a process? To make something measurable implies that the object of measurement has some semi-rigorous definition to enforce measurement consistency. Thus, measuring a process can be a catalyst for defining the process. For example, on several of our projects, the client was interested in assessing the quality of their hazard analysis process. However, when we looked at the process artifacts (hazard reports), there were missing data, out-of-date information, a number of different formats, and inconsistent terminology. On another large defense project, there were no hazard reports to be measured even though, in management’s eyes, they existed. Thus, trying to measure the process artifacts revealed a number of process risks, which could then be communicated to project management. Visibility into these risks was then the catalyst for process improvement initiatives. Ultimately, the measurement program provided the added benefit of helping to define the hazard analysis process and its expectations in the organization.

While “code” is the most recognizable process artifact, requirements documents, design diagrams, operating manuals, task descriptions, and other process artifacts are strong candidates for measurement. In our experience, industry partners are as interested in quantifying the quality of these artifacts as they are in code, particularly in projects with a long development period. Unfortunately, these artifacts do not lend themselves to insightful measurement. Process artifacts, such as requirements and designs, rarely follow a rigorous structured language, thus making the application of a scale (beyond a simple word count) a manual task. Those that do follow rigorous structures, such as formal models or standardized design languages (e.g., UML) lend themselves to counting-based analysis. However, artifacts, such as requirements and designs, are often evaluated according to their semantic content, thus making automated measurement a near impossibility. Thus, when an organization’s goals involve improving non-coding activities, one must plan to spend significant effort understanding and applying metrics to non-code process artifacts. Further, additional effort should be allocated with the organization’s subject matter experts to understand the artifacts and validate the proposed metrics. Partial automated analysis of non-code artifacts, such as requirements, defects, and anomalies is currently possible via the application of natural language processing techniques [15–17]. However, these techniques usually need a significant amount of effort to be institutionalized in the organization process and properly configured to the specific application context.

Even if we are measuring established concepts and sufficient planning for the measurement has been done, problems in data quality still occur. Problems such as missing data, incomplete data, data reported inconsistently, or incorrect data are prevalent and to be expected in any measurement program [18]. Such problems occur because of variations in the way the processes were performed—both in the process for gathering/reporting the data and/or the process that produced the artifacts being measured, creating variations in the process artifacts. Another reason could simply be that the activity of resolving missing data is expensive or impossible. In any case, problems with the quality of the data will likely cause bias in the resulting analyses or render it completely useless. Regardless of how sensitive or sophisticated the data analysis techniques are, the results they produce will not be useful if the underlying data is incorrect.

The general problem of the low quality of data in empirical software engineering has already been well documented [19, 20]. One such example is the problem with bug-fix datasets that has been studied in some detail [21–26]. We recall our own experience related to this problem in one of our projects. Our customer employs several Verification and Validation (V&V) techniques, e.g., peer review, user acceptance testing, and automated unit testing, and would like to assess their effectiveness against the different types of defects. Our customer’s goal is to improve software quality by using the V&V technique that provides the most effective detection, given the type of defects they are expecting. To achieve this, they created and maintained a defect classification schema, which is used in their defect-fix reporting mechanism. However, they found that the defect classification schema was often used incorrectly by the personnel creating the reports, resulting in defects types being reported incorrectly. As a consequence, errors exist in the defect report repository used in their data analysis, leading to the selection of inappropriate V&V techniques [27].

In our experience, we have found the following practices useful in reducing data quality issues:

1. Managers should communicate the importance of good data quality to the team and, when possible, put in place mechanisms for enabling convenient input of data. Managers should also ensure sufficient resources are allocated to enable the team to generate data as complete and as accurate as requested.

2. Data analysts should always validate the received data (e.g., check that data has valid or reasonable values, check underlying data if result analysis deviates from expectation), and provide the data submitter with constructive feedback when problems with the data are found and also provide suggestions for improving data collection (e.g., automate collection mechanisms, employ robust processes that place constraints on how values are entered—such as choosing from drop-down lists instead of free text, etc.)

3. Organizations should aim to institutionalize processes (i.e., processes that are well-documented, trained, repeatable, and consistently implemented) because such processes reduce variability in the artifacts produced.

Note, however, that even with the best intentions, eliminating data problems may still be infeasible, usually due to the significant cost associated with collecting, reviewing, or transforming it. When analyzing data with quality issues, it is important to understand how the analysis results may be impacted by them. Often, knowing which issues exist and their possible impact on the analysis can be taken into account during the interpretation of the analysis results. The caveats and limitations due to data irregularities should be clearly communicated along with the analysis results. Data analysts also need to be realistic in their expectations of the data and creative in their analyses to gain as much value from the data available.

From the trenches

On a large acquisition project, our customer (i.e., the acquirer) was interested in tracking software development progress. The lead development contractor was expected to provide sufficient data so that the plan and actual progress could be compared for deviations. However, for one particular activity, the contractor found that the week-by-week plan data was too difficult to generate; and when generated, the resulting plan was highly volatile, causing the analysis to be ineffective for informing the customer of progress issues. Instead of using the inaccurate weekly data, we decided to analyze the actual progress relative to what was needed (planned) to be accomplished in order to complete the activity on schedule. This data was readily available and relatively stable. We then derived the needed progress rate and assessed whether the planned rate was reasonable given past performance and industry standards. While this analysis did not provide insight about deviations on a weekly basis, it was still able to inform our customer when progress was starting to lag behind an expected schedule and the activity’s timely completion was threatened. This measurement program has been perceived as critical to the project to the point that it is being explored for adoption throughout the acquiring organization.

In summary, lessons learned related to data quality include:

• Measuring process artifacts generally requires a well-defined process, but applying measures to an ill-defined process can still reveal useful insights to the organization.

• Extra time and customer resources should be planned when measuring non-code artifacts.

• Engage sponsors and champions to communicate the importance of data quality to the team.

• Validate all data prior to analysis.

• Data quality issues are inevitable, but being aware of their existence and their impact are key for obtaining analysis results that are useful in spite of the presence of problems.

12.3.4 The Pivotal Role of Subject Matter Expertise

Subject Matter Expert—Person with bona fide expert knowledge about what it takes to do a particular job.

– U.S. Office of Personnel Management

Now that data are available, it is time for analysis and extracting the insights we hope can be used to improve our project. Regardless of the type of data, the analyst must first understand the context of the data so that it can be measured accurately. Furthermore, the measurements and analysis results will need an interpretation to be useful for project management—metrics are insightful, but rarely, if ever, tell the whole story on their own. Subject matter experts (SMEs) are key to overcoming the challenges in this area, which include understanding the data and interpreting the analysis results.

First, consider the scenario where one must understand the data so that it can be measured accurately. For example, on a government aerospace program, we wanted to measure hazard reports,which are PDF documents written in natural language that capture the output of safety analysis. Many terms used in the reports referred to software systems, but some familiarity with spaceflight systems was required to understand this. Understanding the language and domain was challenging, and we went through several iterations of measurement and review to ensure that our counts were accurate.

Consider a second scenario where the raw data have been collected and the measurements are ready for analysis. The challenge now is to interpret the results of the analysis. Analysis results may be interpreted in different ways depending on the assumptions and context that the analysts have regarding the data and/or the project. On one project, we calculated the effort estimation error for all of the features in the new release of the system [28] using data automatically extracted from a database. The analysis showed that a small subset of features had a high relative estimation error. However, we knew little about the technical aspects of the features or their implementation history, and could offer no insights for the project manager.

In both scenarios, SMEs from the development groups were essential to overcoming these challenges in a timely fashion. On the government aerospace project, we created a dictionary of common terms in the hazard reports, and asked software safety SMEs whether or not those subsystems contained software. The SMEs told us how to divide the system conceptually according to the development groups, thereby improving the utility of our results to project management. Such valuable insight can be difficult to obtain. Project SMEs are often most concerned with completing their project tasks, not contributing to a metrics initiative that they likely view as an overhead. The people most suitable to answer questions may be those with the least amount of time to spare. Once again, it is essential to have management support and “boots on the ground” who can identify and free the resources to assist in the measurement effort.

From the trenches

For a government aerospace program, one metric was a count of the number of hazards, causes, and controls (all contained within the hazard report) that involved software behavior. The simplest approach was to search for software-related terms, but the hazard reports contained a significant amount of technical jargon about spaceflight hardware design. For example, the same system handling spacecraft flight may be referred to as Avionics, Guidance, GNC, or Guidance Navigation and Control. This makes interpreting these terms as a single concept across artifacts challenging. Further, the relations between terms may be lost, e.g., a reference to the “flight computer” may include GNC and CD&H, but this implied relationship may not be explicit in the artifact. All of these issues arose when trying to accurately determine whether a hazard involved software components.

Many of the spacecraft engineers come from a hardware background, and thus did not appreciate the extent of software’s involvement in safety-critical aspects of the design. For three major systems on the Program, 45-60% of the hazards were caused or prevented by software, which surprised many stakeholders. By having an accurate picture of the system decomposition, and their responsible parties, our measurements convinced systems engineers to allocate additional costly, yet essential, software assurance effort to certain areas of the system.

Sometimes, there is inconsistency between experts’ opinions and quantitative results and it is important to investigate the reasons for disagreement. One of the main reasons for this inconsistency is that the underlying data (upon which results are computed) is unreliable or incorrectly filtered. For example, consider a scenario where one project has a high number of bugs and another has a low number of bugs. If the quality of the project is evaluated by the number of bugs, one may conclude that the project with the lower number of bugs is the higher quality project. The subject matter experts are puzzled because, in their opinion, the project with the lower number of bugs is actually the lower quality project, simply based on their own project insights. Further investigation reveals that the low-bug-count project was not following the bug reporting process and that some bugs have not been recorded. Thus, the project is actually of a lower quality than suggested by the analysis.

Subject matter expertise and familiarity with the development context are necessary to help identify such biases in the data.

On a commercial project, we presented some analysis results in a meeting with project managers and development leads. The features with high relative error estimates caused concern, but the development leads were able to provide justifiable reasons for the effort estimation errors, such as working with a new technology, developers taking medical leave, and other reasons not captured in the metrics. Understanding the development contributors to a measurement result is necessary for taking corrective action. In addition, for a small client pursuing high-maturity processes we established measurement-focused meetings at regular intervals to get the right people in the room. These meetings focused on assisting the project teams with goal definition, evaluating project performance given established goals and organizational models, using prediction models for decision-making, and adjusting project-specific processes to achieve goals, if needed.

In summary, lessons learned related to the role of subjects matter expertise include:

• Finding time with SMEs to validate metrics and interpret data is difficult. Data analysis is an overhead activity.

• The project must engage SMEs early to ensure that the metrics and measurements are meaningful for the project.

• SMEs should be available to answer specific questions about the project and domain.

• SMEs are necessary for interpreting data and taking corrective action.

• Raw data requires time and resources to understand, particularly in an unfamiliar domain.

• Measurements do not tell the whole story, and results can be meaningless or contentious without the qualitative insight of SMEs.

12.3.5 Responding to Changing Needs

To improve is to change; to be perfect is to change often. – Winston Churchill

It never fails: just when you have successfully defined effective measures and established analysis processes, something changes and there are new management and stakeholder priorities; specific events to be analyzed or highlighted; and needs for new collection or aggregation methods that differ from what was planned. Most of the time, these changes cannot be predicted and are needed immediately. In these situations a project is left trying to adapt to change as best it can.

Edmunds and Morris noted succinctly that “A theme stressed in the literature is the paradoxical situation that, although there is an abundance of information available, it is often difficult to obtain useful, relevant information when it is needed” [29]. A challenge facing any measurement program is when the sponsoring organization, despite all of your planning and rigorous metric definitions, asks questions for which there is no direct data captured. This has happened routinely on many projects and it has required that the analysis teams become creative. Project managers/stakeholders urgently need the measurement program and analysts to provide a way to respond to the new situation. Politically, it is not an option to respond that there is just no information to assist the managers/stakeholders.

On one large government program, the project was cancelled but there were many functions that were almost complete. These pieces of software could be transferred over to another program, but the burning question was whether the software was mature enough to keep or whether it should just be thrown away when the project was terminated. The answer was needed very quickly and there wasn’t time or budget to collect new data. So, answering the program’s questions required analyzing what would give the program any insight into what a good decision would be. For example, the appropriateness of reusing the software could be characterized by the goodness of the software design, software complexity, quality (execution errors as well as code analysis), amount of code actually implemented, a qualitative technical assessment of the code, etc. Fortunately, the project had a mature measurement program and it already had collected some data for these attributes. We created a matrix with the attributes and the resulting analysis of the data to provide the customer with a valuable assessment of the current state and quality of the software, even though the software was still incomplete and not totally tested.

One common challenge for a software development project is the reorganization of its deliveries and software builds. Technical or logistic dependencies of software and its functions, changes in priorities, or schedule changes cause the contents and even the structure of major builds to change (possibly multiple times) during the project’s life cycle. Transforming and re-aggregating the data to keep it in alignment with current reporting needs is an arduous task and it is very tempting to not change the project’s data, but to work around it. However, most of the changes in the data can be performed via scripting and it is crucial to do if analysis of the data is to remain a value-added activity for the project. If the data and resulting analyses are no longer aligned with the current project goals and priorities, the analyses can no longer assist management or other stakeholders in their decision processes.

In addition to keeping data current with change, when there are tools to support the analysis of data, it is important that the tools be able to facilitate change. Tools need to provide the ability for the team to easily implement different what-if analyses. On a major government program, we provided the in-house cost estimate for the program. It was a very large program and even preparing the request for proposals required a few years’ time. Over these years technology, politics, logistics and many aspects of the program changed. We had developed a tool to generate the cost estimate and structured it modularly so that the tool and the estimate were able to provide what-if-analyses and to adapt as more information was known, or as changes to the program were identified. To provide the maximum value added, not only must the data be able to adapt to changes, but tool development for analytics needs to be agile as well.

In summary, lessons learned related to the ability in responding to changing needs include:

• Use your investment in data collection and analyses creatively—do not expect to have a measurement directly collected for a purpose. Many measures can support more than one purpose.

• Keep the data aligned with program changes.

• Look for supporting tools that are agile and easily adaptable to change and for “what-if” exercises.

12.3.6 Effective Ways to Communicate Analysis Results to the Consumers

“You can have brilliant ideas, but if you can’t get them across, your ideas won’t get you anywhere.” – Lee Iacocca

Presenting the analysis results in the “right” way is important, as it affects how results are taken in and understood by the consumers. However, it is not trivial and it is an iterative process to arrive at the correct communication channels (e.g., written report versus oral presentation) and mechanisms.

As the literature recommends [29], today’s professionals need value-added information that is filtered by software or subject matter experts. Reporting too much information at the wrong levels and without an indication of why the analysis results are meaningful to a manager/stakeholder will guarantee that the data and analyses are ignored. The challenge is to be able to summarize metrics across the program at a high level, while at the same time providing insights into the specific problem.

Ultimately, the outcome of the analysis must become an enabler for the consumers to make decisions with respect to the business needs that were identified early on. Therefore, the results, as communicated to the consumers, should be meaningful so as to aid the consumers in acting upon it. As the results of the analysis drive decision-making, effective communication also means providing the information in a timely manner. You want to present the analysis results while the stakeholders are still able to make decisions about it.

However, measurement programs tend to end up with large amounts of data, all of which are analyzed and dissected in various ways. It is tempting to unload all the data and their analysis onto the customers so as to show the value of the measurement program. The opposite typically holds: your customers will drown in the sea of information and will lose interest when they have to sift through the information just to find the subset of information that interests them. More problematically, with large amounts of information, especially when presented in incoherent ways, important issues may not be noticed and the appropriate actions not taken, all of which diminishes the value of the measurement program.

When communicating the analysis results, it is important to be aware to whom you are presenting the data and to tailor the analysis results accordingly. Different stakeholders care about different information, and at different levels of abstraction. For example, a manager tends to want to see a broader, “30,000-foot view” perspective, while technical personnel care about specific data at a much finer granularity. Therefore, different levels of data abstraction are needed to avoid overwhelming the consumers with the analysis, while providing the capability to drill down to detail when the need arises. However, it is possible that when abstracting data, important issues can become hidden, providing a false illusion that everything is fine. Recall the outcome of the requirement gathering phase, especially the elicitation discussion with the stakeholders, to identify the concerns of each stakeholder.

The point of a measurement program is to learn from the data through analysis and synthesis of the data in its context. Therefore, it is not sufficient just to present the results of data, but it is also necessary to present the data in context for comparison, to facilitate the interpretation of the data. Consider what other information is needed in order for the stakeholders to be able to take an action or make a decision.

From the trenches

On one of our large customer projects, an important business need was to understand how the development activity (e.g., requirements definition, code development, software testing) is progressing. The project manager is interested in knowing if delay in the completion of the activity may be anticipated, and, if so, how much of a delay it will be. The manager utilizes this information to make decisions regarding re-planning. When communicating the analysis of measurement results that track the development activity progress, it is important to include how the progress compares against the plan. However, while a comparison of actuals to plan provides a reasonable snapshot of the state of the progress, it is not sufficient to determine the progress toward the future and of potential delay. To do so, the manager also needs to know the trend of progress. This information can be used to infer past performance rate of progress, to compare it with the needed rate of progress, and to assess whether the past progress rate demonstrates the ability to meet the needed rate. Furthermore, if we communicate additional context of information in the form of industry benchmark value for progress rate, the manager can be even more confident in the determination of development delay risk. These presentations of development activity have become essential parts of weekly progress reporting to program and organizational management, and are used to assist in resource allocation and re-planning each month.

Visualization is integral for supporting communication of results. Visualization can quickly identify metric trends and raise issues. More than 80% of people exhibit a strong preference for visual learning [30]. Effective communication entails selecting the appropriate visualization methods. Bar charts and pie charts show proportions more easily than text numbers. Line charts show trends over time. Diagrams show relationships or process flow. Scatter plots can show groups and data trends. The human brain was built for pattern matching, and our visual cortex is the primary mechanism for doing this compared with our language processing centers, which interpret numbers and text [31]. When employing visualization, however, our advice is to keep visualization simple, as complicated visualization can distract from the message conveyed. If you find yourself expending much effort in explaining your visualization, it usually indicates that your visualization is too complex. Note that there are occasions where your stakeholders will have to look at the results on their own. In such cases, it is important to pay attention to correct and easily understood labeling as well as to be aware of color associations (e.g., red is usually associated as something that is in need of attention).

A sign of a successful measurement and analysis program is when measures and analyses are being used continuously and regularly by the decision-maker. While you have the eyes and ears of the decision-makers, be careful not to let the communicated results become repetitive, as the message becomes muted over time. This means that one should periodically revisit the communication channel and mechanisms used to ensure that they are still effective in conveying the message. At the same time, one should continuously explore the different ways to analyze data. The information needs to evolve over time as projects are at different phases or as new knowledge is being formed. Therefore, what analysis you do in the early phase of the project may be different from at the middle or at the end of the project.

In summary, lessons learned related to effectively communicating analysis results include:

• Provide value-added interpretations of all data and analyses.

• Tailor results to the appropriate levels of abstraction for individual consumers.

• Offer sufficient information to allow interpretation of the analysis result in a context that enables decision-making.

• Use visualization, and use the right one.

• Reflect and iterate on the responses to your communication so that you can improve how to communicate with consumers.

12.4 Conclusions

Drawing from examples from our 15 years of performing software analytics work, in this chapter we have discussed the process that has allowed us to deliver value across many different customers and types of organizations. Ultimately, software data analytics is about helping stakeholders to make decisions, and we have seen in our own work how powerful measurement-based decision-making can be. A grassroots analytics program to track and project development progress on a communications system project has been elevated to the highest levels of agency management as a best practice for managing software development on large acquisition programs. A data analytics program we implemented at a web application development company has driven process improvement to help that organization mature from CMMI Level 1 to CMMI Level 5, and is still used today for effort estimation and defect prediction. Data collected from years of software inspections helped form the basis for NASA’s agency-wide Software Formal Inspections Standard [32]. The benefits of software data analytics will vary from organization to organization and project to project, hence any successful data analytics program must begin by understanding the needs and goals of the stakeholders involved.

Historically, the process of creating a successful data analysis program has been a largely manual one, for reasons that include:

• The necessity of finding champions and engaging with multiple stakeholders, and of conveying results in the language of those stakeholders.

• The necessity of scrubbing and quality-checking data sources, which requires human judgment and experience.

• The necessity of ensuring that conclusions are well-grounded and make sense for the domain.

• The frequent need to recalibrate all of the above in light of ever-changing organizational priorities.

A suite of proven methods (GQM [7, 9], QIP [10], EF [11]) have served us well. However, as in all other areas of software engineering, the fast pace of change in the field and the ever-increasing demand for software in all domains means that our technologies also need to continually improve to maintain their impact and relevance. Thus our recent research focus has been on topics that can augment our existing measurement framework with new capabilities and faster results. Moving forward, our research vision encompasses:

• Methods for aligning business goals and technical measures, including GQM+Strategies [9], to facilitate our ability to define appropriate metrics and report actionable results.

• Data mining approaches [15, 33] that can quickly find new and subtle relationships in our datasets from real customers, in order to increase the speed with which we can answer customer queries and to enable more fine-grained recommendations for projects.

• Automated and semi-automated tools [34] for extracting metrics from software artifacts into an analyzable form, in order to increase the speed and breadth with which data can be collected for analysis.

• Visualization approaches [35] that can aid in interactive exploration of relationships in the data as well as give our customers the ability to perform what-if analyses.

As promising as the results of the research by ourselves and others in these areas have been, we are well aware that none of these technologies are silver bullets. That is, they stand little chance of enabling effective, data-driven decision-making and other improvements in software development organizations, if they are not embedded into an intentional and end-to-end process, such as the one we have described here. While technology and the speed of analysis seem to constantly accelerate, the underlying fundamentals of human and organizational behavior do not.

Going forward, we must always guard against algorithmically correct and quick-performance algorithms that fail to take into account the messy reality of data, i.e., those that fail to account for data quality problems, that misinterpret biases in data collection (e.g., assuming that projects with few defects are high quality rather than possibly delinquent reporters), and that fail to sanity-check results against domain understanding. As automation and other research breakthroughs enable ever-larger volumes of data to be processed ever-more rapidly, such problems become more likely and have potentially wider repercussions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 12: Applying Software Data Analysis in Industry Contexts: When Research Meets Reality

Create new playlist

Sign In

Sign Up

12.1 Introduction

12.2 Background

12.2.1 Fraunhofer’s Experience in Software Measurement

12.2.2 Terminology

12.2.3 Empirical Methods

12.2.4 Applying Software Measurement in Practice—The General Approach

12.3 Six Key Issues when Implementing a Measurement Program in Industry

12.3.1 Stakeholders, Requirements, and Planning: The Groundwork for a Successful Measurement Program

12.3.2 Gathering Measurements—How, When, and Who

12.3.3 All Data, No Information—When the Data is not What You Need or Expect

12.3.4 The Pivotal Role of Subject Matter Expertise

12.3.5 Responding to Changing Needs

12.3.6 Effective Ways to Communicate Analysis Results to the Consumers

12.4 Conclusions

Table of Contents for
Chapter 12: Applying Software Data Analysis in Industry Contexts: When Research Meets Reality