What Have We Learned?

The results of the two studies are summarized as follows:

  • Problems with requirements, design, and coding accounted for 34% of the total MRs. Requirements account for about 5% of the total MRs and, although not extremely numerous, are particularly important because they have been found so late in the development process, a period during which they are particularly expensive to fix.

  • Testing large, complex real-time systems often requires elaborate test laboratories that are themselves large, complex real-time systems. In the development of this release, testing-related MRs accounted for 25% of the total MRs.

  • The fact that 16% of the total MRs are “no problems” and the presence of a significant set of design and coding faults such as “unexpected dependencies” and “interface and design/code complexity” indicate that lack of system knowledge is a significant problem in the development of this release.

  • Of the design and coding faults, 78% took five days or less to fix; 22% took six or more days to fix. We note that there is a certain overhead factor that is imposed on the fixing of each fault that includes getting consensus, building the relevant pieces of the system, and using the system test laboratory to validate the repairs. Unfortunately, we do not have data on those overhead factors.

  • Five fault categories account for 60% of the design and coding faults: internal functionality, interface complexity, unexpected dependencies, low-level logic, and design/code complexity. With the exception of “low-level logic,” this set of faults is what we expect would be significant in evolving a large, complex real-time system.

  • Weighting the fault categories by the effort to find and to fix them yielded results that coincide with our intuition of which faults are easy and hard to find and fix.

  • “Incomplete/omitted design,” “lack of knowledge,” and “none given” (which we interpret to mean that sometimes we just make a mistake with no deeper, hidden underlying cause) account for the underlying causes for 64% of design and coding faults. The weighting of the effort to fix these underlying causes coincides very nicely with our intuition: faults caused by requirements problems require the most effort to fix, whereas faults caused by ambiguous design and lack of knowledge were among those that required the least effort to fix.

  • “Application walk-throughs,” “expert person/documentation,” “guideline enforcement,” and “requirements/design templates” represent 64% of the suggested means of preventing design and coding faults. As application walk-throughs accounted for 25% of the suggested means of prevention, we believe that this supports Curtis, Krasner, and Iscoe’s claim [Curtis et al. 1988] that lack of application knowledge is a significant problem.

  • Although informal means of prevention were preferred over formal means, it was the case that informal means of prevention tended to be suggested for faults that required less effort to fix and formal means tended to be suggested for faults that required more effort to fix.

  • In Perry and Evangelist [Perry and Evangelist 1985], [Perry and Evangelist 1987], interface faults were seen to be a significant portion of the entire set of faults (68%). However, there was no weighting of these faults versus implementation faults. We found in this study that interface faults were roughly 49% of the entire set of design and coding faults and that they were harder to fix than the implementation faults (see the previous discussion). Not surprisingly, formal requirements and formal interface specifications were suggested as significant means of preventing interface faults.

The system reported here was developed and evolved using the current “best practice” techniques and tools with well-qualified practitioners. Because of this fact, we feel that the data point is generalizable to other large-scale real-time systems. With this in mind, we offer the following recommendations to improve the current “best practice”:

  • Obtain fault data throughout the entire development/evolution cycle (not just in the testing cycle), and use it monitor the progress of the process.

  • Incorporate the fault survey as an integral part of MR closure and gather the fault-related information while it is fresh in the developer’s mind. This data provides the basis for measurement-based process improvement where the current most frequent or most costly faults are remedied.

  • Incorporate the informal, people-intensive means of prevention into the current process (such as application walk-throughs, expert person or documentation, guideline enforcement, etc.). As our survey has shown, this will yield benefits for the majority of the faults reported here.

  • Introduce techniques and tools to increase the precision and completeness of requirements, architecture, and design documents. This will yield benefits for those faults that were generally harder to fix and will help to detect the requirements, architecture, and design problems earlier in the life cycle.

We close with several lessons learned that may go a long way toward the improvement of future system developments:

  • The fastest way to product improvement as measured by reduced faults is to hire people who are knowledgeable about the domain of the product. Remember, lack of knowledge tended to dominate the underlying causes. The fastest way to increase the knowledge needed to reduce faults is to hire knowledgeable people.

  • One of the least important ways to improve software developments is to use a “better” programming language. We found relatively few problems that would have been solved by the use of better programming languages.

  • Techniques and tools that help to understand the system and the implications of change should be emphasized in improving a development environment. Remember that knowledge-intensive activities tended to dominate the means of prevention.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset