Home Page Icon
Home Page
Table of Contents for
Part 5: Data Analysis at Scale (Big Data)
Close
Part 5: Data Analysis at Scale (Big Data)
by Thomas Zimmermann, Tim Menzies, Christian Bird
The Art and Science of Analyzing Software Data
Cover image
Title page
Table of Contents
Copyright
List of Contributors
Chapter 1: Past, Present, and Future of Analyzing Software Data
Abstract
Acknowledgments
1.1 Definitions
1.2 The Past: Origins
1.3 Present Day
1.4 Conclusion
Part 1: Tutorial-Techniques
Chapter 2: Mining Patterns and Violations Using Concept Analysis
Abstract
Acknowledgments
2.1 Introduction
2.2 Patterns and Blocks
2.3 Computing All Blocks
2.4 Mining Shopping Carts with Colibri
2.5 Violations
2.6 Finding Violations
2.7 Two Patterns or One Violation?
2.8 Performance
2.9 Encoding Order
2.10 Inlining
2.11 Related Work
2.12 Conclusions
Chapter 3: Analyzing Text in Software Projects
Abstract
3.1 Introduction
3.2 Textual Software Project Data and Retrieval
3.3 Manual Coding
3.4 Automated Analysis
3.5 Two Industrial Studies
3.6 Summary
Chapter 4: Synthesizing Knowledge from Software Development Artifacts
Abstract
4.1 Problem Statement
4.2 Artifact Lifecycle Models
4.3 Code Review
4.4 Lifecycle Analysis
4.5 Other Applications
4.6 Conclusion
Chapter 5: A Practical Guide to Analyzing IDE Usage Data
Abstract
Acknowledgments
5.1 Introduction
5.2 Usage Data Research Concepts
5.3 How to Collect Data
5.4 How to Analyze Usage Data
5.5 Limits of What You Can Learn from Usage Data
5.6 Conclusion
5.7 Code Listings
Chapter 6: Latent Dirichlet Allocation: Extracting Topics from Software Engineering Data
Abstract
6.1 Introduction
6.2 Applications of LDA in Software Analysis
6.3 How LDA Works
6.4 LDA Tutorial
6.5 Pitfalls and Threats to Validity
6.6 Conclusions
Chapter 7: Tools and Techniques for Analyzing Product and Process Data
Abstract
7.1 Introduction
7.2 A Rational Analysis Pipeline
7.3 Source Code Analysis
7.4 Compiled Code Analysis
7.5 Analysis of Configuration Management Data
7.6 Data Visualization
7.7 Concluding Remarks
Part 2: Data/Problem Focussed
Chapter 8: Analyzing Security Data
Abstract
8.1 Vulnerability
8.2 Security Data “Gotchas”
8.3 Measuring Vulnerability Severity
8.4 Method of Collecting and Analyzing Vulnerability Data
8.5 What Security Data has Told Us Thus Far
8.6 Summary
Chapter 9: A Mixed Methods Approach to Mining Code Review Data: Examples and a Study of Multicommit Reviews and Pull Requests
Abstract
9.1 Introduction
9.2 Motivation for a Mixed Methods Approach
9.3 Review Process and Data
9.4 Quantitative Replication Study: Code Review on Branches
9.5 Qualitative Approaches
9.6 Triangulation
9.7 Conclusion
Chapter 10: Mining Android Apps for Anomalies
Abstract
Acknowledgments
10.1 Introduction
10.2 Clustering Apps by Description
10.3 Identifying Anomalies by APIs
10.4 Evaluation
10.5 Related Work
10.6 Conclusion and Future Work
Chapter 11: Change Coupling Between Software Artifacts: Learning from Past Changes
Abstract
11.1 Introduction
11.2 Change Coupling
11.3 Change Coupling Identification Approaches
11.4 Challenges in Change Coupling Identification
11.5 Change Coupling Applications
11.6 Conclusion
Part 3: Stories from the Trenches
Chapter 12: Applying Software Data Analysis in Industry Contexts: When Research Meets Reality
Abstract
12.1 Introduction
12.2 Background
12.3 Six Key Issues when Implementing a Measurement Program in Industry
12.4 Conclusions
Chapter 13: Using Data to Make Decisions in Software Engineering: Providing a Method to our Madness
Abstract
13.1 Introduction
13.2 Short History of Software Engineering Metrics
13.3 Establishing Clear Goals
13.4 Review of Metrics
13.5 Challenges with Data Analysis on Software Projects
13.6 Example of Changing Product Development Through the Use of Data
13.7 Driving Software Engineering Processes with Data
Chapter 14: Community Data for OSS Adoption Risk Management
Abstract
Acknowledgments
14.1 Introduction
14.2 Background
14.3 An Approach to OSS Risk Adoption Management
14.4 OSS Communities Structure and Behavior Analysis: The XWiki Case
14.5 A Risk Assessment Example: The Moodbile Case
14.6 Related Work
14.7 Conclusions
Chapter 15: Assessing the State of Software in a Large Enterprise: A 12-Year Retrospective
Abstract
Acknowledgments
15.1 Introduction
15.2 Evolution of the Process and the Assessment
15.3 Impact Summary of the State of Avaya Software Report
15.4 Assessment Approach and Mechanisms
15.5 Data Sources
15.6 Examples of Analyses
15.7 Software Practices
15.8 Assessment Follow-up: Recommendations and Impact
15.9 Impact of the Assessments
15.10 Conclusions
15.11 Appendix
Author Biographies
Chapter 16: Lessons Learned from Software Analytics in Practice
Abstract
16.1 Introduction
16.2 Problem Selection
16.3 Data Collection
16.4 Descriptive Analytics
16.5 Predictive Analytics
16.6 Road Ahead
Part 4: Advanced Topics
Chapter 17: Code Comment Analysis for Improving Software Quality
Abstract
17.1 Introduction
17.2 Text Analytics: Techniques, Tools, and Measures
17.3 Studies of Code Comments
17.4 Automated Code Comment Analysis for Specification Mining and Bug Detection
17.5 Studies and Analysis of API Documentation
17.6 Future Directions and Challenges
Chapter 18: Mining Software Logs for Goal-Driven Root Cause Analysis
Abstract
18.1 Introduction
18.2 Approaches to Root Cause Analysis
18.3 Root Cause Analysis Framework Overview
18.4 Modeling Diagnostics for Root Cause Analysis
18.5 Log Reduction
18.6 Reasoning Techniques
18.7 Root Cause Analysis for Failures Induced by Internal Faults
18.8 Root Cause Analysis for Failures due to External Threats
18.9 Experimental Evaluations
18.10 Conclusions
Chapter 19: Analytical Product Release Planning
Abstract
Acknowledgments
19.1 Introduction and Motivation
19.2 Taxonomy of Data-intensive Release Planning Problems
19.3 Information Needs for Software Release Planning
19.4 The Paradigm of Analytical Open Innovation
Analysis phase
Synthesize phase
19.5 Analytical Release Planning—A Case Study
19.6 Summary and Future Research
19.7 Appendix: Feature Dependency Constraints
Part 5: Data Analysis at Scale (Big Data)
Chapter 20: Boa: An Enabling Language and Infrastructure for Ultra-Large-Scale MSR Studies
Abstract
20.1 Objectives
20.2 Getting Started with Boa
20.3 Boa’s Syntax and Semantics
20.4 Mining Project and Repository Metadata
20.5 Mining Source Code with Visitors
20.6 Guidelines for Replicable Research
20.7 Conclusions
20.8 Practice Problems
Project and Repository Metadata Problems
Source Code Problems
Chapter 21: Scalable Parallelization of Specification Mining Using Distributed Computing
Abstract
21.1 Introduction
21.2 Background
21.3 Distributed Specification Mining
21.4 Implementation and Empirical Evaluation
21.5 Related Work
21.6 Conclusion and Future Work
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Chapter 19: Analytical Product Release Planning
Next
Next Chapter
Chapter 20: Boa: An Enabling Language and Infrastructure for Ultra-Large-Scale MSR Studies
Part 5
Data Analysis at Scale (Big Data)
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset