Home Page Icon
Home Page
Table of Contents for
Part 1: Prolog: Optimizing Compilation
Close
Part 1: Prolog: Optimizing Compilation
by Sid Touati, Benoit de Dinechin
Advanced Backend Optimization
Cover
Contents
Title Page
Copyright
Introduction
I.1. Inside this book
I.2. Other contributors
I.3. Basics on instruction-level parallelism processor architectures
Part 1: Prolog: Optimizing Compilation
1 On the Decidability of Phase Ordering in Optimizing Compilation
1.1. Introduction to the Phase Ordering Problem
1.2. Background on Phase Ordering
1.3. Toward a Theoretical Model for the Phase Ordering Problem
1.4. Examples of Decidable Simplified Cases
1.5. Compiler Optimization Parameter Space Exploration
1.6. Conclusion on Phase Ordering in Optimizing Compilation
Part 2: Instruction Scheduling
2 Instruction Scheduling Problems and Overview
2.1. Vliw Instruction Scheduling Problems
2.2. Software Pipelining
2.3. Instruction Scheduling and Register Allocation
3 Applications of Machine Scheduling to Instruction Scheduling
3.1. Advances in Machine Scheduling
3.2. List Scheduling Algorithms
3.3. Time-Indexed Scheduling Problem Formulations
4 Instruction Scheduling Before Register Allocation
4.1. Instruction Scheduling for an Ilp Processor: Case of a Vliw Architecture
4.2. Large Neighborhood Search for the Resource-Constrained Modulo Scheduling Problem
4.3. Resource-Constrained Modulo Scheduling Problem
4.4. Time-Indexed Integer Programming Formulations
4.5. Large Neighborhood Search Heuristic
4.6. Summary and Conclusions
5 Instruction Scheduling After Register Allocation
5.1. Introduction
5.2. Local Instruction Scheduling
5.3. Global Instruction Scheduling
5.4. Experimental Results
5.5. Conclusions
6 Dealing in Practice With Memory Hierarchy Effects and Instruction Level Parallelism
6.1. The Problem of Hardware Memory Disambiguation at Runtime
6.2. Data Preloading and Prefetching
Part 3: Register Optimization
7 The Register Need of a Fixed Instruction Schedule
7.1. Data Dependence Graph and Processor Model for Register Optimization
7.2. The Acyclic Register Need
7.3. The Periodic Register Need
7.4. Computing the Periodic Register Need
7.5. Some Theoretical Results on the Periodic Register Need
7.6. Conclusion on the Register Requirement
8 The Register Saturation
8.1. Motivations on the Register Saturation Concept
8.2. Computing the Acyclic Register Saturation
8.3. Computing the Periodic Register Saturation
8.4. Conclusion on the Register Saturation
9 Spill Code Reduction
9.1. Introduction to Register Constraints in Software Pipelining
9.2. Related Work in Periodic Register Allocation
9.3. Sira: Schedule Independant Register Allocation
9.4. Siralina: An Efficient Polynomial Heuristic for Sira
9.5. Experimental Results With Sira
9.6. Conclusion on Spill Code Reduction
10 Exploiting the Register Access Delays Before Instruction Scheduling
10.1. Introduction
10.2. Problem Description of Ddg Circuits With Non-Positive Distances
10.3. Necessary and Sufficient Condition to Avoid Non-Positive Circuits
10.4. Application to the Sira Framework
10.5. Experimental Results on Eliminating Non-Positive Circuits
10.6. Conclusion on Non-Positive Circuit Elimination
11 Loop Unrolling Degree Minimization for Periodic Register Allocation
11.1. Introduction
11.2. Background
11.3. Problem Description of Unroll Factor Minimization for Unscheduled Loops
11.4. Algorithmic Solution for Unroll Factor Minimization: Single Register Type
11.5. Unroll Factor Minimization in the Presence of Multiple Register Types
11.6. Unroll Factor Reduction for Already Scheduled Loops
11.7. Experimental Results
11.8. Related Work
11.9. Conclusion on Loop Unroll Degree Minimization
Part 4: Epilog: Performance, Open Problems
12 Statistical Performance Analysis: The Speedup-Test Protocol
12.1. Code Performance Variation
12.2. Background and Notations
12.3. Analyzing the Statistical Significance of the Observed Speedups
12.4. The Speedup-Test Software
12.5. Evaluating the Proportion of Accelerated Benchmarks by a Confidence Interval
12.6. Experiments and Applications
12.7. Related Work
12.8. Discussion and Conclusion on the Speedup-Test Protocol
Conclusion
Appendix 1: Presentation of the Benchmarks Used in Our Experiments
A1.1. Qualitative benchmarks presentation
A1.2. Quantitative benchmarks presentation
A1.3. Changing the architectural configuration of the processor
Appendix 2: Register Saturation Computation on Stand-Alone Ddg
A2.1. The cyclic register saturation
A2.2. The periodic register saturation
Appendix 3: Efficiency Of Sira on the Benchmarks
A3.1. Efficiency of SIRALINA on stand-alone DDG
A3.2. Efficiency of SIRALINA plugged with an industrial compiler
Appendix 4: Efficiency of Non-Positive Circuit Elimination in the Sira Framework
A4.1. Experimental setup
A4.2. Comparison between the heuristics execution times
A4.3. Convergence of the proactive heuristic (iterative SIRALINA)
A4.4. Qualitative analysis of the heuristics
A4.5. Conclusion on non-positive circuit elimination strategy
Appendix 5: Loop Unroll Degree Minimization: Experimental Results
A5.1. Stand-alone experiments with single register types
A5.2. Experiments with multiple register types
Appendix 6: Experimental Efficiency of Software Data Preloading and Prefetching for Embedded Vliw
Appendix 7: Appendix of the Speedup-Test Protocol
A7.1. Why is the observed minimal execution time not necessarily a good statistical estimation of program performances?
A7.2. Hypothesis testing in statistical and probability theory
A7.3. What is a reasonable large sample? Observing the central limit theorem in practice
Bibliography
Lists of Figures, Tables and Algorithms
LIST OF FIGURES
LIST OF TABLES
LIST OF ALGORITHMS
Index
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Introduction
Next
Next Chapter
1 On the Decidability of Phase Ordering in Optimizing Compilation
PART 1
Prolog: Optimizing Compilation
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset