Engineering a Compiler

Engineering a Compiler

2nd Edition - January 18, 2011

Write a review

  • Authors: Keith Cooper, Linda Torczon
  • Hardcover ISBN: 9780120884780
  • eBook ISBN: 9780080916613

Purchase options

Purchase options
Available
DRM-free (EPub, Mobi, PDF)
Sales tax will be calculated at check-out

Institutional Subscription

Free Global Shipping
No minimum order

Description

This entirely revised second edition of Engineering a Compiler is full of technical updates and new material covering the latest developments in compiler technology. In this comprehensive text you will learn important techniques for constructing a modern compiler. Leading educators and researchers Keith Cooper and Linda Torczon combine basic principles with pragmatic insights from their experience building state-of-the-art compilers. They will help you fully understand important techniques such as compilation of imperative and object-oriented languages, construction of static single assignment forms, instruction scheduling, and graph-coloring register allocation.

Key Features

  • In-depth treatment of algorithms and techniques used in the front end of a modern compiler
  • Focus on code optimization and code generation, the primary areas of recent research and development
  • Improvements in presentation including conceptual overviews for each chapter, summaries and review questions for sections, and prominent placement of definitions for new terms
  • Examples drawn from several different programming languages

Readership

Primarily graduate, some undergraduate students in computer science; professional compiler writers, system software developers, architects and computer system designers

Table of Contents

  • CHAPTER 1 Overview of Compilation

    1.1 Introduction

    1.2 Compiler Structure

    1.3 Overview of Translation

    1.3.1 The Front End

    1.3.2 The Optimizer

    1.3.3 The Back End

    1.4 Summary and Perspective

    CHAPTER 2 Scanners

    2.1 Introduction

    2.2 Recognizing Words

    2.2.1 A Formalism for Recognizers

    2.2.2 Recognizing More Complex Words

    2.3 Regular Expressions

    2.3.1 Formalizing the Notation

    2.3.2 Examples

    2.3.3 Closure Properties of REs

    2.4 From Regular Expression to Scanner

    2.4.1 Nondeterministic Finite Automata

    2.4.2 Regular Expression to NFA: Thompson’s Construction

    2.4.3 NFA to DFA: The Subset Construction

    2.4.4 DFA to Minimal DFA: Hopcroft’s Algorithm

    2.4.5 Using a DFA as a Recognizer

    2.5 Implementing Scanners

    2.5.1 Table-Driven Scanners

    2.5.2 Direct-Coded Scanners

    2.5.3 Hand-coded Scanners

    2.5.4 Handling Keywords

    2.6 Advanced Topics

    2.6.1 DFA to Regular Expression

    2.6.2 Another Approach to DFA Minimization: Brzozowski’s Algorithm

    2.6.3 Closure-free Regular Expressions

    2.7 Chapter Summary and Perspective

    CHAPTER 3 Parsers

    3.1 Introduction

    3.2 Expressing Syntax

    3.2.1 Why Not Regular Expressions?

    3.2.2 Context-Free Grammars

    3.2.3 More Complex Examples

    3.2.4 Encoding Meaning into Structure

    3.2.5 Discovering a Derivation for an Input String

    3.3 Top-Down Parsing

    3.3.1 Transforming a Grammar for Top-Down Parsing

    3.3.2 Top-Down Recursive-Descent Parsers

    3.3.3 Table-Driven LL(1) Parsers

    3.4 Bottom-Up Parsing

    3.4.1 The LR(1) Parsing Algorithm

    3.4.2 Building LR(1) Tables

    3.4.3 Errors in the Table Construction

    3.5 Practical Issues

    3.5.1 Error Recovery

    3.5.2 Unary Operators

    3.5.3 Handling Context-Sensitive Ambiguity

    3.5.4 Left versus Right Recursion

    3.6 Advanced Topics 1

    3.6.1 Optimizing a Grammar

    3.6.2 Reducing the Size of LR(1) Tables

    3.7 Summary and Perspective

    CHAPTER 4 Context-Sensitive Analysis

    4.1 Introduction

    4.2 An Introduction to Type Systems

    4.2.1 The Purpose of Type Systems

    4.2.2 Components of a Type System

    4.3 The Attribute-Grammar Framework

    4.3.1 Evaluation Methods

    4.3.2 Circularity

    4.3.3 Extended Examples

    4.3.4 Problems with the Attribute-Grammar Approach

    4.4 Ad Hoc Syntax-Directed Translation

    4.4.1 Implementing Ad Hoc Syntax-Directed Translation

    4.4.2 Examples

    4.5 Advanced Topics

    4.5.1 Harder Problems in Type Inference

    4.5.2 Changing Associativity

    4.6 Summary and Perspective

    CHAPTER 5 Intermediate Representations

    5.1 Introduction

    5.1.1 A Taxonomy of Intermediate Representations

    5.2 Graphical IRs

    5.2.1 Syntax-Related Trees

    5.2.2 Graphs

    5.2.3 Review Questions

    5.3 Linear IRs

    5.3.1 Stack-Machine Code

    5.3.2 Three-Address Code

    5.3.3 Representing Linear Codes

    5.3.4 Building a Control-flow Graph from a Linear Code

    5.4 Mapping Values to Names

    5.4.1 Naming Temporary Values

    5.4.2 Static Single-Assignment Form

    5.4.3 Memory Models

    5.5 Symbol Tables

    5.5.1 Hash Tables

    5.5.2 Building a Symbol Table

    5.5.3 Handling Nested Scopes

    5.5.4 The Many Uses for Symbol Tables

    5.5.5 Other Uses for Symbol Table Technology

    5.6 Summary and Perspective

    CHAPTER 6 The Procedure Abstraction

    6.1 Introduction

    6.2 Procedure Calls

    6.3 Name Spaces 2

    6.3.1 Name Spaces of Algol-like Languages

    6.3.2 Runtime Structures to Support Algol-like Languages

    6.3.3 Name Spaces of Object-Oriented Languages

    6.3.4 Runtime Structures to Support Object-Oriented Languages

    6.4 Communicating Values Between Procedures

    6.4.1 Passing Parameters

    6.4.2 Returning Values

    6.4.3 Establishing Addressability

    6.5 Standardized Linkages

    6.6 Advanced Topics

    6.6.1 Explicit Heap Management

    6.6.2 Implicit Deallocation

    6.7 Summary and Perspective

    CHAPTER 7 Code Shape

    7.1 Introduction

    7.2 Assigning Storage Locations

    7.2.1 Placing Runtime Data Structures

    7.2.2 Layout for Data Areas

    7.2.3 Keeping Values in Registers

    7.3 Arithmetic Operators

    7.3.1 Reducing Demand for Registers

    7.3.2 Accessing Parameter Values

    7.3.3 Function Calls in an Expression

    7.3.4 Other Arithmetic Operators

    7.3.5 Mixed-Type Expressions

    7.3.6 Assignment as an Operator

    7.4 Boolean and Relational Operators

    7.4.1 Representations

    7.4.2 Hardware Support for Relational Operations

    7.5 Storing and Accessing Arrays

    7.5.1 Referencing a Vector Element

    7.5.2 Array Storage Layout

    7.5.3 Referencing an Array Element

    7.5.4 Range Checking

    7.6 Character Strings

    7.6.1 String Representations

    7.6.2 String Assignment

    7.6.3 String Concatenation

    7.6.4 String Length

    7.7 Structure References

    7.7.1 Understanding Structure Layouts

    7.7.2 Arrays of Structures

    7.7.3 Unions and Runtime Tags

    7.7.4 Pointers and Anonymous Values

    7.8 Control-Flow Constructs

    7.8.1 Conditional Execution

    7.8.2 Loops and Iteration

    7.8.3 Case Statements

    7.9 Procedure Calls

    7.9.1 Evaluating Actual Parameters

    7.9.2 Saving and Restoring Registers

    7.10 Summary and Perspective

    CHAPTER 8 Introduction to Optimization

    8.1 Introduction

    8.2 Background

    8.2.1 Examples

    8.2.2 Considerations for Optimization

    8.2.3 Opportunities for Optimization

    8.3 Scope of Optimization

    8.4 Local Optimization

    8.4.1 Local Value Numbering

    8.4.2 Tree-height Balancing

    8.5 Regional Optimization

    8.5.1 Superlocal Value Numbering

    8.5.2 Loop Unrolling

    8.6 Global Optimization

    8.6.1 Finding Uninitialized Variables with Live Information

    8.6.2 Global Code Placement

    8.7 Interprocedural Optimization

    8.7.1 Inline Substitution

    8.7.2 Procedure Placement

    8.7.3 Compiler Organization for Interprocedural Optimization

    8.8 Summary and Perspective

    CHAPTER 9 Data-Flow Analysis

    9.1 Introduction

    9.2 Iterative Data-Flow Analysis

    9.2.1 Dominance

    9.2.2 Live Variable Analysis

    9.2.3 Limitations on Data-Flow Analysis

    9.2.4 Other Data-Flow Problems

    9.3 Static Single-Assignment Form

    9.3.1 A Simple Method for Building SSA Form

    9.3.2 Dominance Frontiers

    9.3.3 Placing f-Functions

    9.3.4 Renaming

    9.3.5 Translation Out of SSA Form

    9.3.6 Using Static Single Assignment Form

    9.4 Interprocedural Analysis

    9.4.1 Call Graph Construction

    9.4.2 Interprocedural Constant Propagation

    9.5 Advanced Topics

    9.5.1 Structural Data-Flow Algorithms and Reducibility

    9.5.2 Speeding up the Iterative Dominance Framework

    9.6 Summary and Perspective

    CHAPTER 10 Scalar Optimizations

    10.1 Introduction

    10.2 Eliminating Useless and Unreachable Code

    10.2.1 Eliminating Useless Code

    10.2.2 Eliminating Useless Control Flow

    10.2.3 Eliminating Unreachable Code

    10.3 Code Motion

    10.3.1 Lazy Code Motion

    10.3.2 Code Hoisting

    10.4 Specialization

    10.4.1 Tail Call Optimization

    10.4.2 Leaf Call Optimization

    10.4.3 Parameter Promotion

    10.5 Redundancy Elimination

    10.5.1 Value Identity versus Name Identity

    10.5.2 Dominator-based Value Numbering

    10.6 Enabling Other Transformations

    10.6.1 Superblock Cloning

    10.6.2 Procedure Cloning

    10.6.3 Loop Unswitching

    10.6.4 Renaming

    10.7 Advanced Topics

    10.7.1 Combining Optimizations

    10.7.2 Strength Reduction

    10.7.3 Choosing an Optimization Sequence

    10.8 Summary and Perspective

    CHAPTER 11 Instruction Selection

    11.1 Introduction

    11.2 Code Generation

    11.3 Extending the Simple Tree-Walk Scheme

    11.4 Instruction Selection via Tree-Pattern Matching

    11.4.1 Rewrite Rules

    11.4.2 Finding a Tiling

    11.4.3 Tools

    11.5 Instruction Selection via Peephole Optimization

    11.5.1 Peephole Optimization

    11.5.2 Peephole Transformers

    11.6 Advanced Topics

    11.6.1 Learning Peephole Patterns

    11.6.2 Generating Instruction Sequences

    11.7 Summary and Perspective

    CHAPTER 12 Instruction Scheduling

    12.1 Introduction

    12.2 The Instruction-Scheduling Problem

    12.2.1 Other Measures of Schedule Quality

    12.2.2 What Makes Scheduling Hard?

    12.3 Local List Scheduling

    12.3.1 The Algorithm

    12.3.2 Scheduling Operations with Variable Delays

    12.3.3 Extending the Algorithm

    12.3.4 Tie Breaking in the List Scheduling Algorithm

    12.3.5 Forward versus Backward List Scheduling

    12.3.6 Improving the Efficiency of List Scheduling

    12.4 Regional Scheduling

    12.4.1 Scheduling Extended Basic Blocks

    12.4.2 Trace Scheduling

    12.4.3 Cloning for Context

    12.5 Advanced Topics

    12.5.1 The Strategy of Software Pipelining

    12.5.2 An Algorithm for Software Pipelining

    12.6 Summary and Perspective

    CHAPTER 13 Register Allocation

    13.1 Introduction

    13.2 Background Issues

    13.2.1 Memory versus Registers

    13.2.2 Allocation versus Assignment

    13.2.3 Register Classes

    13.3 Local Register Allocation and Assignment

    13.3.1 Top-Down Local Register Allocation

    13.3.2 Bottom-Up Local Register Allocation

    13.3.3 Moving Beyond Single Blocks

    13.4 Global Register Allocation and Assignment

    13.4.1 Discovering Global Live Ranges

    13.4.2 Estimating Global Spill Costs

    13.4.3 Interferences and the Interference Graph

    13.4.4 Top-Down Coloring

    13.4.5 Bottom-Up Coloring

    13.4.6 Coalescing Copies to Reduce Degree

    13.4.7 Comparing Top-Down and Bottom-Up Global Allocators

    13.4.8 Encoding Machine Constraints in the Interference Graph

    13.5 Advanced Topics

    13.5.1 Variations on Graph-Coloring Allocation

    13.5.2 Global Register Allocation over SSA Form

    13.6 Summary and Perspective

    CHAPTER A ILOC

    A.1 Introduction

    A.2 Naming Conventions

    A.3 Individual Operations

    A.3.1 Arithmetic

    A.3.2 Shifts

    A.3.3 Memory Operations

    A.3.4 Register-to-Register Copy Operations

    A.4 Control-Flow Operations

    A.4.1 Alternate Comparison and Branch Syntax

    A.4.2 Jumps

    A.5 Representing SSA Form

    CHAPTER B Data Structures

    B.1 Introduction

    B.2 Representing Sets

    B.2.1 Representing Sets as Ordered Lists

    B.2.2 Representing Sets as Bit Vectors

    B.2.3 Representing Sparse Sets

    B.3 Implementing Intermediate Representations

    B.3.1 Graphical Intermediate Representations

    B.3.2 Linear Intermediate Forms

    B.4 Implementing Hash Tables

    B.4.1 Choosing a Hash Function

    B.4.2 Open Hashing

    B.4.3 Open Addressing

    B.4.4 Storing Symbol Records

    B.4.5 Adding Nested Lexical Scopes

    B.5 A Flexible Symbol-Table Design

Product details

  • No. of pages: 824
  • Language: English
  • Copyright: © Morgan Kaufmann 2011
  • Published: January 18, 2011
  • Imprint: Morgan Kaufmann
  • Hardcover ISBN: 9780120884780
  • eBook ISBN: 9780080916613

About the Authors

Keith Cooper

Dr. Cooper Ph.D., Professor, Dept. of Computer Science at Rice University, is the leader of the Massively Scalar Compiler Project at Rice, which investigates issues relating to optimization and code generation for modern machines. He is also a member of the Center for High Performance Software Research, the Computer and Information Technology Institute, and the Center for Multimedia Communication -- all at Rice. He teaches courses in Compiler Construction at the undergraduate and graduate level.

Affiliations and Expertise

Department of Computer Science, Rice University, Houston, Texas, USA

Linda Torczon

Linda Torczon is a principal investigator on the Massively Scalar Compiler Project at Rice University, and the Grid Application Development Software Project sponsored by the next Generation Software program of the National Science Foundation. She also serves as the executive director of HiPerSoft and of the Los Alamos Computer Science Institute. Her research interests include code generation, interprocedural dataflow analysis and optimization, and programming environments.

Affiliations and Expertise

Principal Investigator on the Massively Scalar Compiler Project, Rice University, Houston, Texas, USA

Ratings and Reviews

Write a review

Latest reviews

(Total rating for all reviews)

  • Wei N. Sun Feb 20 2022

    Very good

    Very good

  • AlexanderNye Thu Mar 28 2019

    Excellent in both depth and

    Excellent in both depth and friendliness to beginners

  • JordanSuber Thu Jan 31 2019

    Great Book

    I paired this book with a book implementing a compiler in Java goes hand and hand Writing Compilers and Interpreters: A Software Engineering Approach 3rd Edition written by Ronald Mak

  • Chen D. Wed Jan 24 2018

    Textbook for teaching compiler optimization

    At Rochester I have been using EAC since the publication of the first edition teaching compiler optimization, which is the second half of the book Chapters 8 through 13. EAC combines the theory (necessary for proving convergence/correctness of optimization) and engineering (important techniques in real compilers and their implementation). It balances between the two, as opposed to focusing more on theory e.g. the Dragon book by Aho et al., or more on engineering, e.g. Appel's Tiger book. This balance is beneficial in teaching the compiler front-end (the first half of the book), and it is as important, if not more so, in teaching program analysis and optimization. In the Rochester course, EAC provides the foundation and the approach for later material including dependence theory and loop optimization/parallelization based on the book by Allen and Kennedy.