Computer Architecture

A Quantitative Approach


  • John Hennessy, President, Stanford University, Palo Alto, CA, USA
  • David Patterson, Pardee Professor of Computer Science, University of California, Berkeley, USA

A new edition of the best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer designComputer Architecture has been updated throughout to address the most important trends facing computer designers today. In this edition, the authors bring their trademark method of quantitative analysis not only to high performance desktop machine design, but also to the design of embedded and server systems. They have illustrated their principles with designs from all three of these domains, including examples from consumer electronics, multimedia and web technologies, and high performance computing.
View full description


First year graduate students in Computer Architecture. Anyone involved in designing computers or designing with computers, including architects, computer system engineers, and designers of compilers and operating systems.


Book information

  • Published: May 2002
  • ISBN: 978-1-55860-724-8

Table of Contents

Chapter 1 Fundamentals of Computer Design 1.1 Introduction 1.2 The Changing Face of Computing and the Task of the Computer Designer 1.3 Technology Trends 1.4 Cost, Price, and their Trends 1.5 Measuring and Reporting Performance 1.6 Quantitative Principles of Computer Design 1.7 Putting It All Together: Performance and Price-Performance 1.8 Another View: Power Consumption and Efficiency as the Metric 1.9 Fallacies and Pitfalls 1.10 Concluding Remarks1.11 Historical Perspective and References Exercises Chapter 2 Instruction Set Principles and Examples 2.1 Introduction 2.2 Classifying Instruction Set Architectures 2.3 Memory Addressing 2.4 Addressing Modes for Signal Processing2.5 Type and Size of Operands2.6 Operands for Media and Signal Processing2.7 Operations in the Instruction Set2.8 Operations for Media and Signal Processing2.9 Instructions for Control Flow2.10 Encoding an Instruction Set2.11 Crosscutting Issues: The Role of Compilers2.12 Putting It All Together: The MIPS Architecture2.13 Another View: The Trimedia TM32 CPU2.14 Fallacies and Pitfalls2.15 Concluding Remarks2.16 Historical Perspective and References Exercises Chapter 3 Instruction-Level Parallelism and its Dynamic Exploitation 3.1 Instruction-Level Parallelism: Concepts and Challenges 3.2 Overcoming Data Hazards with Dynamic Scheduling 3.3 Dynamic Scheduling: Examples and the Algorithm3.4 Reducing Branch Costs with Dynamic Hardware Prediction3.5 High Performance Instruction Delivery3.6 Taking Advantage of More ILP with Multiple Issue 3.7 Hardware Based Speculation3.8 Studies of the Limitations of ILP3.9 Limitations on ILP for Realizable Processors3.10 Putting It All Together: The P6 Microarchitecture3.11 Another View: Thread Level Parallelism 3.12 Crosscutting Issues: Using an ILP Datapath to Exploit TLP3.13 Fallacies and Pitfalls3.14 Concluding Remarks3.15 Historical Perspective and ReferencesExercises Chapter 4 Exploiting Instruction Level Parallelism with Software Approaches 4.1 Basic Compiler Techniques for Exposing ILP 4.2 Static Branch Prediction 4.3 Static Multiple Issue: the VLIW Approach 4.4 Advanced Compiler Support for Exposing and Exploiting ILP 4.5 Hardware Support for Exposing More Parallelism at Compile-Time 4.6 Crosscutting Issues 4.7 Putting It All Together: The Intel IA-64 Architecture and Itanium Processor 4.8 Another View: ILP in the Embedded and Mobile Markets 4.9 Fallacies and Pitfalls 4.10 Concluding Remarks 4.11 Historical Perspective and References Exercises Chapter 5 Memory-Hierarchy Design 5.1 Introduction 5.2 Review of the ABCs of Caches 5.3 Cache Performance5.4 Reducing Cache Miss Penalty5.5 Reducing Miss Rate 5.6 Reducing Cache Miss Penalty or Miss Rate via Parallelism 5.7 Reducing Hit Time 5.8 Main Memory and Organizations for Improving Performance 5.9 Memory Technology5.10 Virtual Memory5.11 Protection and Examples of Virtual Memory5.12 Crosscutting Issues in the Design of Memory Hierarchies5.13 Putting It All Together: Alpha 21264 Memory Hierarchy5.14 Another View: The Emotion Engine of the Sony Playstation 25.15 Another View: The Sun Fire 6800 Server5.16 Fallacies and Pitfalls5.17 Concluding Remarks5.18 Historical Perspective and ReferencesExercises Chapter 6 Multiprocessors and Thread-Level Parallelism6.1 Introduction6.2 Characteristics of Application Domains 6.3 Symmetric Shared-Memory Architectures 6.4 Performance of Symmetric Shared-Memory Multiprocessors 6.5 Distributed Shared-Memory Architectures6.6 Performance of Distributed Shared-Memory Multiprocessors 6.7 Synchronization 6.8 Models of Memory Consistency: An Introduction 6.9 Multithreading: Exploiting Thread-Level Parallelism within a Processor6.10 Crosscutting Issues 6.11 Putting It All Together: Sun's Wildfire Prototype 6.12 Another View: Multithreading in a Commercial Server6.13 Another View: Embedded Multiprocessors6.14 Fallacies and Pitfalls6.15 Concluding Remarks6.16 Historical Perspective and ReferencesExercises Chapter 7 Storage Systems 7.1 Introduction 7.2 Types of Storage Devices 7.3 Buses-Connecting I/O Devices to CPU/Memory7.4 Reliability, Availability, and Dependability7.5 RAID: Redundant Arrays of Inexpensive Disks 7.6 Errors and Failures in Real Systems 7.7 I/O Performance Measures 7.8 A Little Queuing Theory7.9 Benchmarks of Storage Performance and Availability 7.10 Crosscutting Issues7.11 Designing an I/O System in Five Easy Pieces 7.12 Putting It All Together: EMC Symmetrix and Celerra 7.13 Another View: Sanyo DSC-110 Digital Camera 7.14 Fallacies and Pitfalls7.15 Concluding Remarks 7.16 Historical Perspective and References Exercises Chapter 8 Interconnection Networks and Clusters 8.1 Introduction 8.2 A Simple Network8.3 Interconnection Network Media8.4 Connecting More Than Two Computers 8.5 Network Topology8.6 Practical Issues for Commercial Interconnection Networks8.7 Examples of Interconnection Networks8.8 Internetworking8.9 Crosscutting Issues for Interconnection Networks8.10 Clusters8.11 Designing a Cluster8.12 Putting It All Together: The Google Cluster of PCs8.13 Another View: Inside a Cell Phone8.14 Fallacies and Pitfalls8.15 Concluding Remarks8.16 Historical Perspective and ReferencesExercises Appendix A Pipelining: Basic and Intermediate Concepts Appendix B Solutions to Selected ExercisesOnline AppendicesAppendix C A Survey of RISC Architectures for Desktop, Server, and Embedded Computers Appendix D An Alternative to RISC: The Intel 80x86 Appendix E Another Alternative to RISC: The VAX Architecture Appendix F The IBM 360/370 Architecture for Mainframe Computers Appendix G Vector Processors Appendix H Computer Arithmetic Appendix I Implementing Coherence Protocols