Computer Architecture book cover

Computer Architecture

A Quantitative Approach

Computer Architecture: A Quantitative Approach explores the ways that software and technology in the cloud are accessed by digital media, such as cell phones, computers, tablets, and other mobile devices. The book became a part of Intel's 2012 recommended reading list for developers, and it covers the revolution of mobile computing. The text also highlights the two most important factors in architecture today: parallelism and memory hierarchy. The six chapters that this book is composed of follow a consistent framework: explanation of the ideas in each chapter; a ""crosscutting issues"" section, which presents how the concepts covered in one chapter connect with those given in other chapters; a ""putting it all together"" section that links these concepts by discussing how they are applied in real machine; and detailed examples of misunderstandings and architectural traps commonly encountered by developers and architects. The first chapter of the book includes formulas for energy, static and dynamic power, integrated circuit costs, reliability, and availability. Chapter 2 discusses memory hierarchy and includes discussions about virtual machines, SRAM and DRAM technologies, and new material on Flash memory. The third chapter covers the exploitation of instruction-level parallelism in high-performance processors, superscalar execution, dynamic scheduling and multithreading, followed by an introduction to vector architectures in the fourth chapter. Chapters 5 and 6 describe multicore processors and warehouse-scale computers (WSCs), respectively. This book is an important reference for computer architects, programmers, application developers, compiler and system software developers, computer system designers and application developers.

Audience
Computer Architects, Computer System Designers, Compiler and System Software Developers, Programmers, Application Developers

Paperback, 856 Pages

Published: September 2011

Imprint: Morgan Kaufmann

ISBN: 978-0-12-383872-8

Reviews

  • "What has made this book an enduring classic is that each edition is not an update, but an extensive revision that presents the most current information and unparalleled insight into this fascinating and fast changing field. For me, after over twenty years in this profession, it is also another opportunity to experience that student-grade admiration for two remarkable teachers." - From the Foreword by Luiz André Barroso, Google, Inc.

    "This is an academic textbook that is also suitable for a far broader readership. Each chapter is organised in the same structure, with the main content supported by case studies and exercises… Having read this book I now have a far better understanding of why processors from all the different designers and manufacturers are so different. Memory hierarchies, multicore architectures and compiler optimisation are all covered in great detail. I was particularly interested in their discussion of graphical processing units and how they are suitable for far more than just graphical workloads… What is great about this book is that it moves with the times. There is a lot of content on processors for mobile computing, and power usage is a pervasive theme. At the other extreme there is an excellent chapter on warehouse scale computers, which offers tremendous insight into the cloud computing infrastructure provided by Google, Amazon and others. If your job has anything to do with IT infrastructure then I recommend this book as a must-read. As an academic text book it has both depth and breadth. And if you're just interested in the topic you'll gain a huge amount of insight into the fundamentals of computer architecture."--The Chartered Institute for IT


Contents


  • Foreword

    Preface

    Acknowledgments

    Chapter 1 Fundamentals of Quantitative Design and Analysis

        1.1 Introduction

        1.2 Classes of Computers

        1.3 Defining Computer Architecture

        1.4 Trends in Technology

        1.5 Trends in Power and Energy in Integrated Circuits

        1.6 Trends in Cost

        1.7 Dependability

        1.8 Measuring, Reporting, and Summarizing Performance

        1.9 Quantitative Principles of Computer Design

        1.10 Putting It All Together: Performance, Price, and Power

        1.11 Fallacies and Pitfalls

        1.12 Concluding Remarks

        1.13 Historical Perspectives and References

        Case Studies and Exercises by Diana Franklin

    Chapter 2 Memory Hierarchy Design

        2.1 Introduction

        2.2 Ten Advanced Optimizations of Cache Performance

        2.3 Memory Technology and Optimizations

        2.4 Protection: Virtual Memory and Virtual Machines

        2.5 Crosscutting Issues: The Design of Memory Hierarchies

        2.6 Putting It All Together: Memory Hierarchies in the ARM Cortex-A8 and Intel Core i7

        2.7 Fallacies and Pitfalls

        2.8 Concluding Remarks: Looking Ahead

        2.9 Historical Perspective and References

        Case Studies and Exercises

    Chapter 3 Instruction-Level Parallelism and Its Exploitation

        3.1 Instruction-Level Parallelism: Concepts and Challenges

        3.2 Basic Compiler Techniques for Exposing ILP

        3.3 Reducing Branch Costs with Advanced Branch Prediction

        3.4 Overcoming Data Hazards with Dynamic Scheduling

        3.5 Dynamic Scheduling: Examples and the Algorithm

        3.6 Hardware-Based Speculation

        3.7 Exploiting ILP Using Multiple Issue and Static Scheduling

        3.8 Exploiting ILP Using Dynamic Scheduling, Multiple Issue, and Speculation

        3.9 Advanced Techniques for Instruction Delivery and Speculation

        3.10 Studies of the Limitations of ILP

        3.11 Cross-Cutting Issues: ILP Approaches and the Memory System

        3.12 Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor Throughput

        3.13 Putting It All Together: The Intel Core i7 and ARM Cortex-A8

        3.14 Fallacies and Pitfalls

        3.15 Concluding Remarks: What’s Ahead?

        3.16 Historical Perspective and References

        Case Studies and Exercises

    Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures

        4.1 Introduction

        4.2 Vector Architecture

        4.3 SIMD Instruction Set Extensions for Multimedia

        4.4 Graphics Processing Units

        4.5 Detecting and Enhancing Loop-Level Parallelism

        4.6 Crosscutting Issues

        4.7 Putting It All Together: Mobile versus Server GPUs and Tesla versus Core i7

        4.8 Fallacies and Pitfalls

        4.9 Concluding Remarks

        4.10 Historical Perspective and References

        Case Study and Exercises

    Chapter 5 Thread-Level Parallelism

        5.1 Introduction

        5.2 Centralized Shared-Memory Architectures

        5.3 Performance of Symmetric Shared-Memory Multiprocessors

        5.4 Distributed Shared-Memory and Directory-Based Coherence

        5.5 Synchronization: The Basics

        5.6 Models of Memory Consistency: An Introduction

        5.7 Crosscutting Issues

        5.8 Putting It All Together: Multicore Processors and Their Performance

        5.9 Fallacies and Pitfalls

        5.10 Concluding Remarks

        5.11 Historical Perspectives and References

        Case Studies and Exercises

    Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism

        6.1 Introduction

        6.2 Programming Models and Workloads for Warehouse-Scale Computers

        6.3 Computer Architecture of Warehouse-Scale Computers

        6.4 Physical Infrastructure and Costs of Warehouse-Scale Computers

        6.5 Cloud Computing: The Return of Utility Computing

        6.6 Crosscutting Issues

        6.7 Putting It All Together: A Google Warehouse-Scale Computer

        6.8 Fallacies and Pitfalls

        6.9 Concluding Remarks

        6.10 Historical Perspectives and References

        Case Studies and Exercises

    Appendix A Instruction Set Principles

        A.1 Introduction

        A.2 Classifying Instruction Set Architectures

        A.3 Memory Addressing

        A.4 Type and Size of Operands

        A.5 Operations in the Instruction Set

        A.6 Instructions for Control Flow

        A.7 Encoding an Instruction Set

        A.8 Crosscutting Issues: The Role of Compilers

        A.9 Putting It All Together: The MIPS Architecture

        A.10 Fallacies and Pitfalls

        A.11 Concluding Remarks

        A.12 Historical Perspective and References

        Exercises

    Appendix B Review of Memory Hierarchy

        B.1 Introduction

        B.2 Cache Performance

        B.3 Six Basic Cache Optimizations

        B.4 Virtual Memory

        B.5 Protection and Examples of Virtual Memory

        B.6 Fallacies and Pitfalls

        B.7 Concluding Remarks

        B.8 Historical Perspective and References

        Exercises

    Appendix C Pipelining: Basic and Intermediate Concepts

        C.1 Introduction

        C.2 The Major Hurdle of Pipelining-Pipeline Hazards

        C.3 How Is Pipelining Implemented?

        C.4 What Makes Pipelining Hard to Implement?

        C.5 Extending the MIPS Pipeline to Handle Multicycle Operations

        C.6 Putting It All Together: The MIPS R4000 Pipeline

        C.7 Crosscutting Issues

        C.8 Fallacies and Pitfalls

        C.9 Concluding Remarks

        C.10 Historical Perspective and References

        Updated Exercises

    Online Appendices

    Appendix D Storage Systems

    Appendix E Embedded Systems

    Appendix F Interconnection Networks

    Appendix G Vector Processors in More Depth

    Appendix H Hardware and Software for VLIW and EPIC

    Appendix I Large-Scale Multiprocessors and Scientific Applications

    Appendix J Computer Arithmetic

    Appendix K Survey of Instruction Set Architectures

    Appendix L Historical Perspectives and References

    References

    Index




Advertisement

advert image