Architecture Design for Soft Errors


  • Shubu Mukherjee, Principal Engineer and Director, SPEARS (Simulation & Pathfinding of Efficient and Reliable Systems): Intel

This book provides a comprehensive description of the architetural techniques to tackle the soft error problem. It covers the new methodologies for quantitative analysis of soft errors as well as novel, cost-effective architectural techniques to mitigate them. To provide readers with a better grasp of the broader problem deffinition and solution space, this book also delves into the physics of soft errors and reviews current circuit and software mitigation techniques. TABLE OF CONTENTSChapter 1: Introduction Chapter 2: Device- and Circuit-Level Modeling, Measurement, and Mitigation Chapter 3: Architectural Vulnerability Analysis Chapter 4: Advanced Architectural Vulnerability Analysis Chapter 5: Error Coding Techniques Chapter 6: Fault Detection via Redundant Execution Chapter 7: Hardware Error Recovery Chapter 8: Software Detection and Recovery
View full description


Practitioners in semi-conductor industry, researchers & developers in computer architecture, advanced graduate seminar courses on soft errors, and (iv) as a reference book for undergraduate courses incomputer architecture. I will describe many basic and advancedtechniques to make this book of interest to this broad audience.


Book information

  • Published: February 2008
  • ISBN: 978-0-12-369529-1


"Dr. Shubu Mukherjee's book is a welcome surprise: books by architecture leaders in major companies are few and far between. Written from the viewpoint of a working engineer, the book describes sources of soft errors and solutions involving device, logic, and architecture design to reduce the effects of soft errors." - Max Baron, Microprocessor Report - May 27, 2008

Table of Contents

Chapter 1: IntroductionChapter 2: Device- and Circuit-Level Modeling, Measurement, and MitigationChapter 3: Architectural Vulnerability AnalysisChapter 4: Advanced Architectural Vulnerability AnalysisChapter 5: Error Coding TechniquesChapter 6: Fault Detection via Redundant ExecutionChapter 7: Hardware Error RecoveryChapter 8: Software Detection and Recovery