Reliability, Maintainability and Risk

Reliability, Maintainability and Risk

Practical Methods for Engineers including Reliability Centred Maintenance and Safety-Related Systems

8th Edition - June 20, 2011

Write a review

  • Author: David Smith
  • eBook ISBN: 9780080969039

Purchase options

Purchase options
DRM-free (PDF, Mobi, EPub)
Sales tax will be calculated at check-out

Institutional Subscription

Free Global Shipping
No minimum order


Reliability, Maintainability and Risk: Practical Methods for Engineers, Eighth Edition, discusses tools and techniques for reliable and safe engineering, and for optimizing maintenance strategies. It emphasizes the importance of using reliability techniques to identify and eliminate potential failures early in the design cycle. The focus is on techniques known as RAMS (reliability, availability, maintainability, and safety-integrity). The book is organized into five parts. Part 1 on reliability parameters and costs traces the history of reliability and safety technology and presents a cost-effective approach to quality, reliability, and safety. Part 2 deals with the interpretation of failure rates, while Part 3 focuses on the prediction of reliability and risk. Part 4 discusses design and assurance techniques; review and testing techniques; reliability growth modeling; field data collection and feedback; predicting and demonstrating repair times; quantified reliability maintenance; and systematic failures. Part 5 deals with legal, management and safety issues, such as project management, product liability, and safety legislation.

Key Features

  • 8th edition of this core reference for engineers who deal with the design or operation of any safety critical systems, processes or operations
  • Answers the question: how can a defect that costs less than $1000 dollars to identify at the process design stage be prevented from escalating to a $100,000 field defect, or a $1m+ catastrophe
  • Revised throughout, with new examples, and standards, including must have material on the new edition of global functional safety standard IEC 61508, which launches in 2010


Chemical, Process, Plant, Oil & Gas and related systems safety engineers

Table of Contents

  • Preface


    Part 1 Understanding Reliability Parameters and Costs

    Chapter 1: The History of Reliability and Safety Technology

    1.1 Failure Data

    1.2 Hazardous Failures

    1.3 Reliability and Risk Prediction

    1.4 Achieving Reliability and Safety-Integrity

    1.5 The RAMS Cycle

    1.6 Contractual and Legal Pressures

    Chapter 2: Understanding Terms and Jargon

    2.1 Defining Failure and Failure Modes

    2.2 Failure Rate and Mean Time Between Failures

    2.3 Interrelationships of Terms

    2.4 The Bathtub Distribution

    2.5 Down Time and Repair Time

    2.6 Availability, Unavailability and Probability of Failure on Demand

    2.7 Hazard and Risk-Related Terms

    2.8 Choosing the Appropriate Parameter

    Chapter 3: A Cost-Effective Approach to Quality, Reliability and Safety

    3.1 Reliability and Optimum Cost

    3.2 Costs and Safety

    3.3 The Cost of Quality

    Part 2 Interpreting Failure Rates

    Chapter 4: Realistic Failure Rates and Prediction Confidence

    4.1 Data Accuracy

    4.2 Sources of Data

    4.3 Data Ranges

    4.4 Confidence Limits of Prediction

    4.5 Manufacturers’ Data

    4.6 Overall Conclusions

    Chapter 5: Interpreting Data and Demonstrating Reliability

    5.1 The Four Cases

    5.2 Inference and Confidence Levels

    5.3 The Chi-Square Test

    5.4 Understanding the Method in More Detail

    5.5 Double-Sided Confidence Limits

    5.6 Reliability Demonstration

    5.7 Sequential Testing

    5.8 Setting Up Demonstration Tests


    Chapter 6: Variable Failure Rates and Probability Plotting

    6.1 The Weibull Distribution

    6.2 Using the Weibull Method

    6.3 More Complex Cases of the Weibull Distribution

    6.4 Continuous Processes


    Part 3 Predicting Reliability and Risk

    Chapter 7: Basic Reliability Prediction Theory

    7.1 Why Predict RAMS?

    7.2 Probability Theory

    7.3 Reliability of Series Systems

    7.4 Redundancy Rules

    7.5 General Features of Redundancy


    Chapter 8: Methods of Modeling

    8.1 Block Diagrams and Repairable Systems

    8.2 Common Cause (Dependent) Failure

    8.3 Fault Tree Analysis

    8.4 Event Tree Diagrams

    Chapter 9: Quantifying the Reliability Models

    9.1 The Reliability Prediction Method

    9.2 Allowing for Diagnostic Intervals

    9.3 FMEA (Failure Mode and Effect Analysis)

    9.4 Human Factors

    9.5 Simulation

    9.6 Comparing Predictions with Targets


    Chapter 10: Risk Assessment (QRA)

    10.1 Frequency and Consequence

    10.2 Perception of Risk, ALARP and Cost per Life Saved

    10.3 Hazard Identification

    10.4 Factors to Quantify

    Part 4 Achieving Reliability and Maintainability

    Chapter 11: Design and Assurance Techniques

    11.1 Specifying and Allocating the Requirement

    11.2 Stress Analysis

    11.3 Environmental Stress Protection

    11.4 Failure Mechanisms

    11.5 Complexity and Parts

    11.6 Burn-In and Screening

    11.7 Maintenance Strategies

    Chapter 12: Design Review, Test and Reliability Growth

    12.1 Review Techniques

    12.2 Categories of Testing

    12.3 Reliability Growth Modeling


    Chapter 13: Field Data Collection and Feedback

    13.1 Reasons for Data Collection

    13.2 Information and Difficulties

    13.3 Times to Failure

    13.4 Spreadsheets and Databases

    13.5 Best Practice and Recommendations

    13.6 Analysis and Presentation of Results

    13.7 Manufacturers’ data

    13.8 Anecdotal Data

    13.9 Examples of Failure Report Forms

    Chapter 14: Factors Influencing Down Time

    14.1 Key Design Areas

    14.2 Maintenance Strategies and Handbooks

    Chapter 15: Predicting and Demonstrating Repair Times

    15.1 Prediction Methods

    15.2 Demonstration Plans

    Chapter 16: Quantified Reliability Centered Maintenance

    16.1 What is QRCM?

    16.2 The QRCM Decision Process

    16.3 Optimum Replacement (Discard)

    16.4 Optimum Spares

    16.5 Optimum Proof Test

    16.6 Condition Monitoring

    Chapter 17: Systematic Failures, Especially Software

    17.1 Programable Devices

    17.2 Software-related Failures

    17.3 Software Failure Modeling

    17.4 Software Quality Assurance (Life Cycle Activities)

    17.5 Modern/Formal Methods

    17.6 Software Checklists

    Part 5 Legal, Management and Safety Considerations

    Chapter 18: Project Management and Competence

    18.1 Setting Objectives and Making Specifications

    18.2 Planning, Feasibility and Allocation

    18.3 Program Activities

    18.4 Responsibilities and Competence

    18.5 Functional Safety Capability

    18.6 Standards and Guidance Documents

    Chapter 19: Contract Clauses and Their Pitfalls

    19.1 Essential Areas

    19.2 Other Areas

    19.3 Pitfalls

    19.4 Penalties

    19.5 Subcontracted Reliability Assessments


    Chapter 20: Product Liability and Safety Legislation

    20.1 The General Situation

    20.2 Strict Liability

    20.3 The Consumer Protection Act 1987

    20.4 Health and Safety at Work Act 1974

    20.5 Insurance and Product Recall

    Chapter 21: Major Incident Legislation

    21.1 History of Major Incidents

    21.2 Development of major incident legislation

    21.3 CIMAH safety reports

    21.4 Offshore Safety Cases

    21.5 Problem Areas

    21.6 The COMAH Directive (1999 and 2005 Amendment)

    21.7 Rail

    21.8 Corporate Manslaughter and Corporate Homicide

    Chapter 22: Integrity of Safety-Related Systems

    22.1 Safety-Related or Safety-Critical?

    22.2 Safety-Integrity Levels (SILs)

    22.3 Programable electronic systems (PESs)

    22.4 Current guidance

    22.5 Framework for Certification

    Chapter 23: A Case Study: The Datamet Project

    23.1 Introduction

    23.2 The Datamet Concept

    23.3 The Contract

    23.4 Detailed Design

    23.5 Syndicate Study

    23.6 Hints

    Chapter 24: A case study: gas detection system

    24.1 Safety-Integrity Target

    24.2 Random Hardware Failures

    24.3 ALARP

    24.4 Architectures

    24.5 Life-Cycle Activities

    24.6 Functional Safety Capability

    Chapter 25: A Case Study: Pressure Control System

    25.1 The Unprotected System

    25.2 Protection System

    25.3 Assumptions

    25.4 Reliability Block Diagram

    25.5 Failure Rate Data

    25.6 Quantifying the Model

    25.7 Proposed Design and Maintenance Modifications

    25.8 Modeling Common Cause Failure (Pressure Transmitters)

    25.9 Quantifying the Revised Model

    25.10 ALARP

    25.11 Architectural Constraints

    Appendix 1: Glossary

    A1.1 Terms Related to Failure

    A1.1.1 Failure

    A1.1.2 Failure Mode

    A1.1.3 Failure Mechanism

    A1.1.4 Failure Rate

    A1.1.5 Mean Time Between Failures and Mean Time to Fail

    A1.1.6 Common Cause Failure

    A1.1.7 Common Mode Failure

    A1.2 Reliability Terms

    A1.2.1 Reliability

    A1.2.2 Redundancy

    A1.2.3 Diversity

    A1.2.4 Failure Mode and Effect Analysis

    A1.2.5 Fault Tree Analysis

    A1.2.6 Cause Consequence Analysis (Event Trees)

    A1.2.7 Reliability Growth

    A1.2.8 Reliability Centered Maintenance

    A1.3 Maintainability Terms

    A1.3.1 Maintainability

    A1.3.2 Mean Time to Repair (MTTR)

    A1.3.3 Repair Rate

    A1.3.4 Repair Time

    A1.3.5 Down Time

    A1.3.6 Corrective Maintenance

    A1.3.7 Preventive Maintenance

    A1.3.8 Least Replaceable Assembly (LRA)

    A1.3.9 Second-Line Maintenance

    A1.4 Terms Associated with Software

    A1.4.1 Software

    A1.4.2 Programable Device

    A1.4.3 High-Level Language

    A1.4.4 Assembler

    A1.4.5 Compiler

    A1.4.6 Diagnostic Software

    A1.4.7 Simulation

    A1.4.8 Emulation

    A1.4.9 Load Test

    A1.4.10 Functional Test

    A1.4.11 Software Error

    A1.4.12 Bit Error Rate

    A1.4.13 Automatic Test Equipment (ATE)

    A1.4.14 Data Corruption

    A1.5 Terms Related to Safety

    A1.5.1 Hazard

    A1.5.2 Major Hazard

    A1.5.3 Hazard Analysis

    A1.5.4 HAZOP

    A1.5.5 LOPA

    A1.5.6 Risk

    A1.5.7 Consequence Analysis

    A1.5.8 Safe Failure Fraction

    A1.5.9 Safety-Integrity

    A1.5.10 Safety-Integrity level

    A1.6 General Terms

    A1.6.1 Availability (Steady State)

    A1.6.2 Unavailability (PFD)

    A1.6.3 Burn-In

    A1.6.4 Confidence Interval

    A1.6.5 Consumer’s Risk

    A1.6.6 Derating

    A1.6.7 Ergonomics

    A1.6.8 Mean

    A1.6.9 Median

    A1.6.10 PFD

    A1.6.11 Producer’s Risk

    A1.6.12 Quality

    A1.6.13 Random

    A1.6.14 FRACAS

    A1.6.15 RAMS

    Appendix 2: Percentage Points of the Chi-Square Distribution

    Appendix 3: Microelectronics Failure Rates

    Appendix 4: General Failure Rates

    Appendix 5: Failure mode percentages

    Appendix 6: Human Error Probabilities

    Appendix 7: Fatality rates

    Appendix 8: Answers to Exercises

    Chapter 2

    Chapter 5

    Chapter 6

    Chapter 7

    Chapter 9


    Chapter 12

    Chapter 25

    25.2: Protection System

    25.4: Reliability Block Diagram

    25.6: Quantifying the Model

    25.7 Revised diagrams

    25.10 ALARP

    25.11 Architectural Constraints

    Appendix 9: Bibliography

    Appendix 10: Scoring Criteria for BETAPLUS Common Cause Model

    A10.1 Checklist and Scoring for Equipment Containing Programable Electronics

    A10.2 Checklist and Scoring for Non-Programable Equipment For Programable Electronics For Sensors and Actuators

    Appendix 11: Example of HAZOP

    A11.1 Equipment Details

    A11.2 HAZOP Worksheets

    A11.3 Potential Consequences


    Appendix 12: HAZID Checklist

    Appendix 13: Markov Analysis of Redundant Systems


Product details

  • No. of pages: 436
  • Language: English
  • Copyright: © Butterworth-Heinemann 2011
  • Published: June 20, 2011
  • Imprint: Butterworth-Heinemann
  • eBook ISBN: 9780080969039

About the Author

David Smith

Dr. David J. Smith is the Proprietor of Technis Consultancy. He has written numerous books on Reliability and Safety over the last 40 years. His FARADIP database has become widely used, and his other software packages are also used throughout the profession. His PhD thesis was on the subject of reliability prediction and common cause failure. He contributed to the first drafting of IEC 61508 and chairs the IGEM panel which produces SR/15 (the gas industry safety related guidance). David is past President of the Safety and Reliability Society.

Affiliations and Expertise

Independent Consultant, Technis, Tonbridge, UK

Ratings and Reviews

Write a review

There are currently no reviews for "Reliability, Maintainability and Risk"