Measuring Data Quality for Ongoing Improvement

Measuring Data Quality for Ongoing Improvement

A Data Quality Assessment Framework

1st Edition - December 31, 2012

Write a review

  • Author: Laura Sebastian-Coleman
  • eBook ISBN: 9780123977540
  • Paperback ISBN: 9780123970336

Purchase options

Purchase options
DRM-free (EPub, Mobi, PDF)
Sales tax will be calculated at check-out

Institutional Subscription

Free Global Shipping
No minimum order


The Data Quality Assessment Framework shows you how to measure and monitor data quality, ensuring quality over time. You’ll start with general concepts of measurement and work your way through a detailed framework of more than three dozen measurement types related to five objective dimensions of quality: completeness, timeliness, consistency, validity, and integrity. Ongoing measurement, rather than one time activities will help your organization reach a new level of data quality. This plain-language approach to measuring data can be understood by both business and IT and provides practical guidance on how to apply the DQAF within any organization enabling you to prioritize measurements and effectively report on results. Strategies for using data measurement to govern and improve the quality of data and guidelines for applying the framework within a data asset are included. You’ll come away able to prioritize which measurement types to implement, knowing where to place them in a data flow and how frequently to measure. Common conceptual models for defining and storing of data quality results for purposes of trend analysis are also included as well as generic business requirements for ongoing measuring and monitoring including calculations and comparisons that make the measurements meaningful and help understand trends and detect anomalies.

Key Features

  • Demonstrates how to leverage a technology independent data quality measurement framework for your specific business priorities and data quality challenges
  • Enables discussions between business and IT with a non-technical vocabulary for data quality measurement
  • Describes how to measure data quality on an ongoing basis with generic measurement types that can be applied to any situation


Data quality engineers, managers and analysts, application program managers and developers, data stewards, data managers and analysts, compliance analysts, Business intelligence professionals, Database designers and administrators, Business and IT managers

Table of Contents

  • Dedication



    Author Biography

    Introduction: Measuring Data Quality for Ongoing Improvement

    Data Quality Measurement: the Problem we are Trying to Solve

    Recurring Challenges in the Context of Data Quality

    DQAF: the Data Quality Assessment Framework

    Overview of Measuring Data Quality for Ongoing Improvement

    Intended Audience

    What Measuring Data Quality for Ongoing Improvement Does Not Do

    Why I Wrote Measuring Data Quality for Ongoing Improvement

    Section 1. Concepts and Definitions

    Chapter 1. Data



    Data as Representation

    Data as Facts

    Data as a Product

    Data as Input to Analyses

    Data and Expectations


    Concluding Thoughts

    Chapter 2. Data, People, and Systems


    Enterprise or Organization

    IT and the Business

    Data Producers

    Data Consumers

    Data Brokers

    Data Stewards and Data Stewardship

    Data Owners

    Data Ownership and Data Governance

    IT, the Business, and Data Owners, Redux

    Data Quality Program Team


    Systems and System Design

    Concluding Thoughts

    Chapter 3. Data Management, Models, and Metadata


    Data Management

    Database, Data Warehouse, Data Asset, Dataset

    Source System, Target System, System of Record

    Data Models

    Types of Data Models

    Physical Characteristics of Data


    Metadata as Explicit Knowledge

    Data Chain and Information Life Cycle

    Data Lineage and Data Provenance

    Concluding Thoughts

    Chapter 4. Data Quality and Measurement


    Data Quality

    Data Quality Dimensions


    Measurement as Data

    Data Quality Measurement and the Business/IT Divide

    Characteristics of Effective Measurements

    Data Quality Assessment

    Data Quality Dimensions, DQAF Measurement Types, Specific Data Quality Metrics

    Data Profiling

    Data Quality Issues and Data Issue Management

    Reasonability Checks

    Data Quality Thresholds

    Process Controls

    In-line Data Quality Measurement and Monitoring

    Concluding Thoughts

    Section 2. DQAF Concepts and Measurement Types

    Chapter 5. DQAF Concepts


    The Problem the DQAF Addresses

    Data Quality Expectations and Data Management

    The Scope of the DQAF

    DQAF Quality Dimensions

    Defining DQAF Measurement Types

    Metadata Requirements

    Objects of Measurement and Assessment Categories

    Functions in Measurement: Collect, Calculate, Compare

    Concluding Thoughts

    Chapter 6. DQAF Measurement Types


    Consistency of the Data Model

    Ensuring the Correct Receipt of Data for Processing

    Inspecting the Condition of Data upon Receipt

    Assessing the Results of Data Processing

    Assessing the Validity of Data Content

    Assessing the Consistency of Data Content

    Comments on the Placement of In-line Measurements

    Periodic Measurement of Cross-table Content Integrity

    Assessing Overall Database Content

    Assessing Controls and Measurements

    The Measurement Types: Consolidated Listing

    Concluding Thoughts

    Section 3. Data Assessment Scenarios


    Assessment Scenarios

    Metadata: Knowledge before Assessment

    Chapter 7. Initial Data Assessment


    Initial Assessment

    Input to Initial Assessments

    Data Expectations

    Data Profiling

    Column Property Profiling

    Structure Profiling

    Profiling an Existing Data Asset

    From Profiling to Assessment

    Deliverables from Initial Assessment

    Concluding Thoughts

    Chapter 8. Assessment in Data Quality Improvement Projects


    Data Quality Improvement Efforts

    Measurement in Improvement Projects

    Chapter 9. Ongoing Measurement


    The Case for Ongoing Measurement

    Example: Health Care Data

    Inputs for Ongoing Measurement

    Criticality and Risk



    Periodic Measurement

    Deliverables from Ongoing Measurement

    In-Line versus Periodic Measurement

    Concluding Thoughts

    Section 4. Applying the DQAF to Data Requirements


    Chapter 10. Requirements, Risk, Criticality


    Business Requirements

    Data Quality Requirements and Expected Data Characteristics

    Data Quality Requirements and Risks to Data

    Factors Influencing Data Criticality

    Specifying Data Quality Metrics

    Concluding Thoughts

    Chapter 11. Asking Questions


    Asking Questions

    Understanding the Project

    Learning about Source Systems

    Your Data Consumers’ Requirements

    The Condition of the Data

    The Data Model, Transformation Rules, and System Design

    Measurement Specification Process

    Concluding Thoughts

    Section 5. A Strategic Approach to Data Quality

    Chapter 12. Data Quality Strategy


    The Concept of Strategy

    Systems Strategy, Data Strategy, and Data Quality Strategy

    Data Quality Strategy and Data Governance

    Decision Points in the Information Life Cycle

    General Considerations for Data Quality Strategy

    Concluding Thoughts

    Chapter 13. Directives for Data Quality Strategy


    Directive 1: Obtain Management Commitment to Data Quality

    Directive 2: Treat Data as an Asset

    Directive 3: Apply Resources to Focus on Quality

    Directive 4: Build Explicit Knowledge of Data

    Directive 5: Treat Data as a Product of Processes that can be Measured and Improved

    Directive 6: Recognize Quality is Defined by Data Consumers

    Directive 7: Address the Root Causes of Data Problems

    Directive 8: Measure Data Quality, Monitor Critical Data

    Directive 9: Hold Data Producers Accountable for the Quality of their Data (and Knowledge about that Data)

    Directive 10: Provide Data Consumers with the Knowledge they Require for Data Use

    Directive 11: Data Needs and Uses will Evolve—Plan for Evolution

    Directive 12: Data Quality Goes beyond the Data—Build a Culture Focused on Quality

    Concluding Thoughts: Using the Current State Assessment

    Section 6. The DQAF in Depth

    Functions for Measurement: Collect, Calculate, Compare

    Features of the DQAF Measurement Logical Data Model

    Facets of the DQAF Measurement Types

    Chapter 14. Functions of Measurement: Collection, Calculation, Comparison


    Functions in Measurement: Collect, Calculate, Compare

    Collecting Raw Measurement Data

    Calculating Measurement Data

    Comparing Measurements to Past History


    The Control Chart: A Primary Tool for Statistical Process Control

    The DQAF and Statistical Process Control

    Concluding Thoughts

    Chapter 15. Features of the DQAF Measurement Logical Model


    Metric Definition and Measurement Result Tables

    Optional Fields

    Denominator Fields

    Automated Thresholds

    Manual Thresholds

    Emergency Thresholds

    Manual or Emergency Thresholds and Results Tables

    Additional System Requirements

    Support Requirements

    Concluding Thoughts

    Chapter 16. Facets of the DQAF Measurement Types


    Facets of the DQAF

    Organization of the Chapter

    Measurement Type #1: Dataset Completeness—Sufficiency of Metadata and Reference Data

    Measurement Type #2: Consistent Formatting in One Field

    Measurement Type #3: Consistent Formatting, Cross-table

    Measurement Type #4: Consistent Use of Default Value in One Field

    Measurement Type #5: Consistent Use of Default Values, Cross-table

    Measurement Type #6: Timely Delivery of Data for Processing

    Measurement Type #7: Dataset Completeness—Availability for Processing

    Measurement Type #8: Dataset Completeness—Record Counts to Control Records

    Measurement Type #9: Dataset Completeness—Summarized Amount Field Data

    Measurement Type #10: Dataset Completeness—Size Compared to Past Sizes

    Measurement Type #11: Record Completeness—Length

    Measurement Type #12: Field Completeness—Non-Nullable Fields

    Measurement Type #13: Dataset Integrity—De-Duplication

    Measurement Type #14: Dataset Integrity—Duplicate Record Reasonability Check

    Measurement Type #15: Field Content Completeness—Defaults from Source

    Measurement Type #16: Dataset Completeness Based on Date Criteria

    Measurement Type #17: Dataset Reasonability Based on Date Criteria

    Measurement Type #18: Field Content Completeness—Received Data is Missing Fields Critical to Processing

    Measurement Type #19: Dataset Completeness—Balance Record Counts Through a Process

    Measurement Type #20: Dataset Completeness—Reasons for Rejecting Records

    Measurement Type #21: Dataset Completeness Through a Process—Ratio of Input to Output

    Measurement Type #22: Dataset Completeness Through a Process—Balance Amount Fields

    Measurement Type #23: Field Content Completeness—Ratio of Summed Amount Fields

    Measurement Type #24: Field Content Completeness—Defaults from Derivation

    Measurement Type #25: Data Processing Duration

    Measurement Type #26: Timely Availability of Data for Access

    Measurement Type #27: Validity Check, Single Field, Detailed Results

    Measurement Type #28: Validity Check, Roll-up

    Measurement Logical Data Model

    Measurement Type #29: Validity Check, Multiple Columns within a Table, Detailed Results

    Measurement Type #30: Consistent Column Profile

    Measurement Type #31: Consistent Dataset Content, Distinct Count of Represented Entity, with Ratios to Record Counts

    Measurement Type #32 Consistent Dataset Content, Ratio of Distinct Counts of Two Represented Entities

    Measurement Type #33: Consistent Multicolumn Profile

    Measurement Type #34: Chronology Consistent with Business Rules within a Table

    Measurement Type #35: Consistent Time Elapsed (hours, days, months, etc.)

    Measurement Type #36: Consistent Amount Field Calculations Across Secondary Fields

    Measurement Type #37: Consistent Record Counts by Aggregated Date

    Measurement Type #38: Consistent Amount Field Data by Aggregated Date

    Measurement Type #39: Parent/Child Referential Integrity

    Measurement Type #40: Child/Parent Referential Integrity

    Measurement Type #41: Validity Check, Cross Table, Detailed Results

    Measurement Type #42: Consistent Cross-table Multicolumn Profile

    Measurement Type #43: Chronology Consistent with Business Rules Across-tables

    Measurement Type #44: Consistent Cross-table Amount Column Calculations

    Measurement Type #45: Consistent Cross-Table Amount Columns by Aggregated Dates

    Measurement Type #46: Consistency Compared to External Benchmarks

    Measurement Type #47: Dataset Completeness—Overall Sufficiency for Defined Purposes

    Measurement Type #48: Dataset Completeness—Overall Sufficiency of Measures and Controls

    Concluding Thoughts: Know Your Data




    Online Materials

    Appendix A. Measuring the Value of Data

    Appendix B. Data Quality Dimensions


    Richard Wang’s and Diane Strong’s Data Quality Framework, 1996

    Thomas Redman’s Dimensions of Data Quality, 1996

    Larry English’s Information Quality Characteristics and Measures, 1999

    Appendix C. Completeness, Consistency, and Integrity of the Data Model


    Process Input and Output

    High-Level Assessment

    Detailed Assessment

    Quality of Definitions


    Appendix D. Prediction, Error, and Shewhart’s Lost Disciple, Kristo Ivanov


    Limitations of the Communications Model of Information Quality

    Error, Prediction, and Scientific Measurement

    What Do We Learn from Ivanov?

    Ivanov’s Concept of the System as Model

    Appendix E. Quality Improvement and Data Quality


    A Brief History of Quality Improvement

    Process Improvement Tools

    Implications for Data Quality

    Limitations of the Data as Product Metaphor

    Concluding Thoughts: Building Quality in Means Building Knowledge in

Product details

  • No. of pages: 376
  • Language: English
  • Copyright: © Morgan Kaufmann 2013
  • Published: December 31, 2012
  • Imprint: Morgan Kaufmann
  • eBook ISBN: 9780123977540
  • Paperback ISBN: 9780123970336

About the Author

Laura Sebastian-Coleman

Laura Sebastian-Coleman
Laura Sebastian-Coleman, Data Quality Director at Prudential, has been a data quality practitioner since 2003. She has implemented data quality metrics and reporting, launched and facilitated working stewardship groups, contributed to data consumer training programs, and led efforts to establish data standards and manage metadata. In 2009, she led a group of analysts in developing the Data Quality Assessment Framework (DQAF), which is the basis for her 2013 book, Measuring Data Quality for Ongoing Improvement. An active professional, Laura has delivered papers, tutorials, and keynotes at data-focused conferences, such as MIT’s Information Quality Program, Data Governance and Information Quality (DGIQ), Enterprise Data World (EDW), Data Modeling Zone, and Data Management Association (DAMA)-sponsored events. From 2009 to 2010, she served as IAIDQ’s Director of Member Services. In 2015, she received the IAIDQ Distinguished Member Award. DAMA Publications Officer (2015 to 2018) and production editor for the DAMA-DMBOK2 (2017), she is also author of Navigating the Labyrinth: An Executive Guide to Data Management (2018). In 2018, she received the DAMA award for excellence in the data management profession. She holds a CDMP (Certified Data Management Professional) from DAMA, an IQCP (Information Quality Certified Professional) from IAIDQ, a Certificate in Information Quality from MIT, a B.A. in English and History from Franklin & Marshall College, and a Ph.D. in English Literature from the University of Rochester.

Affiliations and Expertise

Data Quality Director, Prudential

Ratings and Reviews

Write a review

Latest reviews

(Total rating for all reviews)

  • David F. Thu May 05 2022

    Detailed and business-friendly DQ book

    I purchased this book in 2015 when researching how to go about ERP master data quality assessments due to my then new role in master data governance at Siemens Healthcare. In subsequent years I made it required reading for my team of data stewards. It is one of the few books on the subject which is accessible to the business yet goes into sufficient detail (you'll get 48 measurement focus areas here!) to make it a reference book that will be used again and again.

  • JamesCooper Tue Sep 17 2019

    Measuring Data Quality for Ongoing Improvement

    This provided a needed summary of the purpose and process of data quality assessment and provides examples for implementing these methods.