Entity Resolution and Information Quality book cover

Entity Resolution and Information Quality

Customers and products are the heart of any business, and corporations collect more data about them every year. However, just because you have data doesn’t mean you can use it effectively. If not properly integrated, data can actually encourage false conclusions that result in bad decisions and lost opportunities. Entity Resolution (ER) is a powerful tool for transforming data into accurate, value-added information. Using entity resolution methods and techniques, you can identify equivalent records from multiple sources corresponding to the same real-world person, place, or thing.

This emerging area of data management is clearly explained throughout the book. It teaches you the process of locating and linking information about the same entity - eliminating duplications - and making crucial business decisions based on the results. This book is an authoritative, vendor-independent technical reference for researchers, graduate students and practitioners, including architects, technical analysts, and solution developers. In short, Entity Resolution and Information Quality gives you the applied level know-how you need to aggregate data from disparate sources and form accurate customer and product profiles that support effective marketing and sales. It is an invaluable guide for succeeding in today’s info-centric environment.

Audience

Database administrators, data/Information analysts, information and enterprise architects, data warehouse and systems engineers, and software developers working on an identity resolution engine or middleware stack.

Paperback, 256 Pages

Published: December 2010

Imprint: Morgan Kaufmann

ISBN: 978-0-12-381972-7

Reviews

  • "This book is comprehensive, timely, and on the leading edge of the topic. In addition to being comprehensive and systematic, the book has two distinct characteristics: (1) it addresses the issue of entity relationships, which go beyond entity matching. This novel approach generates much richer information about entities; (2) it discusses not only techniques, but also systems that implement the techniques. This system-oriented approach helps the reader to see how to apply the techniques for problem solving."--Dr. Hongwei (Harry) Zhu - Assistant Professor of Information Technology in the College of Business and Public Administration, Old Dominion University

    "Talburt, the author of this book, is one of the organizers of the first graduate degree program in information quality, hosted by the University of Arkansas at Little Rock. The book contains seven easy-to-read chapters. A chapter on trends and research topics in entity resolution closes this short textbook. Some of the suggestions will undoubtedly encourage graduate students to pursue their research on data integration topics. The book offers interesting pointers and bibliographic references for exploring new avenues of research."--Computing Reviews

    "Talburt (information science, U. of Arkansas-Little Rock) presents a textbook developed from a graduate course on the two emerging specialties within information science. Students tend to come from a number of disciplines, so no deep background in information science is assumed, and the material may even be suitable for upper-level undergraduate courses. He covers principles of entity resolution and information quality, entity resolution models and systems, entity-based data integration, the OYSTER open-source software development project, and trends in research and applications."--SciTech Book News


Contents

  • Foreword

    Preface

    Acknowledgements

    Chapter 1 Principles of Entity Resolution

    Entity Resolution

    Entity Resolution Activities

    Summary

    Review Questions

    Chapter 2 Principles of Information Quality

    Information Quality

    IQ and the Quality of Information

    Two IP Examples

    IQ Management

    Information versus Process

    IQ  and  HPC

    The Evolution of Information Quality

    IQ as an Academic Discipline

    IQ  and  ER

    Summary

    Review Questions

    Chapter 3 Entity Resolution Models

    Overview

    The Fellegi-Sunter Model

    SERF Model

    Algebraic Model

    ENRES Meta-Model

    Summary

    Review Questions

    Chapter 4 Entity-Based Data Integration

    Introduction

    Formal Framework for Describing EBDI

    viiOptimizing Selection Operator Accuracy

    More Complex Selection Rules

    Summary

    Review Questions

    Chapter 5 Entity Resolution Systems

    Introduction

    DataFlux dfPowerStudio

    Infoglide Identity Resolution Engine

    Acxiom AbiliTec

    Summary

    Review Questions

    Chapter 6 The OYSTER Project

    Background

    OYSTER Logic

    Transitive Equivalence Example

    Asserted Equivalence Example

    Febrl: Open-Source Project

    Summary

    Review Questions

    Chapter 7 Trends in Entity Resolution Research and Applications

    Introduction

    ER and Information Hubs

    Association Analysis and Social Networks

    HPC  in  ER

    Integration of ER and IQ

    Entity-Based Data Integration

    Fundamental ER Research

    Summary

    Review Questions

    Bibliography

    Glossary

    Appendix

    Index

Advertisement

advert image