Managing Data in Motion

Managing Data in Motion

Data Integration Best Practice Techniques and Technologies

1st Edition - February 26, 2013

Write a review

  • Author: April Reeve
  • Paperback ISBN: 9780123971678
  • eBook ISBN: 9780123977915

Purchase options

Purchase options
Available
DRM-free (Mobi, PDF, EPub)
Sales tax will be calculated at check-out

Institutional Subscription

Free Global Shipping
No minimum order

Description

Managing Data in Motion describes techniques that have been developed for significantly reducing the complexity of managing system interfaces and enabling scalable architectures. Author April Reeve brings over two decades of experience to present a vendor-neutral approach to moving data between computing environments and systems. Readers will learn the techniques, technologies, and best practices for managing the passage of data between computer systems and integrating disparate data together in an enterprise environment. The average enterprise's computing environment is comprised of hundreds to thousands computer systems that have been built, purchased, and acquired over time. The data from these various systems needs to be integrated for reporting and analysis, shared for business transaction processing, and converted from one format to another when old systems are replaced and new systems are acquired. The management of the "data in motion" in organizations is rapidly becoming one of the biggest concerns for business and IT management. Data warehousing and conversion, real-time data integration, and cloud and "big data" applications are just a few of the challenges facing organizations and businesses today. Managing Data in Motion tackles these and other topics in a style easily understood by business and IT managers as well as programmers and architects.

Key Features

  • Presents a vendor-neutral overview of the different technologies and techniques for moving data between computer systems including the emerging solutions for unstructured as well as structured data types
  • Explains, in non-technical terms, the architecture and components required to perform data integration
  • Describes how to reduce the complexity of managing system interfaces and enable a scalable data architecture that can handle the dimensions of "Big Data"

Readership

Data Warehouse Professionals; Data Modelers and Architects; Database and Network Administrators; ETL and Application Programmers; Project Managers; IT and Data Center Managers; CIO/CTO

Table of Contents

  • Dedication

    Foreword

    Acknowledgements

    Biography

    Introduction

    What this book is about and why it’s necessary

    What the reader will learn

    Who should read this book

    How this book is organized

    Part 1: Introduction to data integration

    Part 2: Batch data integration

    Part 3: Real-time data integration

    Part 4: Big data integration

    Part 1: Introduction to Data Integration

    Chapter 1. The Importance of Data Integration

    The natural complexity of data interfaces

    The rise of purchased vendor packages

    Key enablement of big data and virtualization

    Chapter 2. What Is Data Integration?

    Data in motion

    Integrating into a common format—transforming data

    Migrating data from one system to another

    Moving data around the organization

    Pulling information from unstructured data

    Moving process to data

    Chapter 3. Types and Complexity of Data Integration

    The differences and similarities in managing data in motion and persistent data

    Batch data integration

    Real-time data integration

    Big data integration

    Data virtualization

    Chapter 4. The Process of Data Integration Development

    The data integration development life cycle

    Inclusion of business knowledge and expertise

    Part 2: Batch Data Integration

    Chapter 5. Introduction to Batch Data Integration

    What is batch data integration?

    Batch data integration life cycle

    Chapter 6. Extract, Transform, and Load

    What is ETL?

    Profiling

    Extract

    Staging

    Access layers

    Transform

    Load

    Chapter 7. Data Warehousing

    What is data warehousing?

    Layers in an enterprise data warehouse architecture

    Types of data to load in a data warehouse

    Chapter 8. Data Conversion

    What is data conversion?

    Data conversion life cycle

    Data conversion analysis

    Best practice data loading

    Improving source data quality

    Mapping to target

    Configuration data

    Testing and dependencies

    Private data

    Proving

    Environments

    Chapter 9. Data Archiving

    What is data archiving?

    Selecting data to archive

    Can the archived data be retrieved?

    Conforming data structures in the archiving environment

    Flexible data structures

    Chapter 10. Batch Data Integration Architecture and Metadata

    What is batch data integration architecture?

    Profiling tool

    Modeling tool

    Metadata repository

    Data movement

    Transformation

    Scheduling

    Part 3: Real Time Data Integration

    Chapter 11. Introduction to Real-Time Data Integration

    Why real-time data integration?

    Why two sets of technologies?

    Chapter 12. Data Integration Patterns

    Interaction patterns

    Loose coupling

    Hub and spoke

    Synchronous and asynchronous interaction

    Request and reply

    Publish and subscribe

    Two-phase commit

    Integrating interaction types

    Chapter 13. Core Real-Time Data Integration Technologies

    Confusing terminology

    Enterprise service bus (ESB)

    Service-oriented architecture (SOA)

    Extensible markup language (XML)

    Data replication and change data capture

    Enterprise application integration (EAI)

    Enterprise information integration (EII)

    Chapter 14. Data Integration Modeling

    Canonical modeling

    Message modeling

    Chapter 15. Master Data Management

    Introduction to master data management

    Reasons for a master data management solution

    Purchased packages and master data

    Reference data

    Masters and slaves

    External data

    Master data management functionality

    Types of master data management solutions—registry and data hub

    Chapter 16. Data Warehousing with Real-Time Updates

    Corporate information factory

    Operational data store

    Master data moving to the data warehouse

    Chapter 17. Real-Time Data Integration Architecture and Metadata

    What is real-time data integration metadata?

    Modeling

    Profiling

    Metadata repository

    Enterprise service bus—data transformation and orchestration

    Data movement and middleware

    External interaction

    Part 4: Big, Cloud, Virtual Data

    Chapter 18. Introduction to Big Data Integration

    Data integration and unstructured data

    Big data, cloud data, and data virtualization

    Chapter 19. Cloud Architecture and Data Integration

    Why is data integration important in the cloud?

    Public cloud

    Cloud security

    Cloud latency

    Cloud redundancy

    Chapter 20. Data Virtualization

    A technology whose time has come

    Business uses of data virtualization

    Data virtualization architecture

    Chapter 21. Big Data Integration

    What is big data?

    Big data dimension—volume

    Big data dimension—variety

    Big data dimension—velocity

    Traditional big data use cases

    More big data use cases

    Leveraging the power of big data—real-time decision support

    Big data architecture

    Chapter 22. Conclusion to Managing Data in Motion

    Data integration architecture

    Data integration engines

    Data integration hubs

    Metadata management

    The end

    References

    Index

Product details

  • No. of pages: 204
  • Language: English
  • Copyright: © Morgan Kaufmann 2013
  • Published: February 26, 2013
  • Imprint: Morgan Kaufmann
  • Paperback ISBN: 9780123971678
  • eBook ISBN: 9780123977915

About the Author

April Reeve

April Reeve
April Reeve has spent the last 25 years working as an enterprise architect and program manager, primarily for large financial services firms. Currently she is working for EMC Consulting as a Business Consultant in the Enterprise Information Management practice. April is an expert in multiple Data Management disciplines including Data Conversion, Data Warehousing, Business Intelligence, Master Data Management, Data Integration, and Data Governance. She is a regular speaker on Data Management topics at Industry conferences.

Affiliations and Expertise

April Reeve is a Business Consultant in the Enterprise Information Management practice at EMC Consulting

Ratings and Reviews

Write a review

There are currently no reviews for "Managing Data in Motion"