Managing Data in Motion - 1st Edition - ISBN: 9780123971678, 9780123977915

Managing Data in Motion

1st Edition

Data Integration Best Practice Techniques and Technologies

Authors: April Reeve
eBook ISBN: 9780123977915
Paperback ISBN: 9780123971678
Imprint: Morgan Kaufmann
Published Date: 15th March 2013
Page Count: 204
Tax/VAT will be calculated at check-out Price includes VAT (GST)
38.95
30.99
49.95
Unavailable
Price includes VAT (GST)
DRM-Free

Easy - Download and start reading immediately. There’s no activation process to access eBooks; all eBooks are fully searchable, and enabled for copying, pasting, and printing.

Flexible - Read on multiple operating systems and devices. Easily read eBooks on smart phones, computers, or any eBook readers, including Kindle.

Open - Buy once, receive and download all available eBook formats, including PDF, EPUB, and Mobi (for Kindle).

Institutional Access

Secure Checkout

Personal information is secured with SSL technology.

Free Shipping

Free global shipping
No minimum order.

Table of Contents

Dedication

Foreword

Acknowledgements

Biography

Introduction

What this book is about and why it’s necessary

What the reader will learn

Who should read this book

How this book is organized

Part 1: Introduction to data integration

Part 2: Batch data integration

Part 3: Real-time data integration

Part 4: Big data integration

Part 1: Introduction to Data Integration

Chapter 1. The Importance of Data Integration

The natural complexity of data interfaces

The rise of purchased vendor packages

Key enablement of big data and virtualization

Chapter 2. What Is Data Integration?

Data in motion

Integrating into a common format—transforming data

Migrating data from one system to another

Moving data around the organization

Pulling information from unstructured data

Moving process to data

Chapter 3. Types and Complexity of Data Integration

The differences and similarities in managing data in motion and persistent data

Batch data integration

Real-time data integration

Big data integration

Data virtualization

Chapter 4. The Process of Data Integration Development

The data integration development life cycle

Inclusion of business knowledge and expertise

Part 2: Batch Data Integration

Chapter 5. Introduction to Batch Data Integration

What is batch data integration?

Batch data integration life cycle

Chapter 6. Extract, Transform, and Load

What is ETL?

Profiling

Extract

Staging

Access layers

Transform

Load

Chapter 7. Data Warehousing

What is data warehousing?

Layers in an enterprise data warehouse architecture

Types of data to load in a data warehouse

Chapter 8. Data Conversion

What is data conversion?

Data conversion life cycle

Data conversion analysis

Best practice data loading

Improving source data quality

Mapping to target

Configuration data

Testing and dependencies

Private data

Proving

Environments

Chapter 9. Data Archiving

What is data archiving?

Selecting data to archive

Can the archived data be retrieved?

Conforming data structures in the archiving environment

Flexible data structures

Chapter 10. Batch Data Integration Architecture and Metadata

What is batch data integration architecture?

Profiling tool

Modeling tool

Metadata repository

Data movement

Transformation

Scheduling

Part 3: Real Time Data Integration

Chapter 11. Introduction to Real-Time Data Integration

Why real-time data integration?

Why two sets of technologies?

Chapter 12. Data Integration Patterns

Interaction patterns

Loose coupling

Hub and spoke

Synchronous and asynchronous interaction

Request and reply

Publish and subscribe

Two-phase commit

Integrating interaction types

Chapter 13. Core Real-Time Data Integration Technologies

Confusing terminology

Enterprise service bus (ESB)

Service-oriented architecture (SOA)

Extensible markup language (XML)

Data replication and change data capture

Enterprise application integration (EAI)

Enterprise information integration (EII)

Chapter 14. Data Integration Modeling

Canonical modeling

Message modeling

Chapter 15. Master Data Management

Introduction to master data management

Reasons for a master data management solution

Purchased packages and master data

Reference data

Masters and slaves

External data

Master data management functionality

Types of master data management solutions—registry and data hub

Chapter 16. Data Warehousing with Real-Time Updates

Corporate information factory

Operational data store

Master data moving to the data warehouse

Chapter 17. Real-Time Data Integration Architecture and Metadata

What is real-time data integration metadata?

Modeling

Profiling

Metadata repository

Enterprise service bus—data transformation and orchestration

Data movement and middleware

External interaction

Part 4: Big, Cloud, Virtual Data

Chapter 18. Introduction to Big Data Integration

Data integration and unstructured data

Big data, cloud data, and data virtualization

Chapter 19. Cloud Architecture and Data Integration

Why is data integration important in the cloud?

Public cloud

Cloud security

Cloud latency

Cloud redundancy

Chapter 20. Data Virtualization

A technology whose time has come

Business uses of data virtualization

Data virtualization architecture

Chapter 21. Big Data Integration

What is big data?

Big data dimension—volume

Big data dimension—variety

Big data dimension—velocity

Traditional big data use cases

More big data use cases

Leveraging the power of big data—real-time decision support

Big data architecture

Chapter 22. Conclusion to Managing Data in Motion

Data integration architecture

Data integration engines

Data integration hubs

Metadata management

The end

References

Index


Description

Managing Data in Motion describes techniques that have been developed for significantly reducing the complexity of managing system interfaces and enabling scalable architectures. Author April Reeve brings over two decades of experience to present a vendor-neutral approach to moving data between computing environments and systems. Readers will learn the techniques, technologies, and best practices for managing the passage of data between computer systems and integrating disparate data together in an enterprise environment.

The average enterprise's computing environment is comprised of hundreds to thousands computer systems that have been built, purchased, and acquired over time. The data from these various systems needs to be integrated for reporting and analysis, shared for business transaction processing, and converted from one format to another when old systems are replaced and new systems are acquired.

The management of the "data in motion" in organizations is rapidly becoming one of the biggest concerns for business and IT management. Data warehousing and conversion, real-time data integration, and cloud and "big data" applications are just a few of the challenges facing organizations and businesses today. Managing Data in Motion tackles these and other topics in a style easily understood by business and IT managers as well as programmers and architects.

Key Features

  • Presents a vendor-neutral overview of the different technologies and techniques for moving data between computer systems including the emerging solutions for unstructured as well as structured data types
  • Explains, in non-technical terms, the architecture and components required to perform data integration
  • Describes how to reduce the complexity of managing system interfaces and enable a scalable data architecture that can handle the dimensions of "Big Data"

Readership

Data Warehouse Professionals; Data Modelers and Architects; Database and Network Administrators; ETL and Application Programmers; Project Managers; IT and Data Center Managers; CIO/CTO


Details

No. of pages:
204
Language:
English
Copyright:
© Morgan Kaufmann 2013
Published:
Imprint:
Morgan Kaufmann
eBook ISBN:
9780123977915
Paperback ISBN:
9780123971678

Reviews

"The highlight of the book is that the author is able to present a broad, complicated subject in a coherent, consolidated, and readable manner. This is ideal for busy information technology managers and chief technical officers who have little time for outside reading. The book is especially valuable for those managers who are not familiar with modern data management or who are fluent in general computing technology but are tasked with managing data on a larger scale."--ComputingReviews.com, November 4, 2013
"Reeve, an enterprise information consultant with EMC, describes different techniques, technologies, and best practices for managing the transfer of data between computer systems and integrating disparate databases together within a large organization."--Reference and Research Book News, August 2013
"Few if any enterprises have the luxury of a single unified integrated data platform. Yet one of the least considered areas in enterprise information management is how we should treat and manage the growing numbers of interfaces. April Reeve presents a much-needed overview and guide to the challenges of data integration."--John Ladley, Principal of IMCue Solutions, Editor of the Data Strategy Journal


About the Authors

April Reeve Author

April Reeve has spent the last 25 years working as an enterprise architect and program manager, primarily for large financial services firms. Currently she is working for EMC Consulting as a Business Consultant in the Enterprise Information Management practice. April is an expert in multiple Data Management disciplines including Data Conversion, Data Warehousing, Business Intelligence, Master Data Management, Data Integration, and Data Governance. She is a regular speaker on Data Management topics at Industry conferences.

Affiliations and Expertise

April Reeve is a Business Consultant in the Enterprise Information Management practice at EMC Consulting