Information Management
1st Edition
Strategies for Gaining a Competitive Advantage with Data
Table of Contents
Foreword
In praise of Information Management
Preface
Chapter One. You’re in the Business of Information
An Architecture for Information Success
The Glue is Architecture
Workload Success
Information in Action
Judgment Still Necessary
Chapter Two. Relational Theory In Practice
Relational Theory
Multidimensional Databases
RDBMS Platforms
Action Plan
Chapter Three. You’re in the Business of Analytics
What Distinguishes Analytics?
Predictive Analytics
Building Predictive Analytic Models
Analytics and Information Architecture
Analytics Requires Analysts
Action Plan
Chapter Four. Data Quality: Passing the Standard
Data Quality Defect Categories
Sources of Poor Data Quality
Cures for Poor Data Quality
Action Plan
Chapter Five. Columnar Databases
Columnar Operation
Compression
Workloads
Workload Examples
Columnar Conclusions
Action Plan
Chapter Six. Data Warehouses and Appliances
Data Warehousing
The Data Warehouse Appliance
Data Appliances and the Use of Memory
Action Plan
Chapter Seven. Master Data Management: One Chapter Here, but Ramifications Everywhere
MDM Justification
A Subject-Area Culture
Mastering Data
The Architecture of MDM
MDM Governance
Data Quality and MDM
MDM Roles and Responsibilities
MDM Technology
Action Items
Chapter Eight. Data Stream Processing: When Storing the Data Happens Later
Uses of Data Stream Processing
Data Stream Processing Brings Power
Stream SQL Extensions
In Conclusion
Action Plan
References
Chapter Nine. Data Virtualization: The Perpetual Short-Term Solution
The History of Data Virtualization
Controlling Your Information Asset
Action Plan
Reference
Chapter Ten. Operational Big Data: Key-Value, Document, and Column Stores: Hash Tables Reborn
When to Yes NoSQL
NoSQL Attributes
NoSQL Categorization
Key-Value Stores
Document Stores
Column Stores
NoSQL Solution Checklist
Action Plan
Chapter Eleven. Analytical Big Data: Hadoop: Analytics at Scale
Big Data for Hadoop
Hadoop Defined
Hadoop Distributed File System
MapReduce for Hadoop
Failover
Hadoop Distributions
Supporting Tools
Hadoop Challenges
Hadoop is Not
Summary
Action Plan
Chapter Twelve. Graph Databases: When Relationships are the Data
Terms
Structure
Centrality Analysis
Cypher, a Graph Database Language
Graph Database Models
Action Plan
Chapter Thirteen. Cloud Computing: On-Demand Elasticity
Defining Cloud Computing
Benefits of the Cloud
Challenges with the Cloud
Cloud Deployment Models
Information Management in the Cloud
Action Plan
Chapter Fourteen. An Elegant Architecture Where Information Flows
The Starting Point
Plenty of Work to be Done
Information Management Maturity
Leadership
Action Plan
Chapter Fifteen. Modern Business Intelligence—Collaboration, Mobile, and Self-Service: Organizing the Discussion and Tethering the User to Information
The Mobile Revolution
Mobile Business Intelligence
Self-Service Business Intelligence
Collaborative Business Intelligence
Action Plan
Chapter Sixteen. Agile Practices for Information Management
Traditional Waterfall Methodology
Agile Approaches
SCRUM
SCRUM and Methodology Themes
Action Plan
Chapter Seventeen. Organizational Change Management: The Soft Stuff is the Hard Stuff
Organizational Change Management Work Products
Organization Change Management is Essential to Project Success
Action Plan
Index
Description
Information Management: Gaining a Competitive Advantage with Data is about making smart decisions to make the most of company information. Expert author William McKnight develops the value proposition for information in the enterprise and succinctly outlines the numerous forms of data storage. Information Management will enlighten you, challenge your preconceived notions, and help activate information in the enterprise. Get the big picture on managing data so that your team can make smart decisions by understanding how everything from workload allocation to data stores fits together.
The practical, hands-on guidance in this book includes:
- Part 1: The importance of information management and analytics to business, and how data warehouses are used
- Part 2: The technologies and data that advance an organization, and extend data warehouses and related functionality
- Part 3: Big Data and NoSQL, and how technologies like Hadoop enable management of new forms of data
- Part 4: Pulls it all together, while addressing topics of agile development, modern business intelligence, and organizational change management
Read the book cover-to-cover, or keep it within reach for a quick and useful resource. Either way, this book will enable you to master all of the possibilities for data or the broadest view across the enterprise.
Key Features
- Balances business and technology, with non-product-specific technical detail
- Shows how to leverage data to deliver ROI for a business
- Engaging and approachable, with practical advice on the pros and cons of each domain, so that you learn how information fits together into a complete architecture
- Provides a path for the data warehouse professional into the new normal of heterogeneity, including NoSQL solutions
Readership
IT organizations/ vendors/consultants, DBAs, information architects, managers/directors of information management.
Details
- No. of pages:
- 214
- Language:
- English
- Copyright:
- © Morgan Kaufmann 2014
- Published:
- 12th December 2013
- Imprint:
- Morgan Kaufmann
- eBook ISBN:
- 9780124095267
- Paperback ISBN:
- 9780124080560
Reviews
From the Author: The Information Architecture To Pursue
A top priority of CIOs and organizations everywhere is how to best adapt the environment to manage the information asset. There is a plethora of available systems to throw into that equation. The possibilities can be daunting.
- "One size fits all" does not apply to information architecture.
- Gone are the days when vendors could bring their laminated architectures to a client with credibility.
- Organizations must go forward incrementally from where they are and deliver business returns with each -- at the quarter-, not year-level, turnaround.
For such an important asset, the barometer cannot be a competitor’s environment. Early adopters of good practices will reap the most rewards. Following are several key actions to take to improve a company’s information architecture.
Move Key Operational Systems To In-Memory
In-memory for operational systems is appropriate wherever SQL is used operationally and the performance gains of in-memory can be utilized.
Configurations differ. Products like VoltDB are NewSQL systems purpose-built for storing data and throughput of transactional systems. NewSQL is used today for traditional high performance applications such as capital markets data feeds, financial trade, telco record streams, sensor-based distribution systems, wireless, online gaming, fraud detection, digital ad exchanges, and micro transaction systems.
NewSQL systems are in-memory, schema-based DBMS systems that scale out in a cluster. They have high availability architectures that use synchronous, multi-master, active-active replication. As the name implies, NewSQL supports full SQL – aggregate functions, LIKE, UNION, materialized views, indexes, etc.
In-memory also is found in DBMS environments that primarily scale-up like SAP HANA, Teradata, and IBM PureData. With in-memory systems like HANA, a company can store its entire operational database entirely in RAM as the primary persistence layer. With the increasing number of cores (multi-core CPUs) becoming standard, CPUs are able to process increased data volumes in parallel. Main memory is no longer a limited resource. These systems recognize this and fully exploit main memory. Caches and layers are eliminated because the entire physical database is sitting on the motherboard and is therefore in memory all the time. By providing added performance and full ACID compliance, these systems are pushing up the threshold of size and complexity where NoSQL systems make sense.
Selectively Utilize Data Stream Processing
Data stream processing and event stream processing can hardly be considered a data store alongside DBMS and file systems since it doesn’t actually store data. However, it is a data processing platform. Data is only stored in data stores for processing later anyway so if an organization can perform all processing without the storage, it can skip the storage. Profile data, such as found in a Master Data Management hub can be added to the processing alongside the stream, providing instantaneous context-sensitive processing in real-time.
Examine The Syndicated Data Marketplace
Data has existed for purchase for a while, but the data has mostly been sourced into a very specific need, such as a marketing list for a promotion. As organizations make the move to more widespread data access and leveragable data structures, an investment in syndicated data can be leveraged throughout the enterprise.
Embrace Master Data Management
When facing a mounting workload adding value to an enterprise with information management, considering key components of each application that can be managed separately is wise. The most prominent of these components is master data. Building master data in a scalable, sharable manner, such as with a master data management approach, will streamline project development time and bring consistent data to multiple applications.
Utilize Data Virtualization
Dispense with the notion that each query can come from one data store. With data virtualization making big gains in recent years, data can be selectively stored in its best-fit platform and still be served to queries requiring data from elsewhere. Data virtualization can be a way to save the day for one-off queries or selective queries using data virtualization can be architected into scheduled operations.
Marginalize Multidimensional Databases
The hyper-denormalized multidimensional structure has proved a very difficult structure to use effectively. When created "spot on" to a query need, it is a good performing structure. When mismatched due to too few columns in the structure (requiring "drill through") or too many, creating overhead, it becomes an encumbrance.
Use Data Warehouses Strategically
The data warehouse concept is still necessary in any modern environment. The idea of sharing the data, the DBMS platform, a model, the methods, and the tools across different data sets and subject areas brings many benefits.
Making data warehouse data columnar in orientation generally would help a data warehouse more than it would hurt. However, people don’t generally like any downsides with their upgrades. A data warehouse community is not just multiple people. It’s very disparate user groups. The "groupthink" of the data warehouse also will limit finding the value proposition for the in-memory data warehouse although SSD storage is a must.
The data warehouse can be the lowest common denominator approach to storing data, which is not bad for the mid-specification analytic workload.
Data warehouses will see evolutionary change, but new applications and those who want specific analytic features may just source their data from the data warehouse. There are many of these "marts" being built today. The expansion of platform features in DBMS will continue as marts go searching for their best-fit platform.
Make Analytic Marts Columnar And In-Memory
Analytic marts built to support a single application, subject area, or department are well served to optimize around the specific requirements. These marts, which are multiplying throughout enterprises, have eschewed joining forces with the data warehouse, often because the analytic features will not be turned on for the data warehouse.
At less than 5 terabytes, analytic marts provide a great playground with little downside and bureaucracy that prevent trying out a columnar orientation and in-memory processing. Once tried, these features have quick appeal because the performance they create is often orders of magnitude greater performance for analytic processing.
Priase for Information Management:
"This is an excerpt from the first chapter of Information Management: Strategies for Gaining a Competitive Advantage with Data, written by William McKnight…he addresses the relationship between information management and business value, explores data management technologies, and offers advice on maximizing the potential of enterprise information."--SearchDataManagement.com, March 31, 2014
"I challenge any Information Management professional to not get value from this book. William covers a range of topics, and has so much knowledge he is able to offer usable insights across them all. The book is unique in the way it provides such a solid grounding for anyone making architectural or process decisions in the field of information management, and should be required reading for organizations looking to understand how newer approaches and technologies can be used to enable better decision making."--Michael Whitehead, CEO and Co-Founder, WhereScape Software
Ratings and Reviews
About the Authors
William McKnight Author
William is President of McKnight Consulting Group (www.mcknightcg.com). He is an internationally recognized authority in information management. His consulting work has included many of the Global 2000 and numerous midmarket companies. His teams have won several best practice competitions for their implementations and many of his clients have gone public with their success stories. His strategies form the information management plan for leading companies in various industries.
William is a very popular speaker worldwide and a prolific writer with hundreds of articles and white papers published. William is a distinguished entrepreneur, and a former Fortune 50 technology executive and software engineer. He provides clients with strategies, architectures, platform and tool selection, and complete programs to manage information.
Affiliations and Expertise
President of McKnight Consulting Group