In praise of Information Management
Chapter One. You’re in the Business of Information
An Architecture for Information Success
The Glue is Architecture
Information in Action
Judgment Still Necessary
Chapter Two. Relational Theory In Practice
Chapter Three. You’re in the Business of Analytics
What Distinguishes Analytics?
Building Predictive Analytic Models
Analytics and Information Architecture
Analytics Requires Analysts
Chapter Four. Data Quality: Passing the Standard
Data Quality Defect Categories
Sources of Poor Data Quality
Cures for Poor Data Quality
Chapter Five. Columnar Databases
Chapter Six. Data Warehouses and Appliances
The Data Warehouse Appliance
Data Appliances and the Use of Memory
Chapter Seven. Master Data Management: One Chapter Here, but Ramifications Everywhere
A Subject-Area Culture
The Architecture of MDM
Data Quality and MDM
MDM Roles and Responsibilities
Chapter Eight. Data Stream Processing: When Storing the Data Happens Later
Uses of Data Stream Processing
Data Stream Processing Brings Power
Stream SQL Extensions
Chapter Nine. Data Virtualization: The Perpetual Short-Term Solution
The History of Data Virtualization
Controlling Your Information Asset
Chapter Ten. Operational Big Data: Key-Value, Document, and Column Stores: Hash Tables Reborn
When to Yes NoSQL
NoSQL Solution Checklist
Chapter Eleven. Analytical Big Data: Hadoop: Analytics at Scale
Big Data for Hadoop
Hadoop Distributed File System
MapReduce for Hadoop
Hadoop is Not
Chapter Twelve. Graph Databases: When Relationships are the Data
Cypher, a Graph Database Language
Graph Database Models
Chapter Thirteen. Cloud Computing: On-Demand Elasticity
Defining Cloud Computing
Benefits of the Cloud
Challenges with the Cloud
Cloud Deployment Models
Information Management in the Cloud
Chapter Fourteen. An Elegant Architecture Where Information Flows
The Starting Point
Plenty of Work to be Done
Information Management Maturity
Chapter Fifteen. Modern Business Intelligence—Collaboration, Mobile, and Self-Service: Organizing the Discussion and Tethering the User to Information
The Mobile Revolution
Mobile Business Intelligence
Self-Service Business Intelligence
Collaborative Business Intelligence
Chapter Sixteen. Agile Practices for Information Management
Traditional Waterfall Methodology
SCRUM and Methodology Themes
Chapter Seventeen. Organizational Change Management: The Soft Stuff is the Hard Stuff
Organizational Change Management Work Products
Organization Change Management is Essential to Project Success
Information Management: Gaining a Competitive Advantage with Data is about making smart decisions to make the most of company information. Expert author William McKnight develops the value proposition for information in the enterprise and succinctly outlines the numerous forms of data storage. Information Management will enlighten you, challenge your preconceived notions, and help activate information in the enterprise. Get the big picture on managing data so that your team can make smart decisions by understanding how everything from workload allocation to data stores fits together.
The practical, hands-on guidance in this book includes:
- Part 1: The importance of information management and analytics to business, and how data warehouses are used
- Part 2: The technologies and data that advance an organization, and extend data warehouses and related functionality
- Part 3: Big Data and NoSQL, and how technologies like Hadoop enable management of new forms of data
- Part 4: Pulls it all together, while addressing topics of agile development, modern business intelligence, and organizational change management
Read the book cover-to-cover, or keep it within reach for a quick and useful resource. Either way, this book will enable you to master all of the possibilities for data or the broadest view across the enterprise.
- Balances business and technology, with non-product-specific technical detail
- Shows how to leverage data to deliver ROI for a business
- Engaging and approachable, with practical advice on the pros and cons of each domain, so that you learn how information fits together into a complete architecture
- Provides a path for the data warehouse professional into the new normal of heterogeneity, including NoSQL solutions
IT organizations/ vendors/consultants, DBAs, information architects, managers/directors of information management.
- No. of pages:
- © Morgan Kaufmann 2014
- 12th December 2013
- Morgan Kaufmann
- eBook ISBN:
- Paperback ISBN:
From the Author: The Information Architecture To Pursue
A top priority of CIOs and organizations everywhere is how to best adapt the environment to manage the information asset. There is a plethora of available systems to throw into that equation. The possibilities can be daunting.
- "One size fits all" does not apply to information architecture.
- Gone are the days when vendors could bring their laminated architectures to a client with credibility.
- Organizations must go forward incrementally from where they are and deliver business returns with each -- at the quarter-, not year-level, turnaround.
For such an important asset, the barometer cannot be a competitor’s environment. Early adopters of good practices will reap the most rewards. Following are several key actions to take to improve a company’s information architecture.
Move Key Operational Systems To In-Memory
In-memory for operational systems is appropriate wherever SQL is used operationally and the performance gains of in-memory can be utilized.
Configurations differ. Products like VoltDB are NewSQL systems purpose-built for storing data and throughput of transactional systems. NewSQL is used today for traditional high performance applications such as capital markets data feeds, financial trade, telco record streams, sensor-based distribution systems, wireless, online gaming, fraud detection, digital ad exchanges, and micro transaction systems.
NewSQL systems are in-memory, schema-based DBMS systems that scale out in a cluster. They have high availability architectures that use synchronous, multi-master, active-active replication. As the name implies, NewSQL supports full SQL – aggregate functions, LIKE, UNION, materialized views, indexes, etc.
In-memory also is found in DBMS environments that primarily scale-up like SAP HANA, Teradata, and IBM PureData. With in-memory systems like HANA, a company can store its entire operational database entirely in RAM as the primary persistence layer. With the increasing number of cores (multi-core CPUs) becoming standard, CPUs are able to process increased data volumes in parallel. Main memory is no longer a limited resource. These systems recognize this and fully exploit main memory. Caches and layers are eliminated because the entire physical database is sitting on the motherboard and is therefore in memory all the time. By providing added performance and full ACID compliance, these systems are pushing up the threshold of size and complexity where NoSQL systems make sense.
Selectively Utilize Data Stream Processing
Data stream processing and event stream processing can hardly be considered a data store alongside DBMS and file systems since it doesn’t actually store data. However, it is a data processing platform. Data is only stored in data stores for processing later anyway so if an organization can perform all processing without the storage, it can skip the storage. Profile data, such as found in a Master Data Management hub can be added to the processing alongside the stream, providing instantaneous context-sensitive processing in real-time.
Examine The Syndicated Data Marketplace
Data has existed for purchase for a while, but the data has mostly been sourced into a very specific need, such as a marketing list for a promotion. As organizations make the move to more widespread data access and leveragable data structures, an investment in syndicated data can be leveraged throughout the enterprise.
Embrace Master Data Management
When facing a mounting workload adding value to an enterprise with information management, considering key components of each application that can be managed separately is wise. The most prominent of these components is master data. Building master data in a scalable, sharable manner, such as with a master data management approach, will streamline project development time and bring consistent data to multiple applications.
Utilize Data Virtualization
Dispense with the notion that each query can come from one data store. With data virtualization making big gains in recent years, data can be selectively stored in its best-fit platform and still be served to queries requiring data from elsewhere. Data virtualization can be a way to save the day for one-off queries or selective queries using data virtualization can be architected into scheduled operations.
Marginalize Multidimensional Databases
The hyper-denormalized multidimensional structure has proved a very difficult structure to use effectively. When created "spot on" to a query need, it is a good performing structure. When mismatched due to too few columns in the structure (requiring "drill through") or too many, creating overhead, it becomes an encumbrance.
Use Data Warehouses Strategically
The data warehouse concept is still necessary in any modern environment. The idea of sharing the data, the DBMS platform, a model, the methods, and the tools across different data sets and subject areas brings many benefits.
Making data warehouse data columnar in orientation generally would help a data warehouse more than it would hurt. However, people don’t generally like any downsides with their upgrades. A data warehouse community is not just multiple people. It’s very disparate user groups. The "groupthink" of the data warehouse also will limit finding the value proposition for the in-memory data warehouse although SSD storage is a must.
The data warehouse can be the lowest common denominator approach to storing data, which is not bad for the mid-specification analytic workload.
Data warehouses will see evolutionary change, but new applications and those who want specific analytic features may just source their data from the data warehouse. There are many of these "marts" being built today. The expansion of platform features in DBMS will continue as marts go searching for their best-fit platform.
Make Analytic Marts Columnar And In-Memory
Analytic marts built to support a single application, subject area, or department are well served to optimize around the specific requirements. These marts, which are multiplying throughout enterprises, have eschewed joining forces with the data warehouse, often because the analytic features will not be turned on for the data warehouse.
At less than 5 terabytes, analytic marts provide a great playground with little downside and bureaucracy that prevent trying out a columnar orientation and in-memory processing. Once tried, these features have quick appeal because the performance they create is often orders of magnitude greater performance for analytic processing.
Priase for Information Management:
"This is an excerpt from the first chapter of Information Management: Strategies for Gaining a Competitive Advantage with Data, written by William McKnight…he addresses the relationship between information management and business value, explores data management technologies, and offers advice on maximizing the potential of enterprise information."--SearchDataManagement.com, March 31, 2014
"…overall it does provide some very useful information and guidance that could be used as part of a preparation and planning exercise towards developing a suitable data and information management strategy… it would make a suitable first guide for anyone who has been given the task of developing such a scheme, and might help to clarify some of the key issues in such a way as to make the task a little bit easier." Score: 7 out of 10--BCS.org, April 2014
"William McKnight has delivered a very clear and concise explanation about how to get the most from your organization’s data. He steps the reader through an assortment of data processing technologies and approaches and show which deliver the best ROI for which types of workloads. This is a desperately needed mapping that many users will find invaluable!"--Wayne Eckerson, business intelligence thought leader and president of Eckerson Group, a business-technology management consulting firm specializing in BI, performance management, and analytics.
"A blueprint and action plan for a corporate information management strategy, this book is a useful guide for anyone who wishes to improve business success with technology. Author William McKnight provides the foundation and tools for information managers to set policies and programs for the improved management of information, while addressing advances in architecture and technology principles."--Julie Langenkamp-Muenkel, Editorial Director of Information-Management.com
"I always enjoy William’s writing, especially his balance between inspiring foresight and pragmatic advice rooted in real-world experience. He has skillfully shown that poise again: with his guidance you’ll find Information Management transforms what can be a burdensome responsibility into an insightful practice."--Donald Farmer, VP Product Management, qlikview.com
"Many claim we're in the golden age of data management; every traditional paradigm and approach seems to have a newer, better, and faster alternative. This book provides a terrific overview of the new class of technologies that must be integrated into every CIO's technology plan."--Evan Levy, Co-Author, Customer Data Integration: Reaching a Single Version of the Truth
"Big data is no longer just an IT topic. It’s one that’s now top-of-mind for executives, too. William McKnight takes the increasingly knotty hairball of information management—its practices, technologies, and skills—and unravels it in this timely and relevant book. A must-read for business and IT pros alike."--Jill Dyché, SAS Vice President and author of The New IT
"I challenge any Information Management professional to not get value from this book. William covers a range of topics, and has so much knowledge he is able to offer usable insights across them all. The book is unique in the way it provides such a solid grounding for anyone making architectural or process decisions in the field of information management, and should be required reading for organizations looking to understand how newer approaches and technologies can be used to enable better decision making."--Michael Whitehead, CEO and Co-Founder, WhereScape Software
William is President of McKnight Consulting Group (www.mcknightcg.com). He is an internationally recognized authority in information management. His consulting work has included many of the Global 2000 and numerous midmarket companies. His teams have won several best practice competitions for their implementations and many of his clients have gone public with their success stories. His strategies form the information management plan for leading companies in various industries.
William is a very popular speaker worldwide and a prolific writer with hundreds of articles and white papers published. William is a distinguished entrepreneur, and a former Fortune 50 technology executive and software engineer. He provides clients with strategies, architectures, platform and tool selection, and complete programs to manage information.
President of McKnight Consulting Group