Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data Vault, Second Edition addresses how Big Data fits within the existing information infrastructure and data warehousing systems. This is an essential topic as researchers and engineers increasingly need to deal with large and complex sets of data. Until data is gathered and placed into an existing framework or architecture, it cannot be used to its full potential. Drawing upon years of practical experience and using numerous examples and case studies from across industries, the authors explain where Big Data fits, giving data scientists the necessary context for how pieces of the puzzle should fit together.
- Reviews the exponential growth of Big Data integration and applications across industries – from healthcare to finance
- Places new emphasis on end state architecture as a lens for understanding the architecture of Big Data
- Explains how Big Data fits within an existing systems environment, as well as the value of data transformation and redundancy
- Includes new chapters on data lakes, ponds, landing zones, IoT, edge computing, data modeling and taxonomies
Data analysts, data managers, researchers, and engineers who need to deal with large and complex sets of data; masters level students in data analytics programs
- Introduction to architecture
2. “Diagram of the world”, end state architecture
3. Transformation and redundancy
4. Big Data
5. Siloed applications
6. Data vault
7. Data lake, ponds, landing zone
8. IoT, Edge computing
9. Operational environment
10. The evolution of data architecture
11. Repetitive data, the sandbox
12. Non-repetitive data, contextualization
13. Operational performance
14. Integration of data
15. Personal computing
16. Managing text, taxonomies
17. System of record
18. The intellectual roadmap – data modelling, taxonomies, etc.
19. Business value across the architecture
20. Virtualization, streaming
21. The end of evolution
- No. of pages:
- © Academic Press 2019
- 1st June 2019
- Academic Press
- Paperback ISBN:
Best known as the “Father of Data Warehousing,” Bill Inmon has become the most prolific and well-known author worldwide in the big data analysis, data warehousing and business intelligence arena. In addition to authoring more than 50 books and 650 articles, Bill has been a monthly columnist with the Business Intelligence Network, EIM Institute and Data Management Review. In 2007, Bill was named by Computerworld as one of the “Ten IT People Who Mattered in the Last 40 Years” of the computer profession. Having 35 years of experience in database technology and data warehouse design, he is known globally for his seminars on developing data warehouses and information architectures. Bill has been a keynote speaker in demand for numerous computing associations, industry conferences and trade shows. Bill Inmon also has an extensive entrepreneurial background: He founded Pine Cone Systems, later named Ambeo in 1995, and founded, and took public, Prism Solutions in 1991. Bill consults with a large number of Fortune 1000 clients, and leading IT executives on Data Warehousing, Business Intelligence, and Database Management, offering data warehouse design and database management services, as well as producing methodologies and technologies that advance the enterprise architectures of large and small organizations world-wide. He has worked for American Management Systems and Coopers & Lybrand. Bill received his Bachelor of Science degree in Mathematics from Yale University, and his Master of Science degree in Computer Science from New Mexico State University.
Inmon Data Systems, Castle Rock, CO, USA
Dan has more than 25 years of experience in the Data Warehousing and Business Intelligence field and is internationally known for inventing the Data Vault 1.0 model and the Data Vault 2.0 System of Business Intelligence. He helps business and government organizations around the world to achieve BI excellence by applying his proven knowledge in Big Data, unstructured information management, agile methodologies and product development. He has held training classes and presented at TDWI, Teradata Partners, DAMA, Informatica, Oracle user groups and Data Modeling Zone conference. He has a background in SEI/CMMI Level 5, and has contributed architecture efforts to petabyte scale data warehouses and offers high quality on-line training and consulting services for Data Vault.
Founder and Principal of Empowered Holdings, LLC, St. Albans, VT, USA
Mary Levins is recognized as a leader in Data Governance with over 20 years of experience working with organizations to bring value through data strategies that drive business results. Mary has a BS and MS in Industrial Engineering, and her experience spans across many different industries including manufacturing, healthcare, energy/utilities, automotive, electronics, and financial (including Consumer Credit Bureaus and Credit Unions). Today, Mary is the founder of Sierra Creek Consulting, a specialized firm delivering Data Governance, Data Management, and Data Solutions to help companies bring value through data.
Sierra Creek Consulting LLC, Dacula, GA, USA