Principles of Big Data

Preparing, Sharing, and Analyzing Complex Information

By

  • Jules Berman, Ph.D., M.D., freelance author with expertise in informatics, computer programming, and cancer biology, Columbia, MD, USA

Principles of Big Data helps readers avoid the common mistakes that endanger all Big Data projects. By stressing simple, fundamental concepts, this book teaches readers how to organize large volumes of complex data, and how to achieve data permanence when the content of the data is constantly changing. General methods for data verification and validation, as specifically applied to Big Data resources, are stressed throughout the book. The book demonstrates how adept analysts can find relationships among data objects held in disparate Big Data resources, when the data objects are endowed with semantic support (i.e., organized in classes of uniquely identified data objects). Readers will learn how their data can be integrated with data from other resources, and how the data extracted from Big Data resources can be used for purposes beyond those imagined by the data creators.
View full description

Audience

data managers, data analysts, statisticians

 

Book information

  • Published: May 2013
  • Imprint: MORGAN KAUFMANN
  • ISBN: 978-0-12-404576-7

Reviews

"By stressing simple, fundamental concepts, this book teaches readers how to organize large volumes of complex data, and how to achieve data permanence when the content of the data is constantly changing. General methods for data verification and validation, as specifically applied to Big Data resources, are stressed throughout the book."--ODBMS.org, March 21, 2014
"The book is written in a colloquial style and is full of anecdotes, quotations from famous people, and personal opinions."--ComputingReviews.com, February 3, 2014
"The author has produced a sober, serious treatment of this emerging phenomenon, avoiding hype and gee-whiz cases in favor of concepts and mature advice. For example, the author offers ten distinctions between big data and small data, including such factors as goals, location, data structure, preparation, and longevity. This characterization provides much greater insight into the phenomenon than the standard 3V treatment (volume, velocity, and variety)."--ComputingReviews.com, October 3, 2013




Table of Contents

Preface

Introduction

1. Big Data Moves to the Center of the Universe

2. Measurement

3. Annotation

4. Identification, De-identification, and Re-identification

5. Ontologies and Semantics: How information is endowed with meaning

6. Standards and their Versions

7. Legacy Data

8. Hypothesis Testing

9. Prediction

10. Software

11. Complexity

12. Vulnerabilities

13. Legalities

14. Social and Ethical Issues