COVID-19 Update: We are currently shipping orders daily. However, due to transit disruptions in some geographies, deliveries may be delayed. To provide all customers with timely access to content, we are offering 50% off Science and Technology Print & eBook bundle options. Terms & conditions.
Principles and Practice of Big Data - 2nd Edition - ISBN: 9780128156094, 9780128156100

Principles and Practice of Big Data

2nd Edition

Preparing, Sharing, and Analyzing Complex Information

Author: Jules Berman
Paperback ISBN: 9780128156094
eBook ISBN: 9780128156100
Imprint: Academic Press
Published Date: 25th July 2018
Page Count: 480
Sales tax will be calculated at check-out Price includes VAT/GST
Price includes VAT/GST

Institutional Subscription

Secure Checkout

Personal information is secured with SSL technology.

Free Shipping

Free global shipping
No minimum order.


Principles and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information, Second Edition updates and expands on the first edition, bringing a set of techniques and algorithms that are tailored to Big Data projects. The book stresses the point that most data analyses conducted on large, complex data sets can be achieved without the use of specialized suites of software (e.g., Hadoop), and without expensive hardware (e.g., supercomputers). The core of every algorithm described in the book can be implemented in a few lines of code using just about any popular programming language (Python snippets are provided).

Through the use of new multiple examples, this edition demonstrates that if we understand our data, and if we know how to ask the right questions, we can learn a great deal from large and complex data collections. The book will assist students and professionals from all scientific backgrounds who are interested in stepping outside the traditional boundaries of their chosen academic disciplines.

Key Features

  • Presents new methodologies that are widely applicable to just about any project involving large and complex datasets
  • Offers readers informative new case studies across a range scientific and engineering disciplines
  • Provides insights into semantics, identification, de-identification, vulnerabilities and regulatory/legal issues
  • Utilizes a combination of pseudocode and very short snippets of Python code to show readers how they may develop their own projects without downloading or learning new software


Researchers, engineers, data analysts, and data managers who need to deal with large and complex sets of data

Table of Contents

1. Introduction
2. Providing Structure to Unstructured Data
3. Identification, Deidentification, and Reidentification
4. Metadata, Semantics, and Triples
5. Classifications and Ontologies
6. Introspection
7. Data Integration and Software Interoperability
8. Immutability and Immortality
9. Assessing the Adequacy of a Big Data Resource
10. Measurement
11. Indispensable Tips for Fast and Simple Big Data Analysis
12. Finding the Clues in Large Collections of Data
13. Using Random Numbers to Bring Your Big Data Analytic Problems Down to Size
14. Special Considerations in Big Data Analysis
15. Big Data Failures and How to Avoid (Some of) Them
16. Legalities
17. Data Sharing
18. Data Reanalysis: Much More Important than Analysis
19. Repurposing Big Data


No. of pages:
© Academic Press 2018
25th July 2018
Academic Press
Paperback ISBN:
eBook ISBN:

About the Author

Jules Berman

Jules Berman

Jules Berman holds two bachelor of science degrees from MIT (Mathematics, and Earth and Planetary Sciences), a PhD from Temple University, and an MD, from the University of Miami. He was a graduate researcher in the Fels Cancer Research Institute, at Temple University, and at the American Health Foundation in Valhalla, New York. His post-doctoral studies were completed at the U.S. National Institutes of Health, and his residency was completed at the George Washington University Medical Center in Washington, D.C. Dr. Berman served as Chief of Anatomic Pathology, Surgical Pathology and Cytopathology at the Veterans Administration Medical Center in Baltimore, Maryland, where he held joint appointments at the University of Maryland Medical Center and at the Johns Hopkins Medical Institutions. In 1998, he transferred to the U.S. National Institutes of Health, as a Medical Officer, and as the Program Director for Pathology Informatics in the Cancer Diagnosis Program at the National Cancer Institute. Dr. Berman is a past President of the Association for Pathology Informatics, and the 2011 recipient of the association's Lifetime Achievement Award. He is a listed author on over 200 scientific publications and has written more than a dozen books in his three areas of expertise: informatics, computer programming, and cancer biology. Dr. Berman is currently a free-lance writer.

Affiliations and Expertise

Freelance author with expertise in informatics, computer programming, and cancer biology

Ratings and Reviews