Repurposing Legacy Data

Repurposing Legacy Data

Innovative Case Studies

1st Edition - March 13, 2015

Write a review

  • Author: Jules Berman
  • eBook ISBN: 9780128029152
  • Paperback ISBN: 9780128028827

Purchase options

Purchase options
DRM-free (Mobi, PDF, EPub)
Sales tax will be calculated at check-out

Institutional Subscription

Free Global Shipping
No minimum order


Repurposing Legacy Data: Innovative Case Studies takes a look at how data scientists have re-purposed legacy data, whether their own, or legacy data that has been donated to the public domain. Most of the data stored worldwide is legacy data—data created some time in the past, for a particular purpose, and left in obsolete formats. As with keepsakes in an attic, we retain this information thinking it may have value in the future, though we have no current use for it. The case studies in this book, from such diverse fields as cosmology, quantum physics, high-energy physics, microbiology, psychiatry, medicine, and hospital administration, all serve to demonstrate how innovative people draw value from legacy data. By following the case examples, readers will learn how legacy data is restored, merged, and analyzed for purposes that were never imagined by the original data creators.

Key Features

  • Discusses how combining existing data with other data sets of the same kind can produce an aggregate data set that serves to answer questions that could not be answered with any of the original data
  • Presents a method for re-analyzing original data sets using alternate or improved methods that can provide outcomes more precise and reliable than those produced in the original analysis
  • Explains how to integrate heterogeneous data sets for the purpose of answering questions or developing concepts that span several different scientific fields


Primary Market Data scientists, Big Data curators, Statisticians, Researchers; Secondary Market: graduate level students in computer science, statistics, information sciences

Table of Contents

    • Author Biography
    • Chapter 1. Introduction
      • 1.1 Why Bother?
      • 1.2 What Is Data Repurposing?
      • 1.3 Data Worth Preserving
      • 1.4 Basic Data Repurposing Tools
      • 1.5 Personal Attributes of Data Repurposers
      • References
    • Chapter 2. Learning from the Masters
      • 2.1 New Physics from Old Data
      • 2.2 Repurposing the Physical and Abstract Property of Uniqueness
      • 2.3 Repurposing a 2,000-Year-Old Classification
      • 2.4 Decoding the Past
      • 2.5 What Makes Data Useful for Repurposing Projects?
      • References
    • Chapter 3. Dealing with Text
      • 3.1 Thus It Is Written
      • 3.2 Search and Retrieval
      • 3.3 Indexing Text
      • 3.4 Coding Text
      • References
    • Chapter 4. New Life for Old Data
      • 4.1 New Algorithms
      • 4.2 Taking Closer Looks
      • 4.3 Crossing Data Domains
      • References
    • Chapter 5. The Purpose of Data Analysis Is to Enable Data Reanalysis
      • 5.1 Every Initial Data Analysis on Complex Datasets Is Flawed
      • 5.2 Unrepeatability of Complex Analyses
      • 5.3 Obligation to Verify and Validate
      • 5.4 Asking What the Data Really Means
      • References
    • Chapter 6. Dark Legacy: Making Sense of Someone Else’s Data
      • 6.1 Excavating Treasures from Lost and Abandoned Data Mines
      • 6.2 Nonstandard Standards
      • 6.3 Specifications, Not Standards
      • 6.4 Classifications and Ontologies
      • 6.5 Identity and Uniqueness
      • 6.6 When to Terminate (or Reconsider) a Data Repurposing Project
      • References
    • Chapter 7. Social and Economic Issues
      • 7.1 Data Sharing and Reproducible Research
      • 7.2 Acquiring and Storing Data
      • 7.3 Keeping Your Data Forever
      • 7.4 Data Immutability
      • 7.5 Privacy and Confidentiality
      • 7.6 The Economics of Data Repurposing
      • References
    • Appendix A. Index of Case Studies
    • Appendix B. Glossary
      • References

Product details

  • No. of pages: 176
  • Language: English
  • Copyright: © Elsevier 2015
  • Published: March 13, 2015
  • Imprint: Elsevier
  • eBook ISBN: 9780128029152
  • Paperback ISBN: 9780128028827

About the Author

Jules Berman

Jules Berman
Jules Berman holds two Bachelor of Science degrees from MIT (in Mathematics and in Earth and Planetary Sciences), a PhD from Temple University, and an MD from the University of Miami. He was a graduate researcher at the Fels Cancer Research Institute (Temple University) and at the American Health Foundation in Valhalla, New York. He completed his postdoctoral studies at the US National Institutes of Health, and his residency at the George Washington University Medical Center in Washington, DC. Dr. Berman served as Chief of anatomic pathology, surgical pathology, and cytopathology at the Veterans Administration Medical Center in Baltimore, Maryland, where he held joint appointments at the University of Maryland Medical Center and at the Johns Hopkins Medical Institutions. In 1998, he transferred to the US National Institutes of Health as a Medical Officer and as the Program Director for Pathology Informatics in the Cancer Diagnosis Program at the National Cancer Institute. Dr. Berman is a past President of the Association for Pathology Informatics and is the 2011 recipient of the Association’s Lifetime Achievement Award. He is a listed author of more than 200 scientific publications and has written more than a dozen books in his three areas of expertise: informatics, computer programming, and pathology. Dr. Berman is currently a freelance writer.

Affiliations and Expertise

Freelance author with expertise in informatics, computer programming, and cancer biology

Ratings and Reviews

Write a review

There are currently no reviews for "Repurposing Legacy Data"