Data Simplification - 1st Edition - ISBN: 9780128037812, 9780128038543

Data Simplification

1st Edition

Taming Information With Open Source Tools

Authors: Jules Berman
eBook ISBN: 9780128038543
Paperback ISBN: 9780128037812
Imprint: Morgan Kaufmann
Published Date: 9th March 2016
Page Count: 398
Tax/VAT will be calculated at check-out
64.50
42.95
36.99
59.95
Unavailable
File Compatibility per Device

PDF, EPUB, VSB (Vital Source):
PC, Apple Mac, iPhone, iPad, Android mobile devices.

Mobi:
Amazon Kindle eReader.

Institutional Access

Secure Checkout

Personal information is secured with SSL technology.

Free Shipping

Free global shipping
No minimum order.

Description

Data Simplification: Taming Information With Open Source Tools addresses the simple fact that modern data is too big and complex to analyze in its native form. Data simplification is the process whereby large and complex data is rendered usable. Complex data must be simplified before it can be analyzed, but the process of data simplification is anything but simple, requiring a specialized set of skills and tools.

This book provides data scientists from every scientific discipline with the methods and tools to simplify their data for immediate analysis or long-term storage in a form that can be readily repurposed or integrated with other data.

Drawing upon years of practical experience, and using numerous examples and use cases, Jules Berman discusses the principles, methods, and tools that must be studied and mastered to achieve data simplification, open source tools, free utilities and snippets of code that can be reused and repurposed to simplify data, natural language processing and machine translation as a tool to simplify data, and data summarization and visualization and the role they play in making data useful for the end user.

Key Features

  • Discusses data simplification principles, methods, and tools that must be studied and mastered
  • Provides open source tools, free utilities, and snippets of code that can be reused and repurposed to simplify data
  • Explains how to best utilize indexes to search, retrieve, and analyze textual data
  • Shows the data scientist how to apply ontologies, classifications, classes, properties, and instances to data using tried and true methods

Readership

Researchers in academia and graduate students in Computer Science with an interest in machine learning.

Table of Contents

Chapter 1. The Simple Life

Chapter 2. Structuring Text

Chapter 3. Indexing Text

Chapter 4. Understanding Your Data

Chapter 5. Identifying and Deidentifying Data

Chapter 6. Giving Meaning to Data

Chapter 7. Object-oriented data

Chapter 8. Problem simplification 

Details

No. of pages:
398
Language:
English
Copyright:
© Morgan Kaufmann 2016
Published:
Imprint:
Morgan Kaufmann
eBook ISBN:
9780128038543
Paperback ISBN:
9780128037812

About the Author

Jules Berman

Jules Berman holds two bachelor of science degrees from MIT (Mathematics, and Earth and Planetary Sciences), a PhD from Temple University, and an MD, from the University of Miami. He was a graduate researcher in the Fels Cancer Research Institute, at Temple University, and at the American Health Foundation in Valhalla, New York. His post-doctoral studies were completed at the U.S. National Institutes of Health, and his residency was completed at the George Washington University Medical Center in Washington, D.C. Dr. Berman served as Chief of Anatomic Pathology, Surgical Pathology and Cytopathology at the Veterans Administration Medical Center in Baltimore, Maryland, where he held joint appointments at the University of Maryland Medical Center and at the Johns Hopkins Medical Institutions. In 1998, he transferred to the U.S. National Institutes of Health, as a Medical Officer, and as the Program Director for Pathology Informatics in the Cancer Diagnosis Program at the National Cancer Institute. Dr. Berman is a past President of the Association for Pathology Informatics, and the 2011 recipient of the association's Lifetime Achievement Award. He is a listed author on over 200 scientific publications and has written more than a dozen books in his three areas of expertise: informatics, computer programming, and cancer biology. Dr. Berman is currently a free-lance writer.

Affiliations and Expertise

Ph.D., M.D., freelance author with expertise in informatics, computer programming, and cancer biology

Reviews

"As there is a "gold rush" encouraging the workforce training of data scientists, this gritty "Rules of the Road" monograph should serve as a constant companion for modern data scientists. Berman convincingly portrays the value of programmers and analysts who have facility with Perl, Python, or Ruby and who understand the critical role of metadata, indexing, and data visualization. These professionals will be high on my shopping list of talent to add to our biomedical informatics team in Pittsburgh."

"Data Simplification provides easy, free solutions to the unintended consequences of data complexity. This book should be the first (and probably most important) guide to success in the data sciences. I will be providing copies to my trainees, programmers, analysts, and faculty, as required reading."

-Michael J. Becich, MD, PhD, Associate Vice-Chancellor for Informatics in the Health Sciences, Chairman and Distinguished University Professor, Department of Biomedical Informatics, Director, Center for Commercial Application (CCA) of Healthcare Data, University of Pittsburgh School of Medicine