COVID-19 Update: We are currently shipping orders daily. However, due to transit disruptions in some geographies, deliveries may be delayed. To provide all customers with timely access to content, we are offering 50% off Science and Technology Print & eBook bundle options. Terms & conditions.
Data Deduplication Approaches - 1st Edition - ISBN: 9780128233955, 9780128236338

Data Deduplication Approaches

1st Edition

Concepts, Strategies, and Challenges

Editors: Tin Thein Thwel G. R. Sinha
Paperback ISBN: 9780128233955
eBook ISBN: 9780128236338
Imprint: Academic Press
Published Date: 25th November 2020
Page Count: 404
Sales tax will be calculated at check-out Price includes VAT/GST
Price includes VAT/GST

Institutional Subscription

Secure Checkout

Personal information is secured with SSL technology.

Free Shipping

Free global shipping
No minimum order.


In the age of data science, the rapidly increasing amount of data is a major concern in numerous applications of computing operations and data storage. Duplicated data or redundant data is a main challenge in the field of data science research. Data Deduplication Approaches: Concepts, Strategies, and Challenges shows readers the various methods that can be used to eliminate multiple copies of the same files as well as duplicated segments or chunks of data within the associated files. Due to ever-increasing data duplication, its deduplication has become an especially useful field of research for storage environments, in particular persistent data storage. Data Deduplication Approaches provides readers with an overview of the concepts and background of data deduplication approaches, then proceeds to demonstrate in technical detail the strategies and challenges of real-time implementations of handling big data, data science, data backup, and recovery. The book also includes future research directions, case studies, and real-world applications of data deduplication, focusing on reduced storage, backup, recovery, and reliability.

Key Features

  • Includes data deduplication methods for a wide variety of applications
  • Includes concepts and implementation strategies that will help the reader to use the suggested methods
  • Provides a robust set of methods that will help readers to appropriately and judiciously use the suitable methods for their applications
  • Focuses on reduced storage, backup, recovery, and reliability, which are the most important aspects of implementing data deduplication approaches
  • Includes case studies


Biomedical Engineers and researchers in biomedical engineering, applied informatics, and data science

students and researchers in artificial intelligence, data analytics, and data science

Table of Contents

  1. Introduction to data deduplication approaches
    2. Data deduplication concepts
    3. Concepts, strategies, and challenges of data deduplication
    4. Existing mechanisms for data deduplication
    5. Classification criteria for data deduplication methods
    6. File chunking approaches
    7. Study of data deduplication for file chunking approaches
    8. Essentials of data deduplication using open-source toolkit
    9. Efficient data deduplication scheme for scale-out distributed storage
    10. Identification of duplicate bug reports in software bug repositories: a systematic review, challenges and future scope
    11. A survey and critical analysis on energy generation from datacenter
    12. Review of MODIS EVI and NDVI data for data mining applications
    13. Performance modeling for secure migration processes of legacy systems to the cloud computing
    14. DedupCloud: an optimized efficient virtual machine deduplication algorithm in cloud computing environment
    15. Data deduplication for cloud storage
    16. Data duplication using Amazon Web Services cloud storage
    17. Game-theoretic analysis of encrypted cloud data deduplication
    18. Data deduplication applications in cognitive science and computer vision research


No. of pages:
© Academic Press 2020
25th November 2020
Academic Press
Paperback ISBN:
eBook ISBN:

About the Editors

Tin Thein Thwel

Tin Thein Thwel, PhD is a Professor at Myanmar Institute of Information Technology (MIIT), Mandalay, Myanmar. She received her PhD in Information Technology from the University of Computer Studies, Yangon (UCSY), Myanmar. She is a reviewer and technical committee member of the International Conference on Computer and Applications (ICCA) on data deduplication, cyber security, data mining, and information retrieval. She has 16 years of teaching experience at the university level and her research interests include data deduplication, cyber security, data mining and data science, information retrieval, and distributed computing.

Affiliations and Expertise

Professor, Myanmar Institute of Information Technology (MIIT), Mandalay, Myanmar

G. R. Sinha

G. R. Sinha, PhD is an Adjunct Professor at International Institute of Information Technology (IIIT) Bangalore, India, and presently deputed as Professor at Myanmar Institute of Information Technology (MIIT), Mandalay, Myanmar. He has published 259 research papers in various international and national journals and conferences. He has edited four books with Elsevier, Springer and IOP; and currently editing seven more books with reputed publishers. He is a Visiting Professor (Honorary) of Sri Lanka Technological Campus Colombo. He is a Senate member of MIIT and also ACM Distinguished Speaker in the field of Digital Signal Processing. His research areas include cognitive science, brain computing, image processing, and data science applications.

Affiliations and Expertise

Adjunct Professor, International Institute of Information Technology (IIIT), Bangalore, India

Ratings and Reviews