Computation and Storage in the Cloud book cover

Computation and Storage in the Cloud

Understanding the Trade-Offs

Computation and Storage in the Cloud is the first comprehensive and systematic work investigating the issue of computation and storage trade-off in the cloud in order to reduce the overall application cost. Scientific applications are usually computation and data intensive, where complex computation tasks take a long time for execution and the generated datasets are often terabytes or petabytes in size. Storing valuable generated application datasets can save their regeneration cost when they are reused, not to mention the waiting time caused by regeneration. However, the large size of the scientific datasets is a big challenge for their storage. By proposing innovative concepts, theorems and algorithms, this book will help bring the cost down dramatically for both cloud users and service providers to run computation and data intensive scientific applications in the cloud.

  • Covers cost models and benchmarking that explain the necessary tradeoffs for both cloud providers and users
  • Describes several novel strategies for storing application datasets in the cloud
  • Includes real-world case studies of scientific research applications

Audience

Researchers, practitioners, and graduate students in scientific computing seeking guidance for managing application datasets.

Paperback, 128 Pages

Published: February 2013

Imprint: Elsevier

ISBN: 978-0-12-407767-6

Reviews

  • "Cloud computing systems charge for both data storage and for calculating, say Yuan, Yang….and Chen…so there is a trade-off between storing large data sets in the cloud or deleting them and regenerating then each time they are needed. They suggest some approaches to figuring out which is cheaper."--Reference & Research Book News, December 2013
    "…this book does a good job at tackling a variety of complex subjects. It brings forward state-of-the-art concepts and elaborate algorithms, illustrates issues related to cost-effectiveness, and helps both cloud providers and users get a grip on the intricate world of cloud computing."--Help Net Security online, August 28, 2013


Contents

  • CHAPTER 1  INTRODUCTION 1
    1.1    Scientific Applications in the Cloud
    1.2    Key Issues of this Research
    1.3    Overview of this Book
    CHAPTER 2 LITERATURE REVIEW
    2.1    Data Management of Scientific Applications in Traditional Distributed Systems
    2.1.1    Data Management in Grid
    2.1.2    Data Management in Grid Workflows
    2.1.3   Data Management in Other Distributed Systems
    2.2    Cost-Effectiveness of Scientific Applications in the Cloud
    2.2.1    Cost-Effectiveness of Deploying Scientific Applications in the Cloud
    2.2.2    Trade-Off between Computation and Storage in the Cloud
    2.3    Data Provenance in Scientific Applications
    2.4    Summary 16
    CHAPTER 3 MOTIVATING EXAMPLE AND RESEARCH ISSUES
    3.1    Motivating Example
    3.2    Problem Analysis
    3.2.1    Requirements and Challenges of Deploying Scientific Applications in the Cloud
    3.2.2    Bandwidth Cost of Deploying Scientific Applications in the Cloud
    3.3    Research Issues
    3.3.1    Cost Model for Datasets Storage in the Cloud
    3.3.2    Minimum Cost Benchmarking Approaches
    3.3.3    Cost-Effective Storage Strategies
    3.4    Summary
    CHAPTER 4 COST MODEL OF DATASETS STORAGE IN THE CLOUD
    4.1    Classification of Application Data in the Cloud
    4.2    Data Provenance and Data Dependency Graph (DDG)
    4.3    Datasets Storage Cost Model in the Cloud
    4.4    Summary
    CHAPTER 5  MINIMUM COST BENCHMARKING APPROACHES
    5.1    Static On-Demand Minimum Cost Benchmarking Approach
    5.1.1    CTT-SP Algorithm for Linear DDG
    5.1.2    Minimum Cost Benchmarking Algorithm for DDG with One Block
    5.1.2.1    Constructing CTT for DDG with one block
    5.1.2.2    Setting weights to different types of edges
    5.1.2.3    Steps of finding MCSS for DDG with one sub-branch in one block
    5.1.3    Minimum Cost Benchmarking Algorithm for General DDG
    5.1.3.1    General CTT-SP algorithm for different situations
    5.1.3.2    Pseudo-code of general CTT-SP algorithm
    5.2    Dynamic on-the-fly Minimum Cost Benchmarking Approach
    5.2.1    PSS for a DDG_LS
    5.2.1.1    Different MCSSs of a DDG_LS in a solution space
    5.2.1.2    Range of MCSSs’ cost rates for a DDG_LS
    5.2.1.3    Distribution of MCSSs in the PSS of a DDG_LS
    5.2.2    Algorithms for Calculating PSS of a DDG_LS
    5.2.3    PSS for a General DDG (or DDG Segment)
    5.2.3.1    Three dimension PSS of DDG segment with two branches
    5.2.3.2    High dimension PSS of a general DDG
    5.2.4    Dynamic on-the-fly Minimum Cost Benchmarking
    5.2.4.1    Minimum cost benchmarking by merging and saving PSSs in a hierarchy
    5.2.4.2    Updating of the minimum cost benchmark on the fly
    5.3    Summary
    CHAPTER 6  COST-EFFECTIVE DATASETS STORAGE STRATEGIES
    6.1    Data Accessing Delay and Users’ Preferences in Storage Strategies
    6.2    Cost Rate Based Storage Strategy
    6.2.1    Algorithms for the Strategy
    6.2.1.1    Algorithm for deciding newly generated datasets’ storage status
    6.2.1.2    Algorithm for deciding stored datasets’ storage status due to usage frequencies change
    6.2.1.3    Algorithm for deciding regenerated datasets’ storage status
    6.2.2    Cost-Effectiveness Analysis
    6.3    Local-Optimisation Based Storage Strategy
    6.3.1    Algorithms and Rules for the Strategy
    6.3.1.1    Enhanced CTT-SP algorithm for linear DDG
    6.3.1.2    Rules in the Strategy
    6.3.2    Cost-Effectiveness Analysis
    6.4    Summary
    CHAPTER 7  EXPERIMENTS AND EVALUATIONS
    7.1    Experiment Environment
    7.2    Evaluation of Minimum Cost Benchmarking Approaches
    7.2.1    Cost-Effectiveness Evaluation of the Minimum Cost Benchmark
    7.2.2    Efficiency Evaluation of Two Benchmarking Approaches
    7.3    Evaluation of Cost-Effective Storage Strategies
    7.3.1    Cost-Effectiveness of Two Storage Strategies
    7.3.2    Efficiency Evaluation of Two Storage Strategies
    7.4    Case Study of Pulsar Searching Application
    7.4.1    Utilisation of Minimum Cost Benchmarking Approaches
    7.4.2    Utilisation of Cost-Effective Storage Strategies
    7.5    Summary
    CHAPTER 8 CONCLUSIONS AND FUTURE WORK
    8.1    Summary of This Book
    8.2    Key Contributions of This Book
    APPENDIX A NOTATION INDEX
    APPENDIX B PROOFS OF THEOREMS, LEMMAS AND COROLLARIES
    APPENDIX C METHOD OF CALCULATING Λ BASED ON USERS’ EXTRA BUDGET
    BIBLIOGRAPHY

Advertisement

advert image