Advances in GPU Research and Practice

1st Edition - September 6, 2016
Author: Hamid Sarbazi-Azad
Language: English
Paperback ISBN:
9 7 8 - 0 - 1 2 - 8 0 3 7 3 8 - 6
eBook ISBN:
9 7 8 - 0 - 1 2 - 8 0 3 7 8 8 - 1

Advances in GPU Research and Practice focuses on research and practices in GPU based systems. The topics treated cover a range of issues, ranging from hardware and architect… Read more

Purchase options

LIMITED OFFER

Save 50% on book bundles

Immediately download your ebook while waiting for your print delivery. No promo code is needed.

Institutional subscription on ScienceDirect

Request a sales quote

Advances in GPU Research and Practice focuses on research and practices in GPU based systems. The topics treated cover a range of issues, ranging from hardware and architectural issues, to high level issues, such as application systems, parallel programming, middleware, and power and energy issues.

Divided into six parts, this edited volume provides the latest research on GPU computing. Part I: Architectural Solutions focuses on the architectural topics that improve on performance of GPUs, Part II: System Software discusses OS, compilers, libraries, programming environment, languages, and paradigms that are proposed and analyzed to help and support GPU programmers. Part III: Power and Reliability Issues covers different aspects of energy, power, and reliability concerns in GPUs. Part IV: Performance Analysis illustrates mathematical and analytical techniques to predict different performance metrics in GPUs. Part V: Algorithms presents how to design efficient algorithms and analyze their complexity for GPUs. Part VI: Applications and Related Topics provides use cases and examples of how GPUs are used across many sectors.

Dedication
List of Contributors
Preface
Acknowledgments
Part 1: Programming and Tools
- Chapter 1: Formal analysis techniques for reliable GPU programming: current solutions and call to action
  - Abstract
  - Acknowledgments
  - 1 GPUs in Support of Parallel Computing
  - 2 A quick introduction to GPUs
  - 3 Correctness issues in GPU programming
  - 4 The need for effective tools
  - 5 Call to Action
- Chapter 2: SnuCL: A unified OpenCL framework for heterogeneous clusters
  - Abstract
  - Acknowledgments
  - 1 Introduction
  - 2 OpenCL
  - 3 Overview of SnuCL framework
  - 4 Memory management in SnuCL Cluster
  - 5 SnuCL extensions to OpenCL
  - 6 Performance evaluation
  - 7 Conclusions
- Chapter 3: Thread communication and synchronization on massively parallel GPUs
  - Abstract
  - 1 Introduction
  - 2 Coarse-Grained Communication and Synchronization
  - 3 Built-In Atomic Functions on Regular Variables
  - 4 Fine-Grained Communication and Synchronization
  - 5 Conclusion and Future Research Direction
- Chapter 4: Software-level task scheduling on GPUs
  - Abstract
  - Acknowledgments
  - 1 Introduction, Problem Statement, and Context
  - 2 Nondeterministic behaviors caused by the hardware
  - 3 SM-centric transformation
  - 4 Scheduling-enabled optimizations
  - 5 Other scheduling work on GPUs
  - 6 Conclusion and future work
- Chapter 5: Data placement on GPUs
  - Abstract
  - 1 Introduction
  - 2 Overview
  - 3 Memory specification through MSL
  - 4 Compiler support
  - 5 Runtime support
  - 6 Results
  - 7 Related work
  - 8 Summary
Part 2: Algorithms and Applications
- Chapter 6: Biological sequence analysis on GPU
  - Abstract
  - 1 Introduction
  - 2 Pairwise Sequence Comparison and Sequence-Profile Comparison
  - 3 Design aspects of GPU solutions for biological sequence analysis
  - 4 GPU Solutions for Pairwise Sequence Comparison
  - 5 GPU Solutions for Sequence-Profile Comparison
  - 6 Conclusion and perspectives
- Chapter 7: Graph algorithms on GPUs
  - Abstract
  - 1 Graph representation for GPUs
  - 2 Graph traversal algorithms: the breadth first search (BFS)
  - 3 The single-source shortest path (SSSP) problem
  - 4 The APSP problem
  - 5 Load Balancing and Memory Accesses: Issues and Management Techniques
- Chapter 8: GPU alignment of two and three sequences
  - Abstract
  - 1 Introduction
  - 2 GPU architecture
  - 3 Pairwise alignment
  - 4 Alignment of three sequences
  - 5 Conclusion
- Chapter 9: Augmented Block Cimmino Distributed Algorithm for solving tridiagonal systems on GPU
  - Abstract
  - 1 Introduction
  - 2 ABCD Solver for tridiagonal systems
  - 3 GPU implementation and optimization
  - 4 Performance evaluation
  - 5 Conclusion and future work
- Chapter 10: GPU computing applied to linear and mixed-integer programming
  - Abstract
  - Acknowledgments
  - 1 Introduction
  - 2 Operations Research in Practice
  - 3 Exact Optimization Algorithms
  - 4 Metaheuristics
  - 5 Conclusions
  - Conflicts of Interest
- Chapter 11: GPU-accelerated shortest paths computations for planar graphs
  - Abstract
  - 1 Introduction
  - 2 Related work
  - 3 Partitioned Approaches
  - 4 Computational Complexity Analysis
  - 5 Experiments and results
  - About the Authors
- Chapter 12: GPU sorting algorithms
  - Abstract
  - 1 Introduction
  - 2 Generic Programming Strategies for GPU
  - 3 Sorting algorithms
- Chapter 13: MPC: An effective floating-point compression algorithm for GPUs
  - Abstract
  - Acknowledgments
  - 1 Introduction
  - 2 Methodology
  - 3 Experimental results
  - 4 Summary and Conclusions
- Chapter 14: Adaptive sparse matrix representation for efficient matrix-vector multiplication
  - Abstract
  - 1 Introduction
  - 2 Sparse matrix-vector multiplication
  - 3 GPU architecture and programming model
  - 4 Optimization principles for SpMV
  - 5 Platform (Adaptive Runtime System)
  - 6 Results and analysis
  - 7 Summary
Part 3: Architecture and Performance
- Chapter 15: A framework for accelerating bottlenecks in GPU execution with assist warps
  - Abstract
  - Acknowledgments
  - 1 Introduction
  - 2 Background
  - 3 Motivation
  - 4 The CABA Framework
  - 5 A Case for CABA: Data Compression
  - 6 Methodology
  - 7 Results
  - 8 Other Uses of the CABA Framework
  - 9 Related Work
  - 10 Conclusion
- Chapter 16: Accelerating GPU accelerators through neural algorithmic transformation
  - Abstract
  - 1 Introduction
  - 2 Neural transformation for GPUs
  - 3 Instruction-set-architecture design
  - 4 Neural accelerator: design and integration
  - 5 Controlling quality trade-offs
  - 6 Evaluation
  - 7 Related work
  - 8 Conclusion
- Chapter 17: The need for heterogeneous network-on-chip architectures with GPGPUs: A case study with photonic interconnects
  - Abstract
  - 1 Introduction
  - 2 Background
  - 3 The Need for Heterogeneous Interconnections
  - 4 Characterization of GPGPU Performance
  - 5 Conclusion
- Chapter 18: Accurately modeling GPGPU frequency scaling with the CRISP performance model
  - Abstract
  - Acknowledgments
  - 1 Introduction
  - 2 Motivation and related work
  - 3 GPGPU DVFS performance model
  - 4 Methodology
  - 5 Results
  - 6 Conclusion
Part 4: Power and Reliability
- Chapter 19: Energy and power considerations of GPUs
  - Abstract
  - 1 Introduction
  - 2 Evaluation methodology
  - 3 Power profiling of regular and irregular programs
  - 4 Affecting power and energy on GPUs
  - 5 Summary
  - Appendix
  - About the authors
- Chapter 20: Architecting the last-level cache for GPUs using STT-MRAM nonvolatile memory
  - Abstract
  - 1 Introduction
  - 2 Background
  - 3 Related Work
  - 4 Two-Part L2 Cache Architecture
  - 5 Dynamic Write Threshold Detection Mechanism
  - 6 Implementation
  - 7 Evaluation Result
  - 8 Conclusion
- Chapter 21: Power management of mobile GPUs
  - Abstract
  - Acknowledgments
  - 1 Introduction
  - 2 GPU Power Management for Mobile Games
  - 3 GPU Power Management for GPGPU Applications
  - 4 Future Outlook
  - 5 Conclusions
- Chapter 22: Advances in GPU reliability research
  - Abstract
  - 1 Introduction
  - 2 Evaluating GPU Reliability
  - 3 Hardware Reliability Enhancements
  - 4 Software Reliability Enhancements
  - 5 Summary
- Chapter 23: Addressing hardware reliability challenges in general-purpose GPUs
  - Abstract
  - 1 Introduction
  - 2 GPGPUs Architecture
  - 3 Modeling and Characterizing GPGPUs Reliability in the Presence of Soft Errors [25]
  - 4 RISE: Improving the Streaming Processors’ Reliability Against Soft Errors in GPGPUs [36]
  - 5 Mitigating the Susceptibility of GPGPUs to PVs [43]
Author Index
Subject Index

Purchase options

Save 50% on book bundles

Institutional subscription on ScienceDirect

Hamid Sarbazi-Azad