GPU Computing Gems Jade Edition

GPU Computing Gems Jade Edition

1st Edition - September 28, 2011

Write a review

  • Editor-in-Chief: Wen-mei Hwu
  • eBook ISBN: 9780123859648
  • Hardcover ISBN: 9780123859631

Purchase options

Purchase options
DRM-free (Mobi, PDF, EPub)
Available
Sales tax will be calculated at check-out

Institutional Subscription

Free Global Shipping
No minimum order

Description

GPU Computing Gems, Jade Edition, offers hands-on, proven techniques for general purpose GPU programming based on the successful application experiences of leading researchers and developers. One of few resources available that distills the best practices of the community of CUDA programmers, this second edition contains 100% new material of interest across industry, including finance, medicine, imaging, engineering, gaming, environmental science, and green computing. It covers new tools and frameworks for productive GPU computing application development and provides immediate benefit to researchers developing improved programming environments for GPUs. Divided into five sections, this book explains how GPU execution is achieved with algorithm implementation techniques and approaches to data structure layout. More specifically, it considers three general requirements: high level of parallelism, coherent memory access by threads within warps, and coherent control flow within warps. Chapters explore topics such as accelerating database searches; how to leverage the Fermi GPU architecture to further accelerate prefix operations; and GPU implementation of hash tables. There are also discussions on the state of GPU computing in interactive physics and artificial intelligence; programming tools and techniques for GPU computing; and the edge and node parallelism approach for computing graph centrality metrics. In addition, the book proposes an alternative approach that balances computation regardless of node degree variance. Software engineers, programmers, hardware engineers, and advanced students will find this book extremely usefull. For useful source codes discussed throughout the book, the editors invite readers to the following website:

Key Features

  • This second volume of GPU Computing Gems offers 100% new material of interest across industry, including finance, medicine, imaging, engineering, gaming, environmental science, green computing, and more
  • Covers new tools and frameworks for productive GPU computing application development and offers immediate benefit to researchers developing improved programming environments for GPUs
  • Even more hands-on, proven techniques demonstrating how general purpose GPU computing is changing scientific research
  • Distills the best practices of the community of CUDA programmers; each chapter provides insights and ideas as well as 'hands on' skills applicable to a variety of fields

Readership

Software engineers, programmers, hardware engineers, advanced students

Table of Contents

  • Editors, Reviewers, and Authors

    Editor-In-Chief

    Managing Editor

    NVIDIA Editor

    Area Editors

    Reviewers

    Authors

    Introduction

    State of GPU Computing

    Section 1: Parallel Algorithms and Data Structures

    Introduction

    In this Section

    Chapter 1. Large-Scale GPU Search

    1.1 Introduction

    1.2 Memory Performance

    1.3 Searching Large Data Sets

    1.4 Experimental Evaluation

    1.5 Conclusion

    References

    Chapter 2. Edge v. Node Parallelism for Graph Centrality Metrics

    2.1 Introduction

    2.2 Background

    2.3 Node v. Edge Parallelism

    2.4 Data Structure

    2.5 Implementation

    2.6 Analysis

    2.7 Results

    2.8 Conclusions

    References

    Chapter 3. Optimizing Parallel Prefix Operations for the Fermi Architecture

    3.1 Introduction to Parallel Prefix Operations

    3.2 Efficient Binary Prefix Operations on Fermi

    3.3 Conclusion

    References

    Chapter 4. Building an Efficient Hash Table on the GPU

    4.1 Introduction

    4.2 Overview

    4.3 Building and Querying a Basic Hash Table

    4.4 Specializing the Hash Table

    4.5 Analysis

    4.6 Conclusion

    Acknowledgments

    References

    Chapter 5. Efficient CUDA Algorithms for the Maximum Network Flow Problem

    5.1 Introduction, Problem Statement, and Context

    5.2 Core Method

    5.3 Algorithms, Implementations, and Evaluations

    5.4 Final Evaluation

    5.5 Future Directions

    References

    Chapter 6. Optimizing Memory Access Patterns for Cellular Automata on GPUs

    6.1 Introduction, Problem Statement, and Context

    6.2 Core Methods

    6.3 Algorithms, Implementations, and Evaluations

    6.4 Final Results

    6.5 Future Directions

    References

    Chapter 7. Fast Minimum Spanning Tree Computation

    7.1 Introduction, Problem Statement, and Context

    7.2 The MST Algorithm: Overview

    7.3 CUDA Implementation of MST

    7.4 Evaluation

    7.5 Conclusions

    References

    Chapter 8. Comparison-Based In-Place Sorting with CUDA

    8.1 Introduction

    8.2 Bitonic Sort

    8.3 Implementation

    8.4 Evaluation

    8.5 Conclusion

    References

    Section 2: Numerical Algorithms

    Introduction

    State of GPU-Based Numerical Algorithms

    In this Section

    Chapter 9. Interval Arithmetic in CUDA

    9.1 Interval Arithmetic

    9.2 Importance of Rounding Modes

    9.3 Interval Operators in CUDA

    9.4 Some Evaluations: Synthetic Benchmark

    9.5 Application-Level Benchmark

    9.6 Conclusion

    References

    Chapter 10. Approximating the erfinv Function

    10.1 Introduction

    10.2 New erfinv Approximations

    10.3 Performance and Accuracy

    10.4 Conclusions

    References

    Chapter 11. A Hybrid Method for Solving Tridiagonal Systems on the GPU

    11.1 Introduction

    11.3 Algorithms

    11.4 Implementation

    11.5 Results and Evaluation

    11.6 Future Directions

    Source code

    References

    Chapter 12. Accelerating CULA Linear Algebra Routines with Hybrid GPU and Multicore Computing

    12.1 Introduction, Problem Statement, and Context

    12.2 Core Methods

    12.3 Algorithms, Implementations, and Evaluations

    12.4 Final Evaluation and Validation]{Final Evaluation and Validation of Results, Total Benefits, and Limitations

    12.5 Future Directions

    References

    Chapter 13. GPU Accelerated Derivative-Free Mesh Optimization

    13.1 Introduction, Problem Statement, and Context

    13.2 Core Method

    13.3 Algorithms, Implementations, and Evaluations

    13.4 Final Evaluation

    13.5 Future Direction

    References

    Section 3: Engineering Simulation

    Introduction

    State of GPU Computing in Engineering Simulations

    In this Section

    Chapter 14. Large-Scale Gas Turbine Simulations on GPU Clusters

    14.1 Introduction, Problem Statement, and Context

    14.2 Core Method

    14.3 Algorithms, Implementations, and Evaluations

    14.4 Final Evaluation

    14.5 Test Case and Parallel Performance

    14.6 Future Directions

    References

    Chapter 15. GPU Acceleration of Rarefied Gas Dynamic Simulations

    15.1 Introduction, Problem Statement, and Context

    15.2 Core Methods

    15.3 Algorithms, Implementations, and Evaluations

    15.4 Final Evaluation

    15.5 Future Directions

    References

    Chapter 16. Application of Assembly of Finite Element Methods on Graphics Processors for Real-Time Elastodynamics

    16.1 Introduction, Problem Statement, and Context

    16.2 Core Method

    16.3 Algorithms, Implementations, and Evaluations

    16.4 Evaluation and Validation of Results, Total Benefits, Limitations

    16.5 Future Directions

    Acknowledgments

    References

    Chapter 17. CUDA Implementation of Vertex-Centered, Finite Volume CFD Methods on Unstructured Grids with Flow Control Applications

    17.1 Introduction, Problem Statement, and Context

    17.2 Core (CFD and Optimization) Methods

    17.3 Implementations and Evaluation

    17.4 Applications to Flow Control — Optimization

    References

    Chapter 18. Solving Wave Equations on Unstructured Geometries

    18.1 Introduction, Problem Statement, and Context

    18.2 Core Method

    18.3 Algorithms, Implementations, and Evaluations

    18.4 Final Evaluation

    18.5 Future Directions

    Acknowledgments

    References

    Chapter 19. Fast Electromagnetic Integral Equation Solvers on Graphics Processing Units

    19.1 Problem Statement and Background

    19.2 Algorithms Introduction

    19.3 Algorithm Description

    19.4 GPU Implementations

    19.5 Results

    19.6 Integrating the GPU NGIM Algorithms with Iterative IE Solvers

    19.7 Future directions

    References

    Section 4: Interactive Physics and AI for Games and Engineering Simulation

    Introduction

    State of GPU Computing in Interactive Physics and AI

    In this Section

    Chapter 20. Solving Large Multibody Dynamics Problems on the GPU

    20.1 Introduction, Problem Statement, and Context

    20.2 Core Method

    20.3 The Time-Stepping Scheme

    20.4 Algorithms, Implementations, and Evaluations

    20.5 Final Evaluation

    20.6 Future Directions

    Acknowledgments

    References

    Chapter 21. Implicit FEM Solver on GPU for Interactive Deformation Simulation

    21.1 Problem Statement and Context

    21.2 Core Method

    21.3 Algorithms and Implementations

    21.4 Results and Evaluation

    21.5 Future Directions

    Acknowledgements

    References

    Chapter 22. Real-Time Adaptive GPU Multiagent Path Planning

    22.1 Introduction

    22.2 Core Method

    22.3 Implementation

    22.4 Results

    References

    Section 5: Computational Finance

    Introduction

    State of GPU Computing in Computational Finance

    In this Section

    Chapter 23. Pricing Financial Derivatives with High Performance Finite Difference Solvers on GPUs

    23.1 Introduction, Problem Statement, and Context

    23.2 Core Method

    23.3 Algorithms, Implementations, and Evaluations

    23.4 Final Evaluation

    23.5 Future Directions

    References

    Chapter 24. Large-Scale Credit Risk Loss Simulation

    24.1 Introduction, Problem Statement, and Context

    24.2 Core Methods

    24.3 Algorithms, Implementations, Evaluations

    24.4 Results and Conclusions

    24.5 Future Developments

    Acknowledgements

    References

    Chapter 25. Monte Carlo–Based Financial Market Value-at-Risk Estimation on GPUs

    25.1 Introduction, Problem Statement, and Context

    25.2 Core Methods

    25.3 Algorithms, Implementations, and Evaluations

    25.4 Final Results

    25.5 Conclusion

    References

    Section 6: Programming Tools and Techniques

    Introduction

    Programming Tools and Techniques for GPU Computing

    In this Section

    Chapter 26. Thrust: A Productivity-Oriented Library for CUDA

    26.1 Motivation

    26.2 Diving In

    26.3 Generic Programming

    26.4 Benefits of Abstraction

    26.5 Best Practices

    References

    Chapter 27. GPU Scripting and Code Generation with PyCUDA

    27.1 Introduction, Problem Statement, and Context

    27.2 Core Method

    27.3 Algorithms, Implementations, and Evaluations

    27.4 Evaluation

    27.5 Availability

    27.6 Future Directions

    Acknowledgment

    References

    Chapter 28. Jacket: GPU Powered MATLAB Acceleration

    28.1 Introduction

    28.2 Jacket

    28.3 Benchmarking Procedures

    28.4 Experimental Results

    28.5 Future Directions

    References

    Chapter 29. Accelerating Development and Execution Speed with Just-in-Time GPU Code Generation

    29.1 Introduction, Problem Statement, and Context

    29.2 Core Methods

    29.3 Algorithms, Implementations, and Evaluations

    29.4 Final Evaluation

    29.5 Future Directions

    References

    Chapter 30. GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot

    30.1 Introduction

    30.2 Core Technology

    30.3 Algorithm, Implementation, and Benefits

    30.4 Future Directions

    Acknowledgements

    References

    Chapter 31. Abstraction for AoS and SoA Layout in C++

    31.1 Introduction, Problem Statement, and Context

    31.2 Core Method

    31.3 Implementation

    31.4 ASA in Practice

    31.5 Final Evaluation

    Acknowledgments

    References

    Chapter 32. Processing Device Arrays with C++ Metaprogramming

    32.1 Introduction, Problem Statement, and Context

    32.2 Core Method

    32.3 Implementation

    32.4 Evaluation

    32.5 Future Directions

    References

    Chapter 33. GPU Metaprogramming: A Case Study in Biologically Inspired Machine Vision

    33.1 Introduction, Problem Statement, and Context

    33.2 Core Method

    33.3 Algorithms, Implementations, and Evaluations

    33.4 Final Evaluation

    33.5 Future Directions

    References

    Chapter 34. A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs

    34.1 Introduction, Problem Statement, and Context

    34.2 Core Method

    34.3 Algorithms, Implementations, and Evaluations

    34.4 Final Evaluation

    34.5 Future Directions

    References

    Chapter 35. Dynamic Load Balancing Using Work-Stealing

    35.1 Introduction

    35.2 Core Method

    35.3 Algorithms and Implementations

    35.4 Case Studies and Evaluation

    35.5 Future Directions

    Acknowledgments

    References

    Chapter 36. Applying Software-Managed Caching and CPU/GPU Task Scheduling for Accelerating Dynamic Workloads

    36.1 Introduction, Problem Statement, and Context

    36.2 Core Method

    36.3 Algorithms, Implementations, and Evaluations

    36.4 Final Evaluation

    References

    Index

Product details

  • No. of pages: 560
  • Language: English
  • Copyright: © Morgan Kaufmann 2011
  • Published: September 28, 2011
  • Imprint: Morgan Kaufmann
  • eBook ISBN: 9780123859648
  • Hardcover ISBN: 9780123859631

About the Editor in Chief

Wen-mei Hwu

Wen-mei Hwu
Wen-mei W. Hwu is a Professor and holds the Sanders-AMD Endowed Chair in the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign. His research interests are in the area of architecture, implementation, compilation, and algorithms for parallel computing. He is the chief scientist of Parallel Computing Institute and director of the IMPACT research group (www.impact.crhc.illinois.edu). He is a co-founder and CTO of MulticoreWare. For his contributions in research and teaching, he received the ACM SigArch Maurice Wilkes Award, the ACM Grace Murray Hopper Award, the Tau Beta Pi Daniel C. Drucker Eminent Faculty Award, the ISCA Influential Paper Award, the IEEE Computer Society B. R. Rau Award and the Distinguished Alumni Award in Computer Science of the University of California, Berkeley. He is a fellow of IEEE and ACM. He directs the UIUC CUDA Center of Excellence and serves as one of the principal investigators of the NSF Blue Waters Petascale computer project. Dr. Hwu received his Ph.D. degree in Computer Science from the University of California, Berkeley.

Affiliations and Expertise

CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign, USA

Ratings and Reviews

Write a review

There are currently no reviews for "GPU Computing Gems Jade Edition"