GPU Computing Gems Jade Edition


  • Wen-mei Hwu, CTO of MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign

GPU Computing Gems, Jade Edition describes successful application experiences in GPU computing and the techniques that contributed to that success. Divided into five sections, the book explains how GPU execution is achieved with algorithm implementation techniques and approaches to data structure layout. More specifically, it considers three general requirements: high level of parallelism, coherent memory access by threads within warps, and coherent control flow within warps. This book begins with an overview of parallel algorithms and data structures. The first few chapters focus on accelerating database searches, how to leverage the Fermi GPU architecture to further accelerate prefix operations, and GPU implementation of hash tables. The reader is then systematically walked through the fundamental optimization steps when implementing a bandwidth-limited algorithm, GPU-based libraries of numerical algorithms and software products for numerical analysis with dedicated GPU support, and the adoption of GPU computing techniques in production engineering simulation codes. The next chapters discuss the state of GPU computing in interactive physics and artificial intelligence, programming tools and techniques for GPU computing, and the edge and node parallelism approach for computing graph centrality metrics. The book also proposes an alternative approach that balances computation regardless of node degree variance. This book will be useful to application developers in a wide range of application areas.
View full description


Book information

  • Published: September 2011
  • ISBN: 978-0-12-385963-1


It wasn't until recently that parallel [GPU] computing made people realize that there are whole areas in computing science that we can tackle. … When you can do something 10 or 100 times faster, something magical happens and you can do something completely different.

-Jen-Hsun Huang, CEO, NVIDIA

Table of Contents

Editors, Reviewers, and Authors


Section 1 Parallel Algorithms and Data Structures

    Chapter 1 Large-Scale GPU Search

    Chapter 2 Edge v. Node Parallelism for Graph Centrality Metrics

    Chapter 3 Optimizing Parallel Prefix Operations for the Fermi Architecture

    Chapter 4 Building an Efficient Hash Table on the GPU

    Chapter 5 Efficient CUDA Algorithms for the Maximum Network Flow Problem

    Chapter 6 Optimizing Memory Access Patterns for Cellular Automata on GPUs

    Chapter 7 Fast Minimum Spanning Tree Computation

    Chapter 8 Comparison-Based In-Place Sorting with CUDA

Section 2 Numerical Algorithms

    Chapter 9 Interval Arithmetic in CUDA

    Chapter 10 Approximating the erfinv Function

    Chapter 11 A Hybrid Method for Solving Tridiagonal Systems on the GPU

    Chapter 12 Accelerating CULA Linear Algebra Routines with Hybrid GPU and Multicore Computing

    Chapter 13 GPU Accelerated Derivative-Free Mesh Optimization

Section 3 Engineering Simulation

    Chapter 14 Large-Scale Gas Turbine Simulations on GPU Clusters

    Chapter 15 GPU Acceleration of Rarefied Gas Dynamic Simulations

    Chapter 16 Application of Assembly of Finite Element Methods on Graphics Processors for Real-Time Elastodynamics

    Chapter 17 CUDA Implementation of Vertex-Centered, Finite Volume CFD Methods on Unstructured Grids with Flow Control Applications

    Chapter 18 Solving Wave Equations on Unstructured Geometries

    Chapter 19 Fast Electromagnetic Integral Equation Solvers on Graphics Processing Units

Section 4 Interactive Physics and AI for Games and Engineering Simulation

    Chapter 20 Solving Large Multibody Dynamics Problems on the GPU

    Chapter 21 Implicit FEM Solver on GPU for Interactive Deformation Simulation

    Chapter 22 Real-Time Adaptive GPU Multiagent Path Planning

Section 5 Computational Finance

    Chapter 23 Pricing Financial Derivatives with High Performance Finite Difference Solvers on GPUs

    Chapter 24 Large-Scale Credit Risk Loss Simulation

    Chapter 25 Monte Carlo-Based Financial Market Value-at-Risk Estimation on GPUs

Section 6 Programming Tools and Techniques

    Chapter 26 Thrust: A Productivity-Oriented Library for CUDA

    Chapter 27 GPU Scripting and Code Generation with PyCUDA

    Chapter 28 Jacket: GPU Powered MATLAB Acceleration

    Chapter 29 Accelerating Development and Execution Speed with Just-in-Time GPU Code Generation

    Chapter 30 GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot

    Chapter 31 Abstraction for AoS and SoA Layout in CCC

    Chapter 32 Processing Device Arrays with CCC Metaprogramming

    Chapter 33 GPU Metaprogramming: A Case Study in Biologically Inspired Machine Vision

    Chapter 34 A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs

    Chapter 35 Dynamic Load Balancing Using Work-Stealing

    Chapter 36 Applying Software-Managed Caching and CPU/GPU Task Scheduling for Accelerating Dynamic Workloads