GPU Computing Gems Jade Edition book cover

GPU Computing Gems Jade Edition

This is the second volume of Morgan Kaufmann's GPU Computing Gems, offering an all-new set of insights, ideas, and practical "hands-on" skills from researchers and developers worldwide. Each chapter gives you a window into the work being performed across a variety of application domains, and the opportunity to witness the impact of parallel GPU computing on the efficiency of scientific research.

GPU Computing Gems: Jade Edition showcases the latest research solutions with GPGPU and CUDA, including:

  • Improving memory access patterns for cellular automata using CUDA
  • Large-scale gas turbine simulations on GPU clusters
  • Identifying and mitigating credit risk using large-scale economic capital simulations
  • GPU-powered MATLAB acceleration with Jacket
  • Biologically-inspired machine vision
  • An efficient CUDA algorithm for the maximum network flow problem
  • 30 more chapters of innovative GPU computing ideas, written to be accessible to researchers from any industry

GPU Computing Gems: Jade Edition contains 100% new material covering a variety of application domains: algorithms and data structures, engineering, interactive physics for games, computational finance, and programming tools.


Software engineers, programmers, hardware engineers, advanced students

Hardbound, 560 Pages

Published: September 2011

Imprint: Morgan Kaufmann

ISBN: 978-0-12-385963-1


  • It wasn't until recently that parallel [GPU] computing made people realize that there are whole areas in computing science that we can tackle. … When you can do something 10 or 100 times faster, something magical happens and you can do something completely different.

    -Jen-Hsun Huang, CEO, NVIDIA


  • Part 1: Parallel Algorithms and Data Structures - Paulius Micikevicius, NVIDIA

    1 Large-Scale GPU Search

    2 Edge v. Node Parallelism for Graph Centrality Metrics

    3 Optimizing parallel prefix operations for the Fermi architecture

    4 Building an Efficient Hash Table on the GPU

    5 An Efficient CUDA Algorithm for the Maximum Network Flow Problem

    6 On Improved Memory Access Patterns for Cellular Automata Using CUDA

    7 Fast Minimum Spanning Tree Computation on Large Graphs

    8 Fast in-place sorting with CUDA based on bitonic sort

    Part 2: Numerical Algorithms - Frank Jargstorff, NVIDIA

    9 Interval Arithmetic in CUDA

    10 Approximating the erfinv Function

    11 A Hybrid Method for Solving Tridiagonal Systems on the GPU

    12 LU Decomposition in CULA

    13 GPU Accelerated Derivative-free Optimization

    Part 3: Engineering Simulation - Peng Wang, NVIDIA

    14 Large-scale gas turbine simulations on GPU clusters

    15 GPU acceleration of rarefied gas dynamic simulations

    16 Assembly of Finite Element Methods on Graphics  Processors

    17 CUDA implementation of Vertex-Centered, Finite Volume CFD methods on Unstructured Grids with Flow Control Applications

    18 Solving Wave Equations on Unstructured Geometries

    19 Fast electromagnetic integral equation solvers on graphics processing units (GPUs)

    Part 4: Interactive Physics and AI for Games and Engineering Simulation - Richard Tonge, NVIDIA

    20 Solving Large Multi-Body Dynamics Problems on the GPU

    21 Implicit FEM Solver in CUDA

    22 Real-time Adaptive GPU multi-agent path planning

    Part 5: Computational Finance - Thomas Bradley, NVIDIA

    23 High performance finite difference PDE solvers on GPUs for financial option pricing

    24 Identifying and Mitigating Credit Risk using Large-scale Economic Capital Simulations

    25 Financial Market Value-at-Risk Estimation using the Monte Carlo Method

    Part 6: Programming Tools and Techniques - Cliff Wooley, NVIDIA

    26 Thrust: A Productivity-Oriented Library for CUDA

    27 GPU Scripting and Code Generation with PyCUDA

    28 Jacket: GPU Powered MATLAB Acceleration

    29 Accelerating Development and Execution Speed with Just In Time GPU Code Generation

    30 GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot

    31 Abstraction for AoS and SoA Layout in C++

    32 Processing Device Arrays with C++ Metaprogramming

    33 GPU Metaprogramming: A Case Study in Biologically-Inspired Machine Vision

    34 A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs

    35 Dynamic Load Balancing using Work-Stealing

    36 Applying software-managed caching and CPU/GPU task scheduling for accelerating dynamic workloads


advert image