GPU Computing Gems Emerald Edition


  • Wen-mei Hwu, CTO of MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign

"...the perfect companion to Programming Massively Parallel Processors by Hwu & Kirk." -Nicolas Pinto, Research Scientist at Harvard & MIT, NVIDIA Fellow 2009-2010

Graphics processing units (GPUs) can do much more than render graphics. Scientists and researchers increasingly look to GPUs to improve the efficiency and performance of computationally-intensive experiments across a range of disciplines.

GPU Computing Gems: Emerald Edition brings their techniques to you, showcasing GPU-based solutions including:

  • Black hole simulations with CUDA
  • GPU-accelerated computation and interactive display of molecular orbitals
  • Temporal data mining for neuroscience
  • GPU -based parallelization for fast circuit optimization
  • Fast graph cuts for computer vision
  • Real-time stereo on GPGPU using progressive multi-resolution adaptive windows
  • GPU image demosaicing
  • Tomographic image reconstruction from unordered lines with CUDA
  • Medical image processing using GPU -accelerated ITK image filters
  • 41 more chapters of innovative GPU computing ideas, written to be accessible to researchers from any domain

GPU Computing Gems: Emerald Edition is the first volume in Morgan Kaufmann's Applications of GPU Computing Series, offering the latest insights and research in computer vision, electronic design automation, emerging data-intensive applications, life sciences, medical imaging, ray tracing and rendering, scientific simulation, signal and audio processing, statistical modeling, and video / image processing.

View full description


computer programmers, software engineers, hardware engineers, computer science students


Book information

  • Published: January 2011
  • ISBN: 978-0-12-384988-5


Praise for GPU Computing Gems: Emerald Edition:

“GPU computing is becoming an outstanding field in high performance computing. Due to its easiness, the CUDA approach enables programmers to take advantage of GPU-acceleration very quickly… My research in complex science as well as applications in high frequency trading benefited significantly from GPU computing.”--Dr. Tobias Preis, ETH Zurich, Switzerland
“This book is an important reference for everyone working on GPU/CUDA, and contains definitive work in a selection of fields. The patterns of CUDA parallelization it describes can often be adapted to applications in other fields.”--
Dr. Ming Ouyang, Assistant Professor – Director Visualization and Intensive Graphics Lab, University of Louisville
“Diving into the world of GPU computing has never been more important these days. GPU Computing Gems: Emerald Edition takes you through the looking glass into this fascinating world.”--
Martin Eisemann, Computer Graphics Lab, TU Braunschweig
“…an outstanding collection of vignettes of how to program GPUs for a breathtaking range of applications.”--
Dr. Amitabh Varshney, Director, Institute for Advanced Computer Studies, University of Maryland
"The book features a useful index that might help readers mine the gems in search of a solution to a specific algorithmic problem. The index is accompanied by online resources containing source code samples—and further information—for some of the chapters. A second volume with another 30 chapters of GPGPU application reports, somewhat more focused on generic algorithms and programming techniques, is currently in the pipeline and scheduled to appear as the "Jade Edition" sometime this month."--
Computing in Science and Engineering
"The book is an excellent selection of important papers describing various applications of GPUs. As such, I believe it would be a valuable addition to the bookshelf of any researcher in modeling and simulation…This is not a substitute for a more detailed text on massively parallel programming...Instead, it is a nice practical addition to that text."--Computing Reviews, August 2012

Table of Contents

Editor’s Introduction: State of GPU Computing

Section 1: Scientific Simulation

State of GPU Computing in Scientific Simulation

1: GPU-Accelerated Computation and Interactive Display of Molecular Orbitals

2: Large-Scale Chemical Informatics on GPUs

3: Dynamical Quadrature Grids: Applications in Density Functional Calculations

4: Fast Molecular Electrostatics Algorithms on GPUs

5: Quantum Chemistry: Propagation of Electronic Structure on GPU

6: An Efficient CUDA Implementation of the Tree-based Barnes

Hut n-Body Algorithm

7: Leveraging the Untapped Computation Power of GPUs: Fast Spectral Synthesis Using Texture Interpolation

8: Black Hole Simulations with CUDA

9: Treecode and Fast Multipole Method for N-body Simulation with CUDA

10: Wavelet-based Density Functional Theory Calculation on Massively Parallel Hybrid Architectures

Section 2: Life Sciences

State of GPU Computing in Life Sciences

11: Accurate Scanning of Sequence Databases with the Smith-Waterman Algorithm

12: Massive Parallel Computing to Accelerate Genome-Matching

13: GPU-Supercomputer Acceleration of Pattern Matching

14: GPU Accelerated RNA Folding Algorithm

15: Temporal Data Mining for Neuroscience

Section 3: Statistical Modeling

State of GPU Computing in Statistical Modeling

16: Parallelization Techniques for Random Number Generations

17: Monte Carlo Photon Transport on the GPU

18: High Performance Iterated Function Systems

Section 4: Emerging Data-intensive Applications

State of GPU Computing in Data-intensive Applications

19: Large Scale Machine Learning

20: Multiclass Support Vector Machine

21: Template Driven Agent Based Modeling and Simulation with CUDA

22: GPU-Accelerated Ant Colony Optimization

Section 5: Electronic Design Automation

State of GPU Computing in Electronic Design Automation

23: High Performance Gate-Level Simulation with GP-GPUs

24: GPU-Based Parallel Computing for Fast Circuit Optimization

Section 6: Ray Tracing and Rendering

State of GPU Computing in Ray Tracing and Rendering

25: Lattice-Boltzmann Lighting Models

26: Path Regeneration for Random Walks

27: From Sparse Mocap to Highly-detailed Facial Animation

28: A Programmable Graphics Pipeline in CUDA for Order Independent Transparency

Section 7: Computer Vision

State of GPU Computing in Computer Vision

29: Fast Graph Cuts for Computer Vision

30: Visual Saliency Model on Multi-GPU

31: Real-Time Stereo on GPGPU Using Progressive Multi-Resolution Adaptive Windows

32: Real-Time Speed-Limit-Sign Recognition on an Embedded System Using a GPU

33: Haar Classifiers for Object Detection with CUDA

Section 8: Video and Image Processing

State of GPU Computing in Video and Image Processing

34: Experiences on Image and Video Processing with CUDA and OpenCL

35: Connected Component Labeling in CUDA

36: Image Demosaicing

Section 9: Signal and Audio Processing

State of GPU Computing in Signal and Audio Processing

37: Efficient Automatic Speech Recognition on the GPU

38: Parallel LDPC Decoding

39: Large-Scale Fast Fourier Transform

Section 10: Medical Imaging

State of GPU Computing in Medical Imaging

40: GPU Acceleration of Iterative Digital Breast Tomosynthesis

41: Parallelization of Katsevich CT Image Reconstruction Algorithm on Generic Multi-Core Processors and GPGPU

42: 3-D Tomographic Image Reconstruction from Randomly Ordered Lines with CUDA

43: Using GPUs to Learn Effective Parameter Settings for GPU-Accelerated Iterative CT Reconstruction Algorithms

44: Using GPUs to Accelerate Advanced MRI Reconstruction with Field Inhomogeneity Compensation

45: â„“1 Minimization in â„“1-SPIRiT Compressed Sensing MRI Reconstruction

46: Medical Image Processing Using GPU-accelerated ITK Image Filters

47: Deformable Volumetric Registration Using B-splines

48: Multi-scale Unbiased Diffeomorphic Atlas Construction on Multi-GPUs

49: GPU-accelerated Brain Connectivity Reconstruction and Visualization in Large-Scale Electron Micrographs

50: Fast Simulation of Radiographic Images Using a Monte Carlo X-Ray Transport Algorithm Implemented in CUDA