GPU Computing Gems Emerald Edition book cover

GPU Computing Gems Emerald Edition

"...the perfect companion to Programming Massively Parallel Processors by Hwu & Kirk." -Nicolas Pinto, Research Scientist at Harvard & MIT, NVIDIA Fellow 2009-2010

Graphics processing units (GPUs) can do much more than render graphics. Scientists and researchers increasingly look to GPUs to improve the efficiency and performance of computationally-intensive experiments across a range of disciplines.

GPU Computing Gems: Emerald Edition brings their techniques to you, showcasing GPU-based solutions including:

  • Black hole simulations with CUDA
  • GPU-accelerated computation and interactive display of molecular orbitals
  • Temporal data mining for neuroscience
  • GPU -based parallelization for fast circuit optimization
  • Fast graph cuts for computer vision
  • Real-time stereo on GPGPU using progressive multi-resolution adaptive windows
  • GPU image demosaicing
  • Tomographic image reconstruction from unordered lines with CUDA
  • Medical image processing using GPU -accelerated ITK image filters
  • 41 more chapters of innovative GPU computing ideas, written to be accessible to researchers from any domain

GPU Computing Gems: Emerald Edition is the first volume in Morgan Kaufmann's Applications of GPU Computing Series, offering the latest insights and research in computer vision, electronic design automation, emerging data-intensive applications, life sciences, medical imaging, ray tracing and rendering, scientific simulation, signal and audio processing, statistical modeling, and video / image processing.


computer programmers, software engineers, hardware engineers, computer science students

Hardbound, 886 Pages

Published: January 2011

Imprint: Morgan Kaufmann

ISBN: 978-0-12-384988-5


  • Praise for GPU Computing Gems: Emerald Edition:

    “GPU computing is becoming an outstanding field in high performance computing. Due to its easiness, the CUDA approach enables programmers to take advantage of GPU-acceleration very quickly… My research in complex science as well as applications in high frequency trading benefited significantly from GPU computing.”--Dr. Tobias Preis, ETH Zurich, Switzerland
    “This book is an important reference for everyone working on GPU/CUDA, and contains definitive work in a selection of fields. The patterns of CUDA parallelization it describes can often be adapted to applications in other fields.”--
    Dr. Ming Ouyang, Assistant Professor – Director Visualization and Intensive Graphics Lab, University of Louisville
    “Diving into the world of GPU computing has never been more important these days. GPU Computing Gems: Emerald Edition takes you through the looking glass into this fascinating world.”--
    Martin Eisemann, Computer Graphics Lab, TU Braunschweig
    “…an outstanding collection of vignettes of how to program GPUs for a breathtaking range of applications.”--
    Dr. Amitabh Varshney, Director, Institute for Advanced Computer Studies, University of Maryland
    "The book features a useful index that might help readers mine the gems in search of a solution to a specific algorithmic problem. The index is accompanied by online resources containing source code samples—and further information—for some of the chapters. A second volume with another 30 chapters of GPGPU application reports, somewhat more focused on generic algorithms and programming techniques, is currently in the pipeline and scheduled to appear as the "Jade Edition" sometime this month."--
    Computing in Science and Engineering
    "The book is an excellent selection of important papers describing various applications of GPUs. As such, I believe it would be a valuable addition to the bookshelf of any researcher in modeling and simulation…This is not a substitute for a more detailed text on massively parallel programming...Instead, it is a nice practical addition to that text."--Computing Reviews, August 2012


  • Editor’s Introduction: State of GPU Computing

    Section 1: Scientific Simulation

    State of GPU Computing in Scientific Simulation

    1: GPU-Accelerated Computation and Interactive Display of Molecular Orbitals

    2: Large-Scale Chemical Informatics on GPUs

    3: Dynamical Quadrature Grids: Applications in Density Functional Calculations

    4: Fast Molecular Electrostatics Algorithms on GPUs

    5: Quantum Chemistry: Propagation of Electronic Structure on GPU

    6: An Efficient CUDA Implementation of the Tree-based Barnes

    Hut n-Body Algorithm

    7: Leveraging the Untapped Computation Power of GPUs: Fast Spectral Synthesis Using Texture Interpolation

    8: Black Hole Simulations with CUDA

    9: Treecode and Fast Multipole Method for N-body Simulation with CUDA

    10: Wavelet-based Density Functional Theory Calculation on Massively Parallel Hybrid Architectures

    Section 2: Life Sciences

    State of GPU Computing in Life Sciences

    11: Accurate Scanning of Sequence Databases with the Smith-Waterman Algorithm

    12: Massive Parallel Computing to Accelerate Genome-Matching

    13: GPU-Supercomputer Acceleration of Pattern Matching

    14: GPU Accelerated RNA Folding Algorithm

    15: Temporal Data Mining for Neuroscience

    Section 3: Statistical Modeling

    State of GPU Computing in Statistical Modeling

    16: Parallelization Techniques for Random Number Generations

    17: Monte Carlo Photon Transport on the GPU

    18: High Performance Iterated Function Systems

    Section 4: Emerging Data-intensive Applications

    State of GPU Computing in Data-intensive Applications

    19: Large Scale Machine Learning

    20: Multiclass Support Vector Machine

    21: Template Driven Agent Based Modeling and Simulation with CUDA

    22: GPU-Accelerated Ant Colony Optimization

    Section 5: Electronic Design Automation

    State of GPU Computing in Electronic Design Automation

    23: High Performance Gate-Level Simulation with GP-GPUs

    24: GPU-Based Parallel Computing for Fast Circuit Optimization

    Section 6: Ray Tracing and Rendering

    State of GPU Computing in Ray Tracing and Rendering

    25: Lattice-Boltzmann Lighting Models

    26: Path Regeneration for Random Walks

    27: From Sparse Mocap to Highly-detailed Facial Animation

    28: A Programmable Graphics Pipeline in CUDA for Order Independent Transparency

    Section 7: Computer Vision

    State of GPU Computing in Computer Vision

    29: Fast Graph Cuts for Computer Vision

    30: Visual Saliency Model on Multi-GPU

    31: Real-Time Stereo on GPGPU Using Progressive Multi-Resolution Adaptive Windows

    32: Real-Time Speed-Limit-Sign Recognition on an Embedded System Using a GPU

    33: Haar Classifiers for Object Detection with CUDA

    Section 8: Video and Image Processing

    State of GPU Computing in Video and Image Processing

    34: Experiences on Image and Video Processing with CUDA and OpenCL

    35: Connected Component Labeling in CUDA

    36: Image Demosaicing

    Section 9: Signal and Audio Processing

    State of GPU Computing in Signal and Audio Processing

    37: Efficient Automatic Speech Recognition on the GPU

    38: Parallel LDPC Decoding

    39: Large-Scale Fast Fourier Transform

    Section 10: Medical Imaging

    State of GPU Computing in Medical Imaging

    40: GPU Acceleration of Iterative Digital Breast Tomosynthesis

    41: Parallelization of Katsevich CT Image Reconstruction Algorithm on Generic Multi-Core Processors and GPGPU

    42: 3-D Tomographic Image Reconstruction from Randomly Ordered Lines with CUDA

    43: Using GPUs to Learn Effective Parameter Settings for GPU-Accelerated Iterative CT Reconstruction Algorithms

    44: Using GPUs to Accelerate Advanced MRI Reconstruction with Field Inhomogeneity Compensation

    45: ℓ1 Minimization in ℓ1-SPIRiT Compressed Sensing MRI Reconstruction

    46: Medical Image Processing Using GPU-accelerated ITK Image Filters

    47: Deformable Volumetric Registration Using B-splines

    48: Multi-scale Unbiased Diffeomorphic Atlas Construction on Multi-GPUs

    49: GPU-accelerated Brain Connectivity Reconstruction and Visualization in Large-Scale Electron Micrographs

    50: Fast Simulation of Radiographic Images Using a Monte Carlo X-Ray Transport Algorithm Implemented in CUDA 






advert image