GPU Computing Gems Jade Edition

1st Edition

Editor-in-Chiefs: Wen-mei Hwu
Hardcover ISBN: 9780123859631
eBook ISBN: 9780123859648
Imprint: Morgan Kaufmann
Published Date: 28th September 2011
Page Count: 560
Tax/VAT will be calculated at check-out
Compatible Not compatible
VitalSource PC, Mac, iPhone & iPad Amazon Kindle eReader
ePub & PDF Apple & PC desktop. Mobile devices (Apple & Android) Amazon Kindle eReader
Mobi Amazon Kindle eReader Anything else

Institutional Access

Table of Contents

Editors, Reviewers, and Authors


Managing Editor


Area Editors




State of GPU Computing

Section 1: Parallel Algorithms and Data Structures


In this Section

Chapter 1. Large-Scale GPU Search

1.1 Introduction

1.2 Memory Performance

1.3 Searching Large Data Sets

1.4 Experimental Evaluation

1.5 Conclusion


Chapter 2. Edge v. Node Parallelism for Graph Centrality Metrics

2.1 Introduction

2.2 Background

2.3 Node v. Edge Parallelism

2.4 Data Structure

2.5 Implementation

2.6 Analysis

2.7 Results

2.8 Conclusions


Chapter 3. Optimizing Parallel Prefix Operations for the Fermi Architecture

3.1 Introduction to Parallel Prefix Operations

3.2 Efficient Binary Prefix Operations on Fermi

3.3 Conclusion


Chapter 4. Building an Efficient Hash Table on the GPU

4.1 Introduction

4.2 Overview

4.3 Building and Querying a Basic Hash Table

4.4 Specializing the Hash Table

4.5 Analysis

4.6 Conclusion



Chapter 5. Efficient CUDA Algorithms for the Maximum Network Flow Problem

5.1 Introduction, Problem Statement, and Context

5.2 Core Method

5.3 Algorithms, Implementations, and Evaluations

5.4 Final Evaluation

5.5 Future Directions


Chapter 6. Optimizing Memory Access Patterns for Cellular Automata on GPUs

6.1 Introduction, Problem Statement, and Context

6.2 Core Methods

6.3 Algorithms, Implementations, and Evaluations

6.4 Final Results

6.5 Future Directions


Chapter 7. Fast Minimum Spanning Tree Computa


GPU Computing Gems, Jade Edition describes successful application experiences in GPU computing and the techniques that contributed to that success. Divided into five sections, the book explains how GPU execution is achieved with algorithm implementation techniques and approaches to data structure layout. More specifically, it considers three general requirements: high level of parallelism, coherent memory access by threads within warps, and coherent control flow within warps.
This book begins with an overview of parallel algorithms and data structures. The first few chapters focus on accelerating database searches, how to leverage the Fermi GPU architecture to further accelerate prefix operations, and GPU implementation of hash tables. The reader is then systematically walked through the fundamental optimization steps when implementing a bandwidth-limited algorithm, GPU-based libraries of numerical algorithms and software products for numerical analysis with dedicated GPU support, and the adoption of GPU computing techniques in production engineering simulation codes. The next chapters discuss the state of GPU computing in interactive physics and artificial intelligence, programming tools and techniques for GPU computing, and the edge and node parallelism approach for computing graph centrality metrics. The book also proposes an alternative approach that balances computation regardless of node degree variance. This book will be useful to application developers in a wide range of application areas.

Key Features

  • This second volume of GPU Computing Gems offers 100% new material of interest across industry, including finance, medicine, imaging, engineering, gaming, environmental science, green computing, and more
  • Covers new tools and frameworks for productive GPU computing application development and offers immediate benefit to researchers developing improved programming environments for GPUs
  • Even more hands-on, proven techniques demonstrating how general purpose GPU computing is changing scientific research
  • Distills the best practices of the community of CUDA programmers; each chapter provides insights and ideas as well as 'hands on' skills applicable to a variety of fields


Software engineers, programmers, hardware engineers, advanced students


No. of pages:
© Morgan Kaufmann 2011
Morgan Kaufmann
eBook ISBN:
Hardcover ISBN:


It wasn't until recently that parallel [GPU] computing made people realize that there are whole areas in computing science that we can tackle. … When you can do something 10 or 100 times faster, something magical happens and you can do something completely different.

—Jen-Hsun Huang, CEO, NVIDIA

About the Editor-in-Chiefs

Wen-mei Hwu Editor-in-Chief

Wen-mei W. Hwu is a Professor and holds the Sanders-AMD Endowed Chair in the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign. His research interests are in the area of architecture, implementation, compilation, and algorithms for parallel computing. He is the chief scientist of Parallel Computing Institute and director of the IMPACT research group ( He is a co-founder and CTO of MulticoreWare. For his contributions in research and teaching, he received the ACM SigArch Maurice Wilkes Award, the ACM Grace Murray Hopper Award, the Tau Beta Pi Daniel C. Drucker Eminent Faculty Award, the ISCA Influential Paper Award, the IEEE Computer Society B. R. Rau Award and the Distinguished Alumni Award in Computer Science of the University of California, Berkeley. He is a fellow of IEEE and ACM. He directs the UIUC CUDA Center of Excellence and serves as one of the principal investigators of the NSF Blue Waters Petascale computer project. Dr. Hwu received his Ph.D. degree in Computer Science from the University of California, Berkeley.

Affiliations and Expertise

CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign