COVID-19 Update: We are currently shipping orders daily. However, due to transit disruptions in some geographies, deliveries may be delayed. To provide all customers with timely access to content, we are offering 50% off Science and Technology Print & eBook bundle options. Terms & conditions.
CUDA Application Design and Development - 1st Edition - ISBN: 9780123884268, 9780123884329

CUDA Application Design and Development

1st Edition

Author: Rob Farber
eBook ISBN: 9780123884329
Paperback ISBN: 9780123884268
Imprint: Morgan Kaufmann
Published Date: 8th October 2011
Page Count: 336
Sales tax will be calculated at check-out Price includes VAT/GST
Price includes VAT/GST

Institutional Subscription

Secure Checkout

Personal information is secured with SSL technology.

Free Shipping

Free global shipping
No minimum order.


As the computer industry retools to leverage massively parallel graphics processing units (GPUs), this book is designed to meet the needs of working software developers who need to understand GPU programming with CUDA and increase efficiency in their projects. CUDA Application Design and Development starts with an introduction to parallel computing concepts for readers with no previous parallel experience, and focuses on issues of immediate importance to working software developers: achieving high performance, maintaining competitiveness, analyzing CUDA benefits versus costs, and determining application lifespan.

The book then details the thought behind CUDA and teaches how to create, analyze, and debug CUDA applications. Throughout, the focus is on software engineering issues: how to use CUDA in the context of existing application code, with existing compilers, languages, software tools, and industry-standard API libraries.

Using an approach refined in a series of well-received articles at Dr Dobb's Journal, author Rob Farber takes the reader step-by-step from fundamentals to implementation, moving from language theory to practical coding.

Key Features

  • Includes multiple examples building from simple to more complex applications in four key areas: machine learning, visualization, vision recognition, and mobile computing
  • Addresses the foundational issues for CUDA development: multi-threaded programming and the different memory hierarchy
  • Includes teaching chapters designed to give a full understanding of CUDA tools, techniques and structure.
  • Presents CUDA techniques in the context of the hardware they are implemented on as well as other styles of programming that will help readers bridge into the new material


Software engineers, programmers, hardware engineers, advanced students

Table of Contents

CHAPTER 1 First Programs and How to Think in CUDA

Source Code and Wiki

Distinguishing CUDA from Conventional Programming with a Simple Example

Choosing a CUDA API

Some Basic CUDA Concepts

Understanding Our First Runtime Kernel

Three Rules of GPGPU Programming

Big-O Considerations and Data Transfers

CUDA and Amdahl’s Law

Data and Task Parallelism

Hybrid Execution: Using Both CPU and GPU Resources

Regression Testing and Accuracy

Silent Errors

Introduction to Debugging

UNIX Debugging

Windows Debugging with Parallel Nsight


CHAPTER 2 CUDA for Machine Learning and Optimization

Modeling and Simulation

Machine Learning and Neural Networks

XOR: An Important Nonlinear Machine-Learning Problem

Performance Results on XOR

Performance Discussion


The C++ Nelder-Mead Template

CHAPTER 3 The CUDA Tool Suite: Profiling a PCA/NLPCA



Obtaining Basic Profile Information

Gprof: A Common UNIX Profiler

The NVIDIA Visual Profiler: Computeprof

Parallel Nsight for Microsoft Visual Studio

Tuning and Analysis Utilities (TAU)


CHAPTER 4 The CUDA Execution Model

GPU Architecture Overview

Warp Scheduling and TLP

ILP: Higher Performance at Lower Occupancy

Little’s Law

CUDA Tools to Identify Limiting Factors



The CUDA Memory Hierarchy

GPU Memory

L2 Cache

L1 Cache

CUDA Memory Types

Global Memory


CHAPTER 6 Efficiently Using GPU Memory


Utilizing Irregular Data Structures

Sparse Matrices and the CUSP Library

Graph Algorithms

SoA, AoS, and Other Structures

Tiles and Stencils


CHAPTER 7 Techniques to Increase Parallelism

CUDA Contexts Extend Parallelism

Streams and Contexts

Out-of-Order Execution with Multiple Streams

Tying Data to Computation


CHAPTER 8 CUDA for All GPU and CPU Applications

Pathways from CUDA to Multiple Hardware Backends

Accessing CUDA from Other Languages





CHAPTER 9 Mixing CUDA and Rendering



Introduction to the Files in the Framework


CHAPTER 10 CUDA in a Cloud and Cluster Environments

The Message Passing Interface (MPI)

How MPI Communicates


Balance Ratios

Considerations for Large MPI Runs

Cloud Computing

A Code Example


CHAPTER 11 CUDA for Real Problems

Working with High-Dimensional Data


Force-Directed Graphs

Monte Carlo Methods

Molecular Modeling

Quantum Chemistry

Interactive Workflows

A Plethora of Projects


CHAPTER 12 Application Focus on Live Streaming Video

Topics in Machine Vision


TCP Server

Contents ix

Live Stream Application

The simpleVBO.cpp File

The callbacksVBO.cpp File

Building and Running the Code

The Future


Listing for simpleVBO.cpp


No. of pages:
© Morgan Kaufmann 2011
8th October 2011
Morgan Kaufmann
eBook ISBN:
Paperback ISBN:

About the Author

Rob Farber

Rob Farber

Rob Farber has served as a scientist in Europe at the Irish Center for High-End Computing as well as U.S. national labs in Los Alamos, Berkeley, and the Pacific Northwest. He has also been on the external faculty at the Santa Fe Institute, consultant to fortune 100 companies, and co-founder of two computational startups that achieved liquidity events. He is the author of “CUDA Application Design and Development” as well as numerous articles and tutorials that have appeared in Dr. Dobb's Journal and Scientific Computing, The Code Project and others.

Affiliations and Expertise

CEO/Publisher of, Wall Street Analyst, and consultant to scientific and commercial technology companies around the world.


The book by Rob Faber on CUDA Application Design and Development is required reading for anyone who wants to understand and efficiently program CUDA for scientific and visual programming. It provides a hands-on exposure to the details in a readable and easy to understand form. Jack Dongarra, Innovative Computing Laboratory, EECS Department, University of Tennessee

GPUs have the potential to take computational simulations to new levels of scale and detail. Many scientists are already realising these benefits, tackling larger and more complex problems that are not feasible on conventional CPU-based systems. This book provides the tools and techniques for anyone wishing to join these pioneers, in an accessible though thorough text that a budding CUDA programmer would do well to keep close to hand. Dr. George Beckett, EPCC, University of Edinburgh

With his book, Farber takes us on a journey to the exciting world of programming multi-core processor machines with CUDA. Farber's pragmatic approach is effective in guiding the reader across challenges and their solutions.   Farber's broader presentation of parallel programming with CUDA ranging from CUDA in Cloud and Cluster environments to CUDA for real problems and applications helps the reader learning about the unique opportunities this parallel programming language can offer to the scientific community. This book is definitely a must for students, teachers, and developers! Michela Taufer, Assistant Professor, Department of Computer and Information Sciences, University of Delaware

Rob Farber has written an enlightening and accessible book on the application to CUDA for real research tasks, with an eye to developing scalable and distributed GPU applications.  He supplies clear and usable code examples combined with insight about _why_ one should use a particular approach.  This is an excellent book filled with practical advice for experienced CUDA programmers and ground-up guidance for beginners wondering if CUDA can accelerate their time to solution. Paul A. Navrátil, Manager, Visualization Software, Texas Advanced Computing Center

The book provides a solid introduction to the CUDA programming language starting with the basics and progressively exposing the reader to advanced concepts through the well annotated implementation of real-world applications. It makes a first-rate presentation of CUDA, its use in the implementation of portable and efficient applications and the underlying architecture of GPGPU/CPU systems with particular emphasis on memory hierarchies. This is complemented by a thorough presentation both of the CUDA Tool Suite and of techniques for the parallelisation of applications. Farber's book is a valuable addition to the bookshelves of both the advanced and novice CUDA programmer. Francis Wray, Independent Consultant and Visiting Professor at the Faculty of Computing, Information Systems and Mathematics at the University of Kingston

At a brisk pace, "CUDA Application Design and Development" will take one from the basics of CUDA programming to the level where real-time video processing becomes a stroll in the park. Along the way, the reader can get a clear understanding of how the hybrid CPU-GPU computing idea can be capitalized on, and how a 500-GPU configuration can be used in large scale machine learning problems.  Wasting no time on obscure issues of little relevance, the book provides an excellent account of the CUDA execution model, memory access issues, opportunities to increase parallelism in a program, and how advanced profiling can squeeze performance out of a code.  Rob provides a snapshot of everything that is relevant in CUDA based GPU computing in a style honed through a long series of Dr. Dobb’s articles that have delighted scores of CUDA programmers.  His followers will be delighted once again. Dan Negrut, Associate Professor, University of Wisconsin-Madison, NVIDIA CUDA Fellow

Ratings and Reviews