Multicore and GPU Programming - 1st Edition - ISBN: 9780124171374, 9780124171404

Multicore and GPU Programming

1st Edition

An Integrated Approach

Authors: Gerassimos Barlas
eBook ISBN: 9780124171404
Paperback ISBN: 9780124171374
Imprint: Morgan Kaufmann
Published Date: 17th November 2014
Page Count: 698
Tax/VAT will be calculated at check-out Price includes VAT (GST)
25% off
25% off
25% off
25% off
25% off
20% off
20% off
25% off
25% off
25% off
25% off
25% off
20% off
20% off
25% off
25% off
25% off
25% off
25% off
20% off
20% off
71.95
53.96
53.96
53.96
53.96
53.96
57.56
57.56
60.99
45.74
45.74
45.74
45.74
45.74
48.79
48.79
99.95
74.96
74.96
74.96
74.96
74.96
79.96
79.96
Unavailable
Price includes VAT (GST)
File Compatibility per Device

PDF, EPUB, VSB (Vital Source):
PC, Apple Mac, iPhone, iPad, Android mobile devices.

Mobi:
Amazon Kindle eReader.

Institutional Access

Secure Checkout

Personal information is secured with SSL technology.

Free Shipping

Free global shipping
No minimum order.

Description

Multicore and GPU Programming offers broad coverage of the key parallel computing skillsets: multicore CPU programming and manycore "massively parallel" computing. Using threads, OpenMP, MPI, and CUDA, it teaches the design and development of software capable of taking advantage of today’s computing platforms incorporating CPU and GPU hardware and explains how to transition from sequential programming to a parallel computing paradigm.

Presenting material refined over more than a decade of teaching parallel computing, author Gerassimos Barlas minimizes the challenge with multiple examples, extensive case studies, and full source code. Using this book, you can develop programs that run over distributed memory machines using MPI, create multi-threaded applications with either libraries or directives, write optimized applications that balance the workload between available computing resources, and profile and debug programs targeting multicore machines.

Key Features

  • Comprehensive coverage of all major multicore programming tools, including threads, OpenMP, MPI, and CUDA
  • Demonstrates parallel programming design patterns and examples of how different tools and paradigms can be integrated for superior performance
  • Particular focus on the emerging area of divisible load theory and its impact on load balancing and distributed systems
  • Download source code, examples, and instructor support materials on the book's companion website

Readership

Graduate students in parallel computing courses covering both traditional and GPU computing (or a two-semester sequence); professionals and researchers looking to master parallel computing.

Table of Contents

  • Dedication
  • List of Tables
  • Preface
    • What is in this Book
    • Using this Book as a Textbook
    • Software and Hardware Requirements
    • Sample Code
  • Chapter 1: Introduction
    • Abstract
    • In this chapter you will
    • 1.1 The era of multicore machines
    • 1.2 A taxonomy of parallel machines
    • 1.3 A glimpse of contemporary computing machines
    • 1.4 Performance metrics
    • 1.5 Predicting and measuring parallel program performance
    • Exercises
  • Chapter 2: Multicore and parallel program design
    • Abstract
    • In this chapter you will
    • 2.1 Introduction
    • 2.2 The PCAM methodology
    • 2.3 Decomposition patterns
    • 2.4 Program structure patterns
    • 2.5 Matching decomposition patterns with program structure patterns
    • Exercises
  • Chapter 3: Shared-memory programming: threads
    • Abstract
    • In this chapter you will
    • 3.1 Introduction
    • 3.2 Threads
    • 3.3 Design concerns
    • 3.4 Semaphores
    • 3.5 Applying semaphores in classical problems
    • 3.6 Monitors
    • 3.7 Applying monitors in classical problems
    • 3.8 Dynamic vs. static thread management
    • 3.9 Debugging multithreaded applications
    • 3.10 Higher-level constructs: multithreaded programming without threads
    • Exercises
  • Chapter 4: Shared-memory programming: OpenMP
    • Abstract
    • In this chapter you will
    • 4.1 Introduction
    • 4.2 Your first OpenMP program
    • 4.3 Variable scope
    • 4.4 Loop-level parallelism
    • 4.5 Task parallelism
    • 4.6 Synchronization constructs
    • 4.7 Correctness and optimization issues
    • 4.8 A case study: sorting in OpenMP
  • Chapter 5: Distributed memory programming
    • Abstract
    • In this chapter you will
    • 5.1 Communicating processes
    • 5.2 MPI
    • 5.3 Core concepts
    • 5.4 Your first MPI program
    • 5.5 Program architecture
    • 5.6 Point-to-point communication
    • 5.7 Alternative point-to-point communication modes
    • 5.8 Non blocking communications
    • 5.9 Point-to-point communications: summary
    • 5.10 Error reporting and handling
    • 5.11 Collective communications
    • 5.12 Communicating objects
    • 5.13 Node management: communicators and groups
    • 5.14 One-sided communications
    • 5.15 I/O considerations
    • 5.16 Combining MPI processes with threads
    • 5.17 Timing and performance measurements
    • 5.18 Debugging and profiling MPI Programs
    • 5.19 The boost.MPI Library
    • 5.20 A case study: diffusion-limited aggregation
    • 5.21 A case study: brute-force encryption cracking
    • 5.22 A case study: MPI implementation of the master-worker pattern
    • Exercises
  • Chapter 6: GPU programming
    • Abstract
    • In this chapter you will
    • 6.1 GPU programming
    • 6.2 CUDA’S programming model: Threads, blocks, and grids
    • 6.3 CUDA’S execution model: Streaming multiprocessors and warps
    • 6.4 CUDA compilation process
    • 6.5 Putting together a CUDA project
    • 6.6 Memory hierarchy
    • 6.7 Optimization techniques
    • 6.8 Dynamic parallelism
    • 6.9 Debugging CUDA programs
    • 6.10 Profiling CUDA programs
    • 6.11 CUDA and MPI
    • 6.12 Case studies
    • Exercises
  • Chapter 7: The Thrust template library
    • Abstract
    • In this chapter you will
    • 7.1 Introduction
    • 7.2 First steps in thrust
    • 7.3 Working with thrust datatypes
    • 7.4 Thrust algorithms
    • 7.5 Fancy iterators
    • 7.6 Switching device back ends
    • 7.7 Case studies
    • Exercises
  • Chapter 8: Load balancing
    • Abstract
    • In this chapter you will
    • 8.1 Introduction
    • 8.2 Dynamic load balancing: the linda legacy
    • 8.3 Static load balancing: the divisible load theory approach
    • 8.4 Dltlib: a library for partitioning workloads
    • 8.5 Case studies
    • Exercises
  • Appendix A: Compiling Qt programs
    • A.1 Using an IDE
    • A.2 The qmake Utility
  • Appendix B: Running MPI programs: preparatory configuration steps
    • B.1 Preparatory Steps
    • B.2 Computing Nodes Discovery for MPI Program Deployment
  • Appendix C: Time measurement
    • C.1 Introduction
    • C.2 Posix High-Resolution Timing
    • C.3 Timing in Qt
    • C.4 Timing in OpenMP
    • C.5 Timing in MPI
    • C.6 Timing in CUDA
  • Appendix D: Boost.MPI
    • D.1 Mapping from MPI C to Boost.MPI
  • Appendix E: Setting up CUDA
    • E.1 Installation
    • E.2 Issues with GCC
    • E.3 Running CUDA Without an Nvidia GPU
    • E.4 Running CUDA on Optimus-Equipped Laptops
    • E.5 Combining CUDA with Third-Party Libraries
  • Appendix F: DLTlib
    • F.1 DLTlib Functions
    • F.2 Dltlib Files
  • Glossary
  • Bibliography
  • Index

Details

No. of pages:
698
Language:
English
Copyright:
© Morgan Kaufmann 2015
Published:
Imprint:
Morgan Kaufmann
eBook ISBN:
9780124171404
Paperback ISBN:
9780124171374

About the Author

Gerassimos Barlas

Gerassimos Barlas is a Professor with the Computer Science & Engineering Department, American University of Sharjah, Sharjah, UAE. His research interest includes parallel algorithms, development, analysis and modeling frameworks for load balancing, and distributed Video on-Demand. Prof. Barlas has taught parallel computing for more than 12 years, has been involved with parallel computing since the early 90s, and is active in the emerging field of Divisible Load Theory for parallel and distributed systems.

Affiliations and Expertise

Professor, Computer Science & Engineering Department, American University of Sharjah, UAE