A hands-on introduction to parallel programming based on the Message-Passing Interface (MPI) standard, the de-facto industry standard adopted by major vendors of commercial parallel systems. This textbook/tutorial, based on the C language, contains many fully-developed examples and exercises. The complete source code for the examples is available in both C and Fortran 77. Students and professionals will find that the portability of MPI, combined with a thorough grounding in parallel programming principles, will allow them to program any parallel system, from a network of workstations to a parallel supercomputer.
- Proceeds from basic blocking sends and receives to the most esoteric aspects of MPI.
- Includes extensive coverage of performance and debugging.
- Discusses a variety of approaches to the problem of basic I/O on parallel machines.
- Provides exercises and programming assignments.
Chapter 1 Introduction 1.1 The Need for More Computational Power 1.2 The Need for Parallel Computing 1.3 The Bad News 1.4 MPI 1.5 The Rest of the Book 1.6 Typographic Conventions
Chapter 2 An Overview of Parallel Computing 2.1 Hardware 2.1.1 Flynn's Taxonomy 2.1.2 The Classical von Neumann Machine 2.1.3 Pipeline and Vector Architectures 2.1.4 SIMD Systems 2.1.5 General MIMD Systems 2.1.6 Shared-Memory MIMD 2.1.7 Distributed-Memory MIMD 2.1.8 Communication and Routing 2.2 Software Issues 2.2.1 Shared-Memory Programming 2.2.2 Message Passing 2.2.3 Data-Parallel Languages 2.2.4 RPC and Active Messages 2.2.5 Data Mapping 2.3 Summary 2.4 References 2.5 Exercises
Chapter 3 Greetings! 3.1 The Program 3.2 Execution 3.3 MPI 3.3.1 General MPI Programs 3.3.2 Finding Out about the Rest of the World 3.3.3 Message: Data + Envelope 3.3.4 Sending Messages 3.4 Summary 3.5 References 3.6 Exercises 3.7 Programming Assignment
Chapter 4 An Application: Numerical Integration 4.1 The Trapezoidal Rule 4.2 Parallelizing the Trapezoidal Rule 4.3 I/O on Parallel Systems 4.4 Summary 4.5 References 4.6 Exercises 4.7 Programming Assignments
Chapter 5 Collective Communication 5.1 Tree-Structured Communication 5.2 Broadcast 5.3 Tags, Safety, Buffering, and Synchronization 5.4 Reduce 5.5 Dot Product 5.6 Allreduce 5.7 Gather and Scatter 5.8 Summary 5.10 References 5.11 Exercises 5.12 Programming Assignments
Chapter 6 Grouping Data for Communication 6.1 The Count Parameter 6.2 Derived Types and MPI_Type_struct 6.3 Other Derived Datatype Constructors 6.4 Type Matching 6.5 Pack/Unpack 6.6 Deciding Which Method to Use 6.7 Summary 6.8 References 6.9 Exercises 6.10 Programming Assignments
Chapter 7 Communicators and Topologies 7.1 Matrix Multiplication 7.2 Fox's Algorithm 7.3 Communicators 7.4 Working with Groups, Contexts, and Communicators 7.5 MPI_Comm_split 7.6 Topologies 7.7 MPI_Cart_sub 7.8 Implementation of Fox's Algorithm 7.9 Summary 7.10 References 7.11 Exercises 7.12 Programming Assignments
Chapter 8 Dealing with I/O 8.1 Dealing with stdin, stdout, and stderr 8.1.1 Attribute caching 8.1.2 Callback Functions 8.1.3 Identifying the I/O process rank 8.1.4 Caching an I/O Process Rank 8.1.5 Retrieving the I/O Process Rank 8.1.6 Reading from stdin 8.1.7 Writing to stdout 8.1.8 Writing to stderr and Error Checking 8.2 Limited Access to stdin 8.3 File I/O 8.4 Array I/O 8.4.1 Data Distributions 8.4.2 Model Problem 8.4.3 Distribution of the Input 8.4.4 Derived Datatypes 8.4.5 The Extent of a Derived Datatype 8.4.6 The Input Code 8.4.7 Printing the Array 8.4.8 An Example 8.5 Summary 8.6 References 8.7 Exercises 8.8 Programming Assignments
Chapter 9 Debugging Your Program 9.1 Quick Review of Serial Debugging 9.1.1 Examine the Source Code 9.1.2 Add Debugging Output 9.1.3 Use a Debugger 9.2 More on Serial debugging 9.3 Parallel Debugging 9.4 Nondeterminism 9.5 An Example 9.5.1 The Program? 9.5.2 Debugging The Program 9.5.3 A Brief Discussion of Parallel Debuggers 9.5.4 The Old Standby: printf/fflush 9.5.5 The Classical Bugs in Parallel Programs 9.5.6 First Fix 9.5.7 many parallel Programming Bugs are Really Serial Programming Bugs 9.5.8 Different Systems, Different Errors 9.5.9 Moving to Multiple Processes 9.5.10 Confusion about I/O 9.5.11 Finishing Up 9.6 Error Handling in MPI 9.7 Summary 9.8 References 9.9 Exercises 9.10 Programming Assignments
Chapter 10 Design and Coding of Parallel Programs 10.1 Data-Parallel Programs 10.2 Jacobi's Method 10.3 Parallel Jacobi's Method 10.4 Coding Parallel Programs 10.5 An Example: Sorting 10.5.1 Main Program 10.5.2 The "Input" Functions 10.5.3 All-to-all Scatter/Gather 10.5.4 Redistributing the Keys 10.5.5 Pause to Clean Up 10.5.6 Find_alltoall_send_params 10.5.7 Finishing Up 10.8 Summary 10.7 References 10.8 Exercises 10.9 Programming Assignments
Chapter 11 Performance 11.1 Serial Program Performance 11.2 An Example: The Serial Trapezoidal Rule 11.3 What about the I/O? 11.4 Parallel Program Performance Analysis 11.5 The Cost of Communication 11.6 An Example: The Parallel Trapezoidal Rule 11.7 Taking Timings 11.8 Summary 11.9 References 11.10 Exercises 11.11 Programming Assignments
Chapter 12 More on Performance 12.1 Amdahl's Law 12.2 Work and Overhead 12.3 Sources of Overhead 12.4 Scalability 12.5 Potential Problems in Estimating Performance 12.5.1 Networks of Workstations and Resource Contention 12.5.2 Load Balancing and Idle Time 12.5.3 Overlapping Communication and Computation 12.5.4 Collective Communication 12.6 Performance Evaluation Tools 12.6.1 MPI's Profiling Interface 12.6.2 Upshot 12.7 Summary 12.8 References 12.9 Exercises 12.10 Programming Assignments
Chapter 13 Advanced Point-to-Point Communication 13.1 An Example: Coding Allgather 13.1.1 Function Parameters 13.1.2 Ring Pass Allgather 13.2 Hypercubes 13.2.1 Additional Issues in the Hypercube Exchange 13.2.2 Details of the Hypercube Algorithm 13.3 Send-receive 13.4 Null Processes 13.5 Nonblocking Communication 13.5.1 Ring Allgather with Nonblocking Communication 13.5.2 Hypercube Allgather with Nonblocking Communication 13.6 Persistent Communication Requests 13.7 Communication Modes 13.7.1 Synchronous Mode 13.7.2 Ready Mode 13.7.3 Buffered Mode 13.8 The Last Word on Point-to-Point Communication 13.9 Summary 13.10 References 13.11 Exercises 13.12 Programming Assignments
Chapter 14 Parallel Algorithms 14.1 Designing a Parallel Algorithm 14.2 Sorting 14.3 Serial Bitonic Sort 14.4 Parallel Bitonic Sort 14.5 Tree Searches and Combinatorial Optimization 14.6 Serial Tree Search 14.7 Parallel Tree Search 14.7.1 Par_dfs 14.7.2 Service_requests 14.7.3 Work_remains 14.7.4 Distributed Termination Detection 14.8 Summary 14.9 References 14.10 Exercises 14.11 Programming Assignments
Chapter 15 Parallel Libraries 15.1 Using Libraries: Pro and Con 15.2 Using More than One Language 15.3 ScaLAPACK 15.4 An Example of a ScaLAPACK Program 15.5 PETSc 15.6 A PETSc Example 15.7 Summary 15.8 References 15.9 Exercises 15.10 Programming Assignments
Chapter 16 Wrapping Up 16.1 Where to Go from Here 16.2 The Future of MPI
Appendix A Summary of MPI Commands A.1 Point-to-Point Communication Functions A.1.1 Blocking Sends and Receives A.1.2 Communication Modes A.1.3 Buffer Allocation A.1.4 Nonblocking Communication A.1.5 Probe and Cancel A.1.6 Persistent Communication Requests A.1.7 Send-receive A.2 Derived Datatypes and MPI_Pack/Unpack A.2.1 Derived Datatypes A.2.2 MPI_Pack and MPI_Unpack A.3 Collective Communication Functions A.3.1 Barrier and Broadcast A.3.2 Gather Scatter A.3.3 Reduction Operations A.4 Groups, Contexts, and Communicators A.4.1 Group Management A.4.2 Communicator Management A.4.3 Inter-communicators A.4.4 Attribute Caching A.5 Process Topologies A.5.1 General Topology Functions A.5.2 Cartesian Topology Management A.5.3 Graph Topology Management A.6 Environmental Management A.6.1 Implementation Information A.6.2 Error Handling A.6.3 Timers A.6.4 Startup A.7 Profiling A.8 Constants A.9 Type Definitions
Appendix B MPI on the Internet B.1 Implementations of MPI B.2 The MPI FAQ B.3 MPI Web Pages B.4 MPI Newsgroup B.5 MPI-2 and MPI-IO B.6 Parallel Programming with MPI
- No. of pages:
- © Morgan Kaufmann 1996
- 26th November 1996
- Morgan Kaufmann
- eBook ISBN:
- Paperback ISBN:
Peter Pacheco received a PhD in mathematics from Florida State University. After
completing graduate school, he became one of the first professors in UCLA’s “Program
in Computing,” which teaches basic computer science to students at the College
of Letters and Sciences there. Since leaving UCLA, he has been on the faculty of
the University of San Francisco. At USF Peter has served as chair of the computer
science department and is currently chair of the mathematics department.
His research is in parallel scientific computing. He has worked on the development
of parallel software for circuit simulation, speech recognition, and the simulation
of large networks of biologically accurate neurons. Peter has been teaching
parallel computing at both the undergraduate and graduate levels for nearly twenty
years. He is the author of Parallel Programming with MPI, published by Morgan
University of San Francisco, USA
Intel Recommended Reading List for Developers, 1st Half 2013 – Books for Software Developers, Intel
Intel Recommended Reading List for Developers, 2nd Half 2013 – Books for Software Developers, Intel
Intel Recommended Reading List for Developers, 1st Half 2014 – Books for Software Developers, Intel
"...the detailed discussion of many complex and confusing issues makes the book an important information source for programmers developing large applications using MPI." -—L.M. Liebrock, ACM Computing Reviews