# Industrial Strength Parallel Computing

## 1st Edition

**Editors:**Alice Koniges

**eBook ISBN:**9780080495385

**Imprint:**Morgan Kaufmann

**Published Date:**25th October 1999

**Page Count:**597

## Description

Today, parallel computing experts can solve problems previously deemed impossible and make the "merely difficult" problems economically feasible to solve. This book presents and synthesizes the recent experiences of reknown expert developers who design robust and complex parallel computing applications. They demonstrate how to adapt and implement today's most advanced, most effective parallel computing techniques.

The book begins with a highly focused introductory course designed to provide a working knowledge of all the relevant architectures, programming models, and performance issues, as well as the basic approaches to assessment, optimization, scheduling, and debugging.

Next comes a series of seventeen detailed case studies—all dealing with production-quality industrial and scientific applications, all presented firsthand by the actual code developers. Each chapter follows the same comparison-inviting format, presenting lessons learned and algorithms developed in the course of meeting real, non-academic challenges. A final section highlights the case studies' most important insights and turns an eye to the future of the discipline.

## Key Features

Provides in-depth case studies of seventeen parallel computing applications, some built from scratch, others developed through parallelizing existing applications.

Explains elements critical to all parallel programming environments, including:

**Terminology and architectures**Programming models and methods ** Performance analysis and debugging toolsTeaches primarily by example, showing how scientists in many fields have solved daunting problems using parallel computing.

Covers a wide range of application areas—biology, aerospace, semiconductor design, environmental modeling, data imaging and analysis, fluid dynamics, and more.

Summarizes the state of the art while looking to the future of parallel computing.

Presents technical animations and visualizations from many of the applications detailed in the case studies via a companion web site.

## Table of Contents

Contents

Preface

Color Plates

PART I - The Parallel Computing Environment

Chapter 1 - Parallel Computing Architectures

Alice E. Koniges, David C. Eder, Margaret Cahir

1.1 Historical Parallel Computing Architectures

1.2 Contemporary Parallel Computing Architectures

1.2.1 MPP Processors

1.2.2 MPP Memory

1.2.3 MPP Interconnect Network

References

Chapter 2 - Parallel Application Performance

Alice E. Koniges

2.1 Defining Performance

2.2 Measuring Performance

2.2.1 MPP Application Speedup

References

Chapter 3 - Programming Models and Methods

Margaret Cahir, Robert Moench, Alice E. Koniges

3.1 Message-Passing Models

3.1.1 PVM

3.1.2 MPI

3.1.3 SHMEM

3.2 Data-Parallel Models

3.2.1 High-Performance Fortran

3.3 Parallel Programming Methods

3.3.1 Nested- and Mixed-Model Methods

3.3.2 POSIX Threads and Mixed Models

3.3.3 Compiler Extensions for Explicit Parallelism with Distributed Objects

3.3.4 Work-Sharing Models

References

Chapter 4 - Parallel Programming Tools

Margaret Cahir, Robert Moench, Alice E. Koniges

4.1 The Apprentice Performance Analysis Tool

4.2 Debuggers

4.2.1 Process Control

4.2.2 Data Viewing

Chapter 5 - Optimizing for Single-Processor Performance

Jeff Brooks, Sara Graffunder, Alice E. Koniges

5.1 Using the Functional Units Effectively

5.2 Hiding Latency with the Cache

5.3 Stream Buffer Optimizations

5.4 E-Register Operations

5.5 How Much Performance Can Be Obtained on a Single Processor?

References

Chapter 6 - Scheduling Issues

Morris A. Jette

6.1 Gang Scheduler Implementation

6.2 Gang Scheduler Performance

References

PART II - The Applications

Chapter 7 - Ocean Modeling and Visualization

Yi Chao, P. Peggy Li, Ping Wang, Daniel S. Katz, Benny N. Cheng, Scott Whitman

7.1 Introduction

7.2 Model Description

7.3 Computational Considerations

7.3.1 Parallel Software Tools

7.3.2 Compiler Options

7.3.3 Memory Optimization and Arithmetic Pipelines

7.3.4 Optimized Libraries

7.3.5 Replacement of If/Where Statements by Using Mask Arrays

7.3.6 Computational Performance

7.4 Visualization on MPP Machines

7.5 Scientific Results

7.6 Summary and Future Challenges

Acknowledgments

References

Chapter 8 - Impact of Aircraft on Global Atmospheric Chemistry

Douglas A. Rotman, John R. Tannahill, Steven L. Baughcum

8.1 Introduction

8.2 Industrial Considerations

8.3 Project Objectives and Application Coder

8.4 Computational Considerations

8.4.1 Why Use an MPP?

8.4.2 Programming Considerations

8.4.3 Algorithm Considerations

8.5 Computational Results

8.5.1 Performance

8.5.2 Subsidiary Technology

8.6 Industrial Results

8.7 Summary

References

Chapter 9 - Petroleum Reservoir Management

Michael DeLong, Allyson Gajraj, Wayne Joubert, Olaf Lubeck, James Sanderson, Robert E. Stephenson, Gautam S. Shiralkar, Bart van Bloemen Waanders

9.1 Introduction

9.2 The Need for Parallel Simulations

9.3 Basic Features of the Falcon Simulator

9.4 Parallel Programming Model and Implementation

9.5 IMPES Linear Solver

9.6 Fully Implicit Linear Solver

9.7 Falcon Performance Results

9.8 Amoco Field Study

9.9 Summary

Acknowledgments

References

Chapter 10 - An Architecture-Independent Navier-Stokes Code

Johnson C. T. Wang, Stephen Taylor

10.1 Introduction

10.2 Basic Equations

10.2.1 Nomenclature

10.3 A Navier-Stokes Solver

10.4 Parallelization of a Navier-Stokes Solver

10.4.1 Domain Decomposition

10.4.2 Parallel Algorithm

10.5 Computational Results

10.5.1 Supersonic Flow over Two Wedges

10.5.2 Titan IV Launch Vehicle

10.5.3 Delta II 7925 Vehicle

10.6 Summary

Acnowledgments

References

Chapter 11 - Gaining Insights into the Flow in a Static Mixer

Olivier Byrde, Mark L. Sawley

11.1 Introduction

11.1.1 Overview

11.1.2 Description of the Application

11.2 Computational Aspects

11.2.1 Why Use an MPP?

11.2.2 Flow Computation

11.2.3 Particle Tracking

11.3 Performance Results

11.3.1 Flow Computation

11.3.2 Particle Tracking

11.4 Industrial Results

11.4.1 Numerical Solutions

11.4.2 Optimization Results

11.4.3 Future Work

11.5 Summary

Acknowledgments

References

Chapter 12 - Modeling Groundwater Flow and Contaminant Transport

William J. Bosil, Steven F. Ashby, Chuck Baldwin, Robert D. Falgout, Steven G. Smith, Andrew F. B. Tompson

12.1 Introduction

12.2 Numerical Simulation of Groundwater Flow

12.2.1 Flow and Transport Model

12.2.2 Discrete Solution Approach

12.3 Parallel Implementation

12.3.1 Parallel Random Field Generation

12.3.2 Preconditioned Conjugate Gradient Solver

12.3.3 Gridding and Data Distribution

12.3.4 Parallel Computations in ParFlow

12.3.5 Scalability

12.4 The MGCG Algorithm

12.4.1 Heuristic Semicoarsening Strategy

12.4.2 Operator-Induced Prolongation and Restriction

12.4.3 Definition of Coarse Grid Operator

12.4.4 Smoothers

12.4.5 Coarsest Grid Solvers

12.4.6 Stand-Alone Multigrid versus Multigrid As a Preconditioner

12.5 Numerical Results

12.5.1 The Effect of Coarsest Grid Solver Strategy

12.5.2 Increasing the Spatial Resolution

12.5.3 Enlarging the Size of the Domain

12.5.4 Increasing the Degree of Heterogeneity

12.5.5 Parallel Performance on the Cray T3D

12.6 Summary

Acknowledgments

References

Chapter 13 - Simulation of Plasma Reactors

Stephen Taylor, Marc Rieffel, Jerrell Watts, Sadasivan Shankar

13.1 Introduction

13.2 Computational Considerations

13.2.1 Grid Generation and Partitioning Techniques

13.2.2 Concurrent DSMC Algorithm

13.2.3 Grid Adaption Technique

13.2.4 Library Technology

13.3 Simulation Results

13.4 Performance Results

13.5 Summary

Acknowledgments

References

Chapter 14 - Electron-Molecule Collisions for Plasma Modeling

Carl Winstead, Chuo-Han Lee, Vincent McKoy

14.1 Introduction

14.2 Computing Electron-Molecule Cross Sections

14.2.1 Theoretical Outline

14.2.2 Implementation

14.2.3 Parallel Organization

14.3 Performance

14.4 Summary

Acknowledgments

References

Chapter 15 - Three-Dimensional Plasma Particle-in-Cell Calculations of Ion Thruster Backflow Contamination

Robie I. Samanta Roy, Daniel E. Hastings, Stephen Taylor

15.1 Introduction

15.2 The Physical Model

15.2.1 Beam Ions

15.2.2 Neutral Efflux

15.2.3 CEX Propellant Ions

15.2.4 Electrons

15.3 The Numerical Model

15.4 Parallel Implementation

15.4.1 Partitioning

15.4.2 Parallel PIC Algorithm

15.5 Results

15.5.1 3D Plume Structure

15.5.2 Comparison of 2D and 3D Results

15.6 Parallel Study

15.7 Summary

Acknowledgments

References

Chapter 16 - Advanced Atomic-Level Materials Design

Lin H. Yang

16.1 Introduction

16.2 Industrial Considerations

16.3 Computational Considerations and Parallel Implementations

16.4 Applications to Grain Boundaries in Polycrystalline Diamond

16.5 Summary

Acknowledgments

References

Chapter 17 - Solving Symmetric Eigenvalue Problems

David C. O'Neal, Raghurama Reddy

17.1 Introduction

17.2 Jacobi's Method

17.3 Classical Jacobi Method

17.4 Serial Jacobi Method

17.5 Tournament Orderings

17.6 Parallel Jacobi Method

17.7 Macro Jacobi Method

17.8 Computational Experiments

17.8.1 Test Problems

17.8.2 Convergence

17.8.3 Scaling

17.9 Summary

Acknowledgments

References

Chapter 18 - Nuclear Magnetic Resonance Simulations

Alan J. Benesi, Kenneth M. Merz, James J. Vincent, Ravi Subramanya

18.1 Introduction

18.2 Scientific Considerations

18.3 Description of the Application

18.4 Computational Considerations

18.4.1 Algorithmic Considerations

18.4.2 Programming Considerations

18.5 Computational Results

18.6 Scientific Results

18.6.1 Validation of Simulation

18.6.2 Interesting Scientific Results

18.7 Summary

Acknowledgments

References

Chapter 19 - Molecular Dynamics Simulations Using Particle-Mesh Ewald Methods

Michael F. Crowley, David W. Deerfield II, Tom A. Darden, Thomas E. Cheatham III

19.1 Introduction: Industrial Considerations

19.1.1 Overview

19.1.2 Cutoff Problem for Long-Distance Forces

19.1.3 Particle-Mesh Ewald Method

19.2 Computational Considerations

19.2.1 Why Use an MPP?

19.2.2 Parallel PME

19.2.3 Coarse-Grain Parallel PME

19.3 Computational Results

19.3.1 Performance

19.3.2 Parallel 3D FFT and Groups

19.4 Industrial Strength Results

19.5 The Future

19.6 Summary

References

Chapter 20 - Radar Scattering and Antenna Modeling

Tom Cwik, Cinzia Zuffada, Daniel S. Katz, Jay Parker

20.1 Introduction

20.2 Electromagnetic Scattering and Radiation

20.2.1 Formulation of the Problem

20.2.2 Why This Formulation Addresses the Problem

20.3 Finite Element Modeling

20.3.1 Discretization of the Problem

20.3.2 Why Use a Scalable MPP?

20.4 Computational Formulation and Results

20.4.1 Constructing the Matrix Problem

20.4.2 Beginning the Matrix Solution

20.4.3 Completing the Solution of the Matrix Problem

20.4.4 The Three Stages of the Application

20.5 Results for Radar Scattering and Antenna Modeling

20.5.1 Anistropic Scattering

20.5.2 Patch Antennas-Modeling Conformal Antennas with PHOEBE

20.6 Summary and Future Challenges

Acknowledgments

References

Chapter 21 - Functional Magnetic Resonance Imaging Dataset Analysis

Nigel H. Goddard, Greg Hood, Jonathan D. Cohen, Leigh E. Nystrom, William F. Eddy, Christopher R. Genovese, Douglas C. Noll

21.1 Introduction

21.2 Industrial Considerations

21.2.1 Overview

21.2.2 Description of the Application

21.2.3 Parallelization and the Online Capability

21.3 Computational Considerations

21.3.1 Why Use an MPP?

21.3.2 Programming Considerations

21.3.3 Algorithm Considerations

21.4 Computational Results

21.4.1 Performance

21.4.2 Subsidiary Technologies

21.5 Clinical and Scientific Results

21.5.1 Supercomputing '96 Demonstration

21.5.2 Science Application

21.5.3 What Are the Next Problems to Tackle?

21.6 Summary

Ackowledgments

References

Chapter 22 - Selective and Sensitive Comparison of Genetic Sequence Data

Alexander J. Ropelewski, Hugh B. Nicholas, Jr., David W. Deerfield II

22.1 Introduction

22.2 Industrial Considerations

22.2.1 Overview/Statement of the Problem

22.3 Approaches Used to Compare Sequences

22.3.1 Visualization of Sequence Comparison

22.3.2 Basic Sequence-Sequence Comparison Algorithm

22.3.3 Basic Sequence-Profile Comparison Algorithm

22.3.4 Other Approaches to Sequence Comparison

22.4 Computational Considerations

22.4.1 Why Use an MPP?

22.4.2 Programming Considerations

22.4.3 Algorithm Considerations

22.5 Computational Results

22.5.1 Performance

22.6 Industrial and Scientific Considerations

22.7 The Next Problems to Tackle

22.8 Summary

Acknowledgments

References

Chapter 23 - Interactive Optimization of Video Compression Algorithms

Henri Nicolas, Fred Jordan

23.1 Introduction

23.2 Industrial Considerations

23.3 General Description of the System

23.3.1 General Principle

23.3.2 Main Advantages Offered by Direct View

23.4 Parallel Implementation

23.4.1 Remarks

23.5 Description of the Compression Algorithm

23.6 Experimental Results

23.7 Summary

Ackowledgments

References

PART III - Conclusions and Predictions

Chapter 24 - Designing Industrial Parallel Applications

Alice E. Koniges, David C. Eder, Michael A. Heroux

24.1 Design Lessons from the Applications

24.1.1 Meso- to Macroscale Environmental Modeling

24.1.2 Applied Fluid Dynamics

24.1.3 Applied Plasma Dynamics

24.1.4 Material Design and Modeling

24.1.5 Data Analysis

24.2 Design Issues

24.2.1 Code Conversion Issues

24.2.2 The Degree of Parallelism in the Application

24.3 Additional Design Issues

Chapter 25 - The Future of Industrial Parallel Computing

Michael A. Heroux, Horst Simon, Alice E. Koniges

25.1 The Role of Parallel Computing in Industry

25.2 Microarchitecture Issues

25.2.1 Prediction

25.2.2 Discussion

25.3 Macroarchitecture Issues

25.3.1 Prediction

25.3.2 Discussion

25.4 System Software Issues

25.4.1 Prediction

25.4.2 Discussion

25.5 Programming Environment Issues

25.5.1 Prediction

25.5.2 Discussion

25.6 Applications Issues

25.6.1 Prediction

25.6.2 Discussion

25.7 Parallel Computing in Industry

25.7.1 Area 1: Parallel Execution of a Single Analysis: Incompressible CFD Analysis

25.7.2 Area 2: Design Optimization: Noise, Vibration, Harshness (NVH) Analysis

25.7.3 Area 3: Design Studies: Crash Design Optimization

25.7.4 Area 4: Interactive, Intuitive, Immersive Simulation Environments: Large- Scale Particle Tracing

25.8 Looking Forward: The Role of Parallel Computing in the Digital Information Age

25.8.1 Increasing the Demand for Parallel Computing

25.8.2 The Importance of Advanced User Interfaces

25.8.3 Highly Integrated Computing

25.8.4 The Future

References

Appendix: Mixed Models with Pthreads and MPI

Vijay Sonnad, Chary G. Tamirisa, Gyan Bhanot

Glossary

Index

Contributors

## Details

- No. of pages:
- 597

- Language:
- English

- Copyright:
- © Morgan Kaufmann 2000

- Published:
- 25th October 1999

- Imprint:
- Morgan Kaufmann

- eBook ISBN:
- 9780080495385

## About the Editor

### Alice Koniges

Alice E. Koniges is an internationally known authority on parallel application development. As leader of the Parallel Applications Technology Program at Lawrence Livermore National Laboratory, she directed researchers in the largest set of agreements between industries and national laboratories ever funded by the US Department of Energy. She has served as a consultant to the Max Planck Institutes of Germany on parallelization and high performance computing issues. She was the first woman to be awarded a Ph.D. in Applied Mathematics from Princeton University.

About the Contributors

**
**

## Ratings and Reviews

**
**