Analyzing parallel programs
CPS343
Parallel and High Performance Computing
Spring 2013
Outline
1
Analyzing Parallel Programs
Speedup
Amdahls Law and Gustafson-Barsiss Law
Evaluating

MPI Derived Datatypes
CPS343
Parallel and High Performance Computing
Spring 2013
Outline
1
Motivating Example
Cartesian Grids
2
MPI Datatypes
Datatypes in MPI
Derived Datatypes in MPI
3
Ex

Parallel Alternating Direction Implicit Solver for the Two-Dimensional
Heat Diffusion Problem on Graphics Processing Units
Khor Shu Heng
Engineering Science Programme
National University of Singapore
Abstract
This paper presents a parallel alternating dir

A B RIEF OVERVIEW OF C HAPEL1
( PRE - PRINT OF AN UPCOMING BOOK CHAPTER )
Bradford L. Chamberlain, Cray Inc.
January 2013
revision 1.0
1 This material is based upon work supported by the Defense Advanced Research Projects Agency under its Agreement No. HR

CPS343: Homework on Computing the Mandelbrot Set using CUDA
1
Homework on Computing the Mandelbrot Set using CUDA
The Mandelbrot Set
The Mandelbrot Set is the set of all complex numbers c for which the iteration
2
zk+1 = zk + c,
z0 = 0
(1)
remains bounded

CPS343: Homework on BLAS and HDF les
1
Homework on BLAS and HDF les
Refer to the HDF Users Guide and HDF Reference Manual (both at http:/www.hdfgroup.org/
HDF5/doc/index.html) and the DGEMM manual page as needed in carrying out the following two
problems.

CPS343: Homework on Foster PCAM
1
Homework on Foster PCAM
Due: Friday February 15
Problem 1. In section 2.3.2 (Global Communication) of Foster you read about various
approaches to summing a set of numbers that is distributed among the tasks. This is an
ex

CPS343: Homework on Matrix-vector multiplication
1
Homework on Matrix-vector multiplication
Problem 1. Before we begin looking at parallel programing, it will rst be helpful to
think about a few issues that impact performance in general. One common operat

CPS343: Homework on Quinn Chapter 1
1
Homework on Quinn Chapter 1
Problem 1. Quinn Chapter 1 Exercise 1.3. You have been assigned the task of computing
the sum of 1,000 four-digit numbers as rapidly as possible. You hold in your hands a stack
of 1,000 ind

CPS343: Homework on Quinn Chapter 7
1
Homework on Quinn Chapter 7
Problem 1. Quinn Chapter 7 Exercise 7.4. Benchmarking of a sequential program reveals
that 95% of the execution time is spent inside functions that are amenable to parallelization.
What is