Problems from Chapter 3
9/15/11
3.2
A
Max. degree of concurrency
Critical path length
Max. achievable speedup
(no limit on processes)
Min. number of processes
needed for max. speedup
Max. speedup with number of
processes = 2
processes = 4
Processes = 8
B
10/13/11
Example 5.15
Cost-optimality and the Isoefficiency Function
Minimum Execution Time, Tpmin
Finding minimum cost-optimal Exec. Time, Tpcost-opt
Problem 5.3 from Chapter 5
W, # of tasks
pmax = c(W)
Critical Path
Smax, max speedup
TP, at p= pmax/2
Sp
CS 6643 F'11
Synchronization
Synchronization of several processes using
barrier synchronization
use
use mutexes, semaphores, locks, . to control
access to shared variables
Communication
Communication among processes through
reads
reads and writes of share
CS 6643 F '11 Lec 22
The latter case is preferred. So if i<j, the elements in Pi
elements
elements in Pj
Output: sorted sequence is placed in the memory of one
process or distributed evenly among all processes
We assume the latter
Input: stored in on
CS 6643 F '11 Lec 23
Distribute vector x so that
each process has the entire x
Each process performs dot
product to get an element of
the result vector
Steps
Process i, 1in, has row i of
matrix A and element i of
vector x
Dot product of two vectors i
CS 6643 F '11 Lec 26
G=(V,E,w) is a weigthed graph such that w: E-> is a valid
function
A forest is an acyclic graph
A tree is a connected acyclic graph
A weighted graph has weights for each edge
Every vertex is adjacent to all other vertices
G is c
CS6643 F '11 Lec28
Search directories, subdirectories for
documents (look for .html, .txt, .tex, etc.)
Using a dictionary of key words, create a
profile vector for each document
Store profile vectors
Document Classification Problem
Document Classificati
Fall 2011
CS 6643 Parallel Processing
Practice Questions
Answer all questions. Be concise, specific, and give quantitative justifications where needed.
Note: p is number of processors; m is message size; ts is startup overhead; tw is per word transfer tim
Analytical Modeling of Parallel System Performance
10/11/11
Amdahl's Law
Gives a quick upper bound on the reduction of exec. time for a given size problem with p increasing
Ignores comm. => may overestimate speedup
Ignores Problem size scaling => may unde
CS 6643 Parallel Processing
Homework 3 Solutions
4.2
Fall 2011
10/4/11
Solved in the class.
(b) Time taken = ts lg p + (p-1) m tw
Step i takes ts + 2i m tw, 0 <= i < lg p:
Link between nodes 2i-1 and 2i is used by 2i message.
Improvement: Half the message
Both techniques are closely related to each other
CS 6643 F11 Lec01 3
Interprocessor communication cost is high and is not used
frequently
Problem can be decomposed into several relatively independent
units of work
Transaction processing
Distributed
Parallelism extracted by compiler
Easier for programmer
Hard for compiler writer
Parallelism is not fully exploited
In practice, combination of the two
Implicit Parallel Programming
Parallel part spelled out by programmer
Hard for programmer
Explicit
CS6643 F'11 Lec05 3
A task-dependency graph is influenced by decomposition
of
of computation into tasks and organization of tasks
Note: dependencies limit concurrency
A task-dependency graph with very few or no directed edges is
ideal
ideal for paralle
Boundary value problem
Finding the maximum
The n-body problem
Case Studies
Algorithm Design Case Studies
Ice water
Rod
Insulation
Boundary Value Problem
Fosters Design Methodology
CS6643 F'11 Lec05 2
Identify communication pattern between primitive ta
CS 6643 Fall '11 Lec08
OAB: a process sends some data to all other processes
Reduction: the data from all processes are combined through an
associative operator and accumulated at a single destination
process
One-to-All Broadcast (OAB)
Rajendra V. Boppa
CS6643 F'11 Lec12
2 is the smallest and only even number prime
All numbers that are not prime are called composite
numbers
Greater than 1
Can be divided without a reminder by itself and 1 only
A prime number is
Prime Numbers
Slides adapted from M.J.