{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

cs240a-pprefix

# cs240a-pprefix - Parallel Prefix Algorithms or Tricks with...

This preview shows pages 1–12. Sign up to view the full content.

Parallel Prefix Algorithms, or Tricks with Trees Some slides from Jim Demmel, Kathy Yelick, Alan Edelman, and a cast of thousands …

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Parallel Vector Operations Vector add: z = x + y Embarrassingly parallel if vectors are aligned DAXPY: z = a*x + y (a is scalar) Broadcast a, followed by independent * and + DDOT: s = x T y = Σ j x[j] * y[j] Independent * followed by + reduction
Broadcast and reduction Broadcast of 1 value to p processors in log p time Reduction of p values to 1 in log p time Takes advantage of associativity in +, *, min, max, etc. a 8 1 3 1 0 4 -6 3 2 Add-reduction Broadcast

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
A theoretical secret for turning serial into parallel Surprising parallel algorithms: If “there is no way to parallelize this algorithm!” … … it’s probably a variation on parallel prefix! Parallel Prefix Algorithms
Example of a prefix Sum Prefix Input x = (x1, x2, . . ., xn) Output y = (y1, y2, . . ., yn) y i = Σ j=1:i x j Example x = ( 1, 2, 3, 4, 5, 6, 7, 8 ) y = ( 1, 3, 6, 10, 15, 21, 28, 36) Prefix Functions-- outputs depend upon an initial string

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
What do you think? Can we really parallelize this? It looks like this kind of code: y(0) = 0; for i = 1:n y(i) = y(i-1) + x(i); The ith iteration of the loop depends completely on the (i-1)st iteration. Work = n, span = n, parallelism = 1. Impossible to parallelize, right?
A clue? x = ( 1, 2, 3, 4, 5, 6, 7, 8 ) y = ( 1, 3, 6, 10, 15, 21, 28, 36) Is there any value in adding, say, 4+5+6+7? If we separately have 1+2+3, what can we do? Suppose we added 1+2, 3+4, etc. pairwise -- what could we do?

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 3 7 11 15 19 23 27 31 (Recursively compute prefix sums) 3 10 21 36 55 78 105 136 1 3 6 10 15 21 28 36 45 55 66 78 91 105 120 136 Prefix sum in parallel Algorithm: 1. Pairwise sum 2. Recursive prefix 3. Pairwise sum
12/27/11 9 What’s the total work? 1 2 3 4 5 6 7 8 Pairwise sums 3 7 11 15 Recursive prefix 3 10 21 36 Update “odds” 1 3 6 10 15 21 28 36 • T 1 (n) = n/2 + n/2 + T 1 (n/2) = n + T 1 (n/2) = 2n – 1 at the cost of more work! Parallel prefix cost

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
12/27/11 10 What’s the total work? 1 2 3 4 5 6 7 8 Pairwise sums 3 7 11 15 Recursive prefix 3 10 21 36 Update “odds” 1 3 6 10 15 21 28 36 • T 1 (n) = n/2 + n/2 + T 1 (n/2) = n + T 1 (n/2) = 2n – 1 Parallelism at the cost of more work! Parallel prefix cost
12/27/11 What’s the total work? 1 2 3 4 5 6 7 8 Pairwise sums 3 7 11 15 Recursive prefix 3 10 21 36 Update “odds” 1 3 6 10 15 21 28 36 • T 1 (n) = n/2 + n/2 + T 1 (n/2) = n + T 1 (n/2) = 2n – 1 • T (n) = 2 log n Parallelism at the cost of more work!

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 32

cs240a-pprefix - Parallel Prefix Algorithms or Tricks with...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online