Parallel Algorithms, Part 2 15-211: Fundamental Data Structures and Algorithms Margaret Reid-Miller 20 April 2010

2 Announcements Quiz 6: Wednesday, 21 April in recitation HW6 Theory, due Tuesday 27 in class Quiz 7: Wednesday, 28 April in recitation HW6 Chess, due Wednesday Final Exam Review Session: WH 7500 Sunday, May 2, 2:00-4:00 pm Final Exam: UC McConomy Tuesday, May 4, 1:00-4:00 pm
3 Parallel algorithms so far Sum doubling technique Prefix Sum All partial (running) sums on an array Quicksort Recursive calls - “task parallel” Partition with pivot - “data parallel” uses prefix sum

PRAM – Parallel RAM Multiple processors that access shared memory 4 shared memory P 1 P 2 P 3 P n
Parallel programming model Nested-parallel programming constructs: Parallel loop: Apply an expression to each element in parallel (for … in parallel) Parallel do: Evaluate several tasks in parallel (fork- join) Leads to series-parallel DAGs 5

Programming Languages Several programming languages support series-parallel algorithms: NESL – functional (CMU) Cilk ++ - C ++ extension (Intel) (Aside: Cilkchess was winner of Dutch open computer chess competition 1996) TBB – Threading Building Blocks (Intel) CUDA – for GPUs (Nvidia) OpenMP – widely used but not good with D&C 6
7 Parallel Algorithm Performance Metrics Definition: Work W(n) is the total number operations in the algorithm. It is the time to execute the parallel algorithm on a sequential machine. Definition: Depth D(n) is the length of the longest chain of dependencies among its operations.

