cs240a-cilkapps

cs240a-cilkapps - CS 240A : Examples with Cilk+ Divide &...

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon
1 CS 240A : Examples with Cilk++ Thanks to Charles E. Leiserson for some of these slides Divide & Conquer Paradigm for Cilk++ Solving recurrences Sorting: Quicksort and Mergesort Graph traversal: Breadth-First Search
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 T P = execution time on P processors T 1 = work T = span * * Also called critical-path length or computational depth . Speedup on p processors T 1 /T p Parallelism T 1 /T Work and Span (Recap)
Background image of page 2
3 Sorting Sorting is possibly the most frequently executed operation in computing! Quicksort is the fastest sorting algorithm in practice with an average running time of O(N log N), (but O(N 2 ) worst case performance) Mergesort has worst case performance of O(N log N) for sorting N elements Both based on the recursive divide-and- conquer paradigm
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 QUICKSORT Basic Quicksort sorting an array S works as follows: If the number of elements in S is 0 or 1 , then return. Pick any element v in S . Call this pivot . Partition the set S-{v} into two disjoint groups: S 1 = {x ε S-{v} | x v} S 2 = {x ε S-{v} | x v} Return quicksort(S 1 ) followed by v followed by quicksort(S 2 )
Background image of page 4
5 QUICKSORT 13 21 34 56 32 31 45 78 14 Select Pivot
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 QUICKSORT 13 21 34 56 32 31 45 78 14 Partition around Pivot 34
Background image of page 6
7 QUICKSORT 13 14 21 32 31 45 56 78 34 Quicksort recursively
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8 Parallelizing Quicksort Serial Quicksort sorts an array S as follows: If the number of elements in S is 0 or 1 , then return. Pick any element v in S . Call this pivot . Partition the set S-{v} into two disjoint groups: S 1 = {x ε S-{v} | x v} S 2 = {x ε S-{v} | x v} Return quicksort(S 1 ) followed by v followed by quicksort(S 2 ) Not necessarily so !
Background image of page 8
9 template <typename T> void qsort(T begin, T end) { if (begin != end) { T middle = partition( begin, end, bind2nd( less<typename iterator_traits<T>::value_type>(), *begin ) ); cilk_spawn qsort(begin, middle); qsort(max(begin + 1, middle), end); cilk_sync ; } } Parallel Quicksort (Basic) The second recursive call to qsort does not depend on the results of the first recursive call We have an opportunity to speed up the call by making both calls in parallel.
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10 Performance ./qsort 500000 -cilk_set_worker_count 1 >> 0.083 seconds ./qsort 500000 -cilk_set_worker_count 16 >> 0.014 seconds Speedup = T 1 /T 16 = 0.083/0.014 = 5.93 ./qsort 50000000 -cilk_set_worker_count 1 >> 10.57 seconds ./qsort 50000000 -cilk_set_worker_count 16 >> 1.58 seconds Speedup = T 1 /T 16 = 10.57/1.58 = 6.67
Background image of page 10
11 Measure Work/Span Empirically cilkscreen -w ./qsort 50000000 Work = 21593799861 Span = 1261403043 Burdened span = 1261600249 Parallelism = 17.1189 Burdened parallelism = 17.1162 #Spawn = 50000000 #Atomic instructions = 14 cilkscreen -w ./qsort 500000 Work = 178835973 Span = 14378443 Burdened span = 14525767 Parallelism = 12.4378 Burdened parallelism = 12.3116 #Spawn = 500000 #Atomic instructions = 8 workspan ws; ws.start(); sample_qsort(a, a + n); ws.stop(); ws.report(std::cout);
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
12 Analyzing Quicksort 13 14 21 32 31 45 56 78 34 Quicksort recursively Assume we have a “great” partitioner that always generates two balanced sets
Background image of page 12
13 Work: T 1 (n) = 2T 1 (n/2) + Θ (n) 2T 1 (n/2) = 4T 1 (n/4) + 2 Θ (n/2) ….
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 14
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/27/2011 for the course CMPSC 240A taught by Professor Gilbert during the Fall '09 term at UCSB.

Page1 / 49

cs240a-cilkapps - CS 240A : Examples with Cilk+ Divide &...

This preview shows document pages 1 - 14. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online