cs140-multicorematrix

cs140-multicorematrix - 1 CS 140 : Numerical Examples on...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 CS 140 : Numerical Examples on Shared Memory with Cilk++ • Matrix-matrix multiplication • Matrix-vector multiplication • Hyperobjects Thanks to Charles E. Leiserson for some of these slides 2 T P = execution time on P processors T 1 = work T ∞ = span * * Also called critical-path length or computational depth . Speedup on p processors ∙ T 1 /T p Parallelism ∙ T 1 /T ∞ Work and Span (Recap) 3 cilk_for (int i=0; i<n; ++i) { A[i]+=B[i]; } Vector addition Work: T 1 = Θ (n) Span: T ∞ = Parallelism: T 1 /T ∞ = Θ (n/lg n) Work: T 1 = Span: T ∞ = Θ (lg n) Parallelism: T 1 /T ∞ = Cilk Loops: Divide and Conquer G grain size Assume that G = Θ (1) . Implementation 4 Square-Matrix Multiplication c 11 c 12 ⋯ c 1n c 21 c 22 ⋯ c 2n ⋮ ⋮ ⋱ ⋮ c n1 c n2 ⋯ c nn a 11 a 12 ⋯ a 1n a 21 a 22 ⋯ a 2n ⋮ ⋮ ⋱ ⋮ a n1 a n2 ⋯ a nn b 11 b 12 ⋯ b 1n b 21 b 22 ⋯ b 2n ⋮ ⋮ ⋱ ⋮ b n1 b n2 ⋯ b nn = · C A B c ij = k = 1 n a ik b kj Assume for simplicity that n = 2 k . 5 Parallelizing Matrix Multiply cilk_for (int i=1; i<n; ++i) { cilk_for (int j=0; j<n; ++j) { for (int k=0; k<n; ++k { C[i][j] += A[i][k] * B[k][j]; } } Θ (n 3 ) Span: T ∞ = Θ (n 2 ) Work: T 1 = Θ (n) Parallelism: T 1 /T ∞ = For 1000 × 1000 matrices, parallelism ≈ (10 3 ) 2 = 10 6 . 6 Recursive Matrix Multiplication 8 multiplications of n/2 × n/2 matrices. 1 addition of n × n matrices. Divide and conquer — C 11 C 12 C 21 C 22 = · A 11 A 12 A 21 A 22 B 11 B 12 B 21 B 22 = + A 11 B 11 A 11 B 12 A 21 B 11 A 21 B 12 A 12 B 21 A 12 B 22 A 22 B 21 A 22 B 22 7 template <typename T> void MMult(T *C, T *A, T *B, int n) { T * D = new T[n*n]; // base case & partition matrices cilk_spawn MMult(C11, A11, B11, n/2); cilk_spawn MMult(C12, A11, B12, n/2); cilk_spawn MMult(C22, A21, B12, n/2); cilk_spawn MMult(C21, A21, B11, n/2); cilk_spawn MMult(D11, A12, B21, n/2); cilk_spawn MMult(D12, A12, B22, n/2); cilk_spawn MMult(D22, A22, B22, n/2); MMult(D21, A22, B21, n/2); cilk_sync ; MAdd(C, D, n); // C += D; } D&C Matrix Multiplication Row/column length of matrices Determine submatrices by index calculation Coarsen for efficiency 8 template <typename T> void MMult(T *C, T *A, T *B, int n) { T * D = new T[n*n]; // base case & partition matrices cilk_spawn MMult(C11, A11, B11, n/2); cilk_spawn MMult(C12, A11, B12, n/2); cilk_spawn MMult(C22, A21, B12, n/2); cilk_spawn MMult(C21, A21, B11, n/2); cilk_spawn MMult(D11, A12, B21, n/2); cilk_spawn MMult(D12, A12, B22, n/2); cilk_spawn MMult(D22, A22, B22, n/2); MMult(D21, A22, B21, n/2); cilk_sync ; MAdd(C, D, n); // C += D; } Matrix Addition template <typename T>...
View Full Document

This note was uploaded on 12/27/2011 for the course CMPSC 140 taught by Professor Gilbert during the Fall '11 term at UCSB.

Page1 / 29

cs140-multicorematrix - 1 CS 140 : Numerical Examples on...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online