{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Dense-Matrix-Alg

# Dense-Matrix-Alg - Dense matrix algorithms We will first...

This preview shows pages 1–7. Sign up to view the full content.

1 Dense matrix algorithms We will first study algorithms involving dense matrices (as opposed to sparse matrices ) A very important issue is how to map a matrix onto processors the combination of proper mapping and efficient algorithm is performance critical Main mapping schemes are: striped partitioning blocked partitioning checkerboard partitioning CSE 721, Winter 2011 Dense Matrix Algorithms

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Striped partitioning Ways of partitioning a 16 × 16 matrix on 4 processors CSE 721, Winter 2011 Dense Matrix Algorithms
3 Checkerboard partitioning Ways of partitioning a 8 × 8 matrix on 16 processors Checkerboard partitioning splits both rows and columns CSE 721, Winter 2011 Dense Matrix Algorithms

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Matrix-vector multiplication Standard serial algorithm: Complexity: W = n 2 procedure MAT_VECT ( A, x, y ) begin for i := 0 to n - 1 do begin y [ i ] := 0 for j := 0 to n - 1 do y [ i ] := y [ i ] + A [ i , j ] * x [ j ] end end MAT_VECT CSE 721, Winter 2011 Dense Matrix Algorithms
5 Matrix-Vector Multiplication (rowwise striping) Sequential run time: W = n 2 Simple case: p = n A2A broadcast of vector elements: Θ ( n) single row multiplication: Θ ( n) Total time: Θ ( n) processor-time product: Θ ( n 2 ) (cost-opt.) General case: p < n A2A broadcast of vector elements: t s log p + t w n (hypercube) 2 t s p + t w n (mesh) row multiplication: Θ ( n 2 /p) total time: n 2 /p + t s log p + t w n (hypercube) n 2 /p + 2 t s p + t w n (mesh) CSE 721, Winter 2011 Dense Matrix Algorithms

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 Matrix-Vector Multiplication (checkerboard partitioning) Simple case: p = n 2 one-to-one comm. + one-to-all broadcast + single node accumul. per row + multipl. = = Θ ( n) + Θ ( n) + Θ ( n) + Θ (1) = Θ ( n) (mesh) = Θ
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 18

Dense-Matrix-Alg - Dense matrix algorithms We will first...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online