This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Dense Matrix Algorithms Dense Matrix Algorithms •Matrix multiplication, system of linear equations, … •Few or no usable zero elements •Data decomposition techniques give efficient task partitioning 1-D Block Partitioning 2 •Partitioning schemes for matrices –1-and 2-dimensional block, cyclic and block-cyclic partitionings •One task per process, generally –Plenty of work per task 2-D Block Partitioning CS 6643 F '11 Lec 23 Matrix-Vector Multiplication •Dot product of two vectors is the key operation •nxn matrix and nx1 vector are multiplied to obtain nx1 vector •T s = W = Θ (n 2 ) •Consider 1-D row-wise block partitioning row/process 3 •1 row/process –Process i, 1 ≤ i ≤ n, has row i of matrix A and element i of v ector x •Steps –Distribute vector x so that each process has the entire x –Each process performs dot product to get an element of the result vector CS 6643 F '11 Lec 23 Matrix-Vector Multiplication 4 •AAB is used to distribute the x vector: message size, m = n/p •For n=p, –AAB takes Θ (n) under single-port model; dot product takes Θ (n) time –T p = Θ (n) –pT p = Θ (n 2 ) = W ⇒ cost-optimal CS 6643 F '11 Lec 23 Matrix-Vector Product •If n>p, each process –has n/p rows and n/p elements of x at the beginning –computes n/p elements of the result vector •AAB with m=n/p takes, t s log p +t w (n/p) (p-1) ≅ t s log p +t w n •Each process spends Θ (n 2 /p) time to compute n/p elements •T p = n 2 /p + t s log p +t w n T n t log p +t p 5 •pT p = n 2 + t s p log p +t w np •T o = t s p log p +t w np •For isoefficiency, equate each term of T o to W and get W = K 1 t s p log p W = K 2 t w np ⇒ W = Θ (p 2 ) •Since max. degree of concurrency, C(W) = n, p = Ο (n) ⇒ W = n 2 = Ω (p 2 ) •Isoefficiency function is the maximum of these three, which is Θ (p 2 ) CS 6643 F '11 Lec 23 Matrix-Vector Product •2-D partitioning •n 2 processes, each with one element of matrix A •Let the processes be in a grid like topology •n elements of x are distributed one per process among the last column of processes teps 6 •Steps –Align x so that x[i] is at process [i,i] –OABs among groups of n processes: process[j,j] broadcasts its x...
View Full Document
This note was uploaded on 01/29/2012 for the course CS 6643 taught by Professor Staff during the Fall '08 term at The University of Texas at San Antonio- San Antonio.
- Fall '08