This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Dense Matrix Algorithms Dense Matrix Algorithms •Matrix multiplication, system of linear equations, … •Few or no usable zero elements •Data decomposition techniques give efficient task partitioning 1D Block Partitioning 2 •Partitioning schemes for matrices –1and 2dimensional block, cyclic and blockcyclic partitionings •One task per process, generally –Plenty of work per task 2D Block Partitioning CS 6643 F '11 Lec 23 MatrixVector Multiplication •Dot product of two vectors is the key operation •nxn matrix and nx1 vector are multiplied to obtain nx1 vector •T s = W = Θ (n 2 ) •Consider 1D rowwise block partitioning row/process 3 •1 row/process –Process i, 1 ≤ i ≤ n, has row i of matrix A and element i of v ector x •Steps –Distribute vector x so that each process has the entire x –Each process performs dot product to get an element of the result vector CS 6643 F '11 Lec 23 MatrixVector Multiplication 4 •AAB is used to distribute the x vector: message size, m = n/p •For n=p, –AAB takes Θ (n) under singleport model; dot product takes Θ (n) time –T p = Θ (n) –pT p = Θ (n 2 ) = W ⇒ costoptimal CS 6643 F '11 Lec 23 MatrixVector Product •If n>p, each process –has n/p rows and n/p elements of x at the beginning –computes n/p elements of the result vector •AAB with m=n/p takes, t s log p +t w (n/p) (p1) ≅ t s log p +t w n •Each process spends Θ (n 2 /p) time to compute n/p elements •T p = n 2 /p + t s log p +t w n T n t log p +t p 5 •pT p = n 2 + t s p log p +t w np •T o = t s p log p +t w np •For isoefficiency, equate each term of T o to W and get W = K 1 t s p log p W = K 2 t w np ⇒ W = Θ (p 2 ) •Since max. degree of concurrency, C(W) = n, p = Ο (n) ⇒ W = n 2 = Ω (p 2 ) •Isoefficiency function is the maximum of these three, which is Θ (p 2 ) CS 6643 F '11 Lec 23 MatrixVector Product •2D partitioning •n 2 processes, each with one element of matrix A •Let the processes be in a grid like topology •n elements of x are distributed one per process among the last column of processes teps 6 •Steps –Align x so that x[i] is at process [i,i] –OABs among groups of n processes: process[j,j] broadcasts its x...
View
Full
Document
This note was uploaded on 01/29/2012 for the course CS 6643 taught by Professor Staff during the Fall '08 term at The University of Texas at San Antonio San Antonio.
 Fall '08
 STAFF

Click to edit the document details