11f6643lec23 - Dense Matrix Algorithms Dense Matrix...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Dense Matrix Algorithms Dense Matrix Algorithms •Matrix multiplication, system of linear equations, … •Few or no usable zero elements •Data decomposition techniques give efficient task partitioning 1-D Block Partitioning 2 •Partitioning schemes for matrices –1-and 2-dimensional block, cyclic and block-cyclic partitionings •One task per process, generally –Plenty of work per task 2-D Block Partitioning CS 6643 F '11 Lec 23 Matrix-Vector Multiplication •Dot product of two vectors is the key operation •nxn matrix and nx1 vector are multiplied to obtain nx1 vector •T s = W = Θ (n 2 ) •Consider 1-D row-wise block partitioning row/process 3 •1 row/process –Process i, 1 ≤ i ≤ n, has row i of matrix A and element i of v ector x •Steps –Distribute vector x so that each process has the entire x –Each process performs dot product to get an element of the result vector CS 6643 F '11 Lec 23 Matrix-Vector Multiplication 4 •AAB is used to distribute the x vector: message size, m = n/p •For n=p, –AAB takes Θ (n) under single-port model; dot product takes Θ (n) time –T p = Θ (n) –pT p = Θ (n 2 ) = W ⇒ cost-optimal CS 6643 F '11 Lec 23 Matrix-Vector Product •If n>p, each process –has n/p rows and n/p elements of x at the beginning –computes n/p elements of the result vector •AAB with m=n/p takes, t s log p +t w (n/p) (p-1) ≅ t s log p +t w n •Each process spends Θ (n 2 /p) time to compute n/p elements •T p = n 2 /p + t s log p +t w n T n t log p +t p 5 •pT p = n 2 + t s p log p +t w np •T o = t s p log p +t w np •For isoefficiency, equate each term of T o to W and get W = K 1 t s p log p W = K 2 t w np ⇒ W = Θ (p 2 ) •Since max. degree of concurrency, C(W) = n, p = Ο (n) ⇒ W = n 2 = Ω (p 2 ) •Isoefficiency function is the maximum of these three, which is Θ (p 2 ) CS 6643 F '11 Lec 23 Matrix-Vector Product •2-D partitioning •n 2 processes, each with one element of matrix A •Let the processes be in a grid like topology •n elements of x are distributed one per process among the last column of processes teps 6 •Steps –Align x so that x[i] is at process [i,i] –OABs among groups of n processes: process[j,j] broadcasts its x...
View Full Document

This note was uploaded on 01/29/2012 for the course CS 6643 taught by Professor Staff during the Fall '08 term at The University of Texas at San Antonio- San Antonio.

Page1 / 9

11f6643lec23 - Dense Matrix Algorithms Dense Matrix...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online