Lec07-Matrix multiplication

# Lec07-Matrix multiplication - Lecture 7 Matrix...

This preview shows pages 1–9. Sign up to view the full content.

Lecture 7 Matrix multiplication - continued Scalability Revisiting communication performance and correctness

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
10/12/06 Scott B. Baden / CSE 260 / Fall 2006 2 Announcements • Dr. Bob Lucas will give a special lecture on sparse matrix linear algebra on Weds 11/8 – Meeting time? • There will be only one lecture the week of 11/13 – Friday 11/17 – Meeting time? • Poster session to present projects – Suggested date: Friday 11/30 – Meeting time
10/12/06 Scott B. Baden / CSE 260 / Fall 2006 3 An improved matrix multiply • Difficulties with Cannon’s Algorithm – P is not a perfect square – A and B are not square, and not evenly divisible by p • Interoperation with applications and other libraries difficult or expensive • The SUMMA algorithm offers a practical alternative – Uses a shift algorithm to broadcast – A variant used in SCALAPACK R. VAN DE GEIGN AND J. WATTS, “SUMMA: Scalable universal matrix multiplication algorithm,” Concurrency: Practice and Experience , 9: 255-74 (1997) www.netlib.org/lapack/lawns/lawn96.ps

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
10/12/06 Scott B. Baden / CSE 260 / Fall 2006 4 Loop reordering • The simplest formulation of matrix multiply is the so called “ijk” formulation, named after the order of the loops for i:= 0 to n-1, j:= 0 to n-1, k:= 0 to n-1 C[i,j] += A[i,k] * B[k,j] • Now consider the “kij” formulation for k:= 0 to n-1, i:= 0 to n-1, j:= 0 to n-1 C[i,j] += A[i,k] * B[k,j]
10/12/06 Scott B. Baden / CSE 260 / Fall 2006 5 Formulation • The matrices may be non-square for k := 0 to n3-1 for i := 0 to n1-1 for j := 0 to n2-1 C[i,j] += A[i,k] * B[k,j] • The two innermost loop nests compute n3 outer products for k := 0 to n3-1 C[:,:] += A[:,k] B[k,:] where is outer product C[i,:] += A[i,k] * B[k,:]

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
10/12/06 Scott B. Baden / CSE 260 / Fall 2006 6 Outer product • Recall that when we multiply an m × n matrix by an n × p matrix… we get an m × p matrix • Outer product of column vector a T and vector b = matrix C an m × 1 times a 1 × n a[1,3] x[3,1] Multiplication table with rows formed by a[:] and the columns by b[:] • SUMMA computes n partial outer products: for k := 0 to n-1 C[:,:] += A[:,k] B[k,:] ( a , b , c ) " ( x , y , z ) T # ax ay az bx by bz cx cy cz \$ % ( ) ) )
10/12/06 Scott B. Baden / CSE 260 / Fall 2006 7 Serial algorithm Each row k of B contributes to the n partial n partial outer products : for k := 0 to n-1 C[:,:] += A[:,k] B[k,:] * k

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
10/12/06 Scott B. Baden / CSE 260 / Fall 2006 8 Parallel algorithm Set up a processor geometry P = px × py Blocked multiply, panel size = b << N/max(px,py) for k := 0 to n/b -1 by b multicast A[ : , k:k+b-1 ] along processor rows multicast B[ k:k+b-1, : ] along processor columns C += A[:,k:k+b-1 ] * B[k:k+b-1,: ] // Local MM Each row and column of processors independently participate in a broadcast of a panel Owner of the panel changes with k
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/20/2008 for the course CSE 260 taught by Professor Baden during the Fall '06 term at UCSD.

### Page1 / 35

Lec07-Matrix multiplication - Lecture 7 Matrix...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online