csb-spaa - Parallel Sparse Matrix-Vector and...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Parallel Sparse Matrix-Vector and Matrix- Transpose-Vector Multiplication using Compressed Sparse Blocks Aydın Buluç, UCSB Jeremy T. Fineman (MIT) Matteo Frigo (Cilk Arts) John R. Gilbert (UCSB) Charles E. Leiserson (MIT & Cilk Arts)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Sparse Matrix-Dense Vector Multiplication (SpMV) Applications: Iterative methods for solving linear systems : Krylov subspace methods based on Lanczos biorthogonalization: ) Graph analysis: Betweenness centrality computation y A x y A T x A is an n-by-n sparse matrix with nnz << n 2 nonzero s
Background image of page 2
3 The Landscape: Where does our work fit? Equally fast y=Ax and y=A T x (simultaneously) Plenty of parallelism (for any nonzero distribution ) Hardware specific optimizations (prefetching, TLB blocking, vectorization) Matrix specific optimizations (permutations, index/value compression, register blocking) This is our plane of focus ! Our Contribution
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Theoretical and Experimental: Main Results Our parallel algorithms for y Ax and y A T x using the new compressed sparse blocks ( CSB ) layout has span, and work, yielding parallelism. ) lg / ( n n nnz ) ( nnz ) lg ( n n 0 100 200 300 400 500 600 1 2 3 4 5 6 7 8 MFlops/sec Processors Our CSB algorithms Star-P (CSR+blockrow distribution) Serial (Naïve CSR)
Background image of page 4
5 Compressed Sparse Rows (CSR): A Standard Layout Stores entries in row-major order Uses bits of index data. Reading rows in parallel is easy, but columns is hard. Row pointers data 8 10 2 3 colind n × matrix with nnz nonzeroes 0 2 3 4 0 1 5 7 3 4 5 4 5 6 7 0 4 11 12 13 16 17 Dense collection of “sparse rows” n nnz nnz n lg lg
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Parallelizing SpMV_T is hard using the standard CSR format CSR_SPMV_T(A,x,y) for i 0 to n-1 do for k A.rowptr[i] to A.rowptr[i+1]-1 do y[A.colind[k]] y[A.colind[k]] + A.data
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 21

csb-spaa - Parallel Sparse Matrix-Vector and...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online