CudaPA1

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ndex = threadStartIndex + N; int i; for( i=threadStartIndex; i<threadEndIndex; ++i ){ C[i] = A[i] + B[i]; } } vecadd: your job •  The code described does not coalesc WHY? •  What do you need to do to make it coalesc? ConsecuEve threads read consecuEve memory locaEons. shared Shared Shared Matrix MulEply •  Generic case: Problem: A is an m.k matrix, B is a k.n matrix à༎ C is an m.n matrix GridBlock: computes a block of C with a certain “footprint”...
View Full Document

This note was uploaded on 02/12/2014 for the course CS 475 taught by Professor Staff during the Fall '08 term at Colorado State.

Ask a homework question - tutors are online