{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Lec5Cache

# Etc knapsack knapsack problem bottom up dynamic

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: performance Matrix multiply: in column order. If B is accessed for i = 1 to n for j= 1 to n arrays are (as in C) stored in row major order, cache lines are not helping, which can cause cache misses for all Bs. C[i,j]=0 for k = 1 to n C[i,j]+=A[i,k]*B[k,j] Solution: transpose B Tiling •  Instead of reading a whole row of A and doing n whole row A column B inner products we can read a block of A and compute smaller inner products with sub columns of B. •  These partial products are then added up. Conventional matrix multiply Conventional matrix multiply Conventional matrix multiply Conventional matrix multiply Conventional matrix multiply Conventional matrix multiply etc. ..... Conventional matrix multiply All elements of B are used once, while all of row A[i] are used n times. A[i,*] may fit in the cache, B will probably not! Tiling A and B "   A k x k tile of A (which can ﬁt in the cache) block multiplies...
View Full Document

{[ snackBarMessage ]}