Lec5Cache

# David patterson etal parallel computing lab berkeley

Unformatted text preview: with a k x k tile of B (which can ﬁt in the cache) and thus reuses the B tile k times, better cache use "   Loops become nested loops "   outer loop visits tile origins "   inner loops visit the tile points "   We can parameterize our program with k and experiment Tiled matrix multiply Do the whole block A11 x B11 multiply Tiled matrix multiply The do block A11 x B12 multiply How many times are A and B elements used now? etc. ..... Knapsack Knapsack Problem: Bottom- Up Dynamic Programming "   Knapsack. Fill an n- by- W array. Input: n, W, w1,…,wN, v1,…,vN for w = 0 to W M[0, w] = 0 for i = 1 to n for w = 0 to W if wi > w : M[i, w] = M[i-1, w] else : M[i, w] = max (M[i-1, w], vi + M[i-1, w-wi ]) return M[n, W] Knapsack data dependence M[i, w] depends on M[i-1,w] and M[i-1,w-wi] How can we parallelize knapsack? there are no in row dependences Knapsack parallelization M[i, w...
