Mizaons university of washington blocked matrix

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: i ­cache and d ­cache: 32 KB, 8 ­way, Access: 4 cycles L2 unified cache: 256 KB, 8 ­way, Access: 11 cycles L3 unified cache: 8 MB, 16 ­way, Access: 30 ­40 cycles Block size: 64 bytes for all caches. University of Washington Memory and Caches           Cache basics Principle of locality Memory hierarchies Cache organiza?on Program op?miza?ons that consider caches Caches and Program Op?miza?ons University of Washington Op?miza?ons for the Memory Hierarchy   Write code that has locality   Spa:al: access data con:guously   Temporal: make sure access to the same data is not too far apart in :me   How to achieve?   Proper choice of algorithm   Loop transforma:ons Caches and Program Op?miza?ons University of Washington Example: Matrix Mul?plica?on c = (double *) calloc(sizeof(double), n*n); /* Multiply n x n matrices a and b */ void mmm(double *a, double *b, double *c, int n) { int i, j, k; for (i = 0; i < n; i++) for (j = 0; j < n; j++) for (k = 0; k < n; k++) c[i*n + j] += a[i*n + k]*b[k*n + j]; } j c = i a b * Caches and Program Op?miza?ons University of Washington Cache Miss Analysis   Assume:   Matrix elements are doubles   Cache block = 64 bytes = 8 doubles   Cache size C << n (much smaller than n)   n First itera?on:   n/8 + n = 9n/8 misses (omiqng matrix c) = * = *   Aeerwards in cache: (schema:c) 8 wide Caches and Program Op?miza?ons University of Washington Cache Miss Analysis   Assume:   Matrix elements are doubles   Cache block = 64 bytes = 8 doubles   Cache size C << n (much smaller than n)   n Other itera?ons:   Again: n/8 + n = 9n/8 misses (omiqng matrix c) = * 8 wide   Total misses:   9n/8 * n2 = (9/8) * n3 Caches and Program Op?miza?ons University of Washington Blocked Matrix Mul?plica?on c = (double *) calloc(sizeof(double), n*n); /* Multiply n x n matrices a and b */ void mmm(double *a, double *b, double *c, int n) { int i, j, k; for (i = 0; i < n; i+=B) for (j = 0; j < n; j+=B) for (k = 0; k < n; k+=B) /* B x B mini matrix multiplications */ for (i1 = i; i1 < i+B; i1++) for (j1 = j; j1 < j+B; j1++) for (k1 = k; k1 < k+B; k1++) c[i1*n + j1] += a[i1*n + k1]*b[k1*n + j1]; } j1 c = i1 a b * Block size B x B Caches and Program Op?miza?ons University of Washington Cache Miss Analysis   Assume:   Cache block = 64 bytes = 8 doubles   Cache size C << n (much smaller than n)   Three block...
View Full Document

This document was uploaded on 04/04/2014.

Ask a homework question - tutors are online