324_Book

Making this code run fast requires two types of

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: t; n; j++) for (k = 0; k < n; k++) { r = B[k][j]; for (i = 0; i < n; i++) C[i][j] += A[i][k]*r; } code/mem/matmult/mm.c 1 2 3 4 5 6 for (k = 0; k < n; k++) for (j = 0; j < n; j++) { r = B[k][j]; for (i = 0; i < n; i++) C[i][j] += A[i][k]*r; } code/mem/matmult/mm.c (c) Version . code/mem/matmult/mm.c (d) Version . code/mem/matmult/mm.c 1 2 3 4 5 6 for (k = 0; k < n; k++) for (i = 0; i < n; i++) { r = A[i][k]; for (j = 0; j < n; j++) C[i][j] += r*B[k][j]; } code/mem/matmult/mm.c 1 2 3 4 5 6 for (i = 0; i < n; i++) for (k = 0; k < n; k++) { r = A[i][k]; for (j = 0; j < n; j++) C[i][j] += r*B[k][j]; } code/mem/matmult/mm.c (e) Version . (f) Version . Figure 6.45: Six versions of matrix multiply. Matrix multiply version (class) & ( ) & ( ) & ( ) Loads per iter 2 2 2 Stores per iter 0 1 1 A misses per iter 0.25 1.00 0.00 B misses per iter 1.00 0.00 0.25 C misses per iter 0.00 1.00 0.25 Total misses per iter 1.25 2.00 0.50 Figure 6.46: Analysis of matrix multiply inner loops. The six version...
View Full Document

This note was uploaded on 09/02/2010 for the course ELECTRICAL 360 taught by Professor Schultz during the Spring '10 term at BYU.

Ask a homework question - tutors are online