On early computer systems linking was performed

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: o notice about this graph: and : Class ; and ¯ ¯ ¯ ¯ For large Ò, the fastest version runs three times faster than the slowest version, even though each performs the same number of floating-point arithmetic operations. Versions with the same number and locality of memory accesses have roughly the same measured performance. The two versions with the worst memory behavior, in terms of the number of accesses and misses per iteration, run significantly slower than the other four versions, which have fewer misses or fewer accesses, or both. The Class routines — 2 memory accesses and 1.25 misses per iteration — perform somewhat better on this particular machine than the Class routines — 3 memory accesses and 0.5 misses per iteration — which trade off an additional memory reference for a lower miss rate. The point is that cache misses are not the whole story when it comes to performance. The number of memory accesses 6.6. PUTTING IT TOGETHER: THE IMPACT OF CACHES ON PROGRAM PERFORMANCE 335 is also important, and in many cases, finding the best performance involves a tradeoff between the two. Problems 6.32...
View Full Document

This note was uploaded on 09/02/2010 for the course ELECTRICAL 360 taught by Professor Schultz during the Spring '10 term at BYU.

Ask a homework question - tutors are online