Lecture12 - CSCE 2610 Memory Hierarchy: Improving Cache...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
CSCE 2610 Memory Hierarchy: Improving Cache Performance
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Caching Effects Algorithm choice may need to consider the behavior of a cache. Radix sort is better asymptotically (according to Big-Oh). However, Quicksort makes better use of locality.
Background image of page 2
Cache Performance At any given time during program execution, the CPU is either executing something, or waiting for memory. CPU time = (CPU execution cycles + Memory-stall cycles) x Clock cycle time Memory-stall cycles come (mostly) from cache misses. Memory-stall cycles = read-stall cycles + write-stall cycles Improving time means reducing the number of memory-stall cycles.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Cache Performance Read stall cycles are based upon the time to move data from memory to the cache (miss penalty) Read-stall cycles = Reads/program x Read miss rate x Read miss penalty Write-stall cycles are similar: Write-stall cycles = (Writes/program x Write miss rate x Write miss penalty) + Write buffer stalls Write-buffer stalls can usually be ignored In a well-designed system, the write-buffer will be large enough to handle the average write behavior. This would be difficult to calculate since it's based upon the proximity of write instructions.
Background image of page 4
Cache Performance We can assume that the miss rate is the same for reads and writes. We can also assume that the miss penalty is the same for reads and writes At least in a write-through cache This leads us to the following: Memory-stall cycles = Memory accesses/program x Miss rate x Miss penalty Let's see an example
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Performance Example Assume: Miss rates: (I-cache = 2% , D-cache = 4% ) Miss penalty: 100 cycles Base CPI (ideal cache) = 2 Memory Accesses/Program = 36% What is the overall CPI based upon the cache behavior? Instruction-stall cycles = 0.02 * 100 = 2 Data-access-stall cycles = 0.36 * 0.04 * 100 = 1.44 Overall CPI = 2 + 2 + 1.44 = 5.44 Ideally cached CPU is 5.44/ 2 = 2.72 times faster!
Background image of page 6
Average Memory Access Time An important measure of memory system performance is the “Average Memory Access Time” (AMAT) When you access a location in memory, how long will it take, on average. AMAT = Hit time + Miss Rate x Miss penalty
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/28/2012 for the course CSCE 3510 taught by Professor Unt during the Spring '12 term at North Texas.

Page1 / 31

Lecture12 - CSCE 2610 Memory Hierarchy: Improving Cache...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online