9 - cache - Cache Memory Hierarchy3 Cs and 7 Ways to Reduce...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Cache Memory Hierarchy—3 Cs and 7 Ways to Reduce Misses
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Review: Who Cares About the Memory Hierarchy? μProc 60%/yr. DRAM 7%/yr. 1 10 100 1000 1980 1981 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 DRAM CPU 1982 Processor-Memory Performance Gap: (grows 50% / year) Performance “Moore’s Law” Processor Only Thus Far in Course: CPU cost/performance, ISA, Pipelined Execution CPU-DRAM Gap 1980: no cache in μproc; 1995 2-level cache on chip (1989 first Intel μproc with a cache on chip)
Background image of page 2
3 Processor-Memory Performance Gap ―Tax‖ Processor % Area %Transistors (-cost) (-power) Alpha 21164 37% 77% StrongArm SA110 61% 94% Pentium Pro 64% 88% 2 dies per package: Proc/I$/D$ + L2$ Caches have no inherent value, only try to close performance gap
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Generations of Microprocessors Time of a full cache miss in instructions executed: 1st Alpha (7000): 340 ns/5.0 ns = 68 clks x 2 or 136 2nd Alpha (8400): 266 ns/3.3 ns = 80 clks x 4 or 320 3rd Alpha (t.b.d.): 180 ns/1.7 ns =108 clks x 6 or 648 1/2X latency x 3X clock rate x 3X Instr/clock -5X
Background image of page 4
5 Review: Four Questions for Memory Hierarchy Designers Q1: Where can a block be placed in the upper level? (Block placement) Fully Associative, Set Associative, Direct Mapped Q2: How is a block found if it is in the upper level? (Block identification) Tag/Block Q3: Which block should be replaced on a miss? (Block replacement) Random, LRU Q4: What happens on a write? (Write strategy) Write Back or Write Through (with Write Buffer)
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Review: Cache Performance CPU time = (CPU execution clock cycles + Memory stall clock cycles) x clock cycle time Memory stall clock cycles = (Reads x Read miss rate x Read miss penalty + Writes x Write miss rate x Write miss penalty) Memory stall clock cycles = Memory accesses x Miss rate x Miss penalty
Background image of page 6
7 Review: Cache Performance CPUtime = Instruction Count x (CPI execution + Mem accesses per instruction x Miss rate x Miss penalty) x Clock cycle time Misses per instruction = Memory accesses per instruction x Miss rate CPUtime = IC x (CPI execution + Misses per instruction x Miss penalty) x Clock cycle time
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8 Review: Improving Cache Performance 1. Reduce the miss rate, 2. Reduce the miss penalty, or 3. Reduce the time to hit in the cache.
Background image of page 8
9 Reducing Misses Classifying Misses: 3 Cs Compulsory —The first access to a block is not in the cache, so the block must be brought into the cache. Also called cold start misses or first reference misses . (Misses in even an Infinite Cache) Capacity —If the cache cannot contain all the blocks needed during execution of a program, capacity misses will occur due to blocks being discarded and later retrieved. (Misses in Fully Associative Size X Cache) Conflict —If block-placement strategy is set associative or direct mapped, conflict misses (in addition to compulsory & capacity misses) will occur because a block can be discarded and later retrieved if too many blocks map to its set. Also called collision misses or interference misses .
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/21/2012 for the course CSCI 593 taught by Professor Hamnes during the Spring '11 term at St. Cloud.

Page1 / 34

9 - cache - Cache Memory Hierarchy3 Cs and 7 Ways to Reduce...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online