11+-+Memory

11+-+Memory - CS 4290/6290 Caching and Memory Technology...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 4290/6290 Caching and Memory Technology
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic AMAT = hit time + miss rate * miss penalty Reduce hit time (done) Reduce miss penalty (done) Reduce miss rate • Cycles MemoryStall = CacheMisses x (MissLatency Total – MissLatency Overlapped ) Increase overlapped miss latency 2
Background image of page 2
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic The “3 Cs” Compulsory : have to have these Miss the first time each block is accessed Capacity : due to limited cache capacity • Would not have them if cache size was infinite Conflict : due to limited associativity Would not have them if cache was fully associative 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic Victim Caches Recently kicked-out blocks kept in small cache If we miss on those blocks, can get them fast Why does it work: conflict misses • Misses that we have in our N-way set-assoc cache, but would not have if the cache was fully associative Example: direct-mapped L1 cache and a 16-line fully associative victim cache • Victim cache prevents thrashing when several “popular” blocks want to go to the same entry 4
Background image of page 4
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic Larger blocks Helps if there is more spatial locality 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic Larger caches Fewer capacity misses, but longer hit latency! Higher Associativity Fewer conflict misses, but longer hit latency! … need to work through AMAT equations to figure out which is better 6
Background image of page 6
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic Pseudo Associative Caches Similar to way prediction Start with direct mapped cache If miss on “primary” entry, try another entry Compiler optimizations Loop interchange Blocking 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic For loops over multi-dimensional arrays Example: matrices (2-dim arrays) Change order of iteration to match layout Gets better spatial locality Layout in C: last index changes first for(j=0;j<10000;j++) for(i=0;i<40000;i++) c[i][j]=a[i][j]+b[i][j]; for(i=0;i<40000;i++) for(j=0;j<10000;j++) c[i][j]=a[i][j]+b[i][j]; a[i][j] and a[i+1][j] are 10000 elements apart a[i][j] and a[i][j+1] are next to each other 8
Background image of page 8
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic Idea: overlap miss latency with useful work Also called “latency hiding” Non-blocking caches A blocking cache services one access at a time While miss serviced, other accesses blocked (wait) Non-blocking caches remove this limitation While miss serviced, can process other requests Prefetching Predict what will be needed and get it ahead of time 9
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 36

11+-+Memory - CS 4290/6290 Caching and Memory Technology...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online