This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: EE108b Lecture 13 C. Kozyrakis 1 EE108B Lecture 13 Caches Wrap Up Processes, Interrupts, and Exceptions Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b EE108b Lecture 13 C. Kozyrakis 2 Announcements • Lab3 and PA2.1 are due ton 2/27 • Don’t forget to study the textbook • Don’t forget the review sessions EE108b Lecture 13 C. Kozyrakis 3 Measuring Performance • Memory system – Stalls include both cache miss stalls and write buffer stalls – Cache access time often determines the overall system clock cycle time since it is often the slowest functional unit in the pipeline • Memory stalls hurt processor performance – CPU Time = (CPU Cycles + Memory Stall Cycles) • Cycle Time • Memory stalls caused by both reading and writing – Mem Stall Cycles = Read Stall Cycles + Write Stalls Cycles EE108b Lecture 13 C. Kozyrakis 4 Memory Performance • Read stalls are fairly easy to understand – Read Cycles = Reads/prog • Read Miss Rate • Read Miss Penalty • Write stalls depend upon the write policy – Write-through Write Stall = (Writes/Prog • Write Miss Rate • Write Miss Penalty) + Write Buffer Stalls – Write-back Write Stall = (Writes/Prog • Write Miss Rate • Write Miss Penalty) • “Write miss penalty” can be complex: – Can be partially hidden if processor can continue executing – Can include extra time to writeback a value we are evicting EE108b Lecture 13 C. Kozyrakis 5 Worst-Case Simplicity • Assume that write and read misses cause the same delay • In a multi-level cache be careful about local miss rates – Miss rate for 2 nd level cache is often “high” if you look at the local miss rate (misses per reference into the cache) – But for misses per instruction, it is often much better penalty Miss n Instructio Misses ogram ns Instructio Cycles Stall Memory penalty Miss rate Miss ogram accesses Memory Cycles Stall Memory Pr Pr × × = × × = EE108b Lecture 13 C. Kozyrakis 6 Cache Performance Example • Consider the following – Miss rate for instruction access is 5% – Miss rate for data access is 8% – Data references per instruction are 0.4 – CPI with perfect cache is 2 – Read and write miss penalty is 20 cycles • Including possible write buffer stalls • What is the performance of this machine relative to one without misses? EE108b Lecture 13 C. Kozyrakis 7 Performance Solution • Find the CPI for the base system without misses – CPI no misses = CPI perfect = 2 • Find the CPI for the system with misses – Misses/instr = I Cache Misses + D Cache Misses – = 0.05 + (0.08 • 0.4) = 0.082 – Memory Stall Cycles = Misses/instr • Miss Penalty – = 0.082 • 20 = 1.64 – CPI with misses = CPI perfect + Memory Stall Cycles – = 2 + 1.64 = 3.64 • Compare the performance 82 . 1 2 64 . 3 = = = = misses no misses with misses with misses no CPI CPI e Performanc e Performanc n EE108b Lecture 13 C. Kozyrakis 8 Another Cache Problem • Given the following data...
View Full Document
This note was uploaded on 03/08/2011 for the course EE 108B at Stanford.