lect.13.OS1.4up

lect.13.OS1.4up - EE108b Lecture 13 C. Kozyrakis 1 EE108B...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: EE108b Lecture 13 C. Kozyrakis 1 EE108B Lecture 13 Caches Wrap Up Processes, Interrupts, and Exceptions Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b EE108b Lecture 13 C. Kozyrakis 2 Announcements • Lab3 and PA2.1 are due ton 2/27 • Don’t forget to study the textbook • Don’t forget the review sessions EE108b Lecture 13 C. Kozyrakis 3 Measuring Performance • Memory system – Stalls include both cache miss stalls and write buffer stalls – Cache access time often determines the overall system clock cycle time since it is often the slowest functional unit in the pipeline • Memory stalls hurt processor performance – CPU Time = (CPU Cycles + Memory Stall Cycles) • Cycle Time • Memory stalls caused by both reading and writing – Mem Stall Cycles = Read Stall Cycles + Write Stalls Cycles EE108b Lecture 13 C. Kozyrakis 4 Memory Performance • Read stalls are fairly easy to understand – Read Cycles = Reads/prog • Read Miss Rate • Read Miss Penalty • Write stalls depend upon the write policy – Write-through Write Stall = (Writes/Prog • Write Miss Rate • Write Miss Penalty) + Write Buffer Stalls – Write-back Write Stall = (Writes/Prog • Write Miss Rate • Write Miss Penalty) • “Write miss penalty” can be complex: – Can be partially hidden if processor can continue executing – Can include extra time to writeback a value we are evicting EE108b Lecture 13 C. Kozyrakis 5 Worst-Case Simplicity • Assume that write and read misses cause the same delay • In a multi-level cache be careful about local miss rates – Miss rate for 2 nd level cache is often “high” if you look at the local miss rate (misses per reference into the cache) – But for misses per instruction, it is often much better penalty Miss n Instructio Misses ogram ns Instructio Cycles Stall Memory penalty Miss rate Miss ogram accesses Memory Cycles Stall Memory Pr Pr × × = × × = EE108b Lecture 13 C. Kozyrakis 6 Cache Performance Example • Consider the following – Miss rate for instruction access is 5% – Miss rate for data access is 8% – Data references per instruction are 0.4 – CPI with perfect cache is 2 – Read and write miss penalty is 20 cycles • Including possible write buffer stalls • What is the performance of this machine relative to one without misses? EE108b Lecture 13 C. Kozyrakis 7 Performance Solution • Find the CPI for the base system without misses – CPI no misses = CPI perfect = 2 • Find the CPI for the system with misses – Misses/instr = I Cache Misses + D Cache Misses – = 0.05 + (0.08 • 0.4) = 0.082 – Memory Stall Cycles = Misses/instr • Miss Penalty – = 0.082 • 20 = 1.64 – CPI with misses = CPI perfect + Memory Stall Cycles – = 2 + 1.64 = 3.64 • Compare the performance 82 . 1 2 64 . 3 = = = = misses no misses with misses with misses no CPI CPI e Performanc e Performanc n EE108b Lecture 13 C. Kozyrakis 8 Another Cache Problem • Given the following data...
View Full Document

This note was uploaded on 03/08/2011 for the course EE 108B at Stanford.

Page1 / 14

lect.13.OS1.4up - EE108b Lecture 13 C. Kozyrakis 1 EE108B...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online