ece475-l12

ece475-l12 - 1 ECE 475/CS 416 Computer Architecture- Branch...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 ECE 475/CS 416 Computer Architecture- Branch Prediction Edward Suh C omputer S ystems L aboratory [email protected] ECE 475/CS 416 — Computer Architecture, Fall 2007 Prof. Suh Announcements ¡ Code of Academic Integrity ¡ Homework 3 and Lab 3 out ¡ Re-grade request (Lab 2) 2 ECE 475/CS 416 — Computer Architecture, Fall 2007 Prof. Suh Review ¡ Virtual memory systems provide illusion of a large, private, uniform storage • Address translation and protection check • Demand paging ¡ A good VM design needs to be fast and space efficient • Space optimization: hierarchical page tables • Latency optimization: TLB & virtually-addressed L1 caches ¡ Now let’s go back to the processor core • Dynamic scheduling handles various types of data hazards • BUT, how about control hazard? ECE 475/CS 416 — Computer Architecture, Fall 2007 Prof. Suh Run-Length Between Branches Average dynamic instruction mix from SPEC92: SPECint92 SPECfp92 ALU 39 % 13 % FPU Add 20 % FPU Mult 13 % load 26 % 23 % store 9 % 9 % branch 16 % 8 % other 10 % 12 % SPECint92: compress, eqntott, espresso, gcc , li SPECfp92: doduc, ear, hydro2d, mdijdp2, su2cor What is the average run length between branches? 3 ECE 475/CS 416 — Computer Architecture, Fall 2007 Prof. Suh MIPS Branches and Jumps Instruction Taken known? Target known? J JR BEQZ/BNEZ Each instruction fetch depends on one or two pieces of information from the preceding instruction: 1) Is the preceding instruction a taken branch? 2) If so, what is the target address? After Reg. Fetch * After Inst. Decode After Inst. Decode After Inst. Decode After Inst. Decode After Reg. Fetch * Assuming zero detect on register read ECE 475/CS 416 — Computer Architecture, Fall 2007 Prof. Suh Branch Penalties in Modern Pipelines A PC Generation/Mux P Instruction Fetch Stage 1 F Instruction Fetch Stage 2 B Branch Address Calc/Begin Decode I Complete Decode J Steer Instructions to Functional units R Register File Read E Integer Execute Remainder of execute pipeline (+ another 6 stages) UltraSPARC-III instruction fetch pipeline stages (in-order issue, 4-way superscalar, 750MHz, 2000) Branch Target Address Known Branch Direction & Jump Register Target Known 4 ECE 475/CS 416 — Computer Architecture, Fall 2007 Prof. Suh Branch Prediction ¡ Motivation: branch penalties limit performance of deeply pipelined processors • Gets worse when (a) issue ¡ , (b) pipeline depth ¡ ¡ Required hardware support: • Prediction structures: Branch history tables, branch target buffers, etc. – Must figure out (1) it’s a branch, (2) outcome, (3) target address • Mispredict recovery mechanisms: – Keep result computation separate from commit – Kill instructions following branch in pipeline – Restore state to state following branch ECE 475/CS 416 — Computer Architecture, Fall 2007 Prof. Suh Static Branch Predication Overall probability a branch is taken is ~60-70% but:...
View Full Document

This note was uploaded on 02/19/2008 for the course ECE 4750 taught by Professor Suh during the Fall '07 term at Cornell.

Page1 / 15

ece475-l12 - 1 ECE 475/CS 416 Computer Architecture- Branch...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online