L7-8_P6_2010

L7-8_P6_2010 - Computer Architecture An Example of an...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
Computer Architecture 2010 – P6 uArch 1 Computer Architecture The P6 Micro-Architecture An Example of an Out-Of-Order Micro-processor Lihu Rappoport and Adi Yoaz
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Computer Architecture 2010 – P6 uArch 2 The P6 Family Features Out Of Order execution: Data flow analysis Register renaming Speculative Execution Multiple Branch prediction Super pipeline: 12 pipe stages Processor Year Freq (MHz) Bus (MHz) L2 cache Process (μ) Pentium® Pro 1995 150~200 60/66 256/512K* 0.5, 0.35 Pentium® II 1997 233~450 66/100 512K* 0.35, 0.25 Pentium® III 1999 450~1400 100/133 256/512K 0.25, 0.18, 0.13 Pentium® M 2003 900~2260 400/533 1M / 2M 0.13, 90nm Core TM 2005 1660~2330 667 2M 65nm Core TM 2 2006 1800~2930 800/1066 2/4/8M 65nm *off die
Background image of page 2
Computer Architecture 2010 – P6 uArch 3 In-Order Front End BIU: Bus Interface Unit IFU: Instruction Fetch Unit (includes IC) BPU: Branch Prediction Unit ID: Instruction Decoder MS: Micro-Instruction Sequencer RAT: Register Alias Table Out-of-order Core ROB: Reorder Buffer RRF: Real Register File RS: Reservation Stations IEU: Integer Execution Unit FEU: Floating-point Execution Unit AGU: Address Generation Unit MIU: Memory Interface Unit DCU: Data Cache Unit MOB: Memory Order Buffer L2: Level 2 cache In-Order Retire P6 μ Arch MS AGU MOB External Bus IEU MIU FEU BPU BIU IFU I D RAT R S L2 DCU ROB
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Computer Architecture 2010 – P6 uArch 4 O1 O3 R1 R2 Ex I1 I2 I3 I4 I5 I6 I7 I8 Next IP Reg Ren RS Wr Icache Decode RS disp Retirement In-Order Front End Out-of-order Core In-order Retirement 1: Next IP 2: ICache lookup 3: ILD (instruction length decode) 4: rotate 5: ID1 6: ID2 7: RAT- rename sources, ALLOC-assign destinations 8: ROB-read sources RS-schedule data-ready uops for dispatch 9: RS-dispatch uops 10:EX 11:Retirement P6 Pipeline
Background image of page 4
Computer Architecture 2010 – P6 uArch 5 In-Order Front End BPU – Branch Prediction Unit – predict next fetch address IFU – Instruction Fetch Unit iTLB translates virtual to physical address (access PMH on miss) IC supplies 16byte/cyc (access L2 cache on miss) ILD – Induction Length Decode – split bytes to instructions IQ – Instruction Queue – buffer the instructions ID – Instruction Decode – decode instructions into uops MS – Micro-Sequencer – provides uops for complex instructions Next IP Mux BPU ID MS ILD IQ IDQ IFU Bytes Instructions uops
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Computer Architecture 2010 – P6 uArch 6 Branch Prediction Implementation Use local history to predict direction Need to predict multiple branches Need to predict branches before previous branches are resolved Branch history updated first based on prediction, later based on actual execution (speculative history) Target address taken from BTB Prediction rate: ~92% ~60 instructions between mispredictions High prediction rate is very crucial for long pipelines Especially important for OOOE, speculative execution:
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/14/2011 for the course CS 234267 taught by Professor Rapaport during the Spring '07 term at Technion.

Page1 / 54

L7-8_P6_2010 - Computer Architecture An Example of an...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online