Compiler Optimizations (continued), Dynamic Scheduling

Computer Organization and Design: The Hardware/Software Interface

Info icon This preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
CS152 Computer Architecture and Engineering Lecture 14 Static Scheduling (Continued) Dynamic Scheduling: Scoreboards March 17 th , 2003 John Kubiatowicz ( ) lecture slides: 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.2 ° The Five Classic Components of a Computer ° Today’s Topics: Recap last lecture/Review Scoreboard Administrivia Tomasulo scheduling algorithm Tomasulo loop unrolling The Big Picture: Where are We Now? Control Datapath Memory Processor Input Output 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.3 Recall: Can we somehow make CPI closer to 1? Let’s assume full pipelining: Possible delay slots around a 4-cycle multiply instruction: multf $F0 ,$F2,$F4 multf $F0 , $F2, $F4 ld $F0 ,0($r5) delay-1 delay-1 delay-1 delay-2 delay-2 multf $F4, $F0 ,$F3 delay-3 sw $F0 , 4($R2) addf $F6,$F10, $F0 Fetch Decode Ex1 Ex2 Ex3 Ex4 WB multf delay1 delay2 delay3 addf Earliest forwarding for 4-cycle instructions Earliest forwarding for 1-cycle instructions 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.4 Recall: Revised FP Loop Minimizing Stalls 6 clocks: CPI = 6/5 = 1.2) Instruction Execute Instruction Use Latency producing result Latency using result in cycles FP ALU op 4 Another FP ALU op 3 FP ALU op 4 Store double 2 Load double 2 FP ALU op 1 1 Loop: LD F0 ,0(R1) 2 stall 3 ADDD F4 , F0 ,F2 4 SUBI R1,R1,8 5 BNEZ R1,Loop ;delayed branch 6 SD 8 (R1), F4 ;altered when move past SUBI Swap BNEZ and SD by changing address of SD Unroll loop 4 times code to make faster?
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.5 ° What assumptions made when moved code? OK to move store past SUBI even though changes register OK to move loads before stores: get right data? When is it safe for compiler to do such changes? 1 Loop:LD F0,0(R1) 2 LD F6,-8(R1) 3 LD F10,-16(R1) 4 LD F14,-24(R1) 5 ADDD F4,F0,F2 6 ADDD F8,F6,F2 7 ADDD F12,F10,F2 8 ADDD F16,F14,F2 9 SD 0(R1),F4 10 SD -8(R1),F8 11 SD -16(R1),F12 12 SUBI R1,R1,#32 13 BNEZ R1,LOOP 14 SD 8 (R1),F16 ; 8-32 = -24 14 clock cycles, or 3.5 per iteration CPI = 14/14 = 1 When safe to move instructions? Recall: Unrolled Loop That Minimizes Stalls 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.6 ° Two main variations: Superscalar and VLIW ° Superscalar: varying no. instructions/cycle (1 to 6) Parallelism and dependencies determined/resolved by HW IBM PowerPC 604, Sun UltraSparc, DEC Alpha 21164, HP 7100 ° Very Long Instruction Words (VLIW): fixed number of instructions (16) parallelism determined by compiler Pipeline is exposed; compiler must schedule delays to get right result ° Explicit Parallel Instruction Computer (EPIC)/ Intel 128 bit packets containing 3 instructions (can execute sequentially) Can link 128 bit packets together to allow more parallelism Compiler determines parallelism, HW checks dependencies and fowards/stalls Getting CPI < 1: Issuing Multiple Instructions/Cycle 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.7 ° Superscalar DLX: 2 instructions, 1 FP & 1 anything else – Fetch 64-bits/clock cycle; Int on left, FP on right – Can only issue 2nd instruction if 1st instruction issues
Image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern