Compiler Optimizations (continued), Dynamic Scheduling

Computer Organization and Design: The Hardware/Software Interface

Info icon This preview shows pages 1–3. Sign up to view the full content.

CS152 Computer Architecture and Engineering Lecture 14 Static Scheduling (Continued) Dynamic Scheduling: Scoreboards March 17 th , 2003 John Kubiatowicz ( ) lecture slides: 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.2 ° The Five Classic Components of a Computer ° Today’s Topics: Recap last lecture/Review Scoreboard Administrivia Tomasulo scheduling algorithm Tomasulo loop unrolling The Big Picture: Where are We Now? Control Datapath Memory Processor Input Output 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.3 Recall: Can we somehow make CPI closer to 1? Let’s assume full pipelining: Possible delay slots around a 4-cycle multiply instruction: multf $F0 ,$F2,$F4 multf $F0 , $F2, $F4 ld $F0 ,0($r5) delay-1 delay-1 delay-1 delay-2 delay-2 multf $F4, $F0 ,$F3 delay-3 sw $F0 , 4($R2) addf $F6,$F10, $F0 Fetch Decode Ex1 Ex2 Ex3 Ex4 WB multf delay1 delay2 delay3 addf Earliest forwarding for 4-cycle instructions Earliest forwarding for 1-cycle instructions 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.4 Recall: Revised FP Loop Minimizing Stalls 6 clocks: CPI = 6/5 = 1.2) Instruction Execute Instruction Use Latency producing result Latency using result in cycles FP ALU op 4 Another FP ALU op 3 FP ALU op 4 Store double 2 Load double 2 FP ALU op 1 1 Loop: LD F0 ,0(R1) 2 stall 3 ADDD F4 , F0 ,F2 4 SUBI R1,R1,8 5 BNEZ R1,Loop ;delayed branch 6 SD 8 (R1), F4 ;altered when move past SUBI Swap BNEZ and SD by changing address of SD Unroll loop 4 times code to make faster?
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.5 ° What assumptions made when moved code? OK to move store past SUBI even though changes register OK to move loads before stores: get right data? When is it safe for compiler to do such changes? 1 Loop:LD F0,0(R1) 2 LD F6,-8(R1) 3 LD F10,-16(R1) 4 LD F14,-24(R1) 5 ADDD F4,F0,F2 6 ADDD F8,F6,F2 7 ADDD F12,F10,F2 8 ADDD F16,F14,F2 9 SD 0(R1),F4 10 SD -8(R1),F8 11 SD -16(R1),F12 12 SUBI R1,R1,#32 13 BNEZ R1,LOOP 14 SD 8 (R1),F16 ; 8-32 = -24 14 clock cycles, or 3.5 per iteration CPI = 14/14 = 1 When safe to move instructions? Recall: Unrolled Loop That Minimizes Stalls 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.6 ° Two main variations: Superscalar and VLIW ° Superscalar: varying no. instructions/cycle (1 to 6) Parallelism and dependencies determined/resolved by HW IBM PowerPC 604, Sun UltraSparc, DEC Alpha 21164, HP 7100 ° Very Long Instruction Words (VLIW): fixed number of instructions (16) parallelism determined by compiler Pipeline is exposed; compiler must schedule delays to get right result ° Explicit Parallel Instruction Computer (EPIC)/ Intel 128 bit packets containing 3 instructions (can execute sequentially) Can link 128 bit packets together to allow more parallelism Compiler determines parallelism, HW checks dependencies and fowards/stalls Getting CPI < 1: Issuing Multiple Instructions/Cycle 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.7 ° Superscalar DLX: 2 instructions, 1 FP & 1 anything else – Fetch 64-bits/clock cycle; Int on left, FP on right – Can only issue 2nd instruction if 1st instruction issues
Image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.
  • Spring '04
  • Kubiatowicz
  • Computer Architecture, F6F Hellcat, F4U Corsair, Pratt & Whitney R-2800, F8F Bearcat, F2 F6 F2

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern