Compiler Optimizations (continued), Dynamic Scheduling

# Computer Organization and Design: The Hardware/Software Interface

• Notes
• davidvictor
• 15

This preview shows pages 1–3. Sign up to view the full content.

CS152 Computer Architecture and Engineering Lecture 14 Static Scheduling (Continued) Dynamic Scheduling: Scoreboards March 17 th , 2003 John Kubiatowicz ( ) lecture slides: 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.2 ° The Five Classic Components of a Computer ° Today’s Topics: Recap last lecture/Review Scoreboard Administrivia Tomasulo scheduling algorithm Tomasulo loop unrolling The Big Picture: Where are We Now? Control Datapath Memory Processor Input Output 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.3 Recall: Can we somehow make CPI closer to 1? Let’s assume full pipelining: Possible delay slots around a 4-cycle multiply instruction: multf \$F0 ,\$F2,\$F4 multf \$F0 , \$F2, \$F4 ld \$F0 ,0(\$r5) delay-1 delay-1 delay-1 delay-2 delay-2 multf \$F4, \$F0 ,\$F3 delay-3 sw \$F0 , 4(\$R2) addf \$F6,\$F10, \$F0 Fetch Decode Ex1 Ex2 Ex3 Ex4 WB multf delay1 delay2 delay3 addf Earliest forwarding for 4-cycle instructions Earliest forwarding for 1-cycle instructions 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.4 Recall: Revised FP Loop Minimizing Stalls 6 clocks: CPI = 6/5 = 1.2) Instruction Execute Instruction Use Latency producing result Latency using result in cycles FP ALU op 4 Another FP ALU op 3 FP ALU op 4 Store double 2 Load double 2 FP ALU op 1 1 Loop: LD F0 ,0(R1) 2 stall 3 ADDD F4 , F0 ,F2 4 SUBI R1,R1,8 5 BNEZ R1,Loop ;delayed branch 6 SD 8 (R1), F4 ;altered when move past SUBI Swap BNEZ and SD by changing address of SD Unroll loop 4 times code to make faster?

This preview has intentionally blurred sections. Sign up to view the full version.

3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.5 ° What assumptions made when moved code? OK to move store past SUBI even though changes register OK to move loads before stores: get right data? When is it safe for compiler to do such changes? 1 Loop:LD F0,0(R1) 2 LD F6,-8(R1) 3 LD F10,-16(R1) 4 LD F14,-24(R1) 5 ADDD F4,F0,F2 6 ADDD F8,F6,F2 7 ADDD F12,F10,F2 8 ADDD F16,F14,F2 9 SD 0(R1),F4 10 SD -8(R1),F8 11 SD -16(R1),F12 12 SUBI R1,R1,#32 13 BNEZ R1,LOOP 14 SD 8 (R1),F16 ; 8-32 = -24 14 clock cycles, or 3.5 per iteration CPI = 14/14 = 1 When safe to move instructions? Recall: Unrolled Loop That Minimizes Stalls 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.6 ° Two main variations: Superscalar and VLIW ° Superscalar: varying no. instructions/cycle (1 to 6) Parallelism and dependencies determined/resolved by HW IBM PowerPC 604, Sun UltraSparc, DEC Alpha 21164, HP 7100 ° Very Long Instruction Words (VLIW): fixed number of instructions (16) parallelism determined by compiler Pipeline is exposed; compiler must schedule delays to get right result ° Explicit Parallel Instruction Computer (EPIC)/ Intel 128 bit packets containing 3 instructions (can execute sequentially) Can link 128 bit packets together to allow more parallelism Compiler determines parallelism, HW checks dependencies and fowards/stalls Getting CPI < 1: Issuing Multiple Instructions/Cycle 3/17/04 ©UCB Spring 2004 CS152 / Kubiatowicz Lec14.7 ° Superscalar DLX: 2 instructions, 1 FP & 1 anything else – Fetch 64-bits/clock cycle; Int on left, FP on right – Can only issue 2nd instruction if 1st instruction issues
This is the end of the preview. Sign up to access the rest of the document.
• Spring '04
• Kubiatowicz
• Computer Architecture, F6F Hellcat, F4U Corsair, Pratt & Whitney R-2800, F8F Bearcat, F2 F6 F2

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern