VLIW_notes

VLIW_notes - BASIC LIW/VLIW MULTIPLE INDEPENDENT RISC...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
EE557 Michel Dubois USC 2007 Chapter4.12 3/3/07 BASIC LIW/VLIW MULTIPLE INDEPENDENT RISC INSTRUCTIONS --called ops-- ARE PACKAGED IN A VLI or VLIW--called instructions INDEPENDENT FUNCTIONAL UNITS WITH NO HAZARD DETECTION MAY HAVE SOME FORWARDING COMPILER IS RESPONSIBLE FOR ISSUING/SCHEDULING INSTRUCTION EXAMPLE: 1 INT op OR BRANCH 2 FP ops 2 MEMORY REFERENCES ops EACH op TAKES 16 TO 24 BITS TOTAL INSTRUCTION IS 112-168 BITS USE LOCAL AND GLOBAL COMPILER SCHEDULING ALGORITHMS EE557 Michel Dubois USC 2007 Chapter4.13 3/3/07 VLIW ARCHITECTURE LD/ST LD/ST FPop1 FPop2 INT/BR D D D D D E E E M M WB WB WB E E E E WB E E E E WB PC: WITH OR WITHOUT FORWARDING CYCLIC SCHEDULING: LOOP UNROLLING AND SOFTWARE PIPELINING (APPLICABLE TO LOOPS)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
EE557 Michel Dubois USC 2007 Chapter4.14 3/3/07 VLIW--LOOP UNROLLING PROBLEMS CODE SIZE EMPTY SLOTS REGISTER PRESSURE LOCKSTEP BINARY COMPATIBILITY (BINARY TRANSLATION/EMULATION COST OF BRANCHES ILP ILP ILP ILP. .. UNROLL LOOP 7 TIMES--VLIW PROGRAM: Clock MemOp1 MemOp2 FPOp1 FPOp2 Int/BROp 1 LD F0,0(R1) LD F6,-8(R1) 2 LD F10,-16(R1) LD F14,-24(R1) 3 LD F18,-32(R1) LD F22,-40(R1) ADDD F4,F0,F2 ADDD F8,F6,F2 4 LD F26,-48(R1) ADDD F12,F10,F2 ADDD F16,F14,F2 5 ADDD F20,F18,F2 ADDD F24,F22,F2 6 SD 0(R1),F4 SD -8(R1),F8 ADDD F28,F26,F2 7 SD -16(R1),F12 SD -24(R1),F16 SUBI R1,R1,#48 8 SD 16(R1),F20 SD 8(R1),F24 BNEZ R1,LOOP 9 SD 0(R1),F28 EE557 Michel Dubois USC 2007 Chapter4.15 3/3/07 VLIW--SOFTWARE PIPELINING Loop: LD F0,0(R1) O1 ADDD F4,F0,F2 O2 SD 0(R1),F4 O3 1 LOOP ITERATION PER CLOCK ITE1 ITE2 ITE3 ITE4 ITE5 ITE6 INST1 O1 INST2 -- O1 INST3 O2 -- O1 INST4 -- O2 -- O1 INST5 -- -- O2 -- O1 I N S T 6 O 3- - - -O 2- 1 INST7 O3 -- -- O2 -- INST8 O3 -- -- O2 KERNEL KERNEL CODE:(WE HAVE 6 ITERATIONS BETWEEN LD AND SD LOOPBR compares R1 and R2,loops back if non equal and increments RRB Speedup is 9!!!! USE ROTATING REGISTERS. REGISTER ROTATION IS A FORM OF REGISTER RENAMING Clock MemOp1 MemOp2 FPOp1 FPOp2 Int/BROp 1 LD F0,0(R1) SD 48(R1),F4 ADDD F4,F0,F2 NOOP LOOPBR R1,R4,CLK1
Background image of page 2
EE557 Michel Dubois USC 2007 Chapter4.16 3/3/07 VLIW--ROTATING REGISTERS RRi is Rotating register i. RRi maps to Physical register (RRB+i) mod 16. RRB is incremented by 1 at every iteration WITH ROTATING REGISTERS WE REWRITE THE KERNEL CODE: RR4 maps to P6 (currently RR6) 2 cycles(iterations) later and RR0 maps to P3 (currently RR3) 3 cycles(iterations) later We don’t even need prologue or epilogue provided stores to memory and to regsiters are disabled when input register is empty (full/ empty bit per register) and let the loop overuns (exceptions!) Clock MemOp1 MemOp2 FPOp1 FPOp2 Int/BROp 1 LD RR6,0(R1) SD 48(R1),RR0 ADDD RR3,RR4,F2 NOOP LOOPBR R1,R4,CLK1 P15 P0 RR0 RR3 RR4 RR6 P3 P4 P6 P2 P1 P5 P7 P8 P11 P12 P14 P10 P9 P13 P15 P0 RR0 RR3 RR4 RR6 P3 P4 P6 P2 P1 P5 P7 P8 P11 P12 P14 P10
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/19/2008 for the course EE 577B taught by Professor Bhatti during the Spring '08 term at USC.

Page1 / 9

VLIW_notes - BASIC LIW/VLIW MULTIPLE INDEPENDENT RISC...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online