Instr fp instr clock cycle loop ld f00r1 ld f6 8r1 ld

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: . L, S, branch, or integer ALU operation 2. Any FP operation FP operation Example: Unroll & schedule previous loop Unroll the loop to make 5 copies of the body S. Ziavras Example Integ. instr. FP instr. Clock cycle Loop: L.D F0,0(R1) L.D F6,-8(R1) L.D F10,-16(R1) ADD.D F4,F0,F2 L.D F14,-24(R1) ADD.D F8,F6,F2 F8,F6,F2 L.D F18,-32(R1) ADD.D F12,F10,F2 ADD.D F16,F14,F2 S.D F4,0(R1) S.D F8,-8(R1) ADD.D F20,F18,F2 S.D F12,-16(R1) • 12 clock cycles/ clock cycles/ DADDUI R1,R1,#-40 iteration S.D F16,16(R1) • 2.4 clock cycles per element element BNE R1,R2,Loop S.D F20,8(R1) S. Ziavras 1 2 3 4 5 6 7 8 9 10 11 12 Dynamic Scheduling for Data Hazards • • • Techniques to eliminate or reduce the pipeline stall cycles due to data dependences – Data bypassing – Data forwarding For a data dependence that cannot be hidden, the hazard detection hardware stalls the pipeline (starting with the instr that uses the result) with the instr. that uses the result) – No new instrs. are fetched or issued until t...
View Full Document

This document was uploaded on 02/09/2014.

Ask a homework question - tutors are online