Unformatted text preview: er makes address change 6 clock cycles per iteration clock cycles per iteration (Complete 1 iteration & store back 1 array element) S. Ziavras Loop Processing/Unrolling (4) • Observation: 3/6 cycles process array elements 3/6 • Objective: include more array operations in 6 cycles Loop unrolling unrolling Replicates loop body many times Uses Uses different regs. for different iterations Adjusts loop termination code Example: Apply to previous example loop. Assume: • 4 copies of loop body • R1 initially contains a multiple of 32 • Eliminate obviously redundant operations • Do not reuse any of the regs. not reuse any of the regs S. Ziavras Loop Processing/Unrolling (5) Loop: L.D ADD.D Merge S.D DADDUI L.D instrs. ADD.D S.D L.D Drop ADD.D unneeded S.D BNE operations L.D ADD.D S.D DADDUI BNE F0,0(R1) F4,F0,F2 F4,0(R1) ;drop DADDUI/BNE F6,-8(R1) F8,F6,F2 F8,-8(R1) ;drop DADDUI/BNE F10,-16(R1) F12,F10,F2 F12,-16(R1) ;drop DADDUI/BNE F14,-24(R1) Not easy: easy F16,F14,F2 Requires symboli...
