fa09_cs433_hw3_sol - CS433: Computer Systems Organization...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
CS433: Computer Systems Organization Fall 2009 Homework 3 Assigned: Oct/1 Due in class Oct/13 Total points: 54 for undergraduate students, 62 for graduate students. Instructions: Please write your name, NetID and an alias on your homework submissions for posting grades (If you don’t want your grades posted, then don’t write an alias). We will use this alias throughout the semester. Homeworks are due in class on the date posted. Problem 1: Loop Unrolling [18 points] In this problem, we will use the pipeline shown in Figure A.31 on page A.50 of your book. Its characteristics are: If unspecified, its properties are like those in the MIPS pipeline. There is 1 integer functional unit, taking 1 cycle to perform integer addition (including effective address calculation for loads/stores), subtraction, logic operations and branch operations. There is 1 FP/integer multiplier, taking 8 cycles to perform multiplication. It is pipelined. There is 1 FP adder, taking 3 cycles to perform FP additions and subtractions. It is pipelined. There is 1 FP/integer divider, taking 24 cycles. It is NOT pipelined. There is full forwarding and bypassing, including forwarding from the end of an FU to the MEM stage for stores. Loads and stores complete in one cycle. That is, they spend one cycle in the MEM stage after the effective address calculation. There are as many registers, both FP and integer, as you need. There is one branch delay slot. While the hardware has full forwarding and bypassing, it is the responsibility of the compiler to schedule such that the operands of each instruction are available when needed by each instruction. Loop: L.D F4, 0 (R1) MUL.D F8, F4, F0 L.D F6, 0 (R2) ADD.D F10, F6, F2 ADD.D F12, F8, F10 S.D F12, 0 (R3) DADDUI R1, R1, 8 DADDUI R2, R2, 8 DADDUI R3, R3, 8 DSUB R5, R4, R1 BNEZ R5, Loop
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Part A. [6 points] Consider the role of the compiler in scheduling the code. Rewrite this loop, but let every row take a cycle. If an instruction can’t be issued on a given cycle (because the current instruction has a dependency that will not be resolved in time), write STALL instead, and move on to the next cycle to see if it can be issued then. Assume that a NOP is scheduled in the branch delay slot (effectively stalling 1 cycle after the branch). Explain all stalls, but don’t reorder instructions. How many cycles elapse before the second iteration begins? Show your work. Loop: L.D F4, 0(R1) stall RAW F4 MUL.D F8, F4, F0 L.D F6, 0(R2) stall RAW F6 ADD.D F10, F6, F2 stall RAW F8, F10 stall RAW F8, F10 stall RAW F8 stall RAW F8 ADD.D F12, F8, F10 stall RAW F12 S.D F12, 0(R3) DADDUI R1, R1, #8 DADDUI R2, R2, #8 DADDUI R3, R3, #8 DSUB R5, R4, R BNEZ R5, Loop NOP 19 Cycles.
Background image of page 2
Part B. [6 points] Now reschedule the loop. You can change immediate values and memory offsets. You can reorder instructions, but don’t change anything else. Show any stalls that remain. How many cycles elapse before the second iteration begins? Show your work.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 11

fa09_cs433_hw3_sol - CS433: Computer Systems Organization...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online