CSEE w4824
Homework 4
Handout 14
Prof. Luca Carloni
November 2, 2011
This homework is due at the
beginning of class
on Wednesday, November 16 (
two-day extension
).
A correct answer without adequate explanation or derivation will have points deducted. To get full credit,
•
write legibly/type, and
•
show all work (label relevant items, show derivations, include explanations).
Problem
1
.
Tomasulo’s Algorithm.
(15 points)
.
In this problem, you are to simulate the following MIPS code fragment
using a processor architecture based on
Tomasulo’s Algorithm
:
ADD
R2, R0, R0
SUB.D
F8, F8, F9
ADD.D
F7, F7, F9
MUL.D
F3, F8, F7
L.D
F4, 0(R2)
SUB.D
F4, F4, F7
S.D
F4, 0(R4)
L.D
F6, 0(R8)
MUL.D
F1, F9, F3
BNE
R2, R0, target
ADDI
R2, R2, #8
MUL.D
F1, F7, F5
target:
MUL.D
F4, F1, F6
L.D
F6, 0(R4)
MUL.D
F2, F4, F6
S.D
F4, 0(R2)
ADD.D
F4, F1, F6
Make the following assumptions:
•
Assume to have an architecture implementing Tomasulo’s algorithm as in the figure above, which is the same as
H&P Fig. 2.9. Even though the figure does not show them, you should assume that one integer function unit is
present to execute ALU instructions.
•
Assume that there is also dedicated branch-execution unit and that the branch is resolved in the ’Execution’ stage
and, therefore, it does not have to go through the ’Write’ stage.
•
Assume that there is no branch prediction: i.e. each subsequent instructions waits for the branch to be resolved
before being issued.
•
Assume that the bandwidth of the common data bus allows the broadcasting of only one instruction per cycle
from a functional unit to the other functional units.
•
Assume the following execution times:
instruction
cycles
integer ALU
1
branch
1
load/stores
1
FP addition
3
FP multiplication
5