CS151B/EE116C Solutions to Homework #1
Problem (1) C.36
Register0
Bit1
Bit0
D
D
Q
Q
Reg#
CLK
0M
U
1X
Bit0
Register1
Bit1
Bit0
D
D
Q
Q
Reg#
0M
U
1X
CLK
Bit1
Problem (2)
The base address of y, in hexadecimal, is 0x003A9814.
lui
ori
lw
sub
add
sw
$9,
$9,
$10
CS151B/EE116C Solutions to Homework #2
Problem (1) C.24
If there is no overflow, the circuitry shown in Figure C.5.10 is sufficient the Set output from bit 31
(the sign bit) can be used as the Less input for bit 0. However, if there is an overflow, the in
UNIVERSITY OF CALIFORNIA, LOS ANGELES UCLA
’\
BERKELEY ' DAVIS - [RVINE ' LOS ANGELES - RIVERSIDE - SAN DIEGO ‘ SAN FRANClSCO g3] SANTA BARBARA - SANTA CRUZ
y
CS MlSlB /EE M116C
Midterm Exam
All work and answers should be written directly on these p
UNIVERSITY OF CALIFORNIA, LOS ANGELES UCLA
BERKELEY ' DAVIS - IRVINE - LOS ANGELIiS - RIVERSIDE - SAN DIEGO - SAN FRANCISCO SANTA BARBARA - SANTA CRUZ
CS MISIB/EE Mll6C
Midterm Exam
All work and answers should be written directly on these pages, use
CS151B/EE116C Solutions to Homework #3
Problem (1)
0
6
jr rs
rs
5
0
15
0x8
6
0x8 = 001000
Instruction jr has the same opcode as R-type instructions. However, jr can be distinguished from the
other implemented R-type instructions based on the most signific
Homework 5 Solution
5.3 For a direct-mapped cache design with a 32-bit address, the following bits of the address are
used to access the cache
5.3.1 What is the cache block size (in words)?
Cache line size = 2offset bits = 25 bytes = 32 Bytes = 8 words
5.
Solution for Homework1
Question 1.5
Consider three different processors P1, P2, and P3 executing the same instruction set. P1 has a 3 GHz
clock rate and a CPI of 1.5. P2 has a 2.5 GHz clock rate and a CPI of
UNIVERSITY OF CALIFORNIA, LOS ANGELES
BERKELEY
DAVIS
IRVINE
LOS ANGELES
RIVERSIDE
UCLA
SAN DIEGO
SAN FRANCISCO
SANTA BARBARA
SANTA CRUZ
CS M151B / EE M116C
Midterm Exam
All work and answers should be written directly on these pages, use the backs of pages
Lingfeng Yang 604251317
HW#2
2.24. For a jump operation, we can jump directly to an address not larger than 2^28.
!
0x4000 0000 is greater than 2^28. Thus, a single jump operation can not jump to
!
0x4000 0000.
!
!
For a beq operation, we can have at most
1. I Amdahl-ighted with T radeoffs (10 points): Given the following problems, suggest one solution and give one
drawback of the solution. Be brief, but specic.
EXAMPLE
Problem: long memory latencies
Solution: Caches
Drawback: when the cache misses, the la
Problem (4) B.30, Answer C
Assumption:
1. a delay of gate is proportion to the number of fan-ins (inputs): kT (k=fanin)
2. 1bit full adder: CarryOut = 2T + 3T = 5T, Sum=3T (a xor b xor C)
3. Carry-Lookahead Unit implementation: a sum of products (same as
Assume for the rest of this problem that all logic gates have the following delays:
Fan In
Delay
1
T
2
2T
3
3T
4
5T
5
7T
6
10T
7 or more 2T x fan-in
So a 2-input AND gate would have delay 2T and a 4-input OR gate would have delay 5T.
For simplicity, assum
dstspsth with s new iustruetien: the ﬁnes instruetiuu. This instruetiuu will be an I-type iustruetieuj and will
have the fulluwiug effeet:
if (M[R[rs]=SE(I)
R[rt]=R[$iﬂ}
Nete that the fmeu always uses register $tﬂ as the source 1irslue that is put lute R[
Homework 1
Name:
UID:
1.5
a. IPS (Instruction per second)
IPS(P1) = IC/ET = 1/(CPI*CT) = 3 x 109/1.5 = 2 x 109
IPS(P2) = 2.5 x 109/1 = 2.5 x 109
IPS(P3) = 4 x 109/2.2 = 1.8 x 109
P2 has the highest performance expressed in IPS
b. cycles(P1) = time x clock
CS M151B Homework 2
Name:
UID:
2.24. 0x4000 0000 - 0x2000 0000 > 2e28
So it is not possible to use jump instruction to set the PC to 0x4000 0000.
It is also not possible for beq instruction.
2.26.1 The value in $s2 = 20
2.26.3 For each time in the loop, t
CS151B/EE116C Solutions to Homework #1
Problem (1)
Register0
Bit0
Bit1
Bit2
D
D
D
Q
Q
Q
Reg#
M
U
X
CLK
Bit0
Reg#
Register1
Bit2
D
Q
Bit1
M
U
X
Bit0
D
D
Q
Q
Bit1
Reg#
CLK
M
U
X
Bit2
Problem (2)
The base address of b:3780220, in hexadecimal, is 0x0039AE7C.
Carry Select Adder (CSA)
With ripple carry, you're just taking the minimal set of adders and allowing the carry to serially propagate through. With the CLA, we tried to compute the carry ahead of time. We traded off area for speed.
With CSA the theory is
Today's Agenda:
Trade-offs in the design of the ISA.
This is a very high level view of how a processor is connected to memory:
RAM <=> muArchitecture <-non-physical-link-> ISA <-> Software
Similar to the how the data of a program is stored in memory, the
CS M151B/EE M117:
Glenn Reinman
Glenn Reinnmann
Flipped classroom:
- Videos online (the main lecture)
- The lectures are like office hours that talk about material that is helpful for the midterms.
The focus of the class: tradeoffs.
Grading:
10%: Homework
3. Assume for the rest of this problem that all logic gates have the following delays:
Fan In
Delay
1
T
2
2T
3
4T
4
7T
5
9T
6 or more 2T x fan-in
So a 2-input AND gate would have delay 2T and a 4-input OR gate would have delay 7T. For simplicity,
assume t
(13 Mull): Consider the following three properties:
.g- " Ev . -. '_. . .
'. "k I . ".~
. '. a . .
. v ._ .
, . . r' ' Ann. . -. _~. ,
I. . n. ., . ,.
eh - . .
'r. r
\-
5473; -. 5.53;,- - a, _ ,._ . Wham (Le .lhe number of logic gates required to imple