This preview shows page 1. Sign up to view the full content.
Unformatted text preview: UNIVERSITY OF CALIFORNIA, BERKELEY
College of Engineering
Department of Electrical Engineering and Computer Sciences Elad Alon Homework #5 EECSl41 PROBLEM 1: Logical Effort For this problem, you should assume that CG = 2fF/um and that the transistors are long
channel for the purpose of calculating LE. Out
f I COUT : a) What is the total path effort from In to Out? PE 2 (l'lLE) (113):7 = <3) G) <3) (2) = = "B = (1)(1)(3)(1)(4)(1) = 12 F _ cOUT _ 200fF
— cIN — um
PE 2 (l'lLE)(l'lB)F = (6.91)(12)(33. 33) = 2764 = 33.33 (2pm + 1pm) b) To minimize the delay, what should the EF/stage for this chain of gates be? EF 2 éx/PE = 3. 75 c) Size the gates in this chain to minimize the delay from In to Out. Only calculate
the input capacitance of the gates; don’t bother to provide the actual transistor
sizes. Since
EF 2 f(B)(LE) = 3.75,
and _ COUT,x
fx _ ClN,x
we can calculate the input capacitance of each stage as follows: 5
cIN,f = c0UT,f(BL(:E) = 200fFQ S 88. 89fF. 1
3.75
1(1) cm,e = 8889me g 23.7fF (45 (g)
3. 75 7
cm,c = 33. 7fF 4
c 21me (3) ~ 22 4fF
1‘“ _ 3.75 = ' We can check our calculations by conﬁrming that CIN,a does indeed equal 6fF: (DU)
3. 75 , cm,d = 23. 7fF g 33.7fF E 21fF cm,a = 22.4fF g 6. 0fF (1) Using this sizing, what is the delay (in units of tinv) of your chain from In rising to
Out rising? You can assume that the critical input of the complex gates is always
at the “top” of the transistor stacks (i.e., the critical input is always closest to the
output node), and that CD/CG = y = 0.5. Solution: The delay of each stage is given by D : tinv(p + EF); We set the EF of each stage to 3.75, so we just need to find the parasitic gate delay p
for each of the gates. Therefore: N
tp = tinv<Zpi +NEF) = tinv(y(1 + 2 + 3 + 2 + 1 + 2) + 6(3.75))
i=1
2 tinv(11y + 22. 5) = 28tinv e) You present your design to your boss and she tells you that the delay of your
circuit is below the specification for the block. She also tells you that your team is
over budget on die area. Revise your design such that you save the maximum amount of area, while increasing the delay by no more than 10%. [Note: There are
many possible solutions; any solution that takes a reasonable approach to the
problem will receive full credit.] We can approximate the area of each gate as being directly proportional to the input
capacitance times the number of inputs. Agate 0c cgateNinputs To calculate the total die area, we also need to account for the branching factors —
since each stage beyond a branch is replicated B times. The total die area is
proportional to the area of each gate times the cumulative product of branching
factors: N i
Am oc 2 Ai N BH
i=1 i=1 We can therefore calculate what percentage of the die area each stage uses: We can see that the majority of the area is used in the last stage, so we’ll want to
focus on this stage for our optimization. Since we are allowed a 10% increase in
delay, one simple method is to to reduce the size of the last stage. To maximize the
beneﬁt we get from this, we should keep in mind that as we downsize the last stage
all of the gates before it will get faster (since they see less fanout) as well: N
11tp = 30 8tinv = tinvZQ’i + LEiBifi)
i=1 5 c
: tinv 11y + 5EF + (_) (1) < 0UT,f>
3 cIN,f 5 C 5 C
W 11y + 5 < ‘1‘”)HLEHB + (—)< 0“)
cIN,a 3 CIN,f —t
t inv 5 cIN f) 5 Cour
11y+5 ' 49.78+(—)
cIN,a 3 cIN,f This equation can be solved numerically using a calculator or a computer, and the
result is that Cm; = 33.79fF and the EF/stage of the preceding chain is 3.087. Using
this approach, the resized chain results in an area savings of 57.1% (with only 10%
larger delay). Another approach is to actually modify the design of the chain. Again, we want to
start at the last stage since it accounts for the largest amount of area. If we notice that
the ﬁnal NOR is preceded by an inverter in the critical path, we can add an inverter
on the other input of the NOR (since it is not in the critical path) and the NOR gate +
the inverters become logically equivalent to an AND gate. Then we can implement
the AND with a NAND gate followed by an inverter: I COUT = This changes the PE of the chain, so we have to redo the sizing (Note that you can
skip this step, since in the next step we will resize the chain to reduce its area even
further, but doing this step shows the benefit just from modifying the gates). PE 2 (l'lLE)(l'lB)F nw=<1><§><é><§><§>m=3“ "3 = (1)(1)(3)(1)(4)(1) = 12
F : cOUT _ 200fF cIN _ 2fF/um(2um + 1pm) 2 33'33
PE 2 (l'lLE)(l'lB)F = (5. 53)(12)(33. 33) = 2212. 3
EF 2 Gx/PE = 3. 61 C11“ 2 200fFﬂ g 55.4fF
' 3.61
(é) <1)
cm,e = 55.4fF 3.61 g 20. 5fF
($) (4)
cm,d = 20. 5fF g 30.3fF 3.61 (am cm,c = 30.3fF 2.61 g 19. 6fF c 19 6fF (g) (3) ~ 21 7fF _ ' (£381) 2 ' c = 21.7fF g 6. 0fF
1”” 3.61 N
tp = tinv<Zpi+NEF) = tinv(y(1 +2 + 3 +2 + 2 + 1) +6(3.61))
i=1
2 tinv(11y + 21. 66) g 27.16tinv So, changing the inverter + NOR2 into a NAND2 + inverter results in an area savings
of ~45% and a delay savings of nearly 3%. This means we now have 30.8ti11V —
27.16ti11V = 3.64 tinv of extra delay we can use to reduce the area even further. Again,
we can calculate the amount of area taken up by each stage: Most of the area is still concentrated in the last stage. Following the same procedure
as last time, we can write the equation for delay as a function of Cm; : N
1 1tp = 30 8tinv = tinvZQ’i + LEiBifi)
i=1 2 tinv<11y + 5EF + (1)(1) (COUT'f>) cIN,f 5 C C
= tn“, 11y + 5 < ‘1‘”)HLEHB +< 0“)
cIN,a CIN,f 5 C C
= tn“, 11y + 5 < I“) 66. 36 +< 0“)
cIN,a CIN,f The result is that CM = 18.51fF and the EF/stage of the preceding stages is 2.899. As
a result, the area of the chain is reduced by 76.4% over that of the original chain. PROBLEM 2: Side Loads We have so far ignored any ﬁxed capacitive load between the gates in a chain, but in a
real chip, these devices and gates are connected through metal interconnect. In certain
cases, these devices may be placed sufﬁciently far apart that the delay and power may be
affected by the parasitic resistance and capacitance of the wires. For this problem, we will
ignore the resistance of the interconnect and only model the capacitive component. C1: ClN Cg = Cg: In Consider the logic chain shown above, where CIN = 3fF. The two inverters buffer a signal
which goes across chip to another logic block. The wire that the two inverters drive has a
fixed (i.e., independent of sizing) capacitance of CFIXED = 200C1N. This ﬁxed capacitance
is sometimes called a sideload. a) Derive the equation for the delay of this chain in terms of the input capacitances
of the three gates (C1, C2, C3), the capacitances CFIXED and CL, 7, and tinv. Solution:
c2 C3 + cFIXED cour
tp : tinv (Y + ) + tinv (Y + ) + tinv (21! + LE3 )
CIN C CC C 5 C C3
2 3 + FIXED our
: 4 t. t. — — —
v an + an + + 3 ) b) Using the values for CIN and CL we have provided and your equation from part a),
determine the optimal sizing for the gates to minimize the total delay. Solution: dtp _ 1 C3 + CFIXED _
c2 _ cIN cg _ 0 c3 _ c2 3 cg
Since COUT 2 32cm and CFIXED 2 200cm, the system of equations has two
unknowns and can be solved using a calculator or MATLAB; the solution that
minimizes the delay is C3 = 85.2fF and C2 = 45.3fF. c) We will now explore an alternative heuristic to size the chain. First, pretend that
the sideload doesn’t exist, and calculate the optimal size for the last NOR gate. Leaving the sizing of this last gate constant and reintroducing the sideload, you
can now calculate the total fanout that the ﬁrst two gates must drive. Based on
this total fanout for the first two gates, you can now size the inverters using the
standard method we learned in class. Show your work and the sizing (in terms
of CIN) for each gate. Solution: PE = (l'lLE)(l'lB)F
5 5
mm = (1)(1) (g) = g = 1. 667 "B = (1)(1)(1) = 1
F 2 CE]? = 32
PE 2 (nLE)(nB)E = (1. 667)(1)(32) = 53. 33
EF 2 2{/ﬁ = 3. 76
(B)(LE) 1g) (:1ng = COUT,3 T = 32(3fF) = 42. 55fF Now we know what loading the NOR gate adds to the inverter chain, so we can size the two inverters just as we would if we were sizing a chain of inverters with a load of
CL = ZOOCIN + CIN,3. PE 2 (nLE)(nB)E
nLE = (1)(1) = 1
113 = (1)(1) = 1 cOUT _ 200(3fF) + 42. 55fF _ 642. 55fF — _ — = 214.18
cIN 3fF 3fF PE 2 (nLE)(nB)E = (1)(1)(214. 18) = 214.18
EF = W = 14.64 F: C — 642 55fF 1(1) — 43 89fF
‘NZ _ ' 14.64 _ '
We can double check that C1N,1 = 3fF:
C — 43 89fF 1(1) — 3fF
1‘“ _ ' 14. 64 _ d) In terms of tinv, what is the delay of the chain from part b)? How does this
compare to the delay of the chain in part c)? You may assume 7:1. What are the
major difference(s) between the two designs? Solution: c2 c3 + cFIXED 5 Cour
t = 4 t + t (— + — + —
p,b Y lnv lnv CIN (:2 3 c3 ) = 36.1tinv C C + C 5 C
tp,c : 4'vtinv + tinv + 3 CZFIXED + E 2;”
The major difference between the two designs is that C3 in part b) is more than twice
that of part c). Since the ﬁxed sideload is very large, the delay optimal C3 is relatively
large because as long as C3 is small in comparison to the sideload, it has little effect
on the delay of the previous stage However, the improvement in delay vs. the
heuristic is minimal, while the area and power costs of achieving the optimal delay are signiﬁcant. ) = 37.0tinv e) How can you redesign this chain to reduce the delay? What is the new delay?
[Hint: What is the EF/stage with the current design?] Show your work and a
gatelevel schematic (with sizes) of your new design. Solution: Looking at the design from part c) where the EF of the ﬁrst two stages is 14.64, it
should be clear that if we want to reduce the delay, the main thing we need to do is
bring the EF of these two stages closer to 4. The way to do this is to treat the ﬁrst
part of the chain (before the sideload) as a new chain and find the optimal number of stages for that chain: loglo N = 1 PE : — = 3. 87
opt 0g4 loglo 4 So, we can round the number of stages for the inverter chain that drives the long wire
to 4, leading to an EF/stage of 3.83.
The new schematic (with sizing) is shown below: C1 = C2: C3: C4:
3fF 11 .4fF 43.8fF 167.8ﬂ: C5 = 42.55ﬂ: C
I 2535B COUT = 320m
IN I The new delay is
tp 2 6yth + tinv(4(3.83) + (3. 76)) = 25.1tinv PROBLEM 3: MOS Transistor Model Use the velocity saturation model presented in lecture (shown below) to complete a)
and b). For part c) and d), attach your SPICE netlist and results. Use VTN = 150mV, IVTpl = 300mV, DSATN = 1.1267 cm/s, DSATP = 1e7 cm/s, Cox =
15mm, “N = 260 cmZ/(Vs), up = 120 cm2/(Vs). 1V 1.2V In Out (VGS _ VT)2 I = W C —
D USAT 0X (V05 — VT) + {all a) Using the model, find IDN of the NMOS in the second inverter when In = 0V and
Out = 1V. (Note that as we will see later on in the lectures, the current of the
transistors directly impacts the delay of the inverter) Solution:
gym” 2 (1.12 x 107
{CL = u L = W90nm = 0. 775V
N _
260 V _ s
(1050mV)2
ID = (1pm)(1. 12 X 107 cm/s)(15fF/um2)— = 1. 015mA (1050mV) + o. 775V b) Choose X such that 1131: equals the value you got for IDN in part a) when In = 1.2V
and Out = 0V. Solution:
cm
2,, 2 1.0 x 107—
ch : :1"? L = %90nm = 1. 5V
N _
120V_s
(700mV)2
ID = (Xum)(1.0 X 107 cm/s)(15fF/um2)— = 1. 015mA (700mV) + 1.5V
X = 3. 04pm c) Simulate the circuit with your sizing from b), and compare your results for 1m and
IDN from parts a) and b). Why are the results different? Use SPICE to find the value for X that makes the currents match. How close was the analytical model to
the SPICE result? Solution: Using SPICE, the drain current of the NMOS is 999.1uA and the drain current of the
PMOS is 1101 HA. The plot for determining the correct value of X is shown below. Flndng WP to matdw the ID of 1m NNEE a Drain QJrrent (pA)
§ 308.5 1 15 i 2.5 3 3.5
Wldh of FMCB (pm) The correct value of X from SPICE is 2.755 pm, which is pretty close to our estimate. The reason for the discrepancy between the simulation and the hand calculations is
that the model that we are using does not account for all of the physics of the real
device — in particular, we ignored the fact that the transistor doesn’t act like a perfect
current source even when it is saturated. Here is the SPICE deck:
HW 5—3 solution .LIB '/home/ff/eel4l/MODELS/gpdkO90_mos.sp' TT_slv
.option post nomod VDD Vdd O 1.0 .param wp = 3.04u vgl gl 0 1.2 Ml vdd g1 sl sl gpdk090_nmoslv W=lu n=90n
M2 d2 0 vdd vdd gpdk090_pmoslv W=wp n=90n
vtl sl 0 O vt2 d2 0 0 .Op .dc wp lu 3u .OO5u .end d) [BONUSz] Can you make the analytical model fit the SPICE model better by
extracting an additional parameter? If so, extract this parameter and recalculate
the drain currents using this improved model. Solution: One thing we are missing from the velocity saturation model is the channel length
modulation parameter (9»). To ﬁnd 9», we must remember that 1
1’:
11351])
Then we can find the rDs of the transistor:
dVbs 1'DS 2 dl DS
We can find rDs graphically by plotting ID vs. VDS for the NMOS and PMOS
transistors. ID vs. VDS, vgp = 1v, \/(En = 1.2V DainOJn‘a1t(pA)
§ § § At VDs = 1V, VGs = IV, the slope of the PMOS line is (3060 9.) _14l42Q. At VGs =
1.2V, the slope of the NMOS line is (3424 9T1. At these points, IDSN = 999uA and 11351:
= 997.5pA. This results in M =0. 292v1 and is = o. 328V'1. Recalculating the drain currents: (1050 V)2
1,,n = (1p.m)(1. 12 x 107 cm/s)(15fF/umz)m (1 + Aﬂ(1V))
= 1. 311mA
7 2 (700mV)2
1D,, 2 (Xum)(1. 0 x 10 cm/s)(15fF/1m )m (1 + )1,(1V))
= 1. 311mA X: 2.95pm This value for X is quite a bit closer than what we extracted when we neglected channel
length modulation. There is still some discrepancy however (particularly in the value of
the drain currents), mainly because the original value we used for DSAT was too high. Here is the SPICE deck: HW 5—3d solution .LIB '/home/ff/ee141/MODELS/gpdk090_mos.sp' TT_slv
.option post nomod VDD vdd O 1.0 .param wp = 2.755u vgl gl 0 1.2 vg2 vdd g2 1 *ensures that as VDD is swept, Vgsp = 1V
M1 vdd g1 s1 s1 gpdk090_nmos1v W=1u L=90n M2 d2 g2 vdd vdd gpdk090_pmos1v W=wp L=90n vtl s1 0 O vt2 d2 0 0 .Op .dc VDD 0 1.2 .1m .end ...
View
Full
Document
 Spring '08
 Staff

Click to edit the document details