Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: UNIVERSITY OF CALIFORNIA, BERKELEY College of Engineering Department of Electrical Engineering and Computer Sciences Elad Alon Homework #5 EECSl41 PROBLEM 1: Logical Effort For this problem, you should assume that CG = 2fF/um and that the transistors are long- channel for the purpose of calculating LE. Out f I COUT : a) What is the total path effort from In to Out? PE 2 (l'lLE) (113):7 = <3) G) <3) (2) = = "B = (1)(1)(3)(1)(4)(1) = 12 F _ cOUT _ 200fF — cIN — um PE 2 (l'lLE)(l'lB)F = (6.91)(12)(33. 33) = 2764 = 33.33 (2pm + 1pm) b) To minimize the delay, what should the EF/stage for this chain of gates be? EF 2 éx/PE = 3. 75 c) Size the gates in this chain to minimize the delay from In to Out. Only calculate the input capacitance of the gates; don’t bother to provide the actual transistor sizes. Since EF 2 f(B)(LE) = 3.75, and _ COUT,x fx _ ClN,x we can calculate the input capacitance of each stage as follows: 5 cIN,f = c0UT,f(BL(:E) = 200fFQ S 88. 89fF. 1 3.75 1(1) cm,e = 8889me g 23.7fF (45 (g) 3. 75 7 cm,c = 33. 7fF 4 c 21me (3) ~ 22 4fF 1‘“ _ 3.75 = ' We can check our calculations by confirming that CIN,a does indeed equal 6fF: (DU) 3. 75 , cm,d = 23. 7fF g 33.7fF E 21fF cm,a = 22.4fF g 6. 0fF (1) Using this sizing, what is the delay (in units of tinv) of your chain from In rising to Out rising? You can assume that the critical input of the complex gates is always at the “top” of the transistor stacks (i.e., the critical input is always closest to the output node), and that CD/CG = y = 0.5. Solution: The delay of each stage is given by D : tinv(p + EF); We set the EF of each stage to 3.75, so we just need to find the parasitic gate delay p for each of the gates. Therefore: N tp = tinv<Zpi +N-EF) = tinv(y(1 + 2 + 3 + 2 + 1 + 2) + 6(3.75)) i=1 2 tinv(11y + 22. 5) = 28tinv e) You present your design to your boss and she tells you that the delay of your circuit is below the specification for the block. She also tells you that your team is over budget on die area. Revise your design such that you save the maximum amount of area, while increasing the delay by no more than 10%. [Note: There are many possible solutions; any solution that takes a reasonable approach to the problem will receive full credit.] We can approximate the area of each gate as being directly proportional to the input capacitance times the number of inputs. Agate 0c cgateNinputs To calculate the total die area, we also need to account for the branching factors — since each stage beyond a branch is replicated B times. The total die area is proportional to the area of each gate times the cumulative product of branching factors: N i Am oc 2 Ai N BH i=1 i=1 We can therefore calculate what percentage of the die area each stage uses: We can see that the majority of the area is used in the last stage, so we’ll want to focus on this stage for our optimization. Since we are allowed a 10% increase in delay, one simple method is to to reduce the size of the last stage. To maximize the benefit we get from this, we should keep in mind that as we downsize the last stage all of the gates before it will get faster (since they see less fanout) as well: N 1-1tp = 30- 8tinv = tinvZQ’i + LEiBifi) i=1 5 c : tinv 11y + 5EF + (_) (1) < 0UT,f> 3 cIN,f 5 C 5 C W 11y + 5 < ‘1‘”)HLEHB + (—)< 0“) cIN,a 3 CIN,f —t t inv 5 cIN f) 5 Cour 11y+5 ' 49.78+(—) cIN,a 3 cIN,f This equation can be solved numerically using a calculator or a computer, and the result is that Cm; = 33.79fF and the EF/stage of the preceding chain is 3.087. Using this approach, the resized chain results in an area savings of 57.1% (with only 10% larger delay). Another approach is to actually modify the design of the chain. Again, we want to start at the last stage since it accounts for the largest amount of area. If we notice that the final NOR is preceded by an inverter in the critical path, we can add an inverter on the other input of the NOR (since it is not in the critical path) and the NOR gate + the inverters become logically equivalent to an AND gate. Then we can implement the AND with a NAND gate followed by an inverter: I COUT = This changes the PE of the chain, so we have to redo the sizing (Note that you can skip this step, since in the next step we will resize the chain to reduce its area even further, but doing this step shows the benefit just from modifying the gates). PE 2 (l'lLE)(l'lB)F nw=<1><§><é><§><§>m=3“ "3 = (1)(1)(3)(1)(4)(1) = 12 F : cOUT _ 200fF cIN _ 2fF/um(2um + 1pm) 2 33'33 PE 2 (l'lLE)(l'lB)F = (5. 53)(12)(33. 33) = 2212. 3 EF 2 Gx/PE = 3. 61 C11“ 2 200fFfl g 55.4fF ' 3.61 (é) <1) cm,e = 55.4fF 3.61 g 20. 5fF ($) (4) cm,d = 20. 5fF g 30.3fF 3.61 (am cm,c = 30.3fF 2.61 g 19. 6fF c 19 6fF (g) (3) ~ 21 7fF _ ' (£381) 2 ' c = 21.7fF g 6. 0fF 1”” 3.61 N tp = tinv<Zpi+N-EF) = tinv(y(1 +2 + 3 +2 + 2 + 1) +6(3.61)) i=1 2 tinv(11y + 21. 66) g 27.16tinv So, changing the inverter + NOR2 into a NAND2 + inverter results in an area savings of ~45% and a delay savings of nearly 3%. This means we now have 30.8ti11V — 27.16ti11V = 3.64 tinv of extra delay we can use to reduce the area even further. Again, we can calculate the amount of area taken up by each stage: Most of the area is still concentrated in the last stage. Following the same procedure as last time, we can write the equation for delay as a function of Cm; : N 1- 1tp = 30- 8tinv = tinvZQ’i + LEiBifi) i=1 2 tinv<11y + 5EF + (1)(1) (COUT'f>) cIN,f 5 C C = tn“, 11y + 5 < ‘1‘”)HLEHB +< 0“) cIN,a CIN,f 5 C C = tn“, 11y + 5 < I“) 66. 36 +< 0“) cIN,a CIN,f The result is that CM = 18.51fF and the EF/stage of the preceding stages is 2.899. As a result, the area of the chain is reduced by 76.4% over that of the original chain. PROBLEM 2: Side Loads We have so far ignored any fixed capacitive load between the gates in a chain, but in a real chip, these devices and gates are connected through metal interconnect. In certain cases, these devices may be placed sufficiently far apart that the delay and power may be affected by the parasitic resistance and capacitance of the wires. For this problem, we will ignore the resistance of the interconnect and only model the capacitive component. C1: ClN Cg = Cg: In Consider the logic chain shown above, where CIN = 3fF. The two inverters buffer a signal which goes across chip to another logic block. The wire that the two inverters drive has a fixed (i.e., independent of sizing) capacitance of CFIXED = 200C1N. This fixed capacitance is sometimes called a side-load. a) Derive the equation for the delay of this chain in terms of the input capacitances of the three gates (C1, C2, C3), the capacitances CFIXED and CL, 7, and tinv. Solution: c2 C3 + cFIXED cour tp : tinv (Y + ) + tinv (Y + ) + tinv (21! + LE3 ) CIN C CC C 5 C C3 2 3 + FIXED our : 4 t. t. — — — v an + an + + 3 ) b) Using the values for CIN and CL we have provided and your equation from part a), determine the optimal sizing for the gates to minimize the total delay. Solution: dtp _ 1 C3 + CFIXED _ c2 _ cIN cg _ 0 c3 _ c2 3 cg Since COUT 2 32cm and CFIXED 2 200cm, the system of equations has two unknowns and can be solved using a calculator or MATLAB; the solution that minimizes the delay is C3 = 85.2fF and C2 = 45.3fF. c) We will now explore an alternative heuristic to size the chain. First, pretend that the side-load doesn’t exist, and calculate the optimal size for the last NOR gate. Leaving the sizing of this last gate constant and re-introducing the side-load, you can now calculate the total fanout that the first two gates must drive. Based on this total fanout for the first two gates, you can now size the inverters using the standard method we learned in class. Show your work and the sizing (in terms of CIN) for each gate. Solution: PE = (l'lLE)(l'lB)F 5 5 mm = (1)(1) (g) = g = 1. 667 "B = (1)(1)(1) = 1 F 2 CE]? = 32 PE 2 (nLE)(nB)E = (1. 667)(1)(32) = 53. 33 EF 2 2{/fi = 3. 76 (B)(LE) 1g) (:1ng = COUT,3 T = 32(3fF) = 4-2. 55fF Now we know what loading the NOR gate adds to the inverter chain, so we can size the two inverters just as we would if we were sizing a chain of inverters with a load of CL = ZOOCIN + CIN,3. PE 2 (nLE)(nB)E nLE = (1)(1) = 1 113 = (1)(1) = 1 cOUT _ 200(3fF) + 42. 55fF _ 642. 55fF — _ — = 214.18 cIN 3fF 3fF PE 2 (nLE)(nB)E = (1)(1)(214. 18) = 214.18 EF = W = 14.64 F: C — 642 55fF 1(1) — 43 89fF ‘N-Z _ ' 14.64 _ ' We can double check that C1N,1 = 3fF: C — 43 89fF 1(1) — 3fF 1‘“ _ ' 14. 64 _ d) In terms of tinv, what is the delay of the chain from part b)? How does this compare to the delay of the chain in part c)? You may assume 7:1. What are the major difference(s) between the two designs? Solution: c2 c3 + cFIXED 5 Cour t = 4 t- + t- (— + — + — p,b Y lnv lnv CIN (:2 3 c3 ) = 36.1tinv C C + C 5 C tp,c : 4'vtinv + tinv + 3 CZFIXED + E 2;” The major difference between the two designs is that C3 in part b) is more than twice that of part c). Since the fixed sideload is very large, the delay optimal C3 is relatively large because as long as C3 is small in comparison to the sideload, it has little effect on the delay of the previous stage However, the improvement in delay vs. the heuristic is minimal, while the area and power costs of achieving the optimal delay are significant. ) = 37.0tinv e) How can you redesign this chain to reduce the delay? What is the new delay? [Hint: What is the EF/stage with the current design?] Show your work and a gate-level schematic (with sizes) of your new design. Solution: Looking at the design from part c) where the EF of the first two stages is 14.64, it should be clear that if we want to reduce the delay, the main thing we need to do is bring the EF of these two stages closer to 4. The way to do this is to treat the first part of the chain (before the sideload) as a new chain and find the optimal number of stages for that chain: loglo N = 1 PE : — = 3. 87 opt 0g4 loglo 4 So, we can round the number of stages for the inverter chain that drives the long wire to 4, leading to an EF/stage of 3.83. The new schematic (with sizing) is shown below: C1 = C2: C3: C4: 3fF 11 .4fF 43.8fF 167.8fl: C5 = 42.55fl: C I 2535B COUT = 320m IN I The new delay is tp 2 6yth + tinv(4(3.83) + (3. 76)) = 25.1tinv PROBLEM 3: MOS Transistor Model Use the velocity saturation model presented in lecture (shown below) to complete a) and b). For part c) and d), attach your SPICE netlist and results. Use VTN = 150mV, IVTpl = 300mV, DSATN = 1.1267 cm/s, DSATP = 1e7 cm/s, Cox = 15mm, “N = 260 cmZ/(V-s), up = 120 cm2/(V-s). 1V 1.2V In Out (VGS _ VT)2 I = W C — D USAT 0X (V05 — VT) + {all a) Using the model, find IDN of the NMOS in the second inverter when In = 0V and Out = 1V. (Note that as we will see later on in the lectures, the current of the transistors directly impacts the delay of the inverter) Solution: gym” 2 (1.12 x 107 {CL = u L = W90nm = 0. 775V N _ 260 V _ s (1050mV)2 ID = (1pm)(1. 12 X 107 cm/s)(15fF/um2)— = 1. 015mA (1050mV) + o. 775V b) Choose X such that 1131: equals the value you got for IDN in part a) when In = 1.2V and Out = 0V. Solution: cm 2,, 2 1.0 x 107— ch : :1"? L = %90nm = 1. 5V N _ 120V_s (700mV)2 ID = (Xum)(1.0 X 107 cm/s)(15fF/um2)— = 1. 015mA (700mV) + 1.5V X = 3. 04pm c) Simulate the circuit with your sizing from b), and compare your results for 1m and IDN from parts a) and b). Why are the results different? Use SPICE to find the value for X that makes the currents match. How close was the analytical model to the SPICE result? Solution: Using SPICE, the drain current of the NMOS is 999.1uA and the drain current of the PMOS is 1101 HA. The plot for determining the correct value of X is shown below. Flndng WP to matdw the ID of 1m NNEE a Drain QJrrent (pA) § 308.5 1 15 i 2.5 3 3.5 Wldh of FMCB (pm) The correct value of X from SPICE is 2.755 pm, which is pretty close to our estimate. The reason for the discrepancy between the simulation and the hand calculations is that the model that we are using does not account for all of the physics of the real device — in particular, we ignored the fact that the transistor doesn’t act like a perfect current source even when it is saturated. Here is the SPICE deck: HW 5—3 solution .LIB '/home/ff/eel4l/MODELS/gpdkO90_mos.sp' TT_slv .option post nomod VDD Vdd O 1.0 .param wp = 3.04u vgl gl 0 1.2 Ml vdd g1 sl sl gpdk090_nmoslv W=lu n=90n M2 d2 0 vdd vdd gpdk090_pmoslv W=wp n=90n vtl sl 0 O vt2 d2 0 0 .Op .dc wp lu 3u .OO5u .end d) [BONUSz] Can you make the analytical model fit the SPICE model better by extracting an additional parameter? If so, extract this parameter and recalculate the drain currents using this improved model. Solution: One thing we are missing from the velocity saturation model is the channel length modulation parameter (9»). To find 9», we must remember that 1 1’: 11351]) Then we can find the rDs of the transistor: dVbs 1'DS 2 dl DS We can find rDs graphically by plotting ID vs. VDS for the NMOS and PMOS transistors. ID vs. VDS, vgp = 1v, \/(En = 1.2V DainOJn‘a1t(pA) § § § At VDs = 1V, VGs = IV, the slope of the PMOS line is (3060 9.) _14l42Q. At VGs = 1.2V, the slope of the NMOS line is (3424 9T1. At these points, IDSN = 999uA and 11351: = 997.5pA. This results in M =0. 292v1 and is = o. 328V'1. Recalculating the drain currents: (1050 V)2 1,,n = (1p.m)(1. 12 x 107 cm/s)(15fF/umz)m (1 + Afl(1V)) = 1. 311mA 7 2 (700mV)2 1D,, 2 (Xum)(1. 0 x 10 cm/s)(15fF/|1m )m (1 + )1,(1V)) = 1. 311mA X: 2.95pm This value for X is quite a bit closer than what we extracted when we neglected channel length modulation. There is still some discrepancy however (particularly in the value of the drain currents), mainly because the original value we used for DSAT was too high. Here is the SPICE deck: HW 5—3d solution .LIB '/home/ff/ee141/MODELS/gpdk090_mos.sp' TT_slv .option post nomod VDD vdd O 1.0 .param wp = 2.755u vgl gl 0 1.2 vg2 vdd g2 1 *ensures that as VDD is swept, Vgsp = 1V M1 vdd g1 s1 s1 gpdk090_nmos1v W=1u L=90n M2 d2 g2 vdd vdd gpdk090_pmos1v W=wp L=90n vtl s1 0 O vt2 d2 0 0 .Op .dc VDD 0 1.2 .1m .end ...
View Full Document

Ask a homework question - tutors are online