Unformatted text preview: 267 Chapter 13
Exercise 13.1 Graphs for cases (a) and (b) are shown in Figure 13.1.
f0 OR f1 f3 OR f2 T f1 f2 AND T F f3 f5 f5 AND f7 f6 f8 f7 f4 F f6 Group 2 OR Group 1 f0 f4 f8 (a) sequential execution graph (b) group sequential execution graph (max of 2 operations per group) Figure 13.1: Execution Graphs for Exercise 13.1 The execution time for the concurrent graph, considering that the test done by node f4 results true for n occurrences (and false after that), is computed as: Tconc = 3T + 2nT + 2T = (5 + 2n)T
where 3T corresponds to the time to execute the path f0 to f4 , f5 , and f6. Each iteration that results from a True test at f4 has a duration of 2T , which is also the execution time of the last two tasks to nish the computation (f7 and f8 ). The execution time of the sequential solution shown in Figure 13.1 (a), for the same loop condition used for the concurrent case is: Tseq = 4T + 3nT + 5T = (9 + 3n)T 268 Solutions Manual  Introduction to Digital Design  November 22, 1999 where 4T corresponds to the time to executed the path f0 to f4 for the rst time. Each iteration takes 3T and the last sequence of tasks takes 5T .
Exercise 13.2 Trace of variable's values for the VHDL description shown in Example 13.6, with xin = 5 and yin = ;9 is given as follows:  Before the loop: a = 5, and w = 9  Inside the loop: + rst iteration: a = 6 and w = 8 + second iteration: a = 7 and w = 7 + loop stops  zout = 28
Exercise 13.3 The algorithm presented in Example 13.7 is based on the recurrence: zi+1 = zi(2 ; xzi )
The algorithm stops when jzi x ; 1j < 0:5 . When the inputs for the algorithm are: 2 x = 3 and = 10;5 each variable will have the values shown in the next table: step j zj Error 0 1 1 1.3333333 0.111111 2 1.4814814 0.001234 3 1.4997714 0.000015 4 1.4999999 2 10;8 After 4 iterations, the result has an error that is smaller than 5 10;6 . When the input x = 1=3 is applied to the algorithm presented in Example 13.7, considering an error = 10;5 we obtain the following execution sequence: step j zj Error 0 1 1 1.6666666 0.444444 2 2.4074074 0.197530 3 2.8829447 0.039018 4 2.9954327 0.001522 5 2.9999930 0.000002 From this trace we can see that the algorithm will converge even with the input being out of range (x should be in the range 1=2 1]). A wrong estimate of the error however can happen, since the condition zk ; 1=x < is equivalent to 2(zk x ; 1) < only when x 1=2. Another problem
Exercise 13.4 Solutions Manual  Introduction to Digital Design  November 22, 1999 269 happen if x is out of range, and is also too small, resulting in a reciprocal with too many integer bits, while the circuit is designed to handle at most two integer bits. The algorithm will not converge when x 2. Using prescaling, input x is multiplied by 2 until 2k x is in the proper range. The scaled value k x is used as input for the algorithm. For example: 2 1 x = 3 ! 2x = 2 3 which is in the correct range, and corresponds to the input given in Exercise 13.3. So, the algorithm input should be 2x. The output generated by the algorithm (z ) must be corrected by multiplying it by the same scaling factor 2k used to prescale the input x. Based on the previous example: z = 1:5, and 2z = 3, which corresponds to the correct result.
Exercise 13.5 We want to compute z = a using the recurrence equation: a zi+1 = 1 (zi + z ) 2 i Since z 2 = a, the stop condition is de ned as z 2 ; a < , where is the maximum error in the nal result. The execution graph is presented in Figure 13.2.
Input a, error z=1 p w=z*z Y jw ; aj < error
N output z z = (z + a1 ) z2 Figure 13.2: Execution graph  Exercise 13.5 The VHDL code for the algorithm is given on Figure 13.3. 270 Solutions Manual  Introduction to Digital Design  November 22, 1999  VHDL code for Squareroot computation  Exercise 13.5 LIBRARY ieee USE ieee.std_logic_1164.all USE ieee.std_logic_arith.all ENTITY sqroot IS PORT (a_in: IN REAL RANGE 0.5 TO 1.0 Error: IN REAL RANGE 0.0 TO 0.1 z_out: OUT REAL RANGE 0.7 TO 1.0 CLK: IN BIT) END sqroot ARCHITECTURE high_level OF sqroot IS BEGIN PROCESS (CLK) VARIABLE a,w: REAL RANGE 0.5 TO 1.0 VARIABLE z: REAL RANGE 0.7 TO 1.0 BEGIN IF (CLK'event AND CLK='1') THEN z := 1.0 a:= a_in w := 1.0 WHILE (ABS(wa) > Error) LOOP z:=(a/z+z)/2.0 w:=z*z END LOOP z_out <= z END IF END PROCESS END high_level Figure 13.3: VHDL code for SQRT(a)  Exercise 13.5 Solutions Manual  Introduction to Digital Design  November 22, 1999
Exercise 13.6 271 (a) The concurrent execution graph for the computation of 6! is given in Figure 13.4(a). The execution time for this computation is 3T . The generalization of the factorial computation is based on a recurrent structure as shown in Figure 13.4(b). This structure shows that 2m numbers may be simultaneously multiplied pairwise to generate m products. The tree generated by this structure will have 2i inputs after i levels. To compute n!, n ; 1 inputs are needed. Thus, dlog2 (n ; 1)e levels are required. An example of the computation of 11! is also shown in Figure 13.4(b). In this case 4 levels are required in the tree. The execution time for n! is given as dlog2 (n ; 1)eT .
i 2 inputs 11 10 9 8 7 6 5 4 3 2 * a=6*5 b=4*3 AND c=a*b * * level i * c*2 * * level 2 level 1 (a) computation of 6! (b) generalization for n! Figure 13.4: Concurrent Execution Graph  Exercise 13.6(a) (b) The sequential execution graph for the computation of n! is given in Figure 13.5. We can see from the graph that the loop is executed (n ; 1) times. Thus, the total execution time is (2 + 3(n ; 1))T = (3n ; 1)T .
Input n r=1, f=1 r<n N Output f Y r= r+1 f=f*r Figure 13.5: Sequential execution graph  Exercise 13.6b 272
Exercise 13.7 Solutions Manual  Introduction to Digital Design  November 22, 1999 Execution graph of the algorithm that obtains the maximum value among n positive integers. All n integers are stored in a vector V . The vector is scanned from position 0 to position n ; 1. In this process, the maximal value is obtained. The computation graph is shown in Figure 13.6.
Input V,N MAX = V(0) I=1 N I<N Y Output MAX Y MAX < V(I) N I=I+1 MAX = V(I) Figure 13.6: Execution graph  Exercise 13.7
 VHDL code for MAX of N integers (in vector V)  Exercise 13.7 PACKAGE max_pkg IS CONSTANT N: INTEGER := 5 TYPE DatainT IS ARRAY (N1 DOWNTO 0) OF INTEGER END max_pkg USE work.max_pkg.ALL ENTITY max IS PORT ( V: IN DatainT := (10,12,4,2,34) M: OUT INTEGER CLK: IN BIT) END max ARCHITECTURE high_level OF max IS BEGIN PROCESS (CLK) VARIABLE maxv: INTEGER VARIABLE i: INTEGER RANGE 0 TO N BEGIN IF (CLK'event AND CLK='1') THEN maxv := V(0) i:= 1 WHILE (i < V'length) LOOP IF (maxv < V(i)) THEN maxv := V(i) END IF i:=i+1 END LOOP M <= maxv END IF END PROCESS END high_level Solutions Manual  Introduction to Digital Design  November 22, 1999
Exercise 13.8 273 The degree 9 polynomial is represented as: P9 (x) = 9 X
i=0 ai xi The sequential and concurrent execution graphs for the polynomial computation are shown in Figure 13.7. Using similar organization of Figure 13.5 and 13.6 of the textbook we obtain the organizations presented in Figure 13.8.
Begin Begin V=a9 A=a9*x+a8 B=a7*x+a6 C=a5*x+a4 D=a3*x+a2 E=a1*x+a0 F=x*x i=8 J=F*B+C K=F*D+E L=F*F V = V*x + ai M=L*J+K N=L*L i>0 N END Y i=i1 P9(x)=N*A+M END Figure 13.7: Execution graphs  Exercise 13.8 x a9 a7 a5 a3 a1 M M a8 M a6 M a4 M a2 M a0 Step4 Step3 Step2 Step1 F +
A +
B +
C +
D +
E Control M L M M x I 11 I I
12 13 +
J M N M M +
K x x 0 I 11 I I
12 13 I 21 I
22 F F a9 0 a8 I 21 I
22 B F C x a7 a6 I 11 I I
12 13 I 21 I
22 D F E x a5 a4 J I 11 I I
12 13 x a3 a2 I 11 I I
12 13 L I 21 I
22 I 21 I
22 x a1 a0 L K L 0 I 11 I I
12 13 I 21 I
22 N A M I 23 OO
1 2 I 23 OO
1 2 I 23 OO
1 2 I 23 OO
1 2 I 23 OO
1 2 I 23 OO
1 2 F L A J B K C M D N E P9(x) +
M +
P9(x) Figure 13.8: Organizations for polynomial computation  Exercise 13.8 274
Exercise 13.9 Solutions Manual  Introduction to Digital Design  November 22, 1999 The networks generated for each case is shown in Figure 13.9.
a b c A B Comparator A<B A=B A>B 0 1 0 1 MUX MUX a Part (a) b 16 temp
0 count
1 Decrementer
1 15 MUX NOT least significant bit
15 temp AND c
count1=1 Part (b) a b c c
0 1 0 b a
1 +
0 d
1 MUX c d
0 MUX 1 MUX c d c Part (c) MUX d Figure 13.9: Networks for Exercise 13.9 Solutions Manual  Introduction to Digital Design  November 22, 1999
Exercise 13.10 275 The state diagrams corresponding to the VHDL descriptions on this Exercise are shown in Figure 13.10.
start (a) 1 start 2 C=1 C=0 start (b) 0 start 1 2 end 5 start (c) 0 start 1 2 done 6 done 5 4 3 end 4 MP_0<>0 MP_0=0 3 3 N=0 4 N<>0 Figure 13.10: State diagrams for Exercise 13.10
Exercise 13.11 The state diagram for the system operation is shown in Figure 13.11. For N = 5 the trace of the computation is shown in the following table: clock cycle state X I ODD(X ) I > 0 0 0 undef. undef. undef. undef. 1 1 5 3 T T 2 2 26 2 F T 3 1 26 2 F T 4 2 13 1 T T 5 1 13 1 T T 6 2 66 0 F F 7 2 66 0 F F 276 Solutions Manual  Introduction to Digital Design  November 22, 1999
odd(X)/X=5X+1, I=I1 1 I>0/ 2 I=0/ 0 /X=N, I=3 odd(X)/X=X/2, I=I1 Figure 13.11: State Diagram for the system in Exercise 13.11 Thus the nal value is X = 66. For N = 7 the trace of the computation is shown in the following table: clock cycle state X I ODD(X ) I > 0 0 0 undef. undef. undef. undef. 1 1 7 3 T T 2 2 36 2 F T 3 1 36 2 F T 4 2 18 1 F T 5 1 18 1 F T 6 2 9 0 T F 7 2 9 0 T F Thus the nal value is X = 9. Exercise 13.12 (a) VHDL description
entity p13_12a is port (X: in BIT_VECTOR (0 to 7) clk,sel,ldW: in bit W: out BIT_VECTOR (0 to 7)) end p13_12a architecture arch of p13_12a is  process description  Exercise 13.12(a) BEGIN PROCESS (CLK) variable WIN: BIT_VECTOR (0 TO 7) BEGIN IF (CLK'event AND CLK='1') THEN IF (sel = '1') THEN WIN := X ELSE WIN := NOT(X) END IF IF (ldw = '1') THEN W <= WIN END if END IF END PROCESS END arch Solutions Manual  Introduction to Digital Design  November 22, 1999 (b) VHDL description
entity p13_12b is port (W,Z: in BIT_VECTOR (0 TO 15) clk,ldXY: in BIT X,Y: out BIT_VECTOR (0 TO 15)) end p13_12b Architecture arch of p13_12b is  process description  Exercise 13.12(b) begin PROCESS (CLK) variable XIN,YIN: BIT_VECTOR (0 TO 15) BEGIN IF (CLK'event AND CLK='1') THEN IF (W>Z) THEN XIN := W YIN := Z ELSE XIN := Z YIN := W END IF IF (ldxy = '1') THEN X <= XIN Y <= YIN END IF END IF END PROCESS END arch 277 Exercise 13.13 Control: Assume that E is connected to CNT input, and ldk is connected to the LOAD input of the mod4 counter. Based on these assumptions and the networks presented in Figure 13.31 of the text we obtain the following expressions: E ldK ldC ldA S1 S2 ldB = = = = = = = (ldk)0 (A 7:S 1) + (c out:S 2) clrB = c out:S 2 K 10 :K 00 K 10 :K 0 K 1:K 00 K 1:K 0 The value of ldK for all possible outputs of the counter is given by the following table. This signal will force the counter to load a zero value. counter state ldK 0 0 A7 1 2 c out 3 0 Observe that the counter will be forced to zero in state 1 if A 7=1, or in state 2 if c out=1. The state diagram for the control part is shown in Figure 13.12. 278 Solutions Manual  Introduction to Digital Design  November 22, 1999 0/ldA A_7=1/ c_out=1/ldC, clrB 1/ A_7=0 3/ldB c_out=0/ 2/ Figure 13.12: State diagram for Exercise 13.13 The highlevel description of the system in VHDL follows. The signals A, B , and C represent the output of each register.
LIBRARY ieee USE ieee.std_logic_arith.all USE ieee.std_logic_1164.all USE ieee.std_logic_unsigned.all entity p13_13 is port (X: in STD_LOGIC_VECTOR (7 downto 0) CLK: in STD_LOGIC Y: out STD_LOGIC_VECTOR (7 downto 0)) end p13_13 architecture arch of p13_13 is  process description  Exercise 13.13 SIGNAL A,B,C: STD_LOGIC_VECTOR (7 downto 0):=(OTHERS=>'0') SIGNAL state: NATURAL RANGE 0 TO 3 := 0 SIGNAL c_out: STD_LOGIC := '0' SIGNAL adder_out: STD_LOGIC_VECTOR (8 downto 0) begin adder_out <= ('0'&A) + ('0'&B) c_out <= adder_out(8) CTR:PROCESS (CLK) BEGIN IF (CLK'event AND CLK='1') THEN CASE state is WHEN 0 => A <= X state <= 1 WHEN 1 => IF (A(7) ='1') THEN state <= 0 ELSE state <= 2 END IF WHEN 2 => IF (c_out='1') THEN C <= B B <= "00000000" state <= 0 ELSE state <= 3 END IF WHEN 3 => B <= adder_out(7 downto 0) state <= 0 Solutions Manual  Introduction to Digital Design  November 22, 1999
END CASE Y <= C END IF END PROCESS end arch 279 The system accumulates all entries with values less than 128, until the accumulated value cannot absorb another input X < 128 without over owing. In this case, the internal accumulator value is registered as output and the internal state is reset. The trace of the system operation for a chosen sequence of inputs is shown in Figure 13.13.
/p13_13/x 01101000 /p13_13/clk /p13_13/y /p13_13/a 00000000 01101000 01001001 01101000 10010111 10110001 01101001 00000000 10110001 2 3 0 1 2 3 0 1 0 1 2 0 1 2 01001001 10010111 01101001 /p13_13/b 00000000 /p13_13/c 00000000 /p13_13/state 0 /p13_13/c_out /p13_13/adder_out 001101000 1 010110001 101001000 100011010 001101001 0 500 1 us 1500 Entity:p13_13 Architecture:arch Date: Mon Nov 22 16:45:36 PST 1999 Row: 1 Page: 1 Figure 13.13: Trace of execution  Exercise 13.13 280
Exercise 13.14 Solutions Manual  Introduction to Digital Design  November 22, 1999 The highlevel speci cation for the two's complement multiplier is: Inputs: x y 2 f;(2n;1 ; 1) : : : ;1 0 1 : : : (2n;1 ; 1)g Output: z 2 f;(2n;1 ; 1) : : : ;1 0 1 : : : (2n;1 ; 1)g Function: z = x y For a 32 32 bit multiplier we have n = 5. Considering that the inputs and output are represented in two's complement number system, their bitvector representations are: x = (xn;1 : : : x1 x0 ) y = (yn;1 : : : y1 y0) z = (z2n;1 : : : z1 z0 ) P; P; P n; where x = ;xn;1 2n;1 + n=02 xi 2i , y = ;yn;12n;1 + n=02 yi2i , and z = ;z2n;1 22n;1 + 2=0 2 zi 2i . i i i
The computation is based on the recurrence: zi+1 = ( 1=2(zi + (x 2n )yi ) if i < n ; 1 1=2(zi ; (x 2n )yi ) if i = n ; 1 which represents the case when the most signi cant bit of the multiplier is 1 (negative number). The general architecture of the multiplier is shown in Figure 13.14.
ldX X xin n ldY Y shY y_0 yin n ldZ Z clrZ xin yin Reg.
n Shift Reg. Reg.
2n n+1 zout n1 ls bit is discarded xreg_out AND gates
and_out n Control Complement
last_bit compl_out n+1
duplicate ms bit ldX ldY shY ldZ clrZ last_bit Data Path Adder
n last_bit clk start done_out zout (a) Data path (b) system’s block diagram Figure 13.14: Architecture of the two's complement multiplier  Exercise 13.14 The behavioral description of the multiplier is shown next:
LIBRARY ieee USE ieee.std_logic_1164.all USE ieee.std_logic_arith.all ENTITY multiplierTC IS GENERIC (n: NATURAL := 32)  number of bits in the operands PORT (start: IN BIT xin,yin: IN SIGNED (n1 downto 0) clk: IN BIT zout: OUT SIGNED (2*n1 downto 0) done_out: OUT BIT) END multiplierTC Solutions Manual  Introduction to Digital Design  November 22, 1999
ARCHITECTURE behav OF multiplierTC IS TYPE stateT IS (idle,setup,active,done) SIGNAL state: stateT := idle SIGNAL x,y: SIGNED (n1 downto 0) SIGNAL z: SIGNED (2*n1 downto 0) begin PROCESS (clk) VARIABLE zero_2n : SIGNED (2*n1 DOWNTO 0)  constant 0 VARIABLE scale: SIGNED (n1 DOWNTO 0)  aligning vector VARIABLE add_out: SIGNED (2*n DOWNTO 0) VARIABLE count: NATURAL RANGE 0 TO n BEGIN zero_2n := (OTHERS => '0') scale := (OTHERS => '0') IF (clk'EVENT AND clk='1') THEN CASE state IS WHEN idle => done_out <= '0' IF (start='1') THEN state <= setup ELSE state <= idle END IF WHEN setup => x <= xin y <= yin z <= zero_2n count:=0 state <= active WHEN active => IF (y(count) = '0') THEN add_out:=z(2*n1)&z  range extension ELSE IF (count = (n1)) THEN add_out:=(z(2*n1)&z)  (x(n1)&x&scale) ELSE add_out:=(z(2*n1)&z) + (x(n1)&x&scale) END IF END IF z <= add_out(2*n DOWNTO 1) IF (count /= (n1)) THEN state <= active count := count + 1 ELSE state <= done done_out <= '1' END IF WHEN done => IF (start='1') THEN state <= done ELSE state <= idle done_out <= '0' END IF END CASE END IF END PROCESS zout <= z  update the output END behav 281 A trace of a 4bit TC multiplier is shown in Figure 13.15. 282
/multipliertc/start /multipliertc/xin 1001 /multipliertc/yin 1101 /multipliertc/clk Solutions Manual  Introduction to Digital Design  November 22, 1999 /multipliertc/zout UUUUUUUU /multipliertc/done_out /multipliertc/state idle /multipliertc/x UUUU /multipliertc/y UUUU /multipliertc/z UUUUUUUU setup 00000000 11001000 11100100 10111010 00010101 active 1001 1101 00000000 11001000 11100100 10111010 done idle 00010101 0 200 400 600 800 Entity:multipliertc Architecture:behav Date: Sun Nov 21 20:30:24 PST 1999 Row: 1 Page: 1 Figure 13.15: Trace of execution of a 4bit Two's complement multiplier  Exercise 13.14 X = ;125 = (10000011)2 and Y = ;37 = (11011011)2 . clk state count X Y
0 1 2 3 4 5 6 7 8 9 10 0 1 2 2 2 2 2 2 2 2 2 x x x 1 2 3 4 5 6 7 0 xxxxxxxx xxxxxxxx 10000011 10000011 10000011 10000011 10000011 10000011 10000011 10000011 10000011 A numerical example of the multiplier operation is shown in the following table for the values xxxxxxxx xxxxxxxx 11011011 01101101 00110110 00011011 00001101 00000110 00000011 00000001 00000000 xxxxxxxxxxxxxxxx 0000000000000000 0000000000000000 1100000110000000 1010001001000000 1101000100100000 1010101000010000 1001011010001000 1100101101000100 1010011100100010 0001001000010001 Z Exercise 13.15 The highlevel speci cation of the 32 32 serialparallel multiplier for positive integers is given as follows, for n = log2 32 = 5 bits: Input: x y 2 f0 1 2 : : : 2n ; 1g Output: z 2 f0 1 2 : : : 22n ; 2n+1 + 1g Function: z = x y A block diagram of the data path for the multiplier is shown in Figure 13.16. The left multiplexer is used to select among the 4 multiples of X , that means: 0, X , 2X , and 3X . To enable the generation of the 3X value using the same adder used during the algorithms iterations, some gates were inserted in the path of the selection signals of this multiplexer to force the X value as an output. Another multiplexer was placed at the other input of the adder to receive the 2X value. After the value 3X is generated, the iterations will be executed. The VHDL speci cation of the radix4 multiplication algorithms is given next: Solutions Manual  Introduction to Digital Design  November 22, 1999
ldX X xin n ldY Y shY
00....0 0 3 2 0 00 1 0
n+2 283
xin yin yin n Right 2Shift Reg. ldZ Z clrZ Reg. 2n Control n
0 0 1 00 0 Reg. xreg_out n Reg. ld3X
n+2 gen3X
2 2 n2
A MUX n+2 A n+2 0 MUX ldX ldY shY ldZ clrZ last_bit Data Path gen3X
gen3X clk start Adder n+2 (a) Data path zout (b) system’s block diagram Figure 13.16: Block Diagram for the Data path of the radix4 multiplier
LIBRARY ieee USE ieee.std_logic_1164.all USE ieee.std_logic_arith.all USE ieee.std_logic_unsigned.all ENTITY radix4_multiplier IS GENERIC (n: NATURAL := 8)  number of bits in the operands PORT (start: IN BIT xin,yin: IN UNSIGNED (n1 downto 0) clk: IN BIT zout: OUT UNSIGNED (2*n1 downto 0) done_out: OUT BIT) END radix4_multiplier ARCHITECTURE behav OF radix4_multiplier IS TYPE stateT IS (idle,setup,gen3x,active,done) SIGNAL state: stateT := idle SIGNAL x,y: UNSIGNED (n1 downto 0) SIGNAL z: UNSIGNED (2*n1 downto 0) SIGNAL XXX: UNSIGNED (n+1 downto 0) begin PROCESS (clk) VARIABLE zero_2n : UNSIGNED (2*n1 DOWNTO 0)  constant 0 VARIABLE scale: UNSIGNED (n1 DOWNTO 0)  aligning vector VARIABLE add_out: UNSIGNED (n+1 DOWNTO 0)  adder output VARIABLE count: NATURAL RANGE 0 TO n BEGIN zero_2n := (OTHERS => '0') scale := (OTHERS => '0') IF (clk'EVENT AND clk='1') THEN CASE state IS WHEN idle => done_out <= '0' IF (start='1') THEN state <= setup ELSE state <= idle END IF WHEN setup => x <= xin y <= yin z <= zero_2n count:=0 state <= gen3x 284 Solutions Manual  Introduction to Digital Design  November 22, 1999 WHEN gen3x => XXX <= ('0'&x(n1 downto 0)&'0') + ("00"&x) state <= active WHEN active => CASE conv_integer(y(count+1 downto count)) IS WHEN 0 => add_out:="00"&z(2*n1 downto n) WHEN 1 => add_out:=("00"&z(2*n1 downto n))+x WHEN 2 => add_out:=("00"&z(2*n1 downto n))+('0'&x&'0') WHEN 3 => add_out:=("00"&z(2*n1 downto n))+XXX WHEN others => add_out := (others => '0') END CASE z <= add_out & z(n1 downto 2) IF (count /= (n2)) THEN state <= active count := count + 2 ELSE state <= done done_out <= '1' END IF WHEN done => IF (start='0') THEN state <= done ELSE state <= idle done_out <= '0' END IF END CASE END IF END PROCESS zout <= z  update zout END behav A trace of this code execution for two arbitrary inputs is shown in Figure 13.17. The values are represented in octal.
/radix4_multiplier/start /radix4_multiplier/xin 262 /radix4_multiplier/yin 330 /radix4_multiplier/clk /radix4_multiplier/zout XXXXXX /radix4_multiplier/done_out /radix4_multiplier/state idle /radix4_multiplier/x XXX /radix4_multiplier/y XXX /radix4_multiplier/z XXXXXX /radix4_multiplier/xxx XXXX 0 200 setup gen3x 262 330 000000 1026 400 600 800 054400 041300 113060 active done 000000 054400 041300 113060 Entity:radix4_multiplier Architecture:behav Date: Sun Nov 21 20:24:21 PST 1999 Row: 1 Page: 1 Figure 13.17: Trace of execution  8bit radix4 multiplier  Exercise 13.15 For a 8 8 case, with the operand values x = 135 and y = 115, this is the result of the algorithm execution: Solutions Manual  Introduction to Digital Design  November 22, 1999 clk 0 1 2 3 4 5 6 7 state idle setup gen3x active active active active done count x x 0 0 2 4 6 6 xxxxxxxx xxxxxxxx 10000111 10000111 10000111 10000111 10000111 10000111 285 X xxxxxxxx xxxxxxxx 01110011 01110011 00011100 00000111 00000001 00000000 Y 3X xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx 0110010101 0110010101 0110010101 0110010101 0110010101 xxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxx 0000000000000000 0000000000000000 0110010101000000 0001100101010000 0110101110010100 0011110010100101 Z Exercise 13.16 A recon gurable CS/CR adder is shown in Figure 13.18. When the control signal c = 0, the FA's input comes from the external input, and the network implements the CSA. When c = 1, the carry out bits of each FA is connected to one of the inputs of the neighboring module, creating the carry ripple chain required to reduce the carry vector and sum vector. Only the sum output of the FAs presents the addition result in this case. The delay of the recon gurable adder has one multiplexer delay longer than the simple CS adder.
c c c c FA FA FA FA Figure 13.18: Recon gurable CS/CR adder The critical path of the multiplier is: T = Tadd + TRsetup + TRprop where Tadd is the delay of the adder (CS or CR), TRsetup and TRprop are the setup and propagation delays of the register (Z ). The number of cycles required for multiplication is n + 1. Assuming that the multiplexer delay (tmux ) is similar to the delay of the register (treg ), the number of cycles for nal addition using the recon gurable adder would be: n(tFA + tmux) tFA + tmux + treg n cycles giving a total of 2n +1 cycles for the multiplier using CSA. However, the cycle time of the multiplier using CRA is ntFA + treg , while the cycle time for the multiplier using CSA is only tFA + tmux + treg . Thus TCRA = (n + 1)(ntFA + treg ) TCSA = (2n + 1)(tFA + tmux + treg ) = (2n + 1)(tFA + 2treg ) and it is not di cult to see that the CSA implementation is faster than the CRA implementation for even small values of n. A modi ed controller to operate with the recon gurable adder is presented next: 286 Solutions Manual  Introduction to Digital Design  November 22, 1999
16 16) BIT BIT BIT BIT BIT BIT) number of bits number of clock cycles for **** conversion (m less equal n)**** control input control signals control signals control signal **** control output ENTITY multctrl IS GENERIC(n: NATURAL := m: NATURAL := (strt : ldX,ldY,ldZ: shY, clrZ : CRA_ctr : done : clk : END multctrl PORT IN OUT OUT OUT OUT IN ARCHITECTURE behavioral OF multctrl IS TYPE stateT IS (idle,setup,active,convert,store)  **** SIGNAL state : stateT:= idle SIGNAL count : NATURAL RANGE 0 TO n1 BEGIN PROCESS (clk)  transition function BEGIN IF (clk'EVENT AND clk = '1') THEN CASE state IS WHEN idle => IF (strt = '1') THEN state <= setup ELSE state <= idle END IF WHEN setup => state <= active count <= 0 WHEN convert=> IF (count = m1) THEN  **** count <= 0 state <= store  **** ELSE  **** count <= count+1 state <= store  **** END IF  **** WHEN store => state <= idle  **** END CASE END IF END PROCESS PROCESS (state,count)  output function VARIABLE controls: Bit_Vector(6 DOWNTO 0)  modified ****  code = (CRA_ctr,done,ldX,ldY,ldZ,shY,clrZ) BEGIN CASE state IS WHEN idle => controls := "0100000"  **** WHEN setup => controls := "0011001"  **** WHEN active => controls := "0000110"  **** WHEN convert=> controls := "1000000"  **** WHEN store => controls := "1000100"  **** END CASE done <= controls(5) CRA_ctr <= controls(6)  **** ldX <= controls(4) ldY <= controls(3) ldZ <= controls(2) shY <= controls(1) clrZ<= controls(0) END PROCESS END behavioral Lines marked with four asterisks were modi ed or inserted in the original VHDL code for the multiplier control section (as shown in the text). A new control signal CRA ctr was included in the controller interface. This signal is 1 when we want the carry bit to propagate on the adder. Two other states were also included: convert and store. During the convert state, CRA ctr is activated for a given number of cycles m n (m is de ned based on the delay of the carry propagation Solutions Manual  Introduction to Digital Design  November 22, 1999 287 through the adder when the carry ripples). In the convert state, there is no shift in the Y register, and no load on the Z register. Once the controller waited enough for the carry to propagate, it moves to state store, where the ldZ control signal is activated to store the nonredundant addition result into the Z register. The Z register for a CS representation of the partial product is composed of two subregisters: PS (sum vector) and C (carry vector). The nonredundant result will be available on the PS register only. ...
View
Full
Document
This note was uploaded on 10/31/2009 for the course EE EE M16 taught by Professor Eshaghian,m.m. during the Fall '09 term at UCLA.
 Fall '09
 ESHAGHIAN,M.M.

Click to edit the document details