This preview shows page 1. Sign up to view the full content.
Unformatted text preview: rry-propagate operation. There are many ways to construct carry-save adders, but the simplest is the 3-input 2-output form. This accepts as inputs a partial sum, a partial carry and a partial product, all of the same binary weight, and produces as outputs a new partial sum and a new partial carry where the carry has twice the weight of the sum. The logic function for each bit is identical to a conventional full adder as used in a ripple-carry carry-propagate adder (see Figure 4.10 on page 88), but the structure is different. Figure 4.16 on page 95 illustrates the two structures. The carry-propagate adder takes two conventional (irredundant) binary numbers as inputs and produces a binary sum; the carry-save adder takes one binary and one redundant (partial sum and partial carry) input and produces a sum in redundant binary representation. ARM implementation 95 Figure 4.16 Carry-propagate (a) and carry-save (b) adder structures. During the iterative multiplication stages, the sum is fed back and combined with one new partial product in each iteration. When all the partial products have been added, the redundant representation is converted into a conventional binary number by adding the partial sum and partial carry in the carry-propagate adder in the ALU. High-speed multipliers have several layers of carry-save adder in series, each handling one partial product. If the partial product is produced following a modified Booth's algorithm similar to the one described in Table 4.3 on page 94, each stage of carry-save adder handles two bits of the multiplier in each cycle. The overall structure of the high-performance multiplier used on some ARM cores is shown in Figure 4.17 on page 96. The register names refer to the instruction fields described in Section 5.8 on page 122. The carry-save array has four layers of adders, each handling two multiplier bits, so the array can multiply eight bits per clock cycle. The partial sum and carry registers are cleared at the start of the instruction, or the partial sum register may be initialized to the accumulate value. As the multiplier is shifted right eight bits per cycle in the 'Rs'...
View Full Document
- Spring '09