This preview shows page 1. Sign up to view the full content.
Unformatted text preview: their internal pipeline very carefully if maximum throughput is to be obtained. Synchronizing the DSP code with the ARM code that is running concurrently is a complex task. ARM Limited has introduced two different extensions to the ARM architecture in attempting to simplify the system design task in applications which require both controller and signal processing functions: the Piccolo coprocessor, and the signal processing instruction set extensions in ARM architecture v5TE. Piccolo The Piccolo coprocessor is a sophisticated 16-bit signal processing engine that uses the ARM coprocessor interface to cooperate with the ARM core in the transfer of operands and results from and to memory, but also executes its own instruction set. The organization of Piccolo is illustrated in Figure 8.20 on page 241. Operands are loaded and results stored via the ARM coprocessor interface, so suitable addresses must be generated by the ARM core. The input and output buffers allow these transfers to move many 16-bit values in a single instruction, and values are transferred in pairs, making full use of the ARM's 32-bit bus width. The input buffer stores values until they are called upon by the signal processing code, and they may be accessed out of order from the buffer. The Piccolo register set holds operands that may be 16 bits, 32 bits or 48 bits wide. The four 48-bit registers are used to accumulate results such as inner products without risk of overflow. The processing logic can compute a 16x16 product and add the result to one of these 48-bit accumulator registers in a single cycle. It also offers good support for fixed point operations and supports saturating arithmetic. The signal processing operations that use values held in the register file are specified in a separate instruction set which Piccolo loads from memory via the AMBA bus into a local instruction cache. An objective of the Piccolo architecture is to provide sufficient local storage, in the form of registers, the instruction cache and the input and output buffers, that a single AMBA bus can support good throughput. This contrasts with conventional signal processor designs that use two independent data memories and a separate instruction memory. As a result, Piccolo o...
View Full Document
This document was uploaded on 10/30/2011 for the course CSE 378 380 at SUNY Buffalo.
- Spring '09