This preview shows page 1. Sign up to view the full content.
Unformatted text preview: egularity. Datapath layout The ARM datapath is laid out to a constant pitch per bit. The pitch will be a compromise between the optimum for the complex functions (such as the ALU) which are best suited to a wide pitch and the simple functions (such as the barrel shifter) which are most efficient when laid out on a narrow pitch. Each function is then laid out to this pitch, remembering that there may also be buses passing over a function (for example the B bus passes through the ALU but is not used by it); space must be allowed for these. It is a good idea to produce a floor-plan for the datapath noting the 'passenger' buses through each block, as illustrated in Figure 4.20. The order of the function blocks is chosen to minimize the number of additional buses passing over the more complex functions. Figure 4.20 ARM core datapath buses. ARM implementation 99 Modern CMOS processes allow wiring in several metal layers (the early ARM cores used two metal layers). The wiring layers used for power and ground, bus signals along the datapath and control signals across the datapath must be chosen carefully (for example on ARM2 Vdd and Vss run along both sides of the datapath in metal 2, control wires pass across the datapath in metal 1 and buses run along it in metal 2). Control structures The control logic on the simpler ARM cores has three structural components which relate to each other as shown in Figure 4.21. 1. An instruction decoder PLA (programmable logic array). This unit uses some of the instruction bits and an internal cycle counter to define the class of operation to be performed on the datapath in the next cycle. 2. Distributed secondary control associated with each of the major datapath func tion blocks. This logic uses the class information from the main decoder PLA to select other instruction bits and/or processor state information to control the datapath. 3. Decentralized control units for specific instructions that take a variable number of cycles to complete (load and store multiple, multiply and coprocessor opera tions). Here the main decoder PLA locks into a fixed state until the remote con trol unit indicates completion. The main decode...
View Full Document
- Spring '09