chapt11_ComputerOrganization

chapt11_ComputerOrganization - From coupler-flange to...

Info iconThis preview shows pages 1–23. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 10
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 12
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 14
Background image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 16
Background image of page 17

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 18
Background image of page 19

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 20
Background image of page 21

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 22
Background image of page 23
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: From coupler-flange to spindle-guide I see Thy Hand. 0 God—- Predestlnotion in the stride o‘ yon connectin'—md. —R. Kipling introduction through 10. we examined the hmdamentsl principles of next, we will focus on applying l digital systems: the stored pm- In Chapters 1 digital design. In this chapter and the these techniques to one meior class 0 3mm computer. A stored program computer consists of a processing unit and an attached memory.r system. Commands that instruct the processor to per form certain operations are placed in the memory along with the data items to be operated on. The processing unit consists of dotapoth and control. The datapath contains registers to hold data and functional units, such an arithmetic logic units and shifters. to operate on data. The control unit is little more than a finite state machine that sequences through its states to [1} fetch the next instruction from memory. [2] decode the instruction to interpret its meaning. and (3) execute the instruction by moving andlor operating on data in the registers and functional units of the datepath. The critical design issues for one components together to mi number of control states to complete a typical operation. F a datapath are how to “wire” the vari- nimize hardware complexity and the or control. 11.1 Suumursulatiommtsr 51 [he issue is how to 0 anime th ' “mun” finite state lfihine- e relatively complex instruction Interpre- In . . I]de fittiochapter. we WI." discuss how hardware components are 0 computers. In addition, we will apply the techniquesrgr‘ e . l u ’ ‘ ‘ :8 e. apaths and pl‘UBESSOI COHth units. In I Point-la-polnt single-bus and ' I . , multiplevbus stmte ' ' m res r - :ata;l;{hf::r:lpfi:1:il:;s o):i the dotapotll. There is a Erode-fig 13:25:31 ‘ I on control com [exit . A path can simplify the control and \I'icepitersal.fr more complex dam. I The structure of the controller finite state machines state tilt]me The state diagram ice on to or . we will exploit in chapter 11:2, er control has a spectal structure that 11.1 Structure of a Computer F. 13:2: It: shows a high-level block diagram of a computer. it is de in o a central processing unit (CPU). or processor and:on attached memory a ystem. In turn. th - dalapflm and warm! mum a processor is decomposed into storfiiiféfig {also called the execution unit] contains registers for data “Ch u Shrine results and combinations] circuits for operating on sauna f a“ 1 mg: adding. and multiplying. The latter are somet' un tonal umts because they apply functions to data. Half: moved from memory into re gisters. [t is then moved 1 ' units. where the date manipulations take place. The recsutlltl: eh: Flinn “.1 Structure of a ptoeeseot. m =_1_.:._L____II_.I. ;_a_. _ ._.. . . f 5 . t i i 55! Chapter 1] Computer Organisation back into registers and eventually put back into memory. The datapath implements the pathways along which data can flow from registers to functional units and back again. The control unit (or instruction unit) implements a finite state machine that fetches a stream of instructions from memory. The instructions de- scribe what operations, such as ADD, should he applied to which operands. The operands can he found in particular registers or in memory locations. The control unit interprets or “executes” instructions by asserting the appropriate signals for manipulating the datapath. at the right time and in the correct sequence. For example. to add two registers and place [he results in a third register. the control unit {1) asserts the necessary con- Lro] signals to move the contents of the two source registers to the arith- metic logic unit [ALUL [2) instructs the ALU to perform an ADD operation by asserting the appropriate signals. and (3: moves the result to the specified destination register. again by asserting signals that estab- lish a path between the ALU and the register. Instructions can be grouped into three broad classes: data manipulation (add. subtract. etc.]. data staging {loadi'store dale fromIto memory}. and control (conditional and unconditional branches}. The latter class deter- mines the nexl instruction to fetch. sometimes conditionally based on in— puts lrom the datapath. For example. the instruction may be to take the branch if the last datapath operation resulted in a negative number. You are already familiar with the basic building blocks needed to im- plement the processor. You can interconnect NAND and NUR gates to build adder and logic circuits (Chapter 5} and registers (Chapters 6 and 7}. The processor control unit is iust another finite state machine {Chapters ti, 9. and ID). in the rest of this section, we will examine the components of a computer in a iittle more detail. as a prelude to the rest of this chapter. A processor control unit is considerably more complex than the kinds of finite state machines you have seen so far. A simple finite state machine is little more than next-state and output logic coupled to a state register. A control unit. on the other hand. needs access to its own datapath. a collection of registers containing information that affects the actions of the state machine. Program Countenflnslrnction Register For example. the control unit may have a register to hold the address of the next memory word to fetch for instruction interpretation. This is frequently called the program counter. or PC. when the instruction is moved from memory into the control unit. it must be held somewhere while the control decodes the kind of instruction it is. This staging memory, often implemented by a register. is called the instruction register or IR. It is a specislvpurpose register and is usually not visible to the assembly language programmer. “Kwanzdw w. . 11.1 Structure of a Computer 559 BIS-ll: States of the Bonlrol Unit The control unit can be in one of four basic phases: Reset. Fetch the Next instruction. Decode the instruction and Execute the Instruction. A high-level state diagram for a typical-con: trol unit is shown in Figure]1.2. Let‘s begin with the initialization seguence. An external reset signal places the finite state machine in its initial Reset state. from which the proceseor is initialized. Since the state of the processor contains more than just the state register of [he finite state machine. several of the special registers must also be sel to an initial value. For example. the PC must he set to some value. such as (1 before the first instruction can be fetched. Perhaps an accumulator regisler or a special register holding an indication of the condition of the riatapath will be set to I} as well. Although shown as a single stale in the figure. the ini- tialization process may be implemented by a sequence of states. Next. the machine enters the. Fetch Instruction state. The contents of the PC are sent as an address to the memory system. Then the control generates the signals needed to commence a memory read. When the operation is complete. d1e instruction is available on the mentorv's out- put euros and must be moved into the control unil's IR. Again. Fetch lnSlrUCiIQl‘lleDks like a single state in the figure. but Ihe actual imple— mentation Involves a sequence of states. I’l/d /—llfi< V I: \ Ream! ‘i ‘l—iliilialize I Machinn _ Fttltll] 'i lnstl‘. J! a. / “a Different .‘iuquoncu : c‘ for Each lltslnldim: /_ Load- ,‘fk\ \ TH)? . .1 , .I _ \ I \\ ./. \ Instr. It to—Raglsler 1_ 1mm. WIRE I _/ xx / Branch he.“ -" : Not Tisha“ 3‘ _ Branch i I" Inc-n II \ “K Figure 112 Hrgll-IEVEI control state diagram ———”—”—W 56: Chapter t1 Computer Organization tlprrutitnr .32 {.ItrtH .‘i Figure 11.3 Iterative (maceration oi :latapaih ubiects Clnce the instruction is available in the 1R. the control examines certain bits within the instruction to determine its type. Each instruction type leads to a different sequence of execution states. For example, the basic ax- ecution sequence for a ragistsr—to-zegister add instruction is identical to one for a register-to—regiater subtract. The operands must be moved to the ALU and the result directed to the correct register destination. The only differ once is the operation requested of the ALU. As long as the basic data move- ments are the same. the control sequences can be parametsrizad by the specific operation. decoded directly from the instruction. The state machine in the figure partitions the instructions into three classes: Branch. LoadIStore. and Register—to-Ragistar. Of course. there could be more classes. in the limit. there could be a unique execution sequence for each instruction in the processor’s instruction set. The final state takes care at housekeeping operations. such as incre- menting the PC. before branching back to fetch the next instrudion. The execution sequence for a taken branch modifies the PC itself. so it bypasses this step. The sequence of instruction fetch. execute. and PC increment continues until the machine is reset. While the details of the state diagram tnay vary from one instruction set to another. the general sequencing and the shape of the state diagram are generic to CPU state machines. The most distinguishing feature is the multiway decode branch between the instruction fetch and its axe- cution. This influences the design of controllers for simple CPUs that we describe in the next chapter. The elements of the datapath are built up in a hierarchical and iterative fashion. Consider how we go about constructing a 32-bit arithmetic unit for inclusion in the datapath. At the most primitive level. we begin with the half adder that can add 2 bits. By interconnecting two of these. we create the full adder. Once we have a single “bit slice" for the datapath obiect. we create as many instances of it as we need for the width of the datapath. For example. we construct a 32—bit adder of the ALU by itera- tiver composing 32 instances of a 1—bit—wide adder. As we saw in Chapter 5. an ALU bit slice is somewhat more compli- cated than this. It should also include hardware for logic operations and carry lookahead to perform arithmetic operations with reduced delay. The datapath symbol for a typical arithmetic logic unit is shown in Figure 11.3. The 32-bit A and B data inputs come from other sources in the datapath: the S output goes to a datapath destination. The operation signals come from the control unit: the carry—out signal is routed back to the control unit so that it may detect certain exceptional conditions. such as overflow. that may disrupt the normal sequencing of instruc- tions. We construct other datapath obiects. such as shifters. registers. and register files. in an analogous manner. (:untrol Fl {lulu Flor-.- fieure 11.4 Central ar'-"' data flows in a s-niple CPL'. raw 11 1 Structunaul a Computer 551 \ I . I . . julztfilgldtgél‘ with only a Single data register. usually called tho. (IL-Hum“— bI Ik . Is the Simplest machine organization. Figure 11.4 slum-r. alu- or. Lagram for such a Single accumulator machine. addlnslrutmons Ifor a single accumulator machine are called rim-I.- emIr-eIss JflSfl'ttCUUflS. This us because they contain nnlir a sinulu rial-or- I o tireinory. OneIoperant‘l is implicitly the M]: the other isoan {JIJl't'- a“ tit metndory. The Instructions are of the form A{‘ ‘= AC <opcr'ttion) emorv [A dress] <opemtion> co ‘ y I l I . old he " ‘ and so OIII ADD. SUBIRACI. AND. UR. Let S consider an ADD instruction. The Old value of the AC is replaced with the sum of the A05 c ‘ ontents' i , ; . . ,. location «t1 El the ( onlcttts of the specified memory Data and Cnntrnl Flows Figure 11.4 shows the flow of riots and control between memory. the control registers (IR. MAR. and PC: the data rtggter {AC}. and the functional units tALUJ. The MAR is the Memory . ems Register. a storage element that holds the address during mani‘ Lil‘\r ti . . '56:. Da a f 0W? alr‘. S town as lat] C at: wail ‘ l'lt" (ltl‘ter line I l . l l i ' ' 0 ll ' - I I [I I 1119.5. l . 5 PIC (xx: oi tlIiIe datapath consists of the ariLtinietic logic unit and the .. e .. is t c source or destination of all transfers. These transfers Start! Path ' I I ' u-HLL‘tKl T’Illtl I at: ' h : W... I l I _, x J—-\ ._ ‘1 .I ,' H I" \_I I . II; : Molnar-i \ \ I/ ,l hits with: l —‘ / l' I I M u'uttls t-‘sv. l ALL! I - a ._ I'— I _ —.—" . . 1—1— , I Upliulfi l c‘ u : —: - i : ' —— a |:l:t|l‘lir:il(lll |’|||.'t m— 562 Chapter 11 Computer Organization are initiated by store, arithmetic. or load operations. Let's look at them in more detail. - The instruction identifies not only the operation to be performed but also the address of the memory operand. Store operations move Ihe contents of the AC to a memory location specified by bits within the in- struction. The sequencing begins by moving the specified address from the [R to the MAR. Then the contents of the AC are placed on the memory’s data input lines while the MAR is placed onto its address lines. Finally. the memory control signals are cycled through a write sequence. Arithmetic operations take as operands the contents of Ihe accumula- tor and the memory location specified in the instruction. Again. the con- trol moves the operand address from the IR [0 the MAR, but this time it invokes a memory read cycle. Data obtained from the load path is com- bined with the current contents of the AC to form the operation result. The result is then written back to the accumulator. A load operation is actually a degenerate case of a normal arithmetic operation. The control obtains the B operand along the load path from memory. it places the ALU in a passvthmugh mode. and it stores the result in the AC. Whereas loadlstore and arithmetic instructions manipulate the AC, branch instructions use the PC. If the instruction is an unconditional branch. the address portion of the IR replaces the PC. changing the next instruction to be executed. Similarly. a conditional branch replaces the PC it a condition specified in the instruction evaluates to true. Placement 0! Instructions Ind Dell There are two possible ways to con- nacl the memory system to the CPU. The first is the so-called Princeton architecture instructions and data are mixed in the same memory. in this case. the instruction and loadlstore paths are the same. The alternative is die Harvard architecture. Data and instructions are stored in separate memories with independent paths into the processor. The Princeton architecture is conceptually simpler and requires less connections to the memory. but the Harvard architecture has certain perfor- mance advantages. A Harvard architecture can fetch the next instruction even while executing the current instruction. If the current instruction needs to access memory to obtain an operand. the next instruction can still be moved into the processor. This strategy is called instruction prefetching, because the instruction is obtained before it is really needed. A Princeton architecture can prefetch instructions too. It is just more complicated to do so. To keep the discussion simple. we will assume a straightforward Prince— ton architecture in the rest of this chapter. Betailed Instruction Trace As an example of the control signal and data flows needed to implement an instruction. let‘s trace a simple instruc— tion that adds the contents of a specified memory location to the AC.- 3. The control moves the o rand add ' ' second memory read opergfion to fetcffhebhiei‘lgfl.“ MAR and hang a 4. Once the data is availabl drives the ALU with si to form the 5 result. a memory along “18 load path. the control gnals instructing it to ADD its A and B operands s. The control then moves the S tion ofthe instruction. ii. The control increments th I 9 program counter to i ' tron. The machine returns to the first step. W m m the as“ msmu‘ result into the AC to complete the execu- I . most of what the Con— from one register to anothe . asserting the appro priate control signals at the ' correct t1 . commonly described in term mes comm] sequencas am Instruction Patch 3:; Mime] Move PC to MAR ory ea : ' Memory ‘6 LR: Assert Memory READ signal Lead IR from Memory Instruan Decode: IF IR<op code) = ADD_FROM_MEMURY THEN Instruction Execution: IR<address bita> —> MAR: Memory Read; Memory —9 ALU B: Move operand address to MAR Assert Memory READ signal Gate Memory to ALU H :EU—iASEU A; Gate AC to ALU A : lnstru t AL ALU S q AC: I: U to perform ADD Gate ALU result to AC PC in ‘ crement, Instruct PC to increment 5M Chauler 11 Bummer Urganiiation I"; - -——-v M I A ' Ilt _ Ii [inquest Reuth'Wrilu Memory Whil I.L}-‘h"t' Dal-u q—i— M It —" Il:.~ilr|ll:l1'l:|l.|i It 4—— Fimue 11.5 MEIIIOI’Y interime. We WI‘HU the [Ile'al'lUll stuluments in terms of the mantra] signals to he asserted. such as Mammy Hmld. ALU ADD. or PC increment, WE! write leglster-lu-reglslur transient: in the form soumfl. mgl'slel' -; destination mg- Ester. The detailed pathways IJelwcl-ln registers determine tho more pull (all mgister trmlsfer descriplion. We will 500 more register transfer clustiip— firms in Secliul] 11.2. Figure. 11.4 showed a masonnhly generic illtert'aw to memory. A more real- istit: Vii-cw for a Princeton archiluclurlz machine is shown in Figure 11.5. The key elements are the two Special registers. MAR’and MBR, and Illa {luau control signals. Request. Readel'iIL‘. . and Wait. Lot's starl with the registers. We have seen the MAR before. in Figure 11.5. it L:an be loaded from llll: pmgral‘n counter for inslrueliuu fetch 01' from the [R with a load (11' slum address. To {tacoupln the memory from the internal working at the processor. wl'l l'mrmlune ll s nl:l interface register. the Martial? Bufl'el' Register. or MBR. A bidlrucllmml path [ui' ioadfslm'e duh] exists between Ihe processor Llulupulll llnd the MB“. while the pathway for instructions helweul‘l ttlu Mtilt and IR is unidirl. tinnal. Besides the address unci (Intu lines. the lnlert'aue In memory :iuns'lsts ol' Illl'OE control signals. ‘I'he Requesl signal notifies the memory [hat the plain-«Hem wishes to amass: it. The RUEIKUWI'liF. signal sliclufies the din-ll:- Iiml: mull from memory (in {I load illlll write to memory on a store. 'l'hc Wall signal leis monum- tall the processor. in offal mllt'ying the pro— trllssnl‘ Ihat its mulnl Ilnst has not yet been serwued. We can think of Wail as the 1::Jl11plement of un ucknowledgmanl signal. Prncassor-Mnmnry Handshnking In their most general Form. Kha manner}. system amt the processor [In llUl shall'e a1 conmlon clock. Ti} 9m:le proper lrunslor of data. we should Follow the filllli‘U}"Clfi signaling convention 0‘: Section 6.5.2. Thu prumssm' :lsserts the readfwrite diruclil‘m. places data in the MAR [and the MBR if a write}. :11ch asserts ltoqunst. Thu memur) lloruullly asserts W . luulsscrling it when lhu mart nl- wrllu is cumplule. When the processor lmti :5 that Wall is [10 longer a‘m'ted. :il Latches llatal lulu [he MtiR on ll l‘e'dll or tl'i-sl-Ites the delta conm. . on to lllt-Elllul'} on a writs. The procussor unasserts its Requesl line and must wall in:- the Walt signal to he renssurlelt by the Memory before it can issue it.» next memory request. 'l‘lm signaling t-l’tl\-'Uft][‘l11$ are shown in Figure 11.6. 't‘he l'IJLlr-ct'liil handshake of [he léoqllnst and Walt signals for the lead sequence wall- as follows: (Ii-(rill l: liraquclst asserted. Read data placed on mummy data bus. f..'_'. It! 2: \"J'alt LliiitcitiUl‘lL'lL'l. Cl’l! latches l‘ualtl Llala illtu MHR. 11 1 Structure at a Computer 505 *Jm—_\__fi_\_ /— ' fix HeadIWrilu —..- ~—.—._ . Dull! ——( T-‘rom Mmlmry >—<.‘ 'ltJ Admin-SJ; Wu H _ .. \_¢ ‘I_/ Figure 11.6 Mammy internals timing wavelurms. Cyclfi 3: Request unasnartcd. Cycle 4: Wait HSSOI'IL‘d. In this signaling convention. a new requesl can be made only after Ilie Wait signal is asserted, 'l‘ho write uynln is analogous. Figure 11.? shows possible slate mauhins fi'agmeuls for implementing the tour-cycle handshake with memory. We assume a Monro machine mun- LmIler implementation. In the read cycle. we eniar a slate that drives the address bus from the MAR. asserts the Read and Requesl signals, and latches the data bus into the MBR, This last transfer catches current data only lt‘ memnry has unasserterl Wait, so we must loop in this state lmtil this is true. On exlt to tho nopn state. [he Request signal ls unasuerted and the address bus is no longer driven. The memory signals [hat ll is really for a new request hy asserting Wail. Tn remain interlocked with memory. we loop in [his slaIl-l until Walt is asserted. The write cycle is similar. Rem] _ Gyclu . MAR —: AdghmlsBus: MAR —9 Atldl'ussflus: Wall /— 3/ a 1 « Rllalllwr a; u —. Remile e: . \ l —t Request: ‘ 1 —u Request: X , UstnBus —> MnR; JI MEIR —. Datanus: \\_ _ / WE ._. “Mil _/—~\ Wail II! Vk.\ ' _ ' I} 0 —r leuml: fig | I“ u _. Rnrlmuit: \ z - . x I ‘H' ’ulall \.._./ f w ll i Figure IL? 2-; servant's state flagrant: icr lead an write moles. Chapter 11 Bummer Organiration Depending on detailed setup and hold time requirements. it may be necessary to insert additional states in the fragments of Figure 11.7. For example. if the memory system requires that the address lines and read:r write direction be stable before the request is asserted. this should be done in a state preceding the one that asserts Request. Remember that only the register transfer operations being asserted in a given state need to be written there. If an operation is not mentioned in a state (or state transition for a Mealy machine). it is implicitly unas— serted. Thus. you don‘t have to explicitly set Request to its nnasserted value in the second state of the handshake fragments. However, you should include such register transfer operations to improve the clarity of your state diagram. 11.1.5 Inputfllutput: The Third Component of Computer Organization We have dealt with the interconnections between the processor and memory. The organization of the computer has a third component: inputioutput devices. We cover them only briefly here. Inputloutput devices provide the computer’s communication with the outside world. They include displays. printers. and massive storage devices. such as magnetic disks and tapes. For the purposes of this die cussion, the main attribute of HO devices is that they are typically much slower than the processor to which they are attached. Memory-Mapped HO HO devices can be coupled to the processor in two primary ways: via a dedimted [JD bus or by sharing the same bus as the memory system. Almost all modern computers use the second method. One advantage of this method is that there are no special instructions to perform lJ'O operations. Load and store operations can initiate device reads and writes if they are directed to addresses that are recognized by the U0 device rather than memory. This strategy is called memory- mopped HO because the devices appear to the processor as though they were part of the memory system. HO access times are measured in milliseconds. whereas memory access times are usually less than a microsecond. It isn’t productive to hold up the processor for thousands of instruction times while the HO device does it job. Therefore. the control coupling between the processor and Ifo devices is somewhat more complex than the memory interface. Polling Versus Inlorlupts Because of the (relatively) long time to execute IJD operations. they are normally performed in parallel with CPU processing. An U0 device often has its own controllers. essentially an independent computer that handles the details of device control. The CPU asks the controller to perform an [ID operation, usually by writing 11.2 lining Strategies HI information to memory-mapped control registers. The processor contin- ues to execute a stream of instructions while the U0 controller services its request. The IID controller notifies the CPU when its operation is complete. It can do this in two main ways: polling and interrupts. in polling. the HO controller places its status in a memory-mapped register that the CPU can access. Every once in a while. the system software running on the CPU issues an instruction to examine the status register to see if the request is complete. With interrupts. when the IID operation is complete. the controller as- serts a special control input to the CPU called the interrupt line. This forces the processor's state machine into a special interrupt state. The current state of the processor's registers. such as the PC and AC, is saved to special memory locations. The PC is overwritten with a distinguished address. where the system software's code for interrupt handling can be found. The instructions at this location handle the interrupt by copying data from the U0 device to memory where other programs can access it. Polling is used in some very high performance computers that can- not afford to have their instruction sequencing disturbed by an [4'0 device’s demand for attention. Interrupt-based NO is used in almost all other computers. such as personal computers and time-sharing systems. Changes to the Central State Diagram We need only modest changes to add interrupt support to the basic processor state diagram of Figure 11.2. Before fetching a new instruction, the processor checks to see whether an interrupt request is pending. If not. it continues with normal instruction fetch and execution. if an interrupt has been requested. the processor simply enters its spe- cial interrupt state sequence. it saves the state of the machine, particularly the PC. and tells the U0 device through a standard handshake that it has seen the interrupt request. At this point. the machine returns to normal in— struction fetch and execution, except that the PC now points to the first in- struction of the system soflware’s interrupt handler code. A machine with interrupts usually provides a Return from Interrupt in- struction. The system software executes this instruction at the end of its in- terrupt handling code. restoring the machine's saved state and returning control to the program that was running when the interrupt took place. 11.2 Busing Strategies One of the most critical design decisions for the datapath is how to connect together its hardware resources. There are three general strategies: point—to—poim connections. a single shared bus interconnection. or mul- tiple special-purpose buses. Each represents a trade-off bemoan datapath and control mmplexity and the amount of parallelism supported by the 568 {inapter 1 I Crammer Organiratiou hardware. This determines the processor‘s eficiency: defined as the number of controi states (or clock cycles] needed to fetch and execute a typical instruction. When a datapath supports many simultaneous transfer-s among datapath elements. the control unit requires fewer states (and clock cycles] to execme a given instruction. In this section. we will examine methods for organizing the intercon- nection of datapath components. using the example of four general-pur- pose registers. We will consider how the datapath can support the operation of a register-to-register swap—that is. simultaneous exchange of the contents of one register with another. In register transfer notation. we write the instruction's execution sequence as SWAP (R,- . Hp.- Hy—b Hi; RI-v-i Hp' where R,- and R!- zire the registers whose contents are to be swapped. In a point-to-point interconnection scheme. there is a path from every possible source to every possible destination. Figure 11.3 shows how this can be implemented for the four-register example using 4-to-1 multiplexers. Each of the four registers receives its parallel load inputs from an associated multiplexer block. R; is :In edge-triggered register, which is loaded when the L1)I input is asserted. We assume the load signal takes effect only on the appropriate clock edge; that is. it is a synchronous control signal. If each register is N hits wide. the multiplexer blocks must contain N 4-tn-‘l multiplexers. one multiplexer for each bit in the . Q __ Sz<lfl> LD:| _,.i" p —‘_| Sa<1:l:t> _._.i LD: Figure 11] Point-to-purnl regiSIer Inlercnnncction. th —r 5.14.120): 10 —> S:<‘i:[l)v; ‘1 -¢ L139: 1 —> L03: Fianna 11.9 Tiansteu state diagram. 11.2 Busing Strategies $9 register. These are controlled by [he 2-hit-wide selection inputs. S,-<1:IJ> for register 3;. Humor Transfer Oplralians amt Event Timings To see some of the possi- ble transfers and how they may be implemented. consider the register transfers 31 —) Ru (transfer the contents of R1 to R”) and R2 —-; Ha (trans- fer the contents of R; to it“). The following detailed register transfer operations describe the necessary sequencing of the control signals: 01 —) Sl,<1:i)>; 10—) Syd:sz 1 —> L0”; 1 —> L03: The first two register transfer operations connect Hus input lines to the output of H“ and similarly for Ba being driven from H2. The last two assert the load signals for registers R0 and RH, respectively. Figure 11.9 shows a state diagram fragment to illustrate when the control signals are asserted and when they take effect. We assume B Moore machine implementation (a synchronous Mealy with registered outputs behaves analogously}. When entering state X the multiplexer control signals are asserted. gating R1 and Hz to the inputs of H0 and Ba. The state also asserts the H“ and H3 load signals. But because these are synchronous. they do not take effect until the next state transition. Thus. the H, and R2 signals have time to propagate through the mul- tiplexer blocks and become stable for the requisite setup times before the clock edge arrives that advances the finite stale machine to state Y. The contents of Ru and R3 change on this state transition. not the one that caused the load inputs to become asserted. Since the new values take some time to propagate through the register load circuitry before they emerge at the outputs. the hold lime requirements at the register inputs are easily met. The SWAP Operation To see how the interconnection scheme can implev merit a SWAP operation. you need to understand the timing relationship between register transfer operations and their effect on the datapath. Con- sider a SWAP between registers R, and Hg. The control signal settings are 01 -r 514:0): ‘10 —r 5‘ <1 10>: ‘J. —r L33: 1—)LD‘: *— Sill] Chaplet 11 Computer Olgfillilaltfln On entering state X. the multiplexer selection signals establish the desired pathways between register outputs and inputs. The inari signals are asserted. but the registers have not yet received their new values. This occurs only at the clock edge that causes the transition to state Y. Fortunately. the new values appear at the outputs well after the hold Iirne requirements at the inputs have been met. Discussion The puini-lo-point scheme is so flexible that it can transfer new values into the four registers at the same tints. But there is a signifi— cant hardware cost. A 4-to—1 multiplexer requires at least five gates for its implementation (recall Figure 4.29;. Assuming a 32-bit-wide datapath, this means 160 gates pcr register or 640 gates for the four-register exam- ple. For this reason. point-to-poinl connections can only be used in rare cases in which the flexibility far outweighs the implementation cost. A bus is a gel of interconnection pathways that are shared by multiple data sources and destinations. If the point-to—point commotion scheme is too hardware intensive. a lower-cost ultcrnulive is lo use a single inter- connection bus. This is shown in Figure 11.10. The block with a multi- plexer for each register has been replaced by a block with a single multiplexer that is shared by all regislers. The hardware cost is 25% of that of tho point—to-poinl approach. The multiplexer places selected data on a bus that feeds the load inputs of all registers. This dramatic reduction in hardware cost comes at a price: The shared bus (and its multiplexer) is a critical resource because it can be used by only one. transfer at a time. However. the single source register can still "broadcast" simultaneously to more Ihan one destination register. am :0: —— Mtrx Single Butt Figure “.10 Single-nus register interconnection. 11 2 Busing Strategies 5" The Register transfers Revisited To see that transfers now require more states. let's again consider the transfer oil?1 to R" and Hz to R“. These now require two separate states. asserting the following control signals: State X: til“ -) RD) 01 —) S<1:0>: 1 —-> LDO: State Y: (R: —> R“; 10 —r Salt“): ‘1 —) U33: The $WAP flperatiun Revisited Since the datapath no longer supports two simultaneous transfers. the register swap operation becomes much more difficult for us to implement. We niusl stage the data to be swapped through a temporary regislcr that we introduce into the: Linin- path. Let's call the temporary regisler H4. This means that we have to expand to a 5-to—1 multiplexer. Swapping the contents of registers H. and R2 now requires the fol- inwing register transfer operations: State X: {Fll r-t 3.] I101 —> S<2:O>: 1 —> L9,; Stale Y: (HZ —; El} 010—» S<2ru>: ‘t -D L9,; Slate Z: (ll4 —} ill: IOU -) S<Z:D>: 1 v-r L133: With poiiil-tnvpoint connections. SWAP could be implemented in a single state and just one clock cycle (we assume one clock cycle per state}. Using a single bus interconnection. however, SWAP requires an extra register. a larger MUX. and three control states. Biscussinn This illustrates a fundamental trade—oft in computer hard- ware: extra complexity in the datapath can reduce the control complex- ity and vice versa. The current design decision depends critically on the frequency of operations. 11' you seldom require multiple simultaneous transfers. your correct choice is the simpler datapath. if you need to SWAP frequently, you should choose the point-to-puint method. A compromise strnlegy is also possible. It strikes a balance belwaen control and datapath complexity by introducing a small number of addi- tional buses just where the}.r are needed. We will see this in Section 11.2.3. - 532 Chapter 1| CDI’IILIUIE! Urganiwnon Multiplexers Ifem"; ‘l'ri-Stale Drivers So far, We have used multiph-u-r; to make cunncclinn between sources and destinations. A11 alternnlm: that dramatically redums the necessary hardware lakes advanlage of Hi slate or open-collaclor buffers. Recall [hat these kinds of circuits alllm' multiple sources to share the same wire. as long as only one bufl'ur m driving the shared data line al a tune. bus through its tri-state buffers. Mus! packaged logic registers incl-uh: tri-stale devices. so [his form of inlnrcmmectiun is mnvenienl. Real (Iulaputh designs incorporate [llB usual engineering lrademl'fs between control and datapall} complexity. Typically. they have more than one bus but less than a Full point-tu-point scheme. In this subsection, we Bus as damnation: PC —) BUS. IR —> BUS. AC a BUS. MBR —) BUS. ALU Result —9 BUS: Fiuuln 11.11 Sulng-hus ragistur inwrcormectlon with Iri-staies 11 I? Busmg Straitgnes 513 —p— I l Memury '\ H I Dana Hus MemoryI .—- l Address l l l Flux M . >I M‘- lzl'l‘l‘“ ,F:—‘ / I I _ |_'~|//’B IJ | l |_._I_Y_‘_—l Figure 11.1! Single-bus pmtessm daiaoalh Bus as source: BUS -—> PC. BUS —) IR. BUS —> At. BUS u) MRR. BUS no :‘KLU B. BUS —} MAR: Har'dn'imd: AC —) ALI] A: ' ' Cunsiulw' the simple ‘ | -Bus C cle-hv-Cynle Instruction Executmn ‘inslructlfon "ADD Men-1M1" which adds the Lunlents of inelnury location X In the AC and stores [he result back [nm line AC. “1th the cunnPclion scheme of Figure 11.12. lhe set of ragisiur [runser Esllllpél £0 exacule the instruction are the following {we gmle In» operations 34 state and cycle): Fetch Opemnd Cycle 1: I IR <npcrand address; —9 BUS): BUS —) MAR: Cycle 2: Memory Read: Datuhus —} MBR'. Perform ADD Cycle 3: MBR --r BUS: BUS —) ALL! 3: AC —3 ALU fl: ADD: l-l-h'le Result Cycle 4: ALL] Resull —; BUS: BUS —3 AC: w 53" Chapter 11 COW-Dole! Urganiratinn During cycle 3. the bile Connects the operand in the MBR to the ALU input. The bus cannot be used at the same time as a pathway between the ALU result and the AC. Thus. this transfer must be deferred to the next cycle. With this organization. the ALU must have a latch to hold the result until it can be transferred at the next cycle. Multiple Bus Register 'Iranstsr Diagram Figure 11.1.3 gives an alternative three‘bus organization that supports higher parallelism in the datapath. More parallelism means that more transfers can take place in the Sallie state. This should lead us to a reduced state and cycle count for that - intil instruction. yp We partition the single bus functionally into a Memorv Bus [MBUSJ Result Bus [REUS]. and Address Bus {ABUSL The firsi connects the: MBR with the ALU and lit. the second establishes a pathway between the ALU result and the AC and MBR. and the lust provides connections between the lit. PC. and MAR. Multiple—Bus Icvcle‘bv-Cvcle Instruction Execution 'i'he cvcle-by-cvcie register ti'anster operations now become - - Fetch Operona‘ Cycle 1: IR (operand address: -'-) ABUS: ABUS —) MAR; Cycle 2'. Memory Read: Databus —) MER: Firkin-its Bus Resull Burt _ r I Memory 4$\\ Ailllmss ‘4 Memory '5'” l’ 1 Data Bus ‘ _ (I R 23-3 (‘13- __ ._ : Memory Bus; Figure 11.13 Three-Luis processor [Iatanam 11.3 Finite Slate Machines lfll Simple Lil-‘Us 575 Perform ADD Cycle 3: MBR -v MBUS: MBUS —> ALL] 8: AC —» ALU A: ADD: Write Hes uit ALLI Result —> RBLIS: RBUS —> AC; Since MBUS and RBUS decouple the ALU inputs from the outputs. we can implement operations like ADD in a single cycle. introducing the extra buses has decreased the execution cycle count from four to three. This doesn't quite represent a savings of 25% on the execution time of a typical program. since this cycle tally does not include the instruction l'etch. ' n- in this section. we will derive the stale diagram and ilatapath for a 5 pie processor. The machine will have iii-hit words and iusl tour instruc- tions. Although this may,r be an oversimplified example. it i|]ustratcs the process for deriving the state diagram and dataputh and the interaction between the state diagram and the datapath's register transfer operations. in general. the design of the processor's Cuntml goes hand-in-hanrl with the design of the datapath interconnect. We can summarize the step-by- step as: 1. Start by developing the state diagram and associated register transfer operations for the processor control unit. assuming that lmhli-‘iU-pulllt connections ale supported by the datapath. 2. Next. identify the register interconnections that are not used at the some time in any control state. You can now replace these by bus-structured interconnect. 3. Revise the state diagram to reflect the register transfer operations sup- ported |J_v the modified datapath. 4. Finally. determine how to implement the register trausier operations by detailed control signal sequences. Revise the state diagram to assert these signals in the desired sequences. Now we are ready to begin the specification of our example machine. "raga? _._ _.. Elli Emmet ll Computer Organization 15 1-1 1;:— '_ 1’. m: Address UP oo=Lo Code 01 :51 10=r\DIJ l‘i :BRN Figure 11.1! losnuttico lurrnal and encoding _ '_.._ M H A / ll Request Memory Iteodi'Wi'ite Wait 10:2” H i— _ <15:[I: it Figure 11.15 Processor [0 memory Interface. Processor Sgecilication To a first approximation. at computer is described by its instruction set and prograiiiirienvisihle registers. Ours is a single accumulator machine. its instruction format and encoding are shown in Figure 11.14. Instruction and data words are 16 blls wide. The two high-order hits of the instruction contain an operation code to denote the operation type. The remaining 14 bits are used as the memory address of the oper- and Word. Figure 11.15 shows the processor-memory interface. based on the scheme introduced in Section 11.1.4. The memory data bus and memory address has are to bits and 14 bits wide. respectively. The processor's instructions are: 1. Lood‘fmm Memory [.13 XXX: MelliorleXX| —» AC: 2. Stone to Memory 31' XXX: AC —5 Meinm'leXX]: 3. Add f'nom Memory ADD XXX: {\C 1 MeltiurinXX] —) AC: 4. Brooch ifAccumulatorNegative BRN XXX: IF AC<15> :1 THEN XXX —: PC: Figure 11.16 gives a high—level state diagram for the processor‘s non- trot. This. is the starting point for deriving the detailed state machine in this section. You shouldn't be surprised that its major components are the familiar sequence of instruction fetch, operation decode. and opera- tion execution. 1/ new r — —_ lltlstrurfl inn I‘ Falfill l l/‘\ Operation I lLJecode 4. ’__//\)\ HRH. LL: '_,"/’51 MIXD‘x-HEBRN ,-’““f" 1’— . \ “Ki. (Ilnratioo | ] l J i Execution \. ,r ‘x._/ kr/I I' l . | l Figure ".15 High-level state diagram to examine processor 11.3 Finite State Machines for Simple CPth 53‘? The state diagram of Figure 11.16 provides only‘ a rough beginning for the detailed state machine. For example. fetching an instruction involves a memory access. and this requires several states. And, as pointed out in the previous section. the details of the datapath interconnections affect the ntunber of cycles required to execute the instruction. The final state diagram will contain several more states than we have shown. Refining the 5|an Diagram: Overview We start by decomposing the state diagram into its three major components: instruclion fetch. operation decode. and. operation execution. Throughout this section. we will refine each of these. To begin. we must decide between a Moore or Mealy ster of imple- mentation. Let‘s choose the latter and assume that our controller will he a synchronous Mealy machine. Control outputs are now associated with transitions rather than states. Assorted control signals take eii'etrt when entering the next state. Reset State It is a good idea to make Reset {RES} the first state. This starts the machine in a known state when the reset signal is asserted. Also, it provides the place in the state diagram from which a control signal can be asserted to force parts of the datapath lu known starting values. Perhaps the most important register to set at Reset is the pro- gram counter. In our machine. we will set the PC to t). The Memory Request line should also be driven to its unasserterl value on start-up. lnsltuction Fetch Reset [RESJ is followed by a sequence of states to fetch the first instruc1ion from memory HF... [[71, lej. The PC is moved to the MAR. followed by a memory read sequence. Revising the Moore machine state fragment of Figure 11.7, we obtain the four—state Moaly sequence shown in Figure 111?. Let’s examine the control signals on a transilimt-hy—transition basis. When first detected. the external reset signal lorries the state machine into state RES. This state resets the PC and Memory Request signals. it does so by the explicit operation 0 —> l—‘(I t'or resetting the PC on entry to R138: Request is unassol'ted because it is not otherwise mentioned. We assume that register transfer operations not listed in a transition are implicitly let‘t unasserted. Once the Reset signal is no longer asserted. the machine advances to state 1F... On this transition, the control signals to transfer the PC to the MAR are asserted. This is as good a place as any to increment the PL}. set- ting it to point to the next sequential instruction. You should remember that register transfer statements are not like statements in a conventional Sill Chapter! II Cull-puter Organization ( % /1/ Reset“: —» H; Hfittl RES I R " k} 1 Reset.ll PC AMAR. I PC + t —u PC 'r’tl‘aitiI MAR —> Mgfli‘. 1 —i lteatI3Write. - I -- ‘l —> Re nest 1 aaeiiruw a f “\ q I "l Request. MAR—>Mamury II | IF) I { l. |F1\" Kc } Wu itt'M BR —u [ll Figure 11.1? Reset and instrucnun Ielcn staies programming language. The PC increment. takes place on the sumo clock edge that causes the MAR to be loaded with the old value of the PC. As- suming edge-triggered devices. the setupfhold times and propagation de- lays guarantee that the old value of the PC is transferred to tho MAR. We will reexamine these timing considerations in the next subsection. Figure 11.? showed that the fuur~cycle handshake with memory can begin only when the memory Wait signat is asserted. So we loop in state IF“ until this Wait is asserted. 0:108 memory is ready to accept a request and Wait is asserted. we can begin a read memory sequence to obtain the instruction. On the transition to state If}. we sol up control signals to gate the MAR to the Memory Address Bus and assert the Read and Request signals. Once we have entered IF‘, the instruction address has been pre- sented to memory and a memory read request has been made. As long as the Wait signal is asserted, these must remain asserted. We advance to state We when memory finally unasscrts Wait. signal- ing that data is available on the Memory Databus. On this transition it is 11 3 Finite State Machines for Simple L‘PUs in safe to transfer the values on the memory bus into the MBR. This transi- tion also unasserts the Request signal. indicating to memory that the processor is ready to end the memory cycle. The four-cycle handshake keeps us in this state until the Wait signal is again asserted. On this 0in trauailion, the MBR can be transferred to the [R to begin the next major step in the state machine: operation decode. Operation Decode Because of the simplicity of our instntclion sot. the decode stage is simply a single state that tests the op code bits of the instruction register to determine the next state. This is shown in Figure 11.18. The notation on the transitions from state [JD indicates a conditional test on IR bits ‘15 and 14. For example. if [R<15:1=Ib=LlU. the next state is LEI”. Instruction Execution: lflAD Now we examine the execution sequences for the four instructions. starting with LD. The load execution sequence is given in Figuro]1.19. The transition is taken to State ID” if Ihe Up code hits of the IR are both 0. On this transition. we transfer the address portion of the IR to the MAR. States Ln“, LB]. and LII]Z are almost iden- tical to the instruction fetch states. except thnl the destination of the data From memory is the AC rather than the IR. The rationale for the state transitions is also identical: When memory is ready. we assert Read and Request and keep these asserted until Wait is unasserled. At this point. the data is latched into the MBR and then moved to the AC. Instruction Execution: STORE The store execution sequence is shown in Figure 11.20. in essence. it is a memory write sequence that is similar to the toad’s read sequence. On the transition from the decode state. the address portion of the current instruction is transferred to Ilio MAR while the AC is moved to the MBR. 1f memory is ready to accept a new Figure 31.18 G-peratiun decode slate. I 11 3 F-nite State Machines fur Sirrtule [IPth 581 581] Chapter 11 Computer U.-gani;ation / '\ /'_'\I _/_“'\_ | GD 1: OH _.| I' on 1 \e . \-~./ \__/ is<15:14>=10fM ' .. . M [Rc15:14>=uu.' II ltjl;<]1::1ni;=tt\il:\k |R<t.a.D;—a . _ =5 . :U> —, . . | JH<1.1.0; —. MAR ll AC _, MBR /__ I ' /—-\ r >. ~. .. 31 _. , \. Wat—t: 17“ “'“i” '. '- AD") 3'.- I\ S a}! \__)"1' Wait-’ xv. >7“ wait; MAR —> lucmprv. w . I, | MAR » Mumur}: 1 _, keedr'l’tl'rllc. . .—?m MHR —) Mum: M. Wang 1 —. Request 0 —o Reuitai‘erltc. _ :-.—.— ___ . (J -; Reedfilnte. , [m.“ ".3, W 't-' 1-. Ru m-st I t aktd I _31 q ' ' | 1 —a Ruquesl 1 6R” “Ha-L fiReradt’Write. MAR aMemnr}: ’,-—~\\ I q r I 1 _. qullesL MBR —: Memory- )1“ MAR —u MemUF! l' MAR s Mumurv I ._ \ I. . I 5:. | t \_ / \e_/ "r i _. _/ \‘i item» Rt-quus1 I" . I' \\ ._ ' ' ___ —..—. I . .. '- wait: -. u-aiu “m” I.\ ‘51 \h)“ "/wam '».__/ "-"' Mth +AC—>!\C i Wail! K/ “x : RES '\\h__/' Figure 11.21 Add emutlnr- SEMIICB- Figural”! Load execution sequence. Figumfllfl Srurc Blemitlon sequerce request. we begin a wrilt‘. 1:} It: by gating MAR and MER to the appro- {l’_"\.l priate memory buses white asserting Write and Request. 'l‘hese signals 'KOD _.' I remain asserted until Wait is unassertnd. At this point. the processor " resets the handshake and waits for the memory to do the same. ch'lfi-N" = 11’ Instruction Execution: ADD Figure “.21 shows the execution sequence Kim-\I i for the ADD instruction. The basic structure repeats the [and sequence. "A . . . _ / ‘s..‘_/ \\ Only the tl'artmtion tron] slate ADa hack to Ihe ruset slate has a slightly I \ {titTemnt transfer operation. u‘ 1?; if. f” 'I_ . .4. .l = - _ _ I.‘ ‘ 1R<1.'1.u: —4 PC ,\ JAG/‘1‘” ' n Instructlnn Execution: BRANCH NEGATIVE Figttre11.22 gives the tine! \\ ' execution sequence. for the Branch if AC Negative instruction. II' the high-order hit of AC is l. the [R's address hits replace the contents of I. RE?! _.' _ . _ x the PC. Utherwlse the current contents of the PC. already urcremented \‘-—’ in the previous RES-tool!”Ll transition. determine the mutation of the next rim-.1122 Branch execmsun senLIEF-Ce instruction. I r _‘ 582 Chapter :1 Cumoutei Uigaiiiration 11.3 Finite Slate Machines for Simple EPUS 5K! 1 . (Ch 5 Lien . alrtlaldly checks utliether mentor},r is reach.I to receive a new rEqTESi Didi”; verilynrg that Wait Is assent-idle can eliminate state STE. For the same A0115» Cotilrol Signal Outputs: ooplback and exit conditions for states D m) PC :23 gives the complete state diagram. but PC + 1 —, PC t 1’1.H Um pail“ in Durfefinemfiflt uf the state machine the list of con— 110—) MAR m ""I'JL"s and Outputs is as follows, MAR —+ Memory Address Ens Control Signal and Conditional inpms; Muillnl‘}' Data Bus > Misti Reset MBR —i Memory Data Bus Wait MBR —) IR MEN —9 AC Res /1/ AC "9 MEIR ,_— _ _ __ ___ __ __|_ fr RCWI K91 AC + MEIR _. AC k, lR<13:U> —’ MAR In. K' O iii-Iii} IR<13:D> —y PC 1 —» Readiwifi Wail.“ I] —» REGdA‘VI'llB ' ll" . . J I) Nit.“ 1 —) Requeel we"? Figure 11.24 gives a revised block diagram showing the tlonr of signals . between the control. data ath. and meinor '. 11-: t" )an P 3 \. ' k Wait.Ir (JD I _ I _ _J— {an In i'leriwng the slate diagram of the previous subsection. we assumed .' r,_..\—--~-""‘_"_ \\ aha—~55“ there was a direct path between any source and destination of a register 1”" I; 5'11. / j. AD” yfi BR:— . lransfer operation that we needed. Figure 11.25 shows the implications / of [his assumption for our datapath. We label the connections with [he i “wait; _ We“, instruction type (Load. Store. Add, Branch] or stage (Fetch. Decode. Exe- - LJJ: l W] (V ' M) /—~ “'5'” I cute] that makes use of [he path. T’KJ _/ ' H lli' We must now determine which of these puiill‘lo-puiill connections . m; WEE, _r__ \ can be combined into shared buses. We can combine connections when LU , ‘1 ,L wa‘“ I] the); are never [or infrequently; used in the same state. 2 kg- An, 1 I l J . Datapath Interconnections Since instruction fetch and operand [etch take ' i J g place in different stales of lhe aiatn machine. we can use a single bus to Finuraltzl Conrpletgyatgdiagmm connect the IR. PC. and MAR. Similarly. the connections between the MBR and the IR. ALU l3. and AC can be combined in a single bus. The 584 Chapter I: Eunlpulw Urgamzaliun I Mummy l ' '_ g Monmrv Mummy R‘ “31 Willi Addresi; ‘ Dnla Bus , Ens _____._ __l_ ___.._i___ Rundin L' I F leuim I [J —’ PC I PC + ] 9 H: J; g m: _. MAR 1 ¥ 1 MAR 9Menluryndtlmsslhm r '3 K Mummy leln Llus —) MBR a (J MHR —p Memory Daln Hus l T " ' M31: —. IR H MBR —o AC |l AC - t Milli AL: + MHR -0 AC ' " iit<lll:u> -u MAR “(climb —>I"C I It |R<iI5:H> _ _I_I _ A(.<15) I—II__— __ Figun 11.21 Pmmssur signal ilow Sttu'v antl Add pallis hutwi: n the ALU. AC. and MBR can be unlnlfined as well. yielding Ihc three-bus architeclum of Figure 11.26. This is almost identical In llna dalfijmth of Figure ll.l3. Will: Ii'.|l.‘.i ul'gauimalion we can implement the transfer upumtiun AL'; + MBR -—) AG in a singlu slate, Ulhsrwisu we would need In revisa the pl'ir‘tiul} ni' Ihe state diagram for the ADD execution sequenm to reflect the true sequence of Iransfers needed to implement this Upcl'flliull. In Figure 11.26 the AC is the only i‘cgislcr counseled lu mum than out? bus (it can be Imulad from the Result Bus In the Monmry Bus). This is called u dual-ported mnfigumlinn. and il requires additional hard— ware. It is useful to try to reuse existing Cunnuclinns whenever posaihie. By using an ALU lrtllllpollent that has the ability to pass ils B input through to llm uutpul. we can implemcnl the load path from the MBR lo the AC in the samu man: .r as the add path. We simply instruct the ALU lo PASS 3 rather than ADD A and B. This yields the three-bus anzhitucmre intrut‘luuud in Figure 11.13. eliminaling the lera dalapalh complexity nssociuturl with {I dual-ported M}. We assume this urganizu tinn throughout the rest nf Ihis subseclion. 11 3 Finite State Machines 1m SIITIDIE CPle 585 0 rand Fetch 5mm _ IFntL'h Branch dd I l '———. I A l _i mid . Memori- Melllflf)‘ |_ x I = min ' mlfldfiss I IT -l\ \ | Bus R | __I _ ii i’ n ‘_| M I..__._. R (j (I J I} I/ / R l 2' | i l I | -_..-u / i _' .9 _ I A?” Z T figure 11.25 Point In paint mnnecllnns Implied by the state diagram Address Bus I Resullaus _.. I _‘_I . _ mm L .1 .L Address I Bus I Hi P I I - A c - Ft ‘ H I | ! t i—i Figqu 11.26 Three-bus ptncnssur [Ia Lapth Wllli dual nualeu AB Implementation of the Register Transfer Oyetalion: Now that We have set- llad on the [handles] connections supported by Ihe dutapalh. we are ready to examine Iluw register transfer operations are implamenltn‘l. A data- palh confmi point is a signal that causes the (iatapalh to perform some operation when it is asserted. Some. contrul Operations, such n5 ADD: PASS 3. [J —) PC. PL: + 1 —) PC. are implemented directly by the ALL. and PC functional units. For uthei' upurations. such as PC --9 MAR. we have to assert more lhan nnu control point within Ihc tiatapulh. These more detailed ::u||ll'ui signals are often called inicruupm-utiaim. Thus. we can decompose a rugistm‘ transfer opuration ink: um: ur mnrr: I Juniper- aliuns. and there is UIIB iniCFOEI1}HI‘fl{iutl for each mntrui puinl iiurI exam- pie. 2: mgister land or lTi‘SlHli-E unnlJie :ronlml input] in |lit‘LlHla]1:1lll. 535 Cnanrer i1 [.‘mnputsr Organization As an example. let's examine the register transfer operation PC —> MAR. To implement this. the PC must be gated to the Address Bus while the MAR is loaded from the same bus. in terms otnticrooperations. PC -) MAR is decomposed into PC —) ABUS and ABUS —» MAR. Figure 11.2? shows how these operations manipulate coutml points in the datapath. The PC is a loadable counter. attached to the ABUS via tri-slale buffers. The MAR is a loadable register whose parallel load inputs are driven from the ABUS. Asserting the Inicrooperalion PC —) ABUS connects the PC to the ABLIS. Asserting the ABUS —» MAR microoperutien loads the MAR from the ABUS. Timing of Register Transler Mentions Figure 1 1.2a shows the timing for these signals. The waveform begins with entering state RES. followed by advancing to state IF... In this timing diagram we assume the Reset signal is (iehourtced and synchronized with the system clock and 0 —) PC is directly tied to the synchmnizcd Reset. We use positive edgovtrlggered registers and counters with synchronous control inputs thmughout. Although we assume positive logic in this timing diagram. you should realize that most components come with active low control signals. The Reset signal is captured by a synchronizing flipvllnp on the first rising edge in the figure. A propagation delay later. the synchronized version (if the reset signal is presented as an input to the control. No matter what state the machine is in. the next state is RES it Reset is asserted. The U —. PC microopcration is hardwired to the synchronized Reset signal. The synchronous counter (.‘LR input takes eft'ecl at the next rising edge. This coincides with the transition into state RES. Once we are in state RES. we assume that the Reset input becomes unasserted. Otherwise we would loop in the slate. continuously setting the PC to 0 until Reset is no longer asserted. With Reset unussorted. iFn Addiess Uus MAR " Li) H: —r AHUS | ABUS —a Mall It —: i-'(.‘ PC + ‘t —a PC. figure 11.2} PE-tu MAR transter with micruoueratlons. 11 3 Finite State Machines far Simple CPle 56? RES IFu ll"i m i fl 1 Resul _ / PC gets it 0 —I PU L— I____—___ — — -—- PC gels It PC + ‘1 PC + ‘l 4 PC —___ _ . P‘Lon L Pr. --r nous ABUS _L J AEllJS —p MAR —._. / _.— MAR latches ABLIS I:in 11.28 liming of state changes and iniuoupetalmns. is the next state and the micmoperations PC + 1 —> PC. PC —, ABUS. and AEUS —) MAR are asserted. Because of the way they are implemented in the dalapath. some of these operations take place immediately while others are delayed until the next clock edger’entry into the next state. For example, asserting PC —> ABUS turns on a tri-state buffer. This takes place immediately. Microoperalions like PC + 1 —-> PC [counter increment) and ABUS —) MAR {register load] are synchronous and therefore are deferred to the next clock event. in the waveform, soon alter entry into RES with Reset removed. we gate the PC onto the ABUS. Even though the PC count signal is asserted, it will not take effect until the next rising edge. so dle ABUS correctly receives 0. On die next rising edge. me MAR latches the ABUS and the PC is in- cremented. Because the increment propagation delay exceeds the hold time on the MAR load signal. the 0 value of the PC is still on the ABUS at the time the load is complete. when a bus is a destination. the microoperatron usually takes place immediately: if a register is a destination. the micro-op- eration's effect is usually delayed. Chapter 11 Compute: Urganiialmn 11.3 finite State Madiines for Simple CPle 589 Tabulation at ltanism Transfer Operations and Micmoperations The rela- tionships between register transfer operations and microoperalions are: RegisterTi‘ansfer Cl-tPC PC+1 QPC PC—iMAR MAI-t —> Address Bus Data Bus —) MBR MBR —> Data Bus MER —> [R MBR —) All AIS—i MBR AC+ MBR —> AC IR<1310> —> MAR [Rc13:t]> —> PC 1 —» Read;Ir Write t) -«> Read! WW3 1 —» Request Microoperations D —> PC (delayed): PC + 1 —> PC (delayed): PC —> ABUS (immediate). ABUS -—> MAR (delayed); MAR —> Address Bus (immediale); Data Bus —> MBR (delayed): MBR —; Data Bus (immediate): MBR —i MBUS (immediate). MBUS —> 1R (delayed); MBR —) MBUS (immediate). MBUS —t ALU B (immediate). ALU PASS B (immediate). ALU Result —, RBUS (immediate). REUS —) AC (delayed): AC —) ALU A (immediate, hard-wired]. ALU PASS A (immediate), ALU Result —9 RBUS (immediate, hard-wired}, RBUS —2 MBR (delayed); AC -; ALU A (immediate. hard-wired). MBR —» MBUS (immediate). MBUS —) ALU B (immediate). ALU ADD {immediate}, ALU Result —> RBUS (immediate. hard—wired), RBUS —v AC (delayed): IR -—; ABUS (immediate). ABUS —> IR (delayed): [R —> ABUS (iimnediate). ABUS 4 PC (delayed); Read (immediate): Write (immediate); Request (immediate): Some of these operations can be eliminated because a connection is dedicated [U a particular function and thus does not have to be con- trolled explicitly. AC -—i ALU A and ALU Result —+ RBUS are examples of this. since the AC is the only register that connects to ALU A and the ALU Result is the only source of the RBUS. This leads us to the revised microoperatiun signal flow of Figure 11.29. No control signals 30 to memory. Readi’Writo and Request. and 16 signals go to the datapath. The control has a total of live inputs: the two op cede bite. the highvorder bit of the AC, the mem- ory Wait signal. and the external Reset signal. It is critical that the latter two be synchronized to the control clock. Memory Reset wait ‘1 HUN—1200 Readim Request 0 —> PC PC + ‘1 —) PC PC —> ABUS IR -u AEUS ABUS —) MAR ABUS —! PC MAR —> Memory Address Bus Memory Data Bus u) MBR MBR --) Memory Data Bus MER —> MBUS MBUS —9 IR MBUS —i ALUE RBUS —) AC RBUS -D MER ALU ADD ALU Ms E {ltd 5:14 > ACE-(15> Figure 1129 Revised processor signal flow. Memory Adc rues Plus :dP'Ubt-E>G Memory Data Bus Chapter 11 Computer Organirauun Chapter Review In this chapter. we have examined the fundamental structure of comput— ers. A computer. like many digital systems. consists of datapath and control. The datepath contains lhe storage elements (registers) that hold operands, the Functional units [ALUs, shifter registers) that operate on data, and the interconnections (buses) between them. The computer's control is nothing more than a finite state machine. It cycles through a collection of states that fetch the next instruction from memory. decode this instruction to determine its type. and then execute the instruction. The control executes the instruction by asserting signals to the datapath to cause it to move data from registers to func- tional units. perform operations. and return the results to the registers. Register transfer operations provide a notation for desorihing functional unit operations and the data movements between registers and funcljonal units. The regisler transfer operations are normally written in a form that is independent of the detailed interconnections supported by the datapalh. Once the datapalh interconnections are determined. we replace each register transfer operation by a sequence of micro-operations. These cor- respond to detailed control signals to the datapath that must he asserted to cause a register transfer operation to take place. Computers are interesting because they are particularly complex digi- tal hardware systems. The datapeth is not where this complexity comes from. It comes from the control portion of the machine. In the next chapter. we will look at ways to organize the complex control state machine of a digital computer. Further Reading Unfortunately, most lextbooks on logic design do not provide much cov- erage on computer organization. After all. a detailed trealment of com- puter architectures, instruction sets, and their implementation is a topic for another book. Notable exceptions include johnson and Karim. Digital Design: A Pragmatic Approach. PWS Engineering. Boston, 1957. and Prosser and Winkei. The Art of Digital Design. 2nd ed.. Prentice-Hall. Englewnod Cliffs. NJ, 193?. Both of these have several chapters on com- puter structures and their implementations in hardware. For a historical perspective on how computer architectures have developed. a wonderful book is D. Patterson and I. Hennessy. Computer Architecture: A Quantitative Approach, Morgan-Kaufman. Redwood City. ca, 1990. Rn R1 Exercises 591 11.1 (Register Tlonsfer and State Diagrams) Assume that you have a bus-connected assembly of a 4-bit sublractor and four registers, as shown in Figure Enid. All registers are positive edge—triggered, and registers Hz and 0 have tri-state outputs. All buses are 4 hits wide. You are to perform the following sequence of register lransfer operations: {1) compute 31—30. latching the result into R2. {2) dis- play the result on the LEDs attached to register 0. and (3} replace Hg with the result. a. Tabulate all of the register transfer operations and their detailed microoperations that are supported by this datapath organizalion. h. Creale a timing waveform for the control signals LD (load). 5 (select). and DE (oulput enable) to implement this sequence in a minimum number of clock cycles. This diagram should include traces for all of the control signals in the figure: READ. R1:I.D. as. 328, 32:13. HyOE. OLD. ODE. Recall that the slate changes on the positive edge of the clock. and don'l forget to incorporate signal propagation delays in your timing waveforms. c. Show the state diagram. annotated will: conlrol signal asser- tions and their corresponding register transfer operations. that corresponds to the timing waveform you filled out in part {b}. mural-i (1E OE CLK H Film: Elli! Four-bitdatanaih for Erercue 11.1. LD I'D 592 Chapter 11 Computer Organization 11.2. 11.3 11.4 11.5 11.6 (Memory Interface) A read-only memory and a microprocessor need to communicate asynchronously. The processor asserts a sig- nal to indicate READ and sets the ADDRESS signals. The memory asserts COMPLETE when the data is available on the DATA sig- nals. Draw two simple state diagrams that indicate the behavior of the processor and the memory controller for a simple read operation. The READ and COMPLETE signals should follow the fourvcycle handshake protocol. Assume that the memory control- ler cycles in an initial state waiting for a memory READ request. (Memory interface) For the memory interface and state diagrams in this chapter. we assumed a Princeton architecture. In this exer- cise. you will rederive these for a Harvard architecture. a. Modify the memory interface of Figure 11.5 for a Harvard architecture. Provide two separate memory interfaces. one for instructions and one for data. b. What changes are necessary in the state diagram of Figure 11.23 to reimpternent this machine for a Harvard architecture? 1:. What new register transfer operations are needed to support this alternative memory interface? Modify Figure 11.24 to reflect these additions. (instruction Prefetch) [11 Exercise 11.3 you modified the memory interface to provide separate instruction and data memories. Modify the state diagram of Figure 11.23 to allow the next instruction to he fetched whiie the current instruction is finishing its execution. 'i‘abulate the register transfer operations used in each state. (Control State Din) The state diagram derived in Section 11.3.2 assumed a synchronous Mealy implementation]. Rederive the state diagram, but this time assume a Moore machine implementation. Associate the appropriate register trans- fer operations with the states you derive. Also describe for each state the function. such as instruction fetch. operand fetch. or decode that it is implementing. (Dotopath Design) Consider the following portion of a simple instruction set encoded in 4-bit words. The machine has a single accumulator (RU). a rotating shifter [HI], four general-purpose reg- isters {Hi}. RI. R3. 3;]. four accumulatori'shifter oriented instruc- tions (COMP. ENC. RSR. ASR). and two register-register instructions (ADD, AND}. [a register transfer—like notation. these instructions are defined as follows: 11.7 Exercises 593- 0p Code 0p Code [Binary] (Symbolic) Function no mu aoo Rio] ;= RiDI + apex“: 01 xlxfl AND RIU] := R[U] AND apron] 1|] ea COMP Km] -.= ~REIJ J to 01 [NC Rm] := R[IJ| +1 10 m RSR R|1|<U> := R1114), Rl1i<l):= RIIICZ), R[I]<Z> := RI}}<3>. Rlil<3b :2 R11]<u>; to 11 ASR R[1]<0) := R[1!<‘t>, R[‘l]<1> := R[1]<2>, R111<2> := RIl]<3>. R[1]<3> := RliJ<3>: Note that X1242;I represents a 2-bit encoding of Ihl.‘ operand regis— ter. The RSR is a logical shift right: the 4-bit word is shifted from left to right. with the low-order hit replacing the highorder bit. ASK is an arithmetic shift right: the high-order bit fiils the bit to its right during the shift while retaining its value. if the register contains signed data. the shift has the effect of dividing by 2 whether the stored number is positive or negative. a. Design a point-to-point datapath that is appropriate for this instruction set fragment. Assume that you can use an appropriately designed ALU and a shifter as functional units in your datapath. You may use multiplexers wherever you need them. Draw the register transfer diagram associated with your design. I). Tehulatc the register transfer operations and inicrooperations supported by your datapath. 1:. Consider the execution of the ADD instruction. Draw a state diagram fragment that shows the sequencing of control signal assertions to implement the ADD instruction. How many states are required to execute the instruction? d. Repeat part (c). but this time for the RSR instruction. (Dutapath Design) Repeat Exercise 11.6. but this time design the datapath using a single-bus interconnection scheme. 591 Chapter II Computer Organisation 11.8 11.9 (Dotnpath Design) Repeat Exercise 11.6. but this time design a compromise multiple-bus datapath. Your goal should be to reduce the number of states it takes to implement the ADD and RSR instructions, short of using the pninbtcrpoinl scheme. (I‘L'm-itnoiT State Machine) The instruction set fragment of Exercise 11.6 provides no way to load the registers. We add the following multiple-word instructions to accomplish these functions: 1100 wig!1 v0 XFER Rivgrzi := RIYIYU] 1101 nuns. rams“ [.D 1110 mum. ammo sr 11.10 11.11 mu} := MEMleYg‘i’ng YJYZYIYU} MEMEYmYsY, suntan] ;= mu] The XFER instruction. encoded in two adjacent words in mem- ory. replaces the register indicated by the high-order 2 bits of the instruction’s second word with the register denoted by its 2 low- order bits. The L0 (load) and ST {store} instructions are encoded in three adjacent words: the [first denotes the instruction, the sec- ond and third the memory address {this machine can address up to 256 fourbit words). For completeness. the last instruction is BRN. Branch if fin is Negative: 1111 Y7vfi‘r5v4 YaYzYlYu BRN [F Riol<3> = 1 THEN PC ;= when “You. 3. Draw the state diagram fragment for the instruction fetch and operation decode. given that an instruction may be encoded 'In one. two. or three 4-bit words. I). Draw the memory interface register transfer diagram and the controls set of registers (PC. IR. possibly others). (2. What new register transfer operations and microoperations are added by those {our instructions? (Doloputh Design) How does the register-to-register transfer operation (XFISR) affect your datapath designs in Exercises 11.6 {point-to-point]. 11.? (single bus}. and 11.8 (multiple buses}? (Datopath Design) Put together a unified datapath, integrating the control registers (PC. 1R, etc.) from Exercise 11.9 with your datapalhs from Exercises 11.6. 11?, and 11.8. How many proces- sor states [total clock cycles} does it take to implement the ADD and RSR instructions in each datapath? Explain how you derived your slatercycle count. 11.12 11.13 Exercises 595 (Dotapath Design} Consider the following change to the insth- tion set description of Exercise 11.6. The RSR and ASR are elimi- nated from the instruction set and replaced by the following two instructions: - 1010 OLD CARRY :2 t] 101 1 ADC RN} := RID} + CARRY CARRY 2: Carry out from ALL! The ADD instruction now saves its carryout into a special Carry register. This is cleared by the CLC instruction and can he added to Re by the ADC (add with carry) instruction. a. How should the datapath be modified to support these instructions? Assume the ALU can perform only standard ADD, lNCrernent. AND, and COMPIement operations. h. Draw the state diagram fragment implementing the execution sequences for the am: and CLC instructions. c. Describe how the execution sequence for ADD is changed by introducing these instructions. (Processor Datopath and Control) Consider the instruction for- mat for a 16-bit computer in Figure Ex11.13. The high-order 4 bits specify the operation code. Every instruction contains two operands. The first operand is always one of the processor’s gen- eral-purpose registers, which is specified by the Reg A field in the instruction {the machine has four general-purpose registers. Ho, 3,. Hz. and Ba). The second operand is always in memory. Its address is formed by the sum of the contents of a register. indi- cated by the Reg B field, and the offset value within the instruc— tion. The machine’s initial datspath is shown in Figure Ex11,13, The ALL] implements ADD. SUB, and so forth. a. 'I‘abulale the microoperations implied by the above datapath. Group the operations by common sources or destinations. I). Write a sequence of register transfer microoperations to imple- ment the execution of an Add instruction (assume the instruc- tion has aiready been fetched and decoded]. given the following “macrodcfinition” of the add: ADD Ra, (Rbioffset Ra := Re + Memory[Rb + Offset] 9‘ Hunter 11 Compute! Dauamralion ABus I. l “as A S R H M M R R R R E F P A 3 Whom 3 3 1 '1 s r C R R Memo 3' CBus 15 ‘12 1‘1 109 a 7 U ,DpCude Reg)! ‘RegBl Offset 1 2 2 3 Figure £111.13 Initial dalmath for Wise 11.13. For example. if the instruction is “Add R0, [R1) 10" and H1 contains 5. then the contents of memory location 15 are added to the contents of Hg, with the result stored back into Ho. Assume that a memory access requires only a single state. Indicate any changes to or assumptions about the register transfer operations supported by the datepath to implement die instructions execution sequence (you may need to add additional paths or operations]. In your answer. show how register transfer operations should be grouped into states. c. Write a sequence of register transfer microoperations to imple— ment the execution of a Branch Negative instruction (assume the instruction has already been fetched and decoded}. given die following macrodefinition of the branch: BRN Ra. (Rb)offset Hits 4 0 one PC := Rb + Offset d. However. unlike the add instruction. the offset in the branch instruction is a twos complement number. In other words. the offset can be interpreted as a number between +12? and —128. M8855} State any additions to or assumptions about the datapath that must he made to implement the execution sequence of this insh'uction. As in per! (h), group the register transfer instruc- tion into states. 11.14 (Hecessor Specification and Datapnth Design) In this exercise, you will desCflhe the architecture of a simple s-hit computer. The machine is organized around an evaluation stack. an ALL! that supports the twos complement operations SUB, ADD. and INCRe~ ment and the logical operators AND. OR, and COWIement. and a rotating shifter supporting both logical rotating shifts and arith- metic shifis. The machine supports 16 different instructions. including those for accessing memory, callfreturn from subrou- tine. and conditionallunconditional branches. These instructions are encoded in one to three 4-bit word parcels. The machine can address 256 four-bit words. organized as a Harvard artilitecture. A stack is a data structure in which the element last added is the first to be deleted. Hence, stacks are often called last in, first out data structures. Items are PUSHed to the top of the stack and POPed from the top of the stack. Consider the expression 9 — (5 + 2). This can be implemented by the following sequence of stack-oriented operations: PUSHC 9 ,- Push constant 9 to top of stack PUSHC 5 ; Push constant 5 to top of slack ' PUSHC 2 ; Push constant 2 to top of stack ADD : Top two elements of the stack are added : Remove and replace with the number 7 SUB ; Top two elements of the stack are subtracted ; Remove and replace with the number 2 The sequence and its effects on the stack are shown in Figure Ex11.14(e). The expression can be rewritten without parentheses as 9 5 2 +—. We assume that the Top of Stack pointer is initialized to —1 at processor restart. This represents an empty stack. The T03 pointer is incremented before an item is added to the stack. and is decrementh after an item is removed. Instructions are encoded in one to three four-hit words. Arithw metic, logical, and shift instructions are encoded in a single word: all four hits form the op code. The operands are implicitly the elements on the top of the stack. The arithmeticllogicsl '."¢-tu\-y¥:m€finy- -'. :- .. . 500 Ctnplexll EmulerUJgamtaunn instructions are: ADD. SUB, AND. OR. COMP. lNCR. RSR (rotet‘ ing‘ shift to the right], and ASR [arithmetic shifi to the right). and their encudings are the following: 000“ ADD MemlTUS—1]:= Meml'I‘OS—IHMemrl'OS]: T03 := T03 — 1: 0001 son Mem['I‘OS-1] :2 Memf‘l‘OS-—1] -MemlTUS]: T05 := T03 -— 1', 0010 AND MemiT‘OS-l] :-—- Mem[TOS-1] AND MemITOS]: T05 := T03 - l: 0011 DR MemflDS—I] i= Mem[TCIS—1] 0R Memf'l‘OS]; T05 := T05 - 1: IJ1IJLI COMP MemlTUS] := —MemlTDS]; 0101 INCR Mem[TOS] := MemITOS] + ‘l: 0110 RSR Mentl'l‘DS]c0> := MemlTUS]-r1>. Mend'lflslcb := Mem[TOS]<2>. Memfl‘OSI-CZ) : MemITOSkS), MemlTDSI<3> :-—- Ment['l'US|<0>: 0111 ASR MemITOSI<O> := Memfl‘oslcb. MemlTOSkb : Mem['[‘05]<2:, MernITOSI<Z> := Meml'l‘OS]<3>. Men1[TCIS]<3> := MemJTOSI-fl); Top of Stack 13153;; 2 [m3] ‘ i i [103} 2 PUSHC s _ ‘ 0 [103] ‘W PlJSI-ICS _ ADD _ ‘— [TBS] "' 5 ITOS] "' ? sun _,_ 2 [Test l: Figure Ex'llJIlla] Stack evaluatlun of 9 — l5 1- 2i tum x3x3x.xfl 1001 amaasa. Mitzi-1,09 1010 Amoasm aflaznmn 101: amt-Aim meaning 1100 1101 mafiasn.‘ magma 1110 073161155. Aafizfitfiu 1111 3711049511., magma“ Euercises m The next instruction group pieces data into the stuck or removes data From the stack. The PUSHC instruction is encoded in two 4—bit words. one that contains the op and: and one that contains 1-bit twos complement data to piece on the top at the stack. PUSH and POP are three words icng: one for the op code and two subsequent 4—bit words that when taken together contain an 3-bit address: FUSHC data T03 := T05 + 1: MemlTOSl := xaxzquo; PUSH address T03 :2 T05 + '1: MemITCIS] := McmlAyr’tfiAfifig flgfigfiifigi: POP address Memmmnsm magma] := MemITOS]: ms := Tos - 1: The remaining instructions are Izmnditienehf unconditional branches and subroutine cell and return. All but RTS (Return from Submutine} are encoded in three eAbit words: one word lor the op code and two words for the lurch address. The candi- lional branch instructions. BRZ and BRN. test the lap of slack element for = t] or < U. respectively. If true, Ihe PC is changed to the target address. In either case. the TCIS element is left undisturbed. ISR [jump to subroutine} is a special instruction thal is used to implement subroutines. The current value of the PC is placed an the stack and then the PC is changed to the target address. Any values placed on the stack by the subroutine must be POPed before it returns. The RTE instruction restores the PC from the value saved on the stack. [t is important to organize the processor state machine so that the PC always points to the next instruction to be fetched while executing the current instruction. SR address TDS := 1‘03 + 1: Memt'l‘DSI := PC: PC := Amfiafim Agni-1.11“; RTE PC:= MamlTUSI: TDS := TOS — 1: BRZ. address IF Mem[']US] = 0 THEN PC 2: Mastiff“ flammfio: BRN address [F Meml'lUSf < Ll THEN PC := Amfiass. magma“: {MP address PC := A7AEA5A‘ Asthma“: 500 Chapter 1 5 C-:np..-tcrl:rn.iniza1:uu U _/ Figure Emlttht Data Bus A mun is Room] by the company he organizes. tewhnoiogv is indi'stirtgtifsttobt'e from magic. —-/trt.liur (I. Ciro-ire Address HHS l Any snfifii.‘ieiirty advmiced —-:tmiirose Fierce Slditiuu daznoath designtui lltE-fEISE ILH Fiourv ExltJ-tth} shows an initial tiatapath design for this ,, . instruction set. a. Extend this datapath with a Harvard architecture memory interface. I). What register transfer operations and uiicrooperotions are sup- ported by this datapath? :1. How can lhn microoperations you found in port (Lu be used. to itnplurrienl the execution porlion for each of the ‘IB instruc- firms of the instruction set? 13.15 {Processor Controller Design) Design a stateldiagrain for the pro- cessor and instruction set described in Exercise 11.14. ‘rnu can (thoose a Moore or Meaty design style. a. Draw the modified state design. Identity in general terms w hat is happening in each state. In this. our final chapter. we examine alternative processor's control portion. in real inatthiues. the most complex part of the design. We study four alternative controller organizations. The first is based on classical finite state machines. using 9 Moore or Meal); structure. This approach is sometimes disparagineg called "random logic" con- trol. to contrast it with more structured methods based on ROMS and other forms of programmable logic. The classical method is the only approach we used in Chapter 1 1. ways to implement a the control unit is often The second tnelhod is called time store. It decomposes a single clos- = sieal finite state machine into several simpler. communicating linite Y slate machines. It is a stratogtr for partitioning a finite stale machine Il'Iat \ x ' as re uired for its lollih UIld BX‘JCl'lmfl' ‘3” i is Well matched to the structure of processor controllers. iiii;leligilifiielili:ttdiiie HIZH‘DT)’ “Slew WSPUDFl-‘i l“ i‘ 5lllgl" l’m' I The third method makes use of juriip counters. n-‘liieli We introduced mfi'sur twig. for. the purposes of this calculation. iii Chaplur 10. This approach exleilstveli' uses Mott-level components ' like counters. multiplexers. and decoders to implement the controller. The final ntelhnil, uncroprogromming. uses {{0le to oneodc next states and control signais directly as hits stored in a 1]]!30]l')1'_\'. We exam- ine three alternative microprogmmming methods: 3] What ittiCI‘ooperations Should he ussoctaittcl With edeh .5th c For each of lhe ‘15 instructions in [he instruction scl. describe lJt'rnlt'slr sequence-rs. ...
View Full Document

This homework help was uploaded on 04/22/2008 for the course ECSE 2610 taught by Professor Ji during the Spring '08 term at Rensselaer Polytechnic Institute.

Page1 / 23

chapt11_ComputerOrganization - From coupler-flange to...

This preview shows document pages 1 - 23. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online