This preview shows page 1. Sign up to view the full content.
Unformatted text preview: An Example of How a Computer Really Works A computer is a complex system consisting of many different components. But at the heart ‐‐ or the brain, if you want ‐‐ of the computer is a single component that does the actual computing. This is the Central Processing Unit, or CPU. In a modern desktop computer, the CPU is a single "chip" on the order of one square inch in size. The job of the CPU is to execute programs. A program is simply a list of unambiguous instructions meant to be followed mechanically by a computer. A computer is built to carry out instructions that are written in a very simple type of language called machine language. Each type of computer has its own machine language, and the computer can directly execute a program only if the program is expressed in that language. (It can execute programs written in other languages if they are first translated into machine language.) When the CPU executes a program, that program is stored in the computer's main memory (also called the RAM or random access memory), along with the data that is being used or processed by the program. When the CPU needs to access the program instruction or data in a particular location, it sends the address of that information as a signal to the memory; the memory responds by sending back the data contained in the specified location. The CPU can also store information in memory by specifying the information to be stored and the address of the location where it is to be stored. On the level of machine language, the operation of the CPU is fairly straightforward (although exactly how it is implemented is quite complicated). The CPU executes a program that is stored as a sequence of machine language instructions in main memory. It does this by repeatedly reading, or fetching, an instruction from memory and then carrying out, or executing, that instruction. This process ‐‐ fetch an instruction, execute it, fetch another instruction, execute it, and so on forever ‐‐ is called the fetch‐and‐execute cycle. This is about all that the CPU ever does. In this way, a computer executes machine language programs mechanically ‐‐ that is without understanding them or thinking about them. This is not an easy concept and that is the reason we are taking this time to demonstrate the CPU's operation. (Don't worry we won't ask you to write any programs in machine language in this class.) The other reason we are introducing this material is as a demonstration of a computer language whose behavior can be quite easily understood. While Java code will be much easier to write than machine code, it can be more difficult to understand what a line of Java code does. By understanding machine code, it gives us a model we can use to explain what features of Java code really mean. CPU Internals: The CPU contains a few internal registers, which are small memory units capable of holding a single number. The CPU uses one of these registers ‐‐ the program counter, or PC ‐‐ to keep track of where it is in the program it is executing. The PC stores the address of the next instruction that the CPU should execute. At the beginning of each fetch‐and‐execute cycle, the CPU checks the PC to see which instruction it should fetch. During the course of the fetch‐and‐execute cycle, the number in the PC is updated to indicate the instruction that is to be executed in the next cycle. (Usually, but not always, this is just the instruction that sequentially follows the current instruction in the program.) In addition to the PC, the simple processor we'll consider has 8 "general purpose" registers (called r0 ‐ r7). Each of these registers is large enough to hold 4 bytes of data, and these registers are used to hold the values with which the computer is currently working. The processor also includes 3 condition code registers (called N, Z, and P); each of these registers holds a single bit. Processor r0 r1 r2 r3 r4 r5 r6 r7 PC Condition codes N Z P Instructions: Instructions have two parts: 1) the opcode, which specifies what operation an instruction performs, and 2) the operands, which specify the registers or memory locations used in performing the operation. For the purpose of this discussion, it is sufficient to consider a processor that only has 6 different opcodes; real processors often have hundreds of different opcodes, but many exist for the sake of efficiency. We list them below: ZERO_REGISTER: The ZERO_REGISTER instruction takes a single operand, the name of a register, and it writes a zero into that register. Example: ZERO_REG r1 (The contents of register r1 is overwritten with a zero.) ADD: The ADD instruction takes three operands, all of which are names of registers. The first two operands specify the values that should be added together ‐‐ we call these "source" operands. The first source must specify a register, but the second source can be either a register or a small constant value specified in the instruction itself. The result of the addition of these two values is written to the register specified by the third operand ‐‐ what we call the "destination" register. Example: ADD r1 + r2 -> r3 (The contents of register r1 is added to the contents of register r2 and the result is stored in register r3.) ADD r4 + 1 -> r4 (The contents of register r4 is added to the constant 1 and the result is written back to register r4 (overwriting the old value).) SUB: The SUB instruction corresponds closely to the ADD instruction with 2 source operands (one register value and either a second register value or a small constant) and one destination register operand, but instead of adding together the two source operands, the second is subtracted from the first. The result of this subtraction is stored in the destination register. Example: SUB r5 - r3 -> r2 (The contents of register r3 is subtracted from the contents of register r5 and the result is written into register r2.) LOAD: The LOAD instruction copies a value from memory into a register. It has two operands ‐‐ one source register and one destination register ‐‐ and includes a small constant in the instruction. The memory address from which to load is computed by adding together the contents of the source register and the small constant. Because a register holds 4 bytes, we copy not only the byte at the computed address, but also 3 bytes that follow it in memory (address+1, address+2, and address+3). These 4 bytes are written into the specified destination register. The values in memory do not change. Example: LOAD r1 <- [r4 + 8] (The contents of register r4 are added to the value 8 to compute a memory address. If we assume that register r4 holds the value 20, then this load would compute the address 28. The bytes stored at memory addresses 28, 29, 30, and 31 would be copied to register r1.) STORE: The STORE instruction copies a value from a register to memory. It has two source register operands and includes a small constant. Like the LOAD instruction, the contents of a register are added to the small constant to compute a memory address. The other source register specifies the value to copied to the four bytes in memory starting at the computed address. The values in the source registers do not change. Example: STORE r7 -> [r0 + 20] (The contents of register r0 (assume it held the value 20) would be added to the constant 20 to compute the memory address 40. The contents of register r7 would overwrite the values stored at memory locations 40, 41, 42, and 43.) FOR ALL OF THE ABOVE: After executing any of the above instructions, the instruction immediately after the current instruction should be executed. This is accomplished by adding 2 to the value in the PC register, because each of these instructions is two bytes long (as we'll see below). In addition, anytime a general‐purpose register is written (which occurs in the ZERO_REG, ADD, SUB, and LOAD instructions), the condition code registers are also updated. The condition codes are called N, Z, and P, which record whether the last value written to a general‐purpose register was Negative, Zero, and Positive, respectively. At all times, exactly one of these registers will hold a 1 and the other two will hold the value 0. Which register is set to 1 is based on the value written to the general‐purpose register: if the value was negative the condition codes will be set to N=1, Z=0, and P=0; if the value was zero, N=0, Z=1, P=0; and if the value was positive, N=0, Z=0, P=1. BRANCH: Unlike the previous instructions, the branch instruction doesn't read or write general‐purpose registers or memory; it reads only the condition codes and writes only the PC register. Two pieces of information are specified as part of the branch: 1) which condition codes should be checked, and 2) how many instructions to skip. If any of the condition codes checked are set to one, then the branch will be TAKEN; otherwise, the PC is set to the next sequential instruction (PC + 2). If the branch is taken, then the new PC is computed as: PC + 2 + 2(number of instructions to skip) There are eight possible settings for the condition codes: NZP = branch always NZ = branch if value was LESS THAN OR EQUAL TO ZERO NP = branch if value was NOT EQUAL TO ZERO N = branch if value was LESS THAN ZERO ZP = branch if value was GREATER THAN OR EQUAL TO ZERO Z = branch if value was EQUAL TO ZERO P = branch if value was GREATER THAN ZERO ‐ = branch never Example: BR.NZ 1 (If the previously written value was either negative or zero ‐‐ i.e., if either the N or Z condition codes are set ‐‐ then set the PC to PC+4, skipping 1 instruction; otherwise, continue to the next instruction by setting PC to PC+2.) BR.NZP -16 (Always set the PC to PC‐30; this instruction will always branch because we're guaranteed that one of the N, Z, and P condition codes will be set.) ADD AND + + + 0001 0101 DR DR SR1 SR1 1 0 00 imm5 SR2 AND imm5 0101 DR SR1 1 Storing Instructions in Memory: PCoffset9 n zp BR 0000 Machine language instructions are expressed as binary numbers, just like any value stored A.3 The Instruction Set 000 BaseR 000000 JMP 1100 in memory. So, a machine language instruction is just a sequence of zeros and ones. Each 876543210 15 14 13 12 11 10 9 particular sequence encodes some particular instruction. In the machine we consider here, + 0001 SR1 PCoffset11 0 0 0 SR2 ADD 1 the JSR 0100 every instruction is encoded in 16 bits, using DR 4 most‐significant bits to specify the opcode. In addition to the opcode, each instruction specifies 0, 1, or 2 source registers + 0 DR 0 0 BaseR 000000 imm5 ADD 0001 SR1 1 JSRR 0100 (labeled SR1, SR2 and BaseR) and 0 or 1 destination registers (DR), each of which takes 3 ++ bits to specify because 3 bits are required to name the 8 possible registers (23 = 8). Some PCoffset9 LD 0101 DR SR1 0 00 SR2 AND 0010 instructions include a small signed constant value of 5, 6 bits (constant5 and offset6, ++ respectively). The branch instruction uses 3 bits to specify the condition codes (n, z, p) it AND 0101 DR SR1 1 PCoffset9 imm5 LDI 1010 monitors and a signed number of instructions to skip (PCoffset9). A.3 The Instruction Set + PCoffset9 n DR p BR 0000 z BaseR offset6 LDR 0110 0 3 15 14 The instructions are encoded as follows: 13 12 11 10 9 8 7 6 5 4 A.3 2 1Instruction Set The + + 0001 DR SR1 0 00 ADD 000 BaseR 000000 SR2 PCoffset9 JMP 1100 LEA 1110 76543210 ADD JSR NOT
+ + +
15 14 13 12 11 10 9
0001 1001 0100 0001 0101 0100 1100 0101 0010 1000 0101 0000 1010 0011 0000 1100 0110 1011 1 DR DR 0 0000 0 DR DR n DR p z SR n 000 p z DR SR 525 525 525 8 0 0 111111 SR1 PCoffset11 0 imm5SR2 1 SR SR1 BaseR 111 1 0 imm5 0 0constant5 000000 SR2 ADD AND JSRR RET AND ZERO_REG LD RTI
+ AND BRR ST LDI B ++ + 0 0 SR1 0 PCoffset9 imm5SR2 0 0 1 00 000000000000 00 0 0 SR1 BaseR 1 PCoffset9 imm5 000000 PCoffset9 offset6
525 BR JMP STI LDR LOAD + JMP JSR STR LEA STORE + + + 1100 1 000 0100 DR SR 1110 0111 15 14 13 12 11 10 9 0100 0001 1001 1111 0100 0001 0010 1100 1101 1 0000 0 DR0 0 0 000 0 0 DR A.3 The Instruction Set PCoffset11 000000 PCoffset9 BaseR offset6 876543210 JSR ADD JSRR NOT TRAP SUB JSRR LD ADD RET reserved SUB
+ ++ BaseR 000000 SR1 PCoffset11 0 0 trapvect8 0 111111 SR2 SR BaseR SR1 111 constant5 PCoffset9 imm5 1 000000 There are two encodings for modify condition codes and SUB instructions since they can each each of the ADD ++ + either take two source registers, or one source register and a small constant. DR PCoffset9 LDI 1010 AND 0101 SR1 1 BaseR offset6 LDR 0110 PCoffset9 imm5 SR ST 0011 + In the fetch part of the fetch‐execute cycle, the address in the PC register is supplied to the + BaseR offset6 LDR 0110 PCoffset9 n DR p BR 0000 z LEA 1110 PCoffset9 STI SR 1011 memory and 2 bytes (16 bits) are read and returned to the CPU. The CPU then inspects the + + four opcode bits of the instruction to determine what operation will be performed and how PCoffset9 LEA 1110 000 BaseR 000000 NOT DR SR 111111 1001 JMP 1100 offset6 STR SR 0111 “app-a” — 2003/6/30 the remaining bits of the instruction should be interpreted. — page 525 — #5 + NOT DR SR 111111 1001 1 000 JSR 111 PCoffset11 000000 0100 RET 1100 0000 trapvect8 TRAP 1111 Again, our intention in showing you the binary representation of instructions is to demonstrate that instructions can be stored in memory (further demonstrating that there 111 RET 1100 0 000 0 0 BaseR 000000000000 000000 JSRR 0100 RTI 1000 reserved 1101 is no magic in how a computer works). We in no way expect you to memorize these Figure ST+ Format of the entire LC-3 instruction set. Note: + indicates instructions that A.2 encodings. Examples of instruction encodings areDR included with the code example below. 000000000000 RTI 1000 PCoffset9 LD SR 0010 0011 Example Code: ST STI LDI
modify condition codes
0011 1010 1011 DR SR PCoffset9 + indicates Figure RTI++ A.2 LD 0010 0101 SR1 0 0 SR2 AND DR LDI Format of the entire LC-3 instruction set. Note: PCoffset9 instructions that 1010 000000000000 0 1000 + To demonstrate the execution of a 0111 machine language program, offset6 we use the following PCoffset9 STI 1011 BaseR STR DR offset6 SR LDR 0110 algorithm written in pseudo‐code, which find the highest quiz score from a series of quiz + scores: BaseR offset6 STR SR 0111 PCoffset9 DR 0000 trapvect8 TRAP LEA 1110 1111 “app-a” — 2003/6/30 — page 525 — #5 TRAP NOT reserved + 1111 1001 1101 0000 DR SR trapvect8 111111 Figure reserved A.2 Format of the entire LC-3 instruction set. Note: + indicates instructions that 1101 111 000000 000 RET 1100
modify condition codes 1. get number_of_quizzes 2. count = 0 3. highest_score = 0 4. while count < number_of_quizzes: 4.1 temp = get quiz grade 4.2 if temp > highest_score: 4.2.1 highest_score = temp 4.3 count = count + 1 5. display highest_score To demonstrate the execution of this program as a machine language program, we need to perform two mappings: 1) we need to map the state of the algorithm (e.g., count, highest_score) to locations in memory, and 2) we need to convert the algorithm to machine code. The mapping of state is quite straight forward given our discussion of data modelling; one mapping is shown below. (Note: we are assuming each of these values is being stored in a 4‐byte integer, so each value starts at an address 4 bytes after the previous one.) address count highest_score num_quizzes quizzes a0 a4 a8 a12 a16 a20 a24 Below (on the next page) we show the implementation of the algorithm as machine code. Again, our goal here is not to teach you how to write machine code, but rather that algorithms can be implemented in machine code and demonstrate their execution on a simple processor. We have attempted to show the correspondence between the algorithm and the machine code. address put a zero in a register a100 2. count = 0 3. highest_score = 0 4. while count < num_quizzes: (load count, load num_quizzes, compare 2 values with subtract, branch) 4.1 temp = get quiz grade (load count, compute 4*count, address = 16 + 4*count, load "count"th quiz grade 4.2 if temp > highest_score: (load highest_score, compare to temp w/subtract, branch) 4.2.1 highest_score = temp 4.3 count = count + 1 (load count, add one, store back to memory) go back to "while" 5. display highest_score a102 a104 a106 a108 a110 a112 a114 a116 a118 a120 a122 a124 a126 a128 a130 a132 a134 a136 a138 zero_reg r1 store r1 -> [r1+0] store r1 -> [r1+4] load r2 <- [r1+0] load r3 <- [r1+8] sub r3 - r2 -> r4 br.nz 12 load r2 <- [r1+0] add r2 + r2 -> r2 add r2 + r2 -> r2 load r3 <- [r2+12] load r4 <- [r1+4] sub r3 - r4 -> r5 br.nz 1 store r3 -> [r1+4] load r2 <- [r1+0] add r2 + 1 -> r2 store r2 -> [r1+0] br.pnz -16 (display ...) binary representation
0101 001 000100000 0111 001 001 000000 0111 001 001 000100 0110 010 001 000000 0110 011 001 001000 0001 100 011 0 00 010 0000 110 000001100 0110 010 001 000000 0001 010 010 0 00 010 0001 010 010 0 00 010 0110 011 010 001100 0110 100 001 000100 0001 101 011 0 00 100 0000 110 000000001 0111 011 001 000100 0110 010 001 000000 0001 010 010 1 00001 0111 010 001 000000 0000 111 110000 Because it would be tedious to write up the execution of this code, view the video of the professor manually executing this code. This example execution uses a collection of 3 quiz scores (8, 5, 9). Based on the algorithm above, what would you expect would be the final value of highest_score? Does this match what the machine code execution computes? ...
View Full Document
- Fall '08