CA2 - Chapter 2 ISA and MIPS Csci4203 Csci4203 1...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 2 ISA and MIPS Csci4203 Csci4203 1 Instruction Set Architecture: What Must be Specified? Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction • Instruction Format or Encoding – how is it decoded? • Location of operands and result – where other than memory? – how many explicit operands? – how are memory operands located? – which can or cannot be in memory? • Data type and Size • Operations • Successor instruction – jumps, conditions, branches Csci4203 Csci4203 2 Basic ISA Classes Accumulator (1 register): Stack: 1 address add A 1+x address addx A 0 address add acc ← acc ← tos ← acc + mem[A] acc + mem[A] acc + mem[A + x] acc + mem[A + x] tos + next tos + next General Purpose Register (can be memory/memory): Load/Store: 2 address add A B EA[A] ← 3 address add A B C EA[A] ← EA[A] + EA[B] EA[A] + EA[B] EA[B] + EA[C] EA[B] + EA[C] 3 address add Ra Rb Rc Ra ← Rb + Rc Rb + Rc load Ra Rb Ra ← mem[Rb] mem[Rb] Csci4203 C store Ra Rb sci4203 mem[Rb] ← Ra Ra 3 Comparing Number of Instructions Code sequence for (C = A + B) for four classes of instruction sets: Register (register-memory) Load R1,A Add R1,B Store C, R1 Stack Push A Push B Add Pop C Accumulator Load A Add B Store C Register (load-store) Load R1,A Load R2,B Add R3,R1,R2 Store C,R3 Does this mean the load/store architecture is inferior to others? Csci4203 Csci4203 4 Let’s try a different code sequence Code sequence for C = A + B; D = A – B; Stack Push A Push B Add Pop C Push A Push B Sub Pop D Accumulator Load A Add B Store C Load A Sub B Store D Register (register-memory) Load R1,A Add R1,B Store C, R1 Load R2, A Sub R2, B Store R2, D Register (load-store) Load R1,A Load R2,B Add R3,R1,R2 Store C,R3 Sub R3,R1,R2 Store D, R3 Registers can be viewed as programmable cache. Csci4203 5 Csci4203 General Purpose Registers Dominate ° 1975-2000 all machines use general purpose registers ° Advantages of registers • registers are faster than memory • registers are easier for a compiler to use - e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order vs. stack (i.e. increased parallel processing potential) • registers can hold variables - memory traffic is reduced, so program is sped up (since registers are faster than memory) - code density improves (since register named with fewer bits than memory location) Csci4203 Csci4203 6 Instructions: • Language of the Machine • We’ll be working with the MIPS instruction set architecture – similar to other architectures developed since the 1980's – Almost 100 million MIPS processors manufactured in 2002 – used by NEC, Nintendo, Cisco, Silicon Graphics, Sony, … Csci4203 Csci4203 7 1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0 Other SPARC Hitachi SH PowerPC Motorola 68K MIPS IA-32 ARM ? 1998 1999 2000 2001 2002 Csci4203 Csci4203 8 Why MIPS? General Simple and • A simple instruction set (probably the simplest among all RISCs) that can support general computation needs. • Chapter 2 shows how it is represented in hardware and the relationship between high­level language and this primitive one. Csci4203 Csci4203 9 Why it was called MIPS? • Does it mean “Million Instruction Per Second?” • It was named for “Microprocessor without Interlocked Pipeline Stages” (because John Hennessy was an optimizing compiler writer) Now MIPS has long abandoned the no­interlock approach. Csci4203 Csci4203 10 MIPS arithmetic • All instructions have 3 operands • Operand order is fixed (destination first) Example: C code: a=b+c MIPS ‘code’: add a, b, c “The natural number of operands for an operation like “The natural number of operands for an operation like addition is three…requiring every instruction to have exactly three operands, no more and no less, conforms to the philosophy of keeping the hardware simple” Csci4203 Csci4203 11 MIPS arithmetic • Design Principle 1: simplicity favors regularity. • Of course this complicates some things... C code: a = b + c + d; add a, a, d MIPS code: add a, b, c • Operands must be registers, only 32 registers • provided Csci4203 Csci4203 Each register contains 32 bits (64 bits) 12 Four Design Principles • Design Principle 1: simplicity favors • • • regularity. Design Principle 2: smaller is faster. Design Principle 3: Make the common case fast Design Principle 4: Good design demands good compromises. Csci4203 Csci4203 13 Registers vs. Memory • Arithmetic instructions operands must be registers, — only 32 registers provided • Compiler associates variables with registers • What about programs with lots of variables Why don’t we design a machine with 1000 registers? Csci4203 Csci4203 14 Why not 1000 register ISA? • Instruction format – instructions will be much • • • • • longer Register access time – small is fast How about data structures? Arrays? How about variables that have possible aliases such as pointer de­referenced variables? How about context switch time? …. Csci4203 Csci4203 15 Typical Data Movement Operations Arithmetic Shift Logical Control (Jump/Branch) Subroutine Linkage Interrupt Synchronization String Graphics (MMX) Special Load (from memory) Store (to memory) memory-to-memory move register-to-register move input (from I/O device) output (to I/O device) push, pop (to/from stack) integer (binary + decimal) or FP Add, Subtract, Multiply, Divide shift left/right, rotate left/right not, and, or, xor, set, clear unconditional, conditional call, return trap, return test & set (atomic r-m-w) search, translate parallel subword ops (4 16bit add) Csci4203 Csci4203 popcount, first-one, parity, data encrypt.. 16 Memory Organization • Viewed as a large, single­dimension array, with an address. • A memory address is an index into the array • "Byte addressing" means that the index points to a byte of memory. ... Csci4203 Csci4203 17 Byte Address vs. Word Address • • • • • Bytes are nice, but most data items use larger "words" For MIPS, a word is 32 bits or 4 bytes. For Cray, a word is 64 bits. Cray and DEC are word­ addressable machines. 232 bytes with byte addresses from 0 to 232­1 Words are aligned i.e., what are the least 2 significant bits of a word address? 0 4 8 12 32 bits of data Registers hold 32 bits of data 32 bits of data 32 bits of data 32 bits of data ... Csci4203 Csci4203 18 Addressing Bytes in Word Addressable machines • Need to use logical operation (including shift • operations) to examine individual bytes or even bits. In scientific/engineering computers, words (or double words) are used more frequently. However, strings and texts are popular in commercial computers. Handling bytes on word­addressable machine incurs high overhead. (perhaps not a good compromise) Most machines enforce natural alignment requirement for load/store instructions. Some machines support “extract” and “deposit” instructions to handle bytes/bits in registers. Csci4203 Csci4203 19 • • Instructions • • C code: Load and store instructions Example: A[12] = h + A[8]; add $t0, $s2, $t0 sw $t0, 48($s3) MIPS code: lw $t0, 32($s3) • Can refer to registers by name (e.g., $s2, $t2) instead • • of number Store word has destination last Remember arithmetic operands are registers, not memory! Can’t write: Csci4203 Csci4203 add 48($s3), $s2, 32($s3) 20 Our First Example • Can we figure out the code? swap(int v, int k); { int temp; swap: temp = v[k] muli $2, $5, 4 v[k] = v[k+1]; add $2, $4, $2 v[k+1] = temp; lw $15, 0($2) } lw $16, 4($2) sw $16, 0($2) Csci4203 Csci4203 sw $15, 4($2) 21 Machine Language Format Machine Language • Instructions, like registers and words of data, are also 32 bits long – – Example: add $t0, $s1, $s2 registers have numbers, $t0=8, $s1=17, $t0=8, $s2=18 $s2=18 17 18 • R­Format: 000000 10001 10010 01000 00000 100000 op rs rt rd shamt funct • Can you guess what the field names stand Can Csci4203 Csci4203 22 Constants • How to specify a constant (immediate)? • MIPS has instructions for immediate operands addi andi andi ori ori slti $29, $29, 2 $29, $29, 6 $29, $29, $29, 4 $29, $29, $29, 4 $29, – A = A + 1 or B=B*5 • An new format is needed, call I­format, or I­type op rs rt 16 bit number Csci4203 Csci4203 23 Constants (cont.) • Questions Why not leave constants in memory and just load them? Why not just use the shamt field and use the R­ format. small constants are used more frequently, the most frequently used constants are 0 and 1. How about shift (e.g. sll, srl)? Why is it not considered as immediate? e.g. sll $29, $29, 4 $29, Csci4203 Csci4203 24 How about larger constants? • We'd like to be able to load a 32 bit constant into a • register Must use two instructions, new "load upper immediate" instruction filled with zeros 1010101010101010 lui $t0, 1010101010101010 0000000000000000 • Then must get the lower order bits right, i.e., ori $t0, $t0, 1010101010101010 1010101010101010 0000000000000000 0000000000000000 1010101010101010 ori 1010101010101010 1010101010101010 Csci4203 Csci4203 25 I­format for load and store • Example: lw $t0, 32($s2) 35 op 18 rs 9 rt 32 16 bit number • Where's the compromise? Csci4203 Csci4203 26 Where is the compromise? All instructions use the same length Require different instruction formats: R and I Large constants take more instructions to build Rely on program locality – offsets greater than 32768 will suffer It is difficult to support more than 32 registers Csci4203 Csci4203 27 Stored Program Concept • Instructions are bits • Programs are stored in memory Processor Memory — to be read or written just like data memory for data, programs, compilers, editors, etc. Csci4203 Csci4203 28 Stored Program Concept Physical Memory Accounting program (machine code) Editor program (machine code) (machine code) Processor C Compiler Payroll data Book Text Source code in C For editor program Csci4203 Csci4203 29 Control (branch instruction) • Decision making instructions – alter the control flow, – i.e., change the "next" instruction to be executed • MIPS conditional branch instructions: bne $t0, $t1, Label bne beq $t0, $t1, Label beq • Example: if (i==j) h = i + j; bne add Label: $s0, $s1, Label $s3, $s0, $s1 .... Csci4203 Csci4203 30 Control (jump instruction) • MIPS unconditional branch instructions: j label • Example: if (i!=j) beq $s4, $s5, Lab1 if beq h=i+j; add $s3, $s4, $s5 else j Lab2 else Lab2 h=i-j; sub h=i-j; Lab1: Lab1: sub $s3,$s4,$s5 $s3,$s4,$s5 Lab2: ... • Can you build a simple “for loop”? Csci4203 Csci4203 31 Control Flow • We have: beq, bne, what about Branch­if­less­than? • New instruction: $s1 < $s2 then $t0 = 1 slt $t0, $s1, $s2 else $t0 = 0 $t0 Can use this instruction to build "blt $s1, $s2, Label" — can now build general control structures — this is a MIPS pseudo instruction!! Why?? Note that the assembler needs a register to do this, — there are policy of use conventions for registers Csci4203 Csci4203 if • • 32 Build a while loop While (s[i] == k) i++; Assume that i and k are in $s3 and $s5, and the base of array s is in $s6. Loop: Exit: sll add lw bne addi j $t1,$s3,2 $t1,$t1,$s6 $t0,0($t1) $t0,$s5, exit $s3,$s3,1 Loop Any optimization opportunity ?? lw $t0,0($t1) bne $t0,$s5,exit add $t1,$t1,4 assme i not used outside the loop and the address of s[i] is in $t1 33 Csci4203 Csci4203 Case/Switch Statement Most Programming Languages have a case or a switch statement to select one of many alternatives depending on a single value. How to implement the case/switch using MIPS control instructions? May use a chain of if-then-else, but inefficient when the number of cases is large May encode the targets in a table of addresses called a jump address table. The compiler can use the selection value to index into the table. We need an indirect jump instruction to support jump tables. MIPS has the JR instruction (Jump Register) JR is also used as return branch (JR $31) Csci4203 Csci4203 34 Addresses in Branches and Jumps • Instructions: bne $t4,$t5,Label is at Label if $t4 != $t5 is is at Label if $t4 = $t5 $t4 Next instruction Next Next instruction Next beq $t4,$t5,Label j Label Label Next instruction is at Next rt 16 bit address 26 bit address • Formats: op J I op rs • Addresses are not 32 bits Csci4203 Csci4203 How do we handle this with load and store 35 Addresses in Branches • Instructions: bne $t4,$t5,Label is at Label if $t4≠$t5 is Next instruction Next Next instruction Next beq $t4,$t5,Label is at Label if $t4=$t5 is rs • Formats: op I rt 16 bit address • Could specify a register (like lw and sw) and add it to address – use Instruction Address Register (PC = program counter) most branches are local (principle of locality) Csci4203 Csci4203 Jump instructions just use high order bits of PC • 36 Addresses in Branches • When the address of a branch target is • When the address of a branch target is greater than 26 bits – Need to use JR (Jump Register) instruction – Can use multiple trampoline code sequence – Forming such a large address in register can be costly – Jump register instructions are usually more expensive due to possible branch mis­prediction. Csci4203 Csci4203 37 greater than 16 bits (but less than 26 bits) – Can use trampoline code – Can apply code repositioning optimization Compare & Branch vs. Condition code • IA­32 (i.e. x86) uses condition code • MIPS uses compare and branch for equal and not­equal (which account for 86% cases for integer code, and 37% for FP code) • MIPS uses multiple condition code (i.e.. In register) for more general branches (le,lt gt, ge,..) • Advantages: fewer instructions increased instruction level parallelism Csci4203 Csci4203 38 Supporting Procedures • Procedures/functions are commonly used to structure programs. (perhaps the most important • • feature to support in ISA) Some machines have specific instructions to support function calls. (e.g. calls in DEC VAX, SUN Sparc and Intel IA64 support register window) Main steps involved in a function call: parameter passing control transfer (call) acquire storage (activation record or frame) saving registers do computation restore registers store return value control transfer (return) Csci4203 Csci4203 39 MIPS Support for Procedures • Passing parameters in registers (software convention) – $a0 ­­ $a3 ($4­$7) for the first 4 arguments – $v0 ­­ $$v1 ($2­$3) for return values – $ra ($31) for return address – Jump and Link: jumps to the target and saves the address of the following instruction in $ra. • JAL instruction for call • JR for return • Register Partition temporary registers: $t0­­$t9 (8­15,24­25); caller save saved registers: $s0­­$s7; (16­23) Csci4203 Csci4203 40 Register Partition Caller saves $t registers if they are live across the call inst. Callee saves $s registers if they are used in the procedure Csci4203 Csci4203 41 Example: what to save? What registers to be saved? Caller $t1 $t3 $s5 Jal swap; $t0 $s3 $t3 Csci4203 Csci4203 42 Example: where to insert save/restore Caller $t3 sw $t3, x($sp) lw $t3, x($sp) Jal swap; Csci4203 Csci4203 43 Example: what to save? What registers to be saved? Callee $t1 $t3 $s5 $s3,$s5 $t0 $s3 Csci4203 Csci4203 44 Example: where to insert save/restore? Callee sw $s3, x($sp) sw $s5, y($sp) lw $s3, x($sp) lw $s5, y($sp) Jr $ra Csci4203 Csci4203 45 Best Case: no save/restore required No temporary registers are live across the call Caller Callee is a leaf No saved registers used (only use $t reg) Csci4203 Csci4203 46 Accessing Variables • Two classes: automatic and static Automatic variables are local to procedures, they are stored on the stack, allocated on entry and discarded on exit of the procedure. Static variables exist across the procedure calls. Variables declared outside procedures, or with the static key word are static. Static variables are accessed through the $gp ($28) (called global pointer). Automatic variables are accessed through the $fp ($30) (called frame pointer) Csci4203 47 Csci4203 Allocating Space on the Stack $fp $sp high address $fp call frame $fp $sp return high address $sp Csci4203 Csci4203 48 What are stored on the frame? • • • Frame is also called activation record $fp points to the first word of the frame Frame contains – – – – – – Saved argument registers Saved return address Saved $sp and $fp Saved registers (if any) Local variables and structures Spilled registers Csci4203 Csci4203 49 Local variable allocation and layout Int a,b,c; Double f; Char buffer[80]; a b c f $fp­0 $fp­4 $fp­8 $fp­16 buff $fp­104 Csci4203 Csci4203 50 Memory Allocation Stack Dynamic data (heap) Static data Text reserved Csci4203 Csci4203 51 How virus/worms can happen? Old frame When buffer overflows, it can write into the bookkeeping infor in the previous frame. New frame buffer Csci4203 Csci4203 52 So far: • Instruction add $s1,$s2,$s3 sub $s1,$s2,$s3 lw $s1,100($s2) sw $s1,100($s2) sw bne $s4,$s5,L Label beq $s4,$s5,L Label j Label rs R Label op Label o rs I Formats:p J op Meaning $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] $s1 Memory[$s2+100] = $s1 Next instr. is at Next if $s4 ≠ $s5 Next instr. is at Next if $s4 = $s5 if Next instr. is at Next rt rt rd shamt funct 16 bit address • 26 bit address Csci4203 Csci4203 53 MIPS R3000 Instruction Set Architecture • Instruction Categories – Load/Store – Computational – Jump and Branch – Floating Point • coprocessor – Memory Management – Special (Summary) Registers R0 - R31 PC HI LO 3 Instruction Formats: all 32 bits wide OP OP OP rs rs rt rt rd sa funct immediate jump target Csci4203 Csci4203 54 Software Convention for Registers Software Convention for Registers Name Register number Usage 0 the constant value 0 $zero 2-3 values for results and expression evaluation $v0-$v1 4-7 arguments $a0-$a3 8-15 temporaries $t0-$t7 16-23 saved $s0-$s7 24-25 more temporaries $t8-$t9 28 global pointer $gp 29 stack pointer $sp 30 frame pointer $fp 31 return address $ra Register 1 ($at) reserved for assembler, 26-27 ($k0,$k1)for operating system Csci4203 Csci4203 55 Assembly Language vs. Machine Language • Assembly provides convenient symbolic representation – much easier than writing down numbers – e.g., destination first Machine language is the underlying reality – e.g., destination is no longer first Assembly can provide 'pseudoinstructions' – e.g., “move $t0, $t1” exists only in Assembly – would be implemented using “add $t0,$t1,$zero” When considering performance you should count real instructions Csci4203 Csci4203 56 • • • To summarize: To summarize: MIPS Operands Name 32 registers Example $s0­$s7 $t0­$t7 $zero,$fp,$sp $ra,$at $a0­$a3,$v0­$v1 Memory[0] Memory[4], Memory [4294967292] Comments Fast locations for data, In MIPS, data must be in registers to perform arithmetic. $zero always 0. Register $at is reserved for assembler to handle large constant. Accessed only by data transfer instructions. MIPS uses byte addresses, so sequential words differ by 4. Memory holds data structures, such as arrays and saved registers. Csci4203 Csci4203 2^30 Memory words 57 MIPS assembly language Category add Instruction Example add $s1, $s2, $s3 sub $s1, $s2, $s3 Meaning $s1 = $s2 + $s3 $s1 = $s2 - $s3 Comments Three operands; data in registers Three operands; data in registers Arithmetic subtract add immediate addi $s1, $s2, 100 lw $s1, 100($s2) load word sw $s1, 100($s2) store word lb $s1, 100($s2) Data transfer load byte sb $s1, 100($s2) store byte load upper immediate lui $s1, 100 branch on equal branch on not equal $s1 = $s2 + 100 Used to add constants $s1 = Memory[$s2 + 100] Word from memory to register Memory[$s2 + 100] = $s1 Word from register to memory $s1 = Memory[$s2 + 100] Byte from memory to register Memory[$s2 + 100] = $s1 Byte from register to memory $s1 = 100 * 2 16 Loads constant in upper 16 bits Equal test; PC-relative branch Not equal test; PC-relative Compare less than; for beq, bne Compare less than constant beq $s1, $s2, 25 bne $s1, $s2, 25 slt $s1, $s2, $s3 if ($s1 == $s2) go to PC + 4 + 100 if ($s1 != $s2) go to PC + 4 + 100 if ($s2 < $s3) $s1 = 1; else $s1 = 0 else $s1 = 0 Conditional branch set on less than set less than immediate jump slti $s1, $s2, 100 if ($s2 < 100) $s1 = 1; j 2500 jr $ra jal 2500 Unconditional jump jump register jump and link Jump to target address go to 10000 For switch, procedure return go to $ra $ra Csci4203 = PC + 4; go to 10000 For procedure call 58 Csci4203 1. Immediat e addressing op rs rt Immediate Five Addressing Modes funct Registers Register 2. Register addr essing op rs rt rd ... 3. Base addr essing op rs rt Addr ess Memor y Regist er + Byte Halfw or d Word 4. PC-relative addressing op rs rt Addr ess Memor y PC + Word 5. Pseudodir ect addr essing op Address Memor y PC Word Csci4203 Csci4203 59 Translating and Starting a Program C program compiler Assembly language program Assembler Object: machine code module Linker Executable Loader Csci4203 Csci4203 cc1 as Object: library routines ld a.out Memory 60 Linker • Symbol resolution • Relocation associate each symbol reference with exactly one definition relocates each section by associating a memory location to each symbol definition, and then modify each reference to the correct memory location. Csci4203 Csci4203 61 Loader • Reads the executable file header to determine size of the text and data segments Create address space Copies instructions and data into memory Copies parameters to the stack Initialize machine registers and $sp Jumps to a start­up routine (crt0?). • • • • • Csci4203 Csci4203 62 How Compiler Optimize? Compiler Front­End Translation & Code Gen Optimizer Final Object Code IR (Intermediate Representation) Csci4203 Csci4203 63 How Compiler Optimize? Compiler Front­End High­level Optimizer Code Gen Low­level Optimizer HIR LIR Csci4203 Csci4203 64 Types of Optimizations and examples Optimization name (High­level) Procedure inlining, loop parallelization (local) CSE, Const folding and propagation (global) code motion, Loop optimization, global CSE, register promotion, … (machine dependent) Register allocation, instruction scheduling, strength reduction,… Csci4203 Csci4203 Gcc level O3 O1 O2 O1/O2 65 Alternative Architectures • Design alternative: – provide more powerful operations – goal is to reduce number of instructions executed – danger is a slower cycle time and/or a higher CPI • Let’s look (briefly) at IA­32 Csci4203 Csci4203 66 IA ­ 32 • 1978: The Intel 8086 is announced (16 bit • • • • • • • architecture) 1980: The 8087 floating point coprocessor is added 1982: The 80286 increases address space to 24 bits, +instructions 1985: The 80386 extends to 32 bits, new addressing modes 1989­1995: The 80486, Pentium, Pentium Pro add a few instructions (mostly designed for higher performance) 1997: 57 new “MMX” instructions are added, Pentium II 1999: The Pentium III added another 70 instructions (SSE) 2001: Another 144 instructions (SSE2) Csci4203 Csci4203 67 IA – 32 (cont.) • 2003: AMD extends the architecture to increase address • space to 64 bits, widens all registers to 64 bits and other changes (AMD64) 2004: Intel capitulates and embraces AMD64 (calls it EM64T) and adds more media extensions compatibility compatibility • “This history illustrates the impact of the “golden handcuffs” of This “adding new features as someone might add clothing to a “adding packed bag” packed “an architecture that is difficult to explain and impossible to “an love” Csci4203 Csci4203 68 IA­32 Overview • Complexity: – Instructions from 1 to 17 bytes long – one operand must act as both a source and destination – one operand can come from memory – complex addressing modes e.g., “base or scaled index with 8 or 32 bit displacement” Saving grace: – the most frequently used instructions are not too difficult to build – compilers avoid the portions of the architecture that are slow Csci4203 69 Csci4203 • IA­32 Registers and Data Addressing • Registers in the 32­bit subset that originated with 80386 Name E AX ECX EDX E BX E SP E BP ESI EDI CS SS DS ES FS GS EIP EFLAGS 31 Use 0 GPR 0 GPR 1 GPR 2 GPR 3 GPR 4 GPR 5 GPR 6 GPR 7 Code segment pointer Stack segment pointer (top of stack) Data segment pointer 0 Data segment pointer 1 Data segment pointer 2 Data segment pointer 3 Instruction pointer (PC) Condition codes Csci4203 Csci4203 70 IA­32 Register Restrictions • Registers are not “general purpose” – note the restrictions below Csci4203 Csci4203 71 IA­32 Typical Instructions • Four major types of integer instructions: – Data movement including move, push, pop – Arithmetic and logical (destination register or memory) – Control flow (use of condition codes / flags ) – String instructions, including string move and string compare Csci4203 Csci4203 72 IA­32 Typical Instructions Csci4203 Csci4203 73 IA­32 instruction Formats • Typical formats: (notice the different lengths) a. JE EIP + displacement 4 4 8 JE Condition Displacement b. CALL 8 CALL 32 Offset c. MOV 6 MOV EBX, [EDI + 45] 11 8 dw r/m Postbyte 8 Displacement d. PUSH ESI 5 PUSH 3 Reg e. ADD EAX, #6765 4 3 1 ADD Reg w 32 Immediate f. TEST EDX, #42 7 1 TEST w 8 Postbyte 32 Csci4203 Csci4203 Immediate 74 Summary • Instruction complexity is only one variable – lower instruction count vs. higher CPI / lower clock rate – simplicity favors regularity – smaller is faster – good design demands compromise – make the common case fast Csci4203 Csci4203 – a very important abstraction indeed! 75 • Design Principles: • Instruction set architecture ...
View Full Document

Ask a homework question - tutors are online