30 Pages

Components of CPU Performance

Course: EE 4720, Spring 2001
School: LSU
Rating:
 
 
 
 
 

Word Count: 2036

Document Preview

of 02-1 Components CPU Performance and Performance Equation 02-1 Why is my computer fast (or slow)? ? Would it help to improve CPU performance equation is one way to start answering these questions. 02-1 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-1 02-2 02-2 CPU Performance Decomposed into Three Components: Clock Frequency () Determined by technology and influenced...

Register Now

Unformatted Document Excerpt

Coursehero >> Louisiana >> LSU >> EE 4720

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
of 02-1 Components CPU Performance and Performance Equation 02-1 Why is my computer fast (or slow)? ? Would it help to improve CPU performance equation is one way to start answering these questions. 02-1 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-1 02-2 02-2 CPU Performance Decomposed into Three Components: Clock Frequency () Determined by technology and influenced by organization. Clocks per Instruction (CPI) Determined by ISA, microarchitecture, compiler, and program. Instruction Count (IC) Determined by program, compiler, and ISA. These combined to form CPU Performance Equation tT = , 1 CPI IC where tT denotes the execution time. 02-2 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-2 02-3 02-3 CPU Performance: Simple System Execution in program order . . . . . . one at a time. 1 80 Instr. 1 Instr. 2 Instr. 3 160 2 3 4 5 6 7 8 9 10 11 1,999,996 39,999,920 Instr. 500,000 Time/cycles: 0 Time/mms: 0 IC = 500, 000; = 50 kHz; CPI = 4. Execution time: IC CPI. clock period. Here (and only here) CPI is number of cycles for each instruction. 02-3 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-3 02-4 Execution: Pipelined, In Order 02-4 To Run Faster: Overlap Instructions (Pipelined Execution) Result must be same as one-at-a-time execution . . . . . . not too difficult to achieve. 0 0 Instr. 1 Instr. 5 20 40 3,750,000 Instr. 500,000 1 2 3 4 5 6 7 8 9 10 11 750,000 Time/cycles: Time/mms: Instr. 2 Instr. 6 Instr. 3 Instr. 7 Instr. 4 IC = 500, 000; = 200 kHz; CPI = 750000 500000 = 1.5. Execution time at best: IC clock period . . . . . . assuming 1 cycle to start each instruction and . . . . . . instruction can start each cycle. (Slower in illustration.) 02-4 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-4 02-5 Execution: Pipelined, Ideal Out of Order 02-5 To Run Even Faster: Overlap Instructions and Start Out of Order Sometimes skip an instruction and execute it later. 0 0 Instr. 1 Instr. 5 Instr. 9 4 8 500,000 Instr. 500,000 1 2 3 4 5 6 7 8 9 10 11 500,000 Time/cycles: Time/mms: Instr. 2 Instr. 6 Instr. 4 Instr. 8 Instr. 3 Instr. 7 IC = 500, 000; = 200 kHz; CPI = 1. Execution time at best: IC clock period . . . . . . assuming 1 cycle to start each instruction . . . . . . instruction can start each cycle. 02-5 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-5 02-6 Execution: Pipelined, Ideal Out of Order, Superscalar 02-6 To Run Fastest1 : Overlap, Out-of-Order, Start n per Tick (n-Way Superscalar). Requires about n times as much hardware. (Below, n = 2.) Time/cycles: 250,000 500 Instr. 17 Instr. 500,000 Time/mms: Instr. 1 Instr. 3 Instr. 5 Instr. 7 Instr. 2 Instr. 4 Instr. 6 Instr. 8 Instr. 16 Instr. 14 Instr. 12 Instr. 10 Instr. 18 Instr. 15 Instr. 13 Instr. 11 Instr. 9 0 .008 .016 0 1 2 3 4 5 6 7 8 9 10 11 IC = 500, 000; = 500 MHz; 1 CPI = 2 . 1 Execution time at best: n IC clock period . . . . . . assuming 1 cycle to start each instruction instruction can start each cycle. 1 Using a conventional serial instruction set architecture. 02-6 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-6 02-7 Execution: Pipelined, Out of Order, Superscalar 02-7 Data from a real program, perl. CPI is 0.44. Processor can start four instructions per cycle. Colors show the steps in processing an instruction, yellow is execution. S_regmatch+104 0x001610d8 stw %o4, [ %l0 + 16 ] 0x001610dc cmp %o1, 2 0x001610e0 bne +3i {S_regmatch+109} 0x001610e4 lduw [ %l0 + 24 ], %l6 0x001610ec mov %o0, %i5 mov %i1 , %l3 0x001610f0 0x001610f4 cmp %i1, 0 0x001610f8 be -100i {S_regmatch+12} 0x001610fc sethi %hi(0x1f0800), %g2 0x00161100 lduh [ %l3 + 2 ], %g2 0x00161104 ldub [ %l3 + 1 ], %o0 0x00161108 sll %g2 , 2, %g2 0x0016110c add %l3 , %g2 , %i1 0x00161110 cmp %i1 , %l3 0x00161114 bne +3i {S_regmatch+122} 0x00161118 mov %o0, %l0 0x00161120 cmp %o0, 77 0x00161124 bgu +3509i {S_regmatch+3632 0x00161128 sethi %hi(0x219c00), %g2 0x0016112c sethi %hi(0x160c00), %g1 0x00161130 sll %o0, 2, %g3 0x00161134 add %g1, 512, %g2 {0x160e00} 0x00161138 lduw [ %g3 + %g2 ], %g3 0x0016113c jmp %g3 + %g2 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-7 Time 42,084,435 Grid 20 insn X 5 cyc 02-7 02-8 Component of CPU Performance: Instruction Count 02-8 Given a program there are two ways instructions could be tallied: Static Instruction Count: The number of instructions making up the program. Dynamic Instruction Count (IC): The number of instructions executed in a run of the program. For estimating performance, dynamic instruction count is used. 02-8 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-8 02-9 Instruction Counts 9 i=0 02-9 Example, assembler program that computes a = i. Written in Simplescalar assembler. IC 1 1 r5, r0 r3, r0 r5, r3, r2, r2, r5, r3, r3, r0, r3 1 10 ! r2 = r3 < 10 L23 ! Branch to L23 if r2 not equal 0. ! Branch label. ! Add unsigned. ! r0 is always zero. 10 10 10 10 move move L23: addu addu slt bne Static count: 6 (number of instructions). Dynamic count: 42. 02-9 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-9 02-10 Component of CPU Performance: Clock Frequency 02-10 CPUs implemented using synchronous clocked logic. Typical Clock Cycle When clock switches from low to high work starts. While clock is high work proceeds. When clock goes from high to low work should be complete. Clock frequency determined by critical path. Critical Path: Logic doing most time consuming work (in a cycle). If clock frequency is too high work will not be completed . . . . . . and so system will not perform properly. For high clock frequencies, keep critical paths short. 02-10 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-10 02-11 Component of CPU Performance: CPI 02-11 Cycles (clocks) per Instruction (CPI) Oversimplified definition: CPI: Average number of cycles needed to execute an instruction. Better definition: CPI: Number of cycles to execute some code divided by number of instructions. Difference: Interested in rate at which instructions executed in program . . . . . . not time time for any one instruction. 02-11 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-11 02-12 Review of CPU Performance Equation 02-12 tT = , 1 CPI IC where tT denotes the execution time. Clock Frequency () Determined by technology and influenced by organization. Clocks per Instruction (CPI) Determined by organization and instruction mix. Instruction Count (IC) Determined by program and ISA. 02-12 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-12 02-13 Interaction of Execution Time Components 02-13 Tradeoffs between Clock Frequency, CPI, and Instruction Count Increasing Clock Frequency . . . . . . reduces the work that can be done in a clock cycle . . . . . . and possibly limiting instruction overlap, therefore increasing CPI. Reducing IC (by adding "powerful" instructions to ISA) . . . . . . may force implementors to increase CPI or lower clock frequency. Balancing these is an important skill in computer design. Since the ISA is usually fixed, IC is less of a factor. 02-13 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-13 02-14 Example: Trading off Execution Time Components 02-14 Company X is considering two clock frequencies for its next processor, 500 MHz or 400 MHz. A 500 MHz implementation would execute instructions at 1.7 CPI, 400 the MHz part at 1.1 CPI. Which would be faster?. Find time to execute 1 instruction. 1 500106 500 MHz execution time: 1.7 1 = 3.4 s 1.1 1 = 2.75 s. 1 400106 400 MHz execution time: The lower clock rate would nevertheless take less time. Perhaps because at 500 MHz too much work had to be split into multiple cycles. 02-14 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-14 02-15 IC v. CPI Tradeoffs 02-15 Assumption IC is based on output of a good compiler. Compiler is tuned for a particular implementation. Two Cases 1. Same ISA, different implementation. 2. Different ISA, (and of course) different implementation. 02-15 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-15 02-16 IC v. CPI Tradeoffs, continued. 02-16 Case 1: Same ISA, different implementation. Newer implementation may have lower CPI on existing code . . . . . . but even better performance attainable by recompiling . . . . . . which may increase CPI. Compiler writer selects instructions based on performance of implementation. 02-16 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-16 02-17 02-17 Consider two implementations: Implementation A: add CPI 1 cycle, mul CPI 5 cycles. Implementation B: add CPI 1 cycle, mul CPI 2 cycles. Code computes 6x. ! Call original value of r1, x. ! Code For Implementation A add r1, r1, r1 ! r1 = 2x add r2, r1, r1 ! r2 = 4x add r1, r1, r2 ! r1 = 6x ! Code For Implementation B. mul r1, r1, 6 ! r1 = 6x. Implementation A: IC = 3, CPI = 1 (Computing CPI will be covered later.) Implementation B: IC = 1, CPI = 2. Implementation B is faster despite higher CPI. Code compiled for B will run slowly on A. 02-17 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-17 02-18 IC v. CPI Tradeoffs, continued. 02-18 Case 2: Different ISA, (and of course) different implementation. Major tradeoffs in complexity and speed. Consider two implementations: Implementation A: CPI: load, 2; add and store, 1. Implementation B: CPI: add (doing load and store), 4. ! Code for implementation A. load r1, [r2] ! Load r1 with data at address in r2. add r3, r1, r4 ! r3 = r1 + r4 store [r2], r3 ! Store r3 at address in r2. ! Code for implementation B. add [r2], r4, [r2] Execution time same. Implementation A: IC = 3, CPI = 4 . 3 Implementation B: IC = 1, CPI = 4. EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-18 02-18 02-19 Technological Change 02-19 Golden Handcuffs: The need to maintain compatibility in a successful product line. Famously, Intel's IA-32. (Popularly referred to as 80x86.) The ISA is the handcuffs. . . . . . and technological change brings the desire to move your arms. 02-19 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-19 02-20 02-20 Technological Change and Computer Designer Technology determines "raw materials" for designer. Raw material: number of gates and their speed. ISA lifetime can be decades. Raw materials greatly change over this time. So, design ISA for now and future. 02-20 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-20 02-21 02-21 How technological advancement affects processor. Logic Speed, Clock Rate No changes to organization or ISA. Number of Transistors Available for Logic Changes to organization and possible changes to ISA. Memory Size Change ISA to use larger address space. Can use ISA having larger instruction codings. Memory Speed Compared to Processor Speed Include more sophisticated caching in organization. 02-21 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-21 02-22 Benchmarks 02-22 Benchmark: Program used to evaluate performance. Uses Guide computer design. Guide purchasing decisions. Marketing tool. Guiding Computer Design Measure overall performance. Determine characteristics of programs. E.g., frequency of floating-point operations. Determine effect of design options. 02-22 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-22 02-23 Choosing Benchmark Programs 02-23 Important: Choice of programs for evaluation. Optimal but unrealistic: The exact set of programs customer will run. Problem: computers used for different applications. Therefore, must model typical users' workload. 02-23 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-23 02-24 02-24 Options: Real Programs: Programs chosen using surveys, for example. + Measured performance improvements apply to customer. Large programs hard to run on simulator. (Before system built.) Kernels: Use part of program responsible for most execution time. + Easier to study. Not all program have small kernels. Toy Benchmarks: Program performs simplified version of common task. + Easier to study. May not be realistic. 02-24 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-24 02-25 02-25 Synthetic Benchmarks: Program "looks like" typical program, but does nothing useful. + Easier to study. May not be realistic. Commonly Used Option Overall performance: real programs Test specific features: synthetic benchmarks. 02-25 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-25 02-26 Benchmark Suites 02-26 Benchmark Suite: A named set of programs used to evaluate a system. Typically: Developed and managed by a publication or non-profit organization. E.g., Standard Performance Evaluation Corp., PC Magazine. Tests clearly delineated aspects of system. E.g., CPU, graphics, I/O, application. Specifies a set of programs and inputs for those programs. Specifies reporting requirements for results. 02-26 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-26 02-27 02-27 What Suites Might Measure Application Performance E.g., productivity (office) applications, database programs. Usually tests entire system. CPU and Memory Performance Ignores effect of I/O. Graphics Performance 02-27 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-27 02-28 02-28 Example, SPEC CPU2000 Suites Respected measure of CPU performance. Managed by Standard Performance Evaluation Corporation,. . . . . .a non-profit organization funded by computer companies. Measures CPU and memory performance on integer and FP code. Uses common Unix programs such as perl, gcc, gzip. Requires that results on each program be reported. Programs compiled with publicly available compilers and libraries. Programs compiled with and without expert tuning. 02-28 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-28 02-29 02-29 SPEC CPU2000 Suites and Measures Suite of integer programs run to determine: SPECint2000, execution time of tuned code. SPECint base2000, execution time of untuned code. SPECint rate2000, throughput of tuned code. SPECint rate base2000, throughput of untuned code. Suite of floating programs run to determine: SPECfp2000, execution time of tuned code. SPECfp base2000, execution time of untuned code. SPECfp rate2000, throughput of tuned code. SPECfp rate rate2000, throughput of untuned code. 02-29 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-29 02-30 02-30 Other Examples (Fall 2001: This list is out of date.) BAPCO Suites, measure productivity app. performance on Windows 95. TPC, measure "transaction processing" system performance. WinMARK, graphics performance. 02-30 EE 4720 Lecture Transparency. Formatted 10:24, 21 January 2004 from lsli02. 02-30
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

LSU - EE - 4720
NameComputer Architecture EE 4720 Final Examination8 May 2000, 10:0012:00 CDTProblem 1 Problem 2 Problem 3 Problem 4 Problem 5 Alias Good Luck!(20 pts) (10 pts) (10 pts) (21 pts) (39 pts) (100 pts)Exam TotalProblem 1: An extended DLX ISA,
LSU - EE - 4720
10-1Dynamic Scheduling10-1 10-2Dynamic Scheduling10-2This Set Scheduling:Organizing instructions to improve execution efficiency. Scheduling and Dynamic Execution Definitions From various parts of Chapter 4. Static Scheduling:Organizing
LSU - EE - 4720
01-1EE 4720-Computer Architecture01-1URL: http:/www.ece.lsu.edu/ee4720/RSS: http:/www.ece.lsu.edu/ee4720/rss home.xmlOffered by:David M. Koppelman349 EE Building, 578-5482, koppel@ece.lsu.edu, http:/www.ece.lsu.edu/koppelOffice Hours:
LSU - EE - 4720
01-1EE 4720-Computer Architecture01-1Call Number 1825 (Fall 2002)URL: http:/www.ece.lsu.edu/ee4720Offered by:David M. Koppelman349 EE Building578-5482, koppel@ece.lsu.edu, http:/www.ece.lsu.edu/koppel/koppel.htmlTentative office hou
LSU - EE - 4720
fr-1Fall 2003 Final Exam Reviewfr-1When / WhereTuesday, 9 December 2003, 10:00-12:00 CST (Here).ConditionsClosed Book, Closed NotesBring one 215 280 mm note sheet.Cannot use communication devices.FormatTwo or three or maybe four p
LSU - EE - 4720
mr-1Fall 2005 Midterm Exam Reviewmr-1When / WhereMonday, 24 October 2005, 12:40-13:30 CDTCEBA 2142 (Here)ConditionsClosed Book, Closed NotesBring one sheet of notes (both sides), 216 mm 280 mm.No use of communication devices.Forma
LSU - EE - 4720
mr-1Fall 2007 Midterm Exam Reviewmr-1When / WhereWednesday, 7 November 2007, 10:40-11:30 CSTTaylor 3142 (Here)ConditionsClosed Book, Closed NotesBring one sheet of notes (both sides), 216 mm 280 mm.No use of communication devices.
LSU - EE - 4720
mr-1Fall 2008 Midterm Exam Reviewmr-1When / WhereFriday, 31 October 2008, 10:40-11:30 Central Daylight TimeTaylor Hall 3142 (Here)ConditionsClosed Book, Closed NotesBring one sheet of notes (both sides), 216 mm 280 mm.No use of comm
LSU - EE - 4720
NameSolutionComputer Architecture EE 4720 Final Examination13 December 2005, 12:3014:30 CSTProblem 1 Problem 2 Problem 3 Problem 4 Problem 5 Alias Out-of-order graduation? Good Luck!(15 pts) (20 pts) (17 pts) (15 pts) (33 pts) (100 pts)Exa
LSU - EE - 4720
NameComputer Architecture EE 4720 Final Examination14 May 2003, 15:0017:00 CDTProblem 1 Problem 2 Problem 3 Problem 4 Problem 5 Alias Good Luck!(20 pts) (15 pts) (15 pts) (20 pts) (30 pts) (100 pts)Exam TotalProblem 1: The execution of a M
LSU - EE - 4720
NameSolutionComputer Architecture EE 4720 Final Examination8 May 2000, 10:0012:00 CDTProblem 1 Problem 2 Problem 3 Problem 4 Problem 5 Alias MPL phone home! Good Luck!(20 pts) (10 pts) (10 pts) (21 pts) (39 pts) (100 pts)Exam TotalProble
LSU - EE - 4720
NameComputer Architecture EE 4720 Final Examination10 May 1997, 12:3014:30 CDTProblem 1 Problem 2 Problem 3 Problem 4 Alias Good Luck!(25 pts) (25 pts) (25 pts) (25 pts) (100 pts)Exam TotalProblem 1: DLX's immediate instructions use 16-bit
LSU - EE - 4720
EE 4720Problem 1:Homework 2Compare the coding of the DLX instructions:Due: 22 September 2000add r1, r2, r3 addi r4, r5, #6 to the corresponding Sun SPARC V8 instructions: add %g3, %g2, %g1 add %g5, 6, %g4 ! g1 = g2 + g3 ! g4 = g5 + 6The de
LSU - EE - 4720
EE 4720Problem 1:Homework 3Due: 2 October 2000What changes would have to be made to the pipeline below to add the DLX-BAM indexed addressing instructions (from homework 2). Hint: The load is easy and inexpensive, the store requires a substant
LSU - EE - 4720
LSU EE 4720Problem 1:Homework 3Due: 15 March 2004The MIPS program below copies a region of memory and runs on the illustrated implementation. In the sub-problems below use only the bypass connections shown in the illustration. (a) Show a pipel
LSU - EE - 4720
LSU EE 4720Problem 1:Homework 3Due: 3 November 2004Do Problems 1 and 2 From Spring 2004 Homework 3 http:/www.ece.lsu.edu/ee4720/2004/hw03.pdf. After completing the problems look at the solution and assign yourself a grade. The maximum grade sh
LSU - EE - 4720
EE 4720Problem 1:Homework 3 SolutionDue: 2 October 2000What changes would have to be made to the pipeline below to add the DLX-BAM indexed addressing instructions (from homework 2). Hint: The load is easy and inexpensive, the store requires a
LSU - EE - 4720
LSU EE 4720Problem 1:Homework 4 SolutionDue: 22 March 2004Suppose code like the memory copy program below (from Homework 3) appears frequently enough in the execution of programs so that new instructions should be added to the ISA to allow imp
LSU - EE - 4720
LSU EE 4720Homework 4 SolutionDue: 27 November 2002Problem 1: Consider the solution to Spring 2002 Homework 4, shown on the next page. (The solution was updated 19 November 2002, the PED is shown in dynamic order instead of the nearly-impossibl
LSU - EE - 4720
LSU EE 4720Problem 1:Homework 4Due: 22 March 2004Suppose code like the memory copy program above (from Homework 3) appears frequently enough in the execution of programs so that new instructions should be added to the ISA to allow improved exe
LSU - EE - 4720
EE 4720Problem 1:Homework 5Due: 5 December 2001An ISA has a character size of c = 9 bits (one more than most other ISA's!) and a 30-bit address space (A). An implementation has a bus width of w = 72 bits and has no cache. Show how 220 36 memo
LSU - EE - 4720
LSU EE 4720Homework 5 Solution Due: 3 December 2002To answer the questions below you need to use the PSE dataset viewer program. PSE (pronounced see) runs on Solaris and Linux; you can use the computer accounts distributed in class to run it, a L
LSU - EE - 4720
EE 4720Homework 6Due: Not CollectedIf you only have time for one of these problems, do problem three (the one on connecting memory devices to implement a cache). If you have or are hoping to get a job interview with a company that makes process
LSU - EE - 4720
LSU EE 4720Homework 1 SolutionDue: 17 September 2003Problem 1: Look at the following SPEC CINT2000 disclosures for these Dell and HP Itanium 2 systems: HP: http:/www.spec.org/osg/cpu2000/results/res2003q3/cpu2000-20030711-02389.html Dell: http:
LSU - EE - 4720
LSU EE 4720Homework 1Due: 3 October 2006Problem 1: Without looking at the solution solve Spring 2002 Homework 2 Problem 2 parts a-c. Then, look at the solution and assign yourself a grade in the range [0,1].
LSU - EE - 4720
LSU EE 4720Homework 1Due: 2 March 2007Problem 1: Without looking at the solution solve Spring 2002 Homework 2 Problem 2 parts a-c. Then, look at the solution and assign yourself a grade in the range [0,1]. Problem 2: If the value in register r2
LSU - EE - 4720
LSU EE 4720Homework 1Due: 20 February 2008Problem 1: Solve Fall 2007 Homework 2 without looking at the solution. Then look at the solution and give yourself a grade on a scale of [0, 1]. Warning: test questions are based on the assumption that
LSU - EE - 4720
LSU EE 4720Homework 1 Solution Due: 29 September 2008To answer the first question below see the MIPS32 Architecture manual linked to the course references page.Problem 1: The MIPS I bgtz and bltz instructions compare a register to zero, but can
LSU - EE - 4720
EE 4720Problem 1:Homework 1 Solution2 pts Just plug the run times into these equations AM =Assigned: Spring 1997GM =n i=1 i1XtnHM = X !,1n i=1 ti1n1vY u u t tnni=1ito obtain 42.6, 13.8, and 27.7 for the arithmetic,
LSU - EE - 4720
LSU EE 4720Homework 1 SolutionDue: 11 February 2005Problem 1: POWER is an IBM ISA developed for engineering workstations, PowerPC is an ISA developed by IBM, Apple, and Motorola for personal computers and is based on POWER. POWER and PowerPC ha
LSU - EE - 4720
LSU EE 4720Homework 1Due: 29 September 2008To answer the first question below see the MIPS32 Architecture manual linked to the course references page.Problem 1: The MIPS I bgtz and bltz instructions compare a register to zero, but can't compa
LSU - EE - 4720
LSU EE 4720Homework 2 SolutionDue: 9 March 2005For answers to the questions below refer to the PowerPC description Book I which can be found on the class references page, http:/www.ece.lsu.edu/ee4720/reference.html. Problem 1: One instruction t
LSU - EE - 4720
LSU EE 4720Homework 2 SolutionDue: 29 February 2008For the answers to these questions look at the ARM Architecture Reference Manual linked to the course references page, http:/www.ece.lsu.edu/ee4720/reference.html. Problem 1: The register field
LSU - EE - 4720
EE 4720Homework 2Due: 19 February 1999The SPARC assembly language program below is used in the problems that follow. SPARC register names are %g0-%g7, %i0-%i7, %l0-%l7, and %o0-%o7; and %g0 is a zero register (like r0 in DLX). The destination f
LSU - EE - 4720
LSU EE 4720Homework 2 SolutionDue: 1 October 2007For lecture material relevant to this assignment see http:/www.ece.lsu.edu/ee4720/2007f/lsli06.pdf. For some background and a list of similar problems see the statically scheduled study guide, ht
LSU - EE - 4720
LSU EE 4720Problem 1:Homework 3 SolutionDue: 3 November 2004Do Problems 1 and 2 From Spring 2004 Homework 3 http:/www.ece.lsu.edu/ee4720/2004/hw03.pdf. After completing the problems look at the solution and assign yourself a grade. The maximum
LSU - EE - 4720
LSU EE 4720Homework 3 SolutionDue: 15 October 2007The problems below ask about VAX instructions, which were not yet covered in class. For information on these instructions see the VAX Macro and Instruction Set manual linked to the EE 4720 refer
LSU - EE - 4720
LSU EE 4720Homework 3 SolutionDue: 29 October 2008Problem 1: Two MIPS implementations appear below, the first is the one presented in class, it will be called the mux-in-EX implementation. The second, the mux-in-ID implementation, has the ALU i
LSU - EE - 4720
LSU EE 4720Homework 3 SolutionDue: 20 March 2006Review Fall 2004 Final Exam Problem 2, which was discussed in class on Monday, 13 March 2006. Problem 1: Using the solution to Fall 2004 Final Exam problem 2 parts a, b, and d (but not c) as a sta
LSU - EE - 4720
LSU EE 4720Homework 3Due: 29 October 2008Problem 1: Two MIPS implementations appear below, the first is the one presented in class, it will be called the mux-in-EX implementation. The second, the mux-in-ID implementation, has the ALU input mult
LSU - EE - 4720
03-1Instruction Set (ISA) Design and Addressing Modes03-1Material from sections 2.1, 2.2, and 2.3.OutlineISA Design ChoicesIt's more than just picking instructions.ISA Design Choice DetailsScrew up, and you'll be cursed for decades.
LSU - EE - 4720
04-1Instruction Usage04-1Usage of DLX Instructions By SPEC92 Integer Codeand 3% 4% 5% 9% 13% 14% 16% 26% 0% compress eqntott 5% 10% 15% Total dynamic count espresso 20% gcc li 25% shift or store int compare int add intconditional branch loa
LSU - EE - 4720
08-1Interrupts and Exceptions08-1NotesMaterial in this set from Section 3.6.The book uses &quot;exception&quot; as a general term for all interrupts . . . . . . in these notes interrupt is used as the general term . . . . . . and a narrower definitio
LSU - EE - 4720
03-1Instruction Set (ISA) Design and Addressing Modes03-1 03-2ISA Design DecisionsI. OrganizationA. Data types (supported by ISA). B. Memory and register organization. C. Addressing modes.03-2Material from sections 2.1, 2.2, and 2.3.Outl
LSU - EE - 4720
12-1This Set12-1Material from Section 4.3This set under construction.Outline Branch Prediction Overview One-Level Predictor Two-Level Correlating Predictor Other topics to be added. Sample Problems12-1EE 4720 Lecture Transpare
LSU - EE - 4720
10-1Dynamic Scheduling10-1This Set Scheduling and Dynamic Execution Definitions From various parts of Chapter 4. Description of Two Dynamic Scheduling MethodsNot yet complete.(Material below may repeat material above.) Tomasulo's Algo
LSU - EE - 4720
LSU EE 4720Problem 1:# Cycle add $t1, $t2, $t3 sub $t4, $t5, $t1 lw $t6, 4($t1) sw 0($t4), $t6 0 1 IF IDHomework 3 Solution Due: 19 March 2003Consider the code below.(a) Show a pipeline execution diagram for the code running on the following
LSU - EE - 4720
09-1Multicycle Pipeline Operations09-1Material may be added to this set.Material CoveredSection 3.7.Long-Latency Operations (Topics)Typical long-latency instructions: floating pointPipelined v. non-pipelined execution unitsInitiatio
LSU - EE - 4720
03-1Instruction Set (ISA) Design and Addressing Modes03-1Material from sections 2.1, 2.2, and 2.3.OutlineISA Design ChoicesIt's more than just picking instructions.ISA Design Choice DetailsScrew up, and you'll be cursed for decades.
LSU - EE - 4720
12-1This Set12-1Material from Section 4.3This set under construction.Outline Branch Prediction Overview One-Level Predictor Two-Level Correlating Predictor Other topics to be added. Sample Problems12-1EE 4720 Lecture Transpare
LSU - EE - 4720
13-1Memory and Caches13-1See also cache study guide.ContentsSupplement to material in section 5.2.Includes notation presented in class.13-1EE 4720 Lecture Transparency. Formatted 12:18, 4 December 2006 from lsli13.13-113-2Memory
LSU - EE - 4720
13-1Memory and Caches13-1See also cache study guide.ContentsSupplement to material in section 5.2.Includes notation presented in class.13-1EE 4720 Lecture Transparency. Formatted 8:55, 30 November 2004 from lsli13.13-113-2Memory
LSU - EE - 4720
13-1Memory and Caches13-1See also cache study guide.ContentsSupplement to material in section 5.2.Includes notation presented in class.13-1EE 4720 Lecture Transparency. Formatted 13:15, 9 December 2007 from lsli13.13-113-2Memory
LSU - EE - 4720
NameSolutionComputer Architecture EE 4720 Midterm Examination, Part IMonday, 16 October 2000, 12:4013:30 CDTProblem 1 Problem 2 Problem 3 Problem 4 Problem 5 Problem 6 Alias Lets go Mets!(17 pts) Mon. (17 pts) Mon. (16 pts) Mon. (13 pts) Wed
LSU - EE - 4720
NameComputer Architecture EE 4720 Midterm ExaminationFriday, 26 October 2001, 13:4014:30 CDTProblem 1 Problem 2 Problem 3 Problem 4 Alias Good Luck!(15 pts) (15 pts) (10 pts) (60 pts) (100 pts)Exam TotalProblem 1: The DLX implementation be
LSU - EE - 4720
NameSolutionComputer Architecture EE 4720 Midterm Examination22 March 2000, 13:4014:30 CSTProblem 1 Problem 2 Problem 3 Alias Good Luck!(35 pts) (20 pts) (45 pts) (100 pts)Exam TotalProblem 1: The DLX implementation below has six stages.
LSU - EE - 4720
NameComputer Architecture EE 4720 Midterm ExaminationFriday, 25 October 2002, 10:4011:30 CDTProblem 1 Problem 2 Problem 3 Problem 4 Alias Good Luck!(30 pts) (13 pts) (13 pts) (44 pts) (100 pts)Exam TotalProblem 1: A new MIPS branch instruc
LSU - EE - 4720
NameComputer Architecture EE 4720 Midterm ExaminationFriday, 27 October 2006, 12:4013:30 CDTProblem 1 Problem 2 Problem 3 Alias Good Luck!(50 pts) (20 pts) (30 pts) (100 pts)Exam TotalProblem 1: In the MIPS implementation below the data me
LSU - EE - 4720
NameComputer Architecture EE 4720 Midterm ExaminationWednesday, 29 March 2006, 11:4012:30 CSTProblem 1 Problem 2 Problem 3 Alias Good Luck!(10 pts) (40 pts) (50 pts) (100 pts)Exam TotalProblem 1: The MIPS code below runs on the illustrated
LSU - EE - 4720
MIPS32TM Architecture For Programmers Volume III: The MIPS32TM Privileged Resource ArchitectureDocument Number: MD00090 Revision 0.95 March 12, 2001MIPS Technologies, Inc. 1225 Charleston Road Mountain View, CA 94043-1353Copyright 2001 MIPS Te
LSU - EE - 4720
LSU EE 4720Statically Sched. MIPS Impl. Study GuideDavid M. KoppelmanSpring 20061.1 IntroductionAn important part of the course and a big chunk of midterm- and final-exam credit is on the statically scheduled MIPS implementations. Essentially