Homework5 02

Homework5 02 - LSU EE 4720 Homework 5 Solution Due: 3...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Homework 5 Solution Due: 3 December 2002 To answer the questions below you need to use the PSE dataset viewer program. PSE (pro- nounced see) runs on Solaris and Linux; you can use the computer accounts distributed in class to run it, a Linux distribution may also be provided for running it on other systems. Procedures for setting up the class account and using PSE are at http://www.ece.lsu.edu/ee4720/proc.html ; preliminary documentation for PSE is at http://www.ece.lsu.edu/ee4720/pse.pdf . Problem 1: Near the beginning of the semester the performance of a program to compute π was evaluated with and without optimization. It’s back, down below. Follow instructions referred to above to view the execution of the optimized and unoptimized versions of the pi program running on a simulated 4-way dynamically scheduled superscalar machine with a 48-instruction reorder bu±er. The datasets to use are pi_opt.ds and pi_noopt.ds . ( a ) Based on the pipeline execution diagram compute the CPI of the main loop for a large number of iterations in the optimized version. Do not use the IPC displayed by PSE, instead base it on the PED. In your answer describe how the CPI was determined. To fnd the precise CPI frst fnd a repeating pattern. Fortunately, once the branch predictor warms up and the ROB flls each iteration is identical so a unit o± the repeating pattern is one iteration long. One such iteration (not the frst) starts at cycle (time) 339, the next starts at 345, ±or a time o± 6 cycles. There are 9 instructions (including the nop ), so the CPI is 6 9 = 2 3 . ( b ) Consider ²rst the optimized version of the program. Would it run faster with a larger reorder bu±er? Would it run faster on an 8-way superscalar machine? How else might the processor be modi²ed to improve performance? Explain each answer. An important ±eature to notice is that, except ±or nop , instructions wait many cycles be±ore executing. All o± the waiting instructions are waiting ±or operands and so execution time is limited by the critical path through the code. (No instruction in the loop waits ±or a ±unctional unit, there are enough ±or this loop.) Grid 20 insn X 5 cyc Rank: 4/7 Pos. 1/7 0.76 IPC over 38 cycles.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 08/01/2009 for the course EE 4720 taught by Professor Staff during the Spring '08 term at LSU.

Page1 / 5

Homework5 02 - LSU EE 4720 Homework 5 Solution Due: 3...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online