hw2_solution

hw2_solution - EE108B Spring 2003-2004 Prof. Kozyrakis...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
EE108B Spring 2003-2004 Prof. Kozyrakis Homework #2 Solution 1. a. (5 points) Copy propagation (1 point): Instructions 0x40, 0x54 and 0x7c are removed. Arithmetic identity/Algebraic simplification (1 point): Since (i+1)*4 == (i*4)+4, instructions 0x40 and 0x4c, and 0x54 and 0x60 that computes the new A[i] and B[i], are transformed to 0x34 and 0x20 respectively. Leaf routine optimization (1point): It is a leaf routine and there is no need to save and restore fp and gp. There is also no need to store i and c on the stack since they are only used locally. As a result no stack space need to be allocated. Thus instructions 0xc–0x18, 0x24, 0x3c, 0x50, 0x68, 0x74, 0x80 and 0x88-0x90 in the unoptimized code are removed, and 0x28-0x2c are reduced to instruction 0x10 in the optimized version. Loop invariant code Motion (1 point) : Since the arrays A and B are in static memory, instructions 0x48 and 0x5c that load the base address of A and B are moved above the loop (instructions 0x14- 0x18 in the optimized code) to reduce the number of dynamic instructions. Loop inversion (1 point): Since the lower and upper bound of the for loop are constants, the loop can be transformed into a while loop that has a lower loop overhead. Thus, instructions 0x30-0x38 and 0x84 are transformed to 0x30 and 0x38 in the optimized version. b. (4 points) Unoptimized version: 11 (before loop) + 22 (in loop) * 100 + 7 (after loop) = 2218 ALU 1009/2218 = 46% Branch 202/2218 = 9% Memory 1007/2218 = 45% Optimized version: 7 (before loop) + 8 (in loop) * 100 + 1 (after loop) = 808 ALU 505/808 = 62% Branch 101/808 = 13% Memory 202/808 = 25% 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
EE108B Spring 2003-2004 Prof. Kozyrakis c. (2 points) - The constants in instructions 0x0 and 0x4, which is the offset of gp
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 11/18/2011 for the course EE 108A taught by Professor Dally during the Spring '04 term at Stanford.

Page1 / 6

hw2_solution - EE108B Spring 2003-2004 Prof. Kozyrakis...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online