{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

hw1soln

hw1soln - Summer 2009 Prof Schimmel ECE 3055 Chapter 1...

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Summer 2009 Prof. Schimmel ECE 3055 Chapter 1 Solutions S3 1.2.4 2 microseconds from cache ==> 20 microseconds from DRAM. 20 microseconds from DRAM ==> 2 seconds from magnetic disk. 20 microseconds from DRAM ==> 2 ms from flash memory. Computer Architecture and OS Homework 1 Solution Problem 1.3 Solution 1.3 1.3.1 P2 has the highest performance performance of P1 (instructions/sec) = 2 109/1.5 = 1.33 109 performance of P2 (instructions/sec) = 1.5 109/1.0 = 1.5 109 performance of P3 (instructions/sec) = 3 109/2.5 = 1.2 109 1.3.2 No. cycles = time clock rate cycles(P1) = 10 2 109 = 20 109 s cycles(P2) = 10 1.5 109 = 15 109 s cycles(P3) = 10 3 109 = 30 109 s time = (No. instr. CPI)/clock rate, then No. instructions = No. cycles/CPI instructions(P1) = 20 109/1.5 = 13.33 109 instructions(P2) = 15 109/1 = 15 109 instructions(P3) = 30 109/2.5 = 12 109 1.3.3 timenew = timeold 0.7 = 7 s CPI = CPI 1.2, then CPI(P1) = 1.8, CPI(P2) = 1.2, CPI(P3) = 3 = No. instr. CPI/time, then (P1) = 13.33 109 1.8/7 = 3.42 GHz (P2) = 15 109 1.2/7 = 2.57 GHz (P3) = 12 109 3/7 = 5.14 GHz 1.3.4 IPC = 1/CPI = No. instr./(time clock rate) IPC(P1) = 1.42 IPC(P2) = 2 IPC(P3) = 3.33 1.3.5 Timenew/Timeold = 7/10 = 0.7. So new = old/0.7 = 1.5 GHz/0.7 = 2.14 GHz. 1.3.6 Timenew/Timeold = 9/10 = 0.9. So Instructionsnew = Instructionsold 0.9 = 30 109 0.9 = 27 109. Chapter 1 Solutions Chapter 1 Solutions S15 S5 Problem 1.5 = Clock rate 10-6/CPI 1.14.3 MIPS Solution 1.5 MIPS(P1) 1.5.1 = 4 10 9 106/1.25 = 3200 MIPS(P2) = 3 109 106/0.75 = 4000 a. 1G, 0.75G inst/s MIPS(P1) < MIPS(P2), performance(P1) < performance(P2) in this case (from 1.14.1) b. 1G, 1.5G inst/s 1.14.4 1.5.2 a. a. b. b. FP op = 106 0.4 = 4 105, clock cylesfp = CPI No. FP instr. = 4 105 P2 is 1.33 times faster than P1 Tfp = 4 105 0.33 109 = 1.32 104 then MFLOPS = 3.03 103 P1 is 1.03 times faster than P2 FP op = 3 106 0.4 = 1.2 106, clock cylesfp = CPI No. FP instr. = 0.70 1.2 106 6 T 1.5.3fp = 0.84 10 0.33 109 = 2.77 104 then MFLOPS = 4.33 103 a. P2 1.14.5 is 1.31clock faster thanFP cycles + CPI(L/S) No. instr. (L/S) + CPI(Branch) CPU times cycles = P1 b. P1 is 1.00 times No. instr. (Branch) faster than P2 1.5.4 clock cycles = 4 105 + 0.75 5 105 + 1.5 105 = 9.25 105 CPU a. b. 5 9 4 Tcpu =s 2.05 9.25 10 0.33 10 = 3.05 10 6 4 6 3 MIPS = 10 /(3.05 10 10 ) = 3.2 10 1.93 s a. 5 105 L/S instr., 4 105 FP instr. and 105 Branch instr. b. 1.2 106 L/S instr., 1.2 106 FP instr. and 0.6 106 Branch instr. 6 CPU 1.5.5 clock cycles = 0.84 10 a. b. + 1.25 1.2 106 + 1.25 0.6 106 = 3.09 106 Tcpu = 3.09 106 0.33 109 = 1.01 103 0.71 s3 106/(1.01 103 106) = 2.97 103 MIPS = 0.86 s 1.14.6 1.5.6 a. a. b. b. performance = 1/Tcpu = 3.2 103 1.30 times faster performance = 1/Tcpu = 9.9 102 1.40 times faster Problem 1.15 1.15.1 Chapter 1 b. b. a. a. Solution 1.6 1.6.1 Solution 1.15 The second program has the higher performance and the higher MFLOPS figure, but the first program has the higher MIPS figure. Compiler A CPI Compiler B CPI 1.00 1.17 Tfp = 35 0.8 = 28 s, Tp1 = 28 + 85 + 50 + 30 = 193 s. Reduction: 3.5% 0.80 0.58 Tfp = 50 0.8 = 40 s, Tp4 = 40 + 80 + 50 + 30 = 200 s. Reduction: 4.7% Solutions 1.15.2 a. b. Tp1 = 200 0.8 = 160 s, Tfp + Tl/s + Tbranch = 115 s, Tint = 45 s. Reduction time INT: 47% Tp4 = 210 0.8 = 168 s, Tfp + Tl/s + Tbranch = 130 s, Tint = 38 s. Reduction time INT: 52.4% 1.15.3 a. b. Tp1 = 200 0.8 = 160 s, Tfp + Tint + Tl/s = 170 s. NO Tp4 = 210 0.8 = 168 s, Tfp + Tint + Tl/s = 180 s. NO 1.15.4 Clock cyles = CPIfp No. FP instr. + CPIint No. INT instr. + CPIl/s No. L/S instr. + CPIbranch No. branch instr. 1.15.2 a. b. Tp1 = 200 0.8 = 160 s, Tfp + Tl/s + Tbranch = 115 s, Tint = 45 s. Reduction time INT: 47% Tp4 = 210 0.8 = 168 s, Tfp + Tl/s + Tbranch = 130 s, Tint = 38 s. Reduction time INT: 52.4% 1.15.3 a. b. Tp1 = 200 0.8 = 160 s, Tfp + Tint + Tl/s = 170 s. NO Tp4 = 210 0.8 = 168 s, Tfp + Tint + Tl/s = 180 s. NO 1.15.4 Clock cyles = CPIfp No. FP instr. + CPIint No. INT instr. + CPIl/s No. L/S instr. + CPIbranch No. branch instr. Tcpu = clock cycles/clock rate = clock cycles/2 109 a. b. 1 processor: clock cycles = 8192; Tcpu = 4.096 s 8 processors: clock cycles = 1024; Tcpu = 0.512 s To half the number of clock cycles by improving the CPI of FP instructions: CPIimproved fp No. FP instr. + CPIint No. INT instr. + CPIl/s No. L/S instr. + CPIbranch No. branch instr. = clock cycles/2 CPIimproved fp = (clock cycles/2 - (CPIint No. INT instr. + CPIl/s No. L/S instr. + CPIbranch No. branch instr.))/No. FP instr. a. b. 1 processor: CPIimproved fp = (4096 7632)/560 < 0 ==> not possible 8 processors: CPIimproved fp = (512 944)/80 < 0 ==> not possible 1.15.5 Using the clock cycle data from 1.15.4: To half the number of clock cycles improving the CPI of L/S instructions: CPIfp No. FP instr. + CPIint No. INT instr. + CPIimproved l/s No. L/S instr. + CPIbranch No. branch instr. = clock cycles/2 Chapter Solutions CPIimproved l/s = (clock cycles/2 - (CPIfp No. FP instr. + CPIint No.1INT instr. + CPIbranch No. branch instr.))/No. L/S instr. a. b. 1 processor: CPIimproved l/s = (4096 3072)/1280 = 0.8 8 processors: CPIimproved l/s = (512 384)/160 = 0.8 S17 1.15.6 Clock cyles = CPIfp No. FP instr. + CPIint No. INT instr. + CPIl/s No. L/S instr. + CPIbranch No. branch instr. Tcpu = clock cycles/clock rate = clock cycles/2 109 CPIint = 0.6 1 = 0.6; CPIfp = 0.6 1 = 0.6; CPIl/s = 0.7 4 = 2.8; CPIbranch = 0.7 2 = 1.4 a. b. 1 processor: Tcpu(before improv.) = 4.096 s; Tcpu(after improv.) = 2.739 s 8 processors: Tcpu(before improv.) = 0.512 s; Tcpu(after improv.) = 0.342 s Solution 1.16 1.16.1 Without reduction in any routine: a. b. total time 2 proc = 185 ns total time 16 proc = 34 ns ...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online