573 a closer look at processor operation as a tool

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: lue. The result is stored at *dest only after the loop has been completed. As the following assembly code for the loop shows, the compiler can now use register %eax to hold the accumulated value. Comparing to the loop for combine3, we have reduced the memory operations per iteration from two reads and one write to just a single read. Registers %ecx and %edx are used as before, but there is no need to reference *dest. combine4: type=INT, OPER = * data in %eax, x in %ecx, i in %edx, length in %esi .L24: loop: imull (%eax,%edx,4),%ecx Multiply x by data[i] incl %edx i++ cmpl %esi,%edx Compare i:length jl .L24 If <, goto loop 1 2 3 4 5 We see a significant improvement in program performance: Function combine3 combine4 Page 217 219 Method Direct data access Accumulate in temporary Integer + * 6.00 9.00 2.00 4.00 Floating Point + * 8.00 117.00 3.00 5.00 The most dramatic decline is in the time for floating-point multiplication. Its time becomes comparable to the times for the other combinations of data type and operation. We will examine the caus...
View Full Document

This note was uploaded on 09/02/2010 for the course ELECTRICAL 360 taught by Professor Schultz during the Spring '10 term at BYU.

Ask a homework question - tutors are online