This preview shows page 1. Sign up to view the full content.
Unformatted text preview: lue. The result is stored at *dest only after the loop has been completed. As the following assembly code for the loop shows, the compiler can now use register %eax to hold the accumulated value. Comparing to the loop for combine3, we have reduced the memory operations per iteration from two reads and one write to just a single read. Registers %ecx and %edx are used as before, but there is no need to reference *dest.
combine4: type=INT, OPER = * data in %eax, x in %ecx, i in %edx, length in %esi .L24: loop: imull (%eax,%edx,4),%ecx Multiply x by data[i] incl %edx i++ cmpl %esi,%edx Compare i:length jl .L24 If <, goto loop 1 2 3 4 5 We see a signiﬁcant improvement in program performance: Function combine3 combine4 Page 217 219 Method Direct data access Accumulate in temporary Integer + * 6.00 9.00 2.00 4.00 Floating Point + * 8.00 117.00 3.00 5.00 The most dramatic decline is in the time for ﬂoating-point multiplication. Its time becomes comparable to the times for the other combinations of data type and operation. We will examine the caus...
View Full Document