Unformatted text preview: x.0 addl $2, %edx.0 cmpl %esi, %edx.1 jltaken cc.1 load cmpl load jl
cc.1 t.1a %ecx.1 t.1b %ebx.1 %edx.1 cc.1 %ecx.0 %ebx.0 t.1a t.1b imull imull
%ecx.1 %ebx.1 Figure 5.25: Operations for First Iteration of Inner Loop of TwoWay Unrolled, TwoWay Parallel Integer Multiplication. The two multiplication operations are logically independent. the results at the end. For example, let ÈÒ denote the product of elements
¼ ½ Ò ½ : È Ò Ò ½
¼ Assuming Ò is even, we can also write this as ÈÒ È Ò ¢ È ÇÒ , where È Ò is the product of the elements with even indices, and È ÇÒ is the product of the elements with odd indices: È ÈÇ Ò Ò ¾ ¾
¾ Ò Ò ¾ ¾
¾ ·½ ¼ ¼ Figure 5.24 shows code that uses this method. It uses both twoway loop unrolling to combine more elements per iteration, and twoway parallelism, accumulating elements with even index in variable x0, and elements with odd index in variable x1. As before, we include a second loop to accumulate any remaining array elements for the case where the vector length is not a multiple of 2. We then apply the combining operation to x0 and...
View
Full
Document
This note was uploaded on 09/02/2010 for the course ELECTRICAL 360 taught by Professor Schultz during the Spring '10 term at BYU.
 Spring '10
 Schultz
 The American, Gulliver's Travels, 2.2.5 2.2.6 2.2.7 2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5

Click to edit the document details