324_Book

# For an instruction such as an indirect jump as we saw

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: x.0 addl \$2, %edx.0 cmpl %esi, %edx.1 jl-taken cc.1 load cmpl load jl cc.1 t.1a %ecx.1 t.1b %ebx.1 %edx.1 cc.1 %ecx.0 %ebx.0 t.1a t.1b imull imull %ecx.1 %ebx.1 Figure 5.25: Operations for First Iteration of Inner Loop of Two-Way Unrolled, Two-Way Parallel Integer Multiplication. The two multiplication operations are logically independent. the results at the end. For example, let ÈÒ denote the product of elements ¼ ½ Ò ½ : È Ò Ò ½ ¼ Assuming Ò is even, we can also write this as ÈÒ È Ò ¢ È ÇÒ , where È Ò is the product of the elements with even indices, and È ÇÒ is the product of the elements with odd indices: È ÈÇ Ò Ò ¾ ¾ ¾ Ò Ò ¾ ¾ ¾ ·½ ¼ ¼ Figure 5.24 shows code that uses this method. It uses both two-way loop unrolling to combine more elements per iteration, and two-way parallelism, accumulating elements with even index in variable x0, and elements with odd index in variable x1. As before, we include a second loop to accumulate any remaining array elements for the case where the vector length is not a multiple of 2. We then apply the combining operation to x0 and...
View Full Document

## This note was uploaded on 09/02/2010 for the course ELECTRICAL 360 taught by Professor Schultz during the Spring '10 term at BYU.

Ask a homework question - tutors are online