This preview shows page 1. Sign up to view the full content.
Unformatted text preview: te value, while those with odd indices were very close to 0.0. Then product È Ò might overﬂow, or È ÇÒ might underﬂow, even though the ﬁnal product ÈÒ does not. In most real-life applications, however, such 244 CHAPTER 5. OPTIMIZING PROGRAM PERFORMANCE %edx.0 1 2 3 %ecx.0 4 %ebx.0 5 6 7 8 9 10 11 12 13 14 15 16 Cycle i=0 Iteration 1 imull imull load load
t.1a t.1b addl cmpl jl
cc.1 %edx.1 addl load load cmpl jl
cc.2 %edx.2 addl load load cmpl jl
cc.3 %edx.3 %ecx.1 %ebx.1 t.2a t.2b imull imull
%ecx.2 t.3a t.3b i=2 Iteration 2 %ebx.2 imull imull
%ecx.3 i=4 Iteration 3 %ebx.3 Figure 5.26: Scheduling of Operations for Two-Way Unrolled, Two-Way Parallel Integer Multiplication with Unlimited Resources. The multiplier can now generate two values every 4 cycles. 5.10. ENHANCING PARALLELISM 245 patterns are unlikely. Since most physical phenomena are continous, numerical data tend to be reasonably smooth and well-behaved. Even when there are discontinuities, they do not generally cause periodic patterns that lead to a condition such as that sketch...
View Full Document
This note was uploaded on 09/02/2010 for the course ELECTRICAL 360 taught by Professor Schultz during the Spring '10 term at BYU.
- Spring '10
- The American