324_Book

# A more pragmatic programmer would argue the advantage

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ement of the destination vector per iteration. The second uses a technique known as loop unrolling to compute two elements per iteration. This version will only work properly for even values of Ò. Later in this chapter we cover loop unrolling in more detail, including how to make it work for arbitrary values of Ò. The time required by such a procedure can be characterized as a constant plus a factor proportional to the number of elements processed. For example, Figure 5.2 shows a plot of the number of clock cycles required by the two functions for a range of values of Ò. Using a least squares ﬁt, we ﬁnd that the two function run times (in clock cycles) can be approximated by lines with equations ¼ · ¼Ò and ¿ · ¿ Ò, respectively. These equations indicated an overhead of 80 to 84 cycles to initiate the procedure, set up the loop, and complete the procedure, plus a linear factor of 3.5 or 4.0 cycles per element. For large values of Ò (say greater than 50), the run times will...
View Full Document

## This note was uploaded on 09/02/2010 for the course ELECTRICAL 360 taught by Professor Schultz during the Spring '10 term at BYU.

Ask a homework question - tutors are online