The more traditional way of expressing relative

Unformatted text preview: be dominated by the linear factors. We refer to the coefficients in these terms as the effective number of Cycles per Element, abbreviated “CPE.” Note that we prefer measuring the number of cycles per element rather than the number of cycles per iteration, because techniques such as loop unrolling allow us to use fewer iterations to complete the computation, but our ultimate concern is how fast the procedure will run for a given vector length. We focus our efforts on minimizing the CPE for our computations. By this measure, vsum2, with a CPE of 3.50, is superior to vsum1, with a CPE of 4.0. Aside: What is a least squares fit? ÜÒ ÝÒ , we often try to draw a line that best approximates the X-Y trend For a set of data points ܽ ݽ ÑÜ that minimizes the represented by this data. With a least squares fit, we look for a line of the form Ý ´ µ ´ µ · 208 CHAPTER 5. OPTIMIZING PROGRAM PERFORMANCE code/opt/vsum.c 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 void vsum1(int n) { int i; for (i = 0; i < n; i++) c[i] = a[i] + b[i]; } /* Sum vector of n elements (n must be ev...
