As an example the cmovll instruction performs a copy

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: .00 5.00 1.50 2.50 3.00 5.00 1.50 2.50 1.61 2.00 1.87 2.07 1.66 2.00 As this table shows, increasing the degree of loop unrolling and the degree of parallelism helps program performance up to some point, but it yields diminishing improvement or even worse performance when taken to an extreme. In the next section, we will describe two reasons for this phenomenon. 5.10.2 Register Spilling The benefits of loop parallelism are limited by the ability to express the computation in assembly code. In particular, the IA32 instruction set only has a small number of registers to hold the values being accumulated. If we have a degree of parallelism Ô that exceeds the number of available registers, then the compiler will resort to spilling, storing some of the temporary values on the stack. Once this happens, the performance drops dramatically. This occurs for our benchmarks when we attempt to have Ô . Our measurements show the performance for this case is worse than that for Ô . For the case of the integer data type, there are only eight total i...
View Full Document

Ask a homework question - tutors are online