The appropriate counter is incremented every time a

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: d array_clear_8(int *src, int *dest, int n) { int i; int len = n - 7; for (i = 0; i dest[i] = dest[i+1] dest[i+2] dest[i+3] dest[i+4] dest[i+5] dest[i+6] dest[i+7] } for (; i < n; dest[i] = } code/opt/copy.c < len; i+=8) { 0; = 0; = 0; = 0; = 0; = 0; = 0; = 0; i++) 0; Figure 5.32: Functions to Clear Array. These illustrate the pipelining of the store operation. Each successive value of register %edx depends on the result of a load operation having %edx as an operand. Figure 5.31 shows the scheduling of operations for the first three iterations of this function. As can be seen, the latency of the load operation limits the CPE to 3.0. 5.13.2 Store Latency In all of our examples so far, we have interacted with the memory only by using the load operation to read from a memory location into a register. Its counterpart, the store operation, writes a register value to memory. As Figure 5.12 indicates, this operation also has a nominal latency of three cycles, and an issue time of one cycle. However, its behavior, and its interactions with load opera...
View Full Document

Ask a homework question - tutors are online