Unaligned load designed to avoid cache line splits if

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: to use the SIMD subset of SSE3 extensions, the application should follow the steps illustrated in Section 11.6.2, "Checking for SSE/SSE2 Support." Next, use the additional step provided below: Check that the processor supports the SIMD and x87 SSE3 extensions (if CPUID.01H:ECX.SSE3[bit 0] = 1). See Example 12-1 for a code example. Checking support for SSE, SSE2 along with SSE3 allows software flexibility to use SSE3. To use FISTTP, software can use the step above to detect support for SSE3. In the initial implementation of MONITOR and MWAIT, these two instructions are available to ring 0 and conditionally available at ring level greater than 0. Before an Vol. 1 12-7 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 application attempts to use the MONITOR and MWAIT instructions, the application should use the following steps: 1. Check that the processor supports MONITOR and MWAIT. If CPUID.01H:ECX.MONITOR[bit 3] = 1, MONITOR and MWAIT are available at ring 0. 2. To verify MONITOR and MWAIT is supported at ring level greater than 0, use a routine similar to Example 12-2. 3. Query the smallest and largest line size that MONITOR uses. Use CPUID.05H:EAX.smallest[bits 15:0];EBX.largest[bits15:0]. Values are returned in bytes in EAX and EBX. 4. Ensure the memory address range(s) that will be supplied to MONITOR meets memory type requirements. MONITOR and MWAIT are targeted for system software that supports efficient thread synchronization, See Chapter 12 in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A for details. Example 12-1. Verifying SSE3 Support boolean SSE3_SIMD_works = TRUE; try { IssueSSE3_SIMD_Instructions(); // Use ADDSUBPD } except (UNWIND) { // if we get here, SSE3 not available SSE3_SIMD_works = FALSE; } Example 12-2. Verifying MONITOR/MWAIT Support boolean MONITOR_MWAIT_works = TRUE; try { _asm { xor ecx, ecx xor edx, edx mov eax, MemArea monitor } // Use monitor } except (UNWIND) { 12-8 Vol. 1 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 // if we get here, MONITOR/MWAIT is not available MONITOR_MWAIT_works = FALSE; } 12.4.3 Enable FTZ and DAZ for SIMD Floating-Point Computation Enabling the FTZ and DAZ flags in the MXCSR register is likely to accelerate SIMD floating-point computation where strict compliance to the IEEE standard 754-1985 is not required. The FTZ flag is available to Intel 64 and IA-32 processors that support the SSE; DAZ is available to Intel 64 processors and to most IA-32 processors that support SSE/SSE2/SSE3. Software can detect the presence of DAZ, modify the MXCSR register, and save and restore state information by following the techniques discussed in Section 11.6.3 through Section 11.6.6. 12.4.4 Programming SSE3 with SSE/SSE2 Extensions SIMD instructions in SSE3 extensions are intended to complement the use of SSE/SSE2 in programming SIMD applications. Application software that intends to use SSE3 instructions should also check for the availability of SSE/SSE2 instructions. The FIS...
View Full Document

Ask a homework question - tutors are online