Unformatted text preview: le quadword) -- 128-bit packed byte integers -- 128-bit packed word integers -- 128-bit packed doubleword integers -- 128-bit packed quadword integers Instructions to support the additional data types and extend existing SIMD integer operations: -- Packed and scalar double-precision floating-point instructions -- Additional 64-bit and 128-bit SIMD integer instructions -- 128-bit versions of SIMD integer instructions introduced with the MMX technology and the SSE extensions -- Additional cacheability-control and instruction-ordering instructions Vol. 1 11-1 PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2) Modifications to existing IA-32 instructions to support SSE2 features: -- Extensions and modifications to the CPUID instruction -- Modifications to the RDPMC instruction These new features extend the IA-32 architecture's SIMD programming model in three important ways: They provide the ability to perform SIMD operations on pairs of packed doubleprecision floating-point values. This permits higher precision computations to be carried out in XMM registers, which enhances processor performance in scientific and engineering applications and in applications that use advanced 3-D geometry techniques (such as ray tracing). Additional flexibility is provided with instructions that operate on single (scalar) double-precision floating-point values located in the low quadword of an XMM register. They provide the ability to operate on 128-bit packed integers (bytes, words, doublewords, and quadwords) in XMM registers. This provides greater flexibility and greater throughput when performing SIMD operations on packed integers. The capability is particularly useful for applications such as RSA authentication and RC5 encryption. Using the full set of SIMD registers, data types, and instructions provided with the MMX technology and SSE/SSE2 extensions, programmers can develop algorithms that finely mix packed single- and double-precision floating-point data and 64- and 128-bit packed integer data. SSE2 extensions enhance the support introduced with SSE extensions for controlling the cacheability of SIMD data. SSE2 cache control instructions provide the ability to stream data in and out of the XMM registers without polluting the caches and the ability to prefetch data before it is actually used. SSE2 extensions are fully compatible with all software written for IA-32 processors. All existing software continues to run correctly, without modification, on processors that incorporate SSE2 extensions, as well as in the presence of applications that incorporate these extensions. Enhancements to the CPUID instruction permit detection of the SSE2 extensions. Also, because the SSE2 extensions use the same registers as the SSE extensions, no new operating-system support is required for saving and restoring program state during a context switch beyond that provided for the SSE extensions. SSE2 extensions are accessible from all IA-32 execution modes: protected mode, real addr...
View Full Document
- Winter '11
- X86, Intel corporation, 64-bit mode, fpu floating-point exception, FPU Control Instructions