This preview shows page 1. Sign up to view the full content.
Unformatted text preview: essor's cache hierarchy Serializes store operations 5.6 SSE2 INSTRUCTIONS SSE2 extensions represent an extension of the SIMD execution model introduced with MMX technology and the SSE extensions. SSE2 instructions operate on packed double-precision floating-point operands and on packed byte, word, doubleword, and quadword operands located in the XMM registers. For more detail on these instructions, see Chapter 11, "Programming with Streaming SIMD Extensions 2 (SSE2)." SSE2 instructions can only be executed on Intel 64 and IA-32 processors that support the SSE2 extensions. Support for these instructions can be detected with the CPUID instruction. See the description of the CPUID instruction in Chapter 3, "Instruction Set Reference, A-M," of the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 2A. These instructions are divided into four subgroups (note that the first subgroup is further divided into subordinate subgroups): Packed and scalar double-precision floating-point instructions Packed single-precision floating-point conversion instructions 128-bit SIMD integer instructions Cacheability-control and instruction ordering instructions The following sections give an overview of each subgroup. 5.6.1 SSE2 Packed and Scalar Double-Precision Floating-Point Instructions SSE2 packed and scalar double-precision floating-point instructions are divided into the following subordinate subgroups: data movement, arithmetic, comparison, conversion, logical, and shuffle operations on double-precision floating-point operands. These are introduced in the sections that follow. 22.214.171.124 SSE2 Data Movement Instructions SSE2 data movement instructions move double-precision floating-point data between XMM registers and between XMM registers and memory. MOVAPD Move two aligned packed double-precision floating-point values between XMM registers or between and XMM register and memory Vol. 1 5-21 INSTRUCTION SET SUMMARY MOVUPD Move two unaligned packed double-precision floating-point values between XMM registers or between and XMM register and memory Move high packed double-precision floating-point value to an from the high quadword of an XMM register and memory Move low packed single-precision floating-point value to an from the low quadword of an XMM register and memory Extract sign mask from two packed double-precision floatingpoint values Move scalar double-precision floating-point value between XMM registers or between an XMM register and memory MOVHPD MOVLPD MOVMSKPD MOVSD 126.96.36.199 SSE2 Packed Arithmetic Instructions The arithmetic instructions perform addition, subtraction, multiply, divide, square root, and maximum/minimum operations on packed and scalar double-precision floating-point operands. ADDPD ADDSD SUBPD SUBSD MULPD MULSD DIVPD DIVSD SQRTPD SQRTSD MAXPD MAXSD MINPD MINSD Add packed double-precision floating-point values Add scalar double precision floating-point values Subtract scalar double-precision floating-point v...
View Full Document
This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.
- Winter '11