This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ntegers or 10-8 Vol. 1 PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE) a scalar single-precision floating-point value into a doubleword integer (see Figure 11-8). SSE extensions provide conversion instructions between XMM registers and MMX registers, and between XMM registers and general-purpose bit registers. See Figure 11-8. The address of a 128-bit packed memory operand must be aligned on a 16-byte boundary, except in the following cases: The MOVUPS instruction supports unaligned accesses. Scalar instructions that use a 4-byte memory operand that is not subject to alignment requirements. Figure 4-2 shows the byte order of 128-bit (double quadword) data types in memory. 10.4 SSE INSTRUCTION SET SSE instructions are divided into four functional groups Packed and scalar single-precision floating-point instructions 64-bit SIMD integer instructions State management instructions Cacheability control, prefetch, and memory ordering instructions The following sections give an overview of each of the instructions in these groups. 10.4.1 SSE Packed and Scalar Floating-Point Instructions The packed and scalar single-precision floating-point instructions are divided into the following subgroups: Data movement instructions Arithmetic instructions Logical instructions Comparison instructions Shuffle instructions Conversion instructions The packed single-precision floating-point instructions perform SIMD operations on packed single-precision floating-point operands (see Figure 10-5). Each source operand contains four single-precision floating-point values, and the destination operand contains the results of the operation (OP) performed in parallel on the corresponding values (X0 and Y0, X1 and Y1, X2 and Y2, and X3 and Y3) in each operand. Vol. 1 10-9 PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE) X3 X2 X1 X0 Y3 Y2 Y1 Y0 OP OP OP OP X3 OP Y3 X2 OP Y2 X1 OP Y1 X0 OP Y0 Figure 10-5. Packed Single-Precision Floating-Point Operation
The scalar single-precision floating-point instructions operate on the low (least significant) doublewords of the two source operands (X0 and Y0); see Figure 10-6. The three most significant doublewords (X1, X2, and X3) of the first source operand are passed through to the destination. The scalar operations are similar to the floating-point operations performed in the x87 FPU data registers with the precision control field in the x87 FPU control word set for single precision (24-bit significand), except that x87 stack operations use a 15-bit exponent range for the result, while SSE operations use an 8-bit exponent range. X3 X2 X1 X0 Y3 Y2 Y1 Y0 OP X3 X2 X1 X0 OP Y0 Figure 10-6. Scalar Single-Precision Floating-Point Operation 10.4.1.1 SSE Data Movement Instructions SSE data movement instructions move single-precision floating-point data between XMM registers and between an XMM register and memory. 10-10 Vol. 1 PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE) The MOVAPS (move aligned packed single-precision floating-point values) instructi...
View Full Document
This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.
- Winter '11