This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ntegers or 108 Vol. 1 PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE) a scalar singleprecision floatingpoint value into a doubleword integer (see Figure 118). SSE extensions provide conversion instructions between XMM registers and MMX registers, and between XMM registers and generalpurpose bit registers. See Figure 118. The address of a 128bit packed memory operand must be aligned on a 16byte boundary, except in the following cases: The MOVUPS instruction supports unaligned accesses. Scalar instructions that use a 4byte memory operand that is not subject to alignment requirements. Figure 42 shows the byte order of 128bit (double quadword) data types in memory. 10.4 SSE INSTRUCTION SET SSE instructions are divided into four functional groups Packed and scalar singleprecision floatingpoint instructions 64bit SIMD integer instructions State management instructions Cacheability control, prefetch, and memory ordering instructions The following sections give an overview of each of the instructions in these groups. 10.4.1 SSE Packed and Scalar FloatingPoint Instructions The packed and scalar singleprecision floatingpoint instructions are divided into the following subgroups: Data movement instructions Arithmetic instructions Logical instructions Comparison instructions Shuffle instructions Conversion instructions The packed singleprecision floatingpoint instructions perform SIMD operations on packed singleprecision floatingpoint operands (see Figure 105). Each source operand contains four singleprecision floatingpoint values, and the destination operand contains the results of the operation (OP) performed in parallel on the corresponding values (X0 and Y0, X1 and Y1, X2 and Y2, and X3 and Y3) in each operand. Vol. 1 109 PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE) X3 X2 X1 X0 Y3 Y2 Y1 Y0 OP OP OP OP X3 OP Y3 X2 OP Y2 X1 OP Y1 X0 OP Y0 Figure 105. Packed SinglePrecision FloatingPoint Operation
The scalar singleprecision floatingpoint instructions operate on the low (least significant) doublewords of the two source operands (X0 and Y0); see Figure 106. The three most significant doublewords (X1, X2, and X3) of the first source operand are passed through to the destination. The scalar operations are similar to the floatingpoint operations performed in the x87 FPU data registers with the precision control field in the x87 FPU control word set for single precision (24bit significand), except that x87 stack operations use a 15bit exponent range for the result, while SSE operations use an 8bit exponent range. X3 X2 X1 X0 Y3 Y2 Y1 Y0 OP X3 X2 X1 X0 OP Y0 Figure 106. Scalar SinglePrecision FloatingPoint Operation 10.4.1.1 SSE Data Movement Instructions SSE data movement instructions move singleprecision floatingpoint data between XMM registers and between an XMM register and memory. 1010 Vol. 1 PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE) The MOVAPS (move aligned packed singleprecision floatingpoint values) instructi...
View
Full
Document
This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.
 Winter '11
 Watlins

Click to edit the document details