This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Data Types Introduced with the SSE2 Extensions
All of these data types are operated on in XMM registers or memory. Instructions are provided to convert between these 128bit data types and the 64bit and 32bit data types. Vol. 1 115 PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2) The address of a 128bit packed memory operand must be aligned on a 16byte boundary, except in the following cases: a MOVUPD instruction which supports unaligned accesses scalar instructions that use an 8byte memory operand that is not subject to alignment requirements Figure 42 shows the byte order of 128bit (double quadword) and 64bit (quadword) data types in memory. 11.4 SSE2 INSTRUCTIONS The SSE2 instructions are divided into four functional groups: Packed and scalar doubleprecision floatingpoint instructions 64bit and 128bit SIMD integer instructions 128bit extensions of SIMD integer instructions introduced with the MMX technology and the SSE extensions Cacheabilitycontrol and instructionordering instructions The following sections provide more information about each group. 11.4.1 Packed and Scalar DoublePrecision FloatingPoint Instructions The packed and scalar doubleprecision floatingpoint instructions are divided into the following subgroups: Data movement instructions Arithmetic instructions Comparison instructions Conversion instructions Logical instructions Shuffle instructions The packed doubleprecision floatingpoint instructions perform SIMD operations similarly to the packed singleprecision floatingpoint instructions (see Figure 113). Each source operand contains two doubleprecision floatingpoint values, and the destination operand contains the results of the operation (OP) performed in parallel on the corresponding values (X0 and Y0, and X1 and Y1) in each operand. 116 Vol. 1 PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2) X1 X0 Y1 Y0 OP OP X1 OP Y1 X0 OP Y0 Figure 113. Packed DoublePrecision FloatingPoint Operations
The scalar doubleprecision floatingpoint instructions operate on the low (least significant) quadwords of two source operands (X0 and Y0), as shown in Figure 114. The high quadword (X1) of the first source operand is passed through to the destination. The scalar operations are similar to the floatingpoint operations performed in x87 FPU data registers with the precision control field in the x87 FPU control word set for double precision (53bit significand), except that x87 stack operations use a 15bit exponent range for the result while SSE2 operations use an 11bit exponent range. See Section 11.6.8, "Compatibility of SIMD and x87 FPU FloatingPoint Data Types," for more information about obtaining compatible results when performing both scalar doubleprecision floatingpoint operations in XMM registers and in x87 FPU data registers. X1 X0 Y1 Y0 OP X1 X0 OP Y0 Figure 114. Scalar DoublePrecision FloatingPoint Operations Vol. 1 117 PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2) 11.4....
View
Full
Document
This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.
 Winter '11
 Watlins

Click to edit the document details