ia-32_volume1_basic-arch

Mode real address mode virtual 8086 mode the

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Data Types Introduced with the SSE2 Extensions All of these data types are operated on in XMM registers or memory. Instructions are provided to convert between these 128-bit data types and the 64-bit and 32-bit data types. Vol. 1 11-5 PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2) The address of a 128-bit packed memory operand must be aligned on a 16-byte boundary, except in the following cases: a MOVUPD instruction which supports unaligned accesses scalar instructions that use an 8-byte memory operand that is not subject to alignment requirements Figure 4-2 shows the byte order of 128-bit (double quadword) and 64-bit (quadword) data types in memory. 11.4 SSE2 INSTRUCTIONS The SSE2 instructions are divided into four functional groups: Packed and scalar double-precision floating-point instructions 64-bit and 128-bit SIMD integer instructions 128-bit extensions of SIMD integer instructions introduced with the MMX technology and the SSE extensions Cacheability-control and instruction-ordering instructions The following sections provide more information about each group. 11.4.1 Packed and Scalar Double-Precision Floating-Point Instructions The packed and scalar double-precision floating-point instructions are divided into the following sub-groups: Data movement instructions Arithmetic instructions Comparison instructions Conversion instructions Logical instructions Shuffle instructions The packed double-precision floating-point instructions perform SIMD operations similarly to the packed single-precision floating-point instructions (see Figure 11-3). Each source operand contains two double-precision floating-point values, and the destination operand contains the results of the operation (OP) performed in parallel on the corresponding values (X0 and Y0, and X1 and Y1) in each operand. 11-6 Vol. 1 PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2) X1 X0 Y1 Y0 OP OP X1 OP Y1 X0 OP Y0 Figure 11-3. Packed Double-Precision Floating-Point Operations The scalar double-precision floating-point instructions operate on the low (least significant) quadwords of two source operands (X0 and Y0), as shown in Figure 11-4. The high quadword (X1) of the first source operand is passed through to the destination. The scalar operations are similar to the floating-point operations performed in x87 FPU data registers with the precision control field in the x87 FPU control word set for double precision (53-bit significand), except that x87 stack operations use a 15-bit exponent range for the result while SSE2 operations use an 11-bit exponent range. See Section 11.6.8, "Compatibility of SIMD and x87 FPU Floating-Point Data Types," for more information about obtaining compatible results when performing both scalar double-precision floating-point operations in XMM registers and in x87 FPU data registers. X1 X0 Y1 Y0 OP X1 X0 OP Y0 Figure 11-4. Scalar Double-Precision Floating-Point Operations Vol. 1 11-7 PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2) 11.4....
View Full Document

This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.

Ask a homework question - tutors are online