ia-32_volume1_basic-arch

Data when storing data from xmm registers to memory

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: the third element of the second operand. HADDPD Performs a double-precision addition on contiguous data elements. The first data element of the result is obtained by adding the first and second elements of the first operand; the second element by adding the first and second elements of the second operand. Performs a double-precision subtraction on contiguous data elements. The first data element of the result is obtained by subtracting the second element of the first operand from the first element of the first operand; the second element by subtracting the second element of the second operand from the first element of the second operand. HSUBPD 5.7.5 MOVSHDUP MOVSLDUP MOVDDUP SSE3 SIMD Floating-Point LOAD/MOVE/DUPLICATE Instructions Loads/moves 128 bits; duplicating the second and fourth 32-bit data elements Loads/moves 128 bits; duplicating the first and third 32-bit data elements Loads/moves 64 bits (bits[63:0] if the source is a register) and returns the same 64 bits in both the lower and upper halves of the 128-bit result register; duplicates the 64 bits from the source 5.7.6 MONITOR MWAIT SSE3 Agent Synchronization Instructions Sets up an address range used to monitor write-back stores Enables a logical processor to enter into an optimized state while waiting for a write-back store to the address range set up by the MONITOR instruction Vol. 1 5-27 INSTRUCTION SET SUMMARY 5.8 SUPPLEMENTAL STREAMING SIMD EXTENSIONS 3 (SSSE3) INSTRUCTIONS SSSE3 provide 32 instructions (represented by 14 mnemonics) to accelerate computations on packed integers. These include: Twelve instructions that perform horizontal addition or subtraction operations. Six instructions that evaluate absolute values. Two instructions that perform multiply and add operations and speed up the evaluation of dot products. Two instructions that accelerate packed-integer multiply operations and produce integer values with scaling. Two instructions that perform a byte-wise, in-place shuffle according to the second shuffle control operand. Six instructions that negate packed integers in the destination operand if the signs of the corresponding element in the source operand is less than zero. Two instructions that align data from the composite of two operands. SSSE3 instructions can only be executed on Intel 64 and IA-32 processors that support SSSE3 extensions. Support for these instructions can be detected with the CPUID instruction. See the description of the CPUID instruction in Chapter 3, "Instruction Set Reference, A-M," of the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 2A. The sections that follow describe each subgroup. 5.8.1 PHADDW Horizontal Addition/Subtraction Adds two adjacent, signed 16-bit integers horizontally from the source and destination operands and packs the signed 16-bit results to the destination operand. Adds two adjacent, signed 16-bit integers horizontally from the source and destination operands and packs the signed, satura...
View Full Document

Ask a homework question - tutors are online