ia-32_volume1_basic-arch

In length two byte opcodes that are 3 bytes in length

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: on loads/moves 128-bits, duplicating the second and fourth 32-bit data elements. MOVSHDUP OperandA, OperandB -- OperandA (128 bits, four data elements): 3a, 2a, 1a, 0a -- OperandB (128 bits, four data elements): 3b, 2b, 1b, 0b -- Result (stored in OperandA): 3b, 3b, 1b, 1b The MOVSLDUP instruction loads/moves 128-bits, duplicating the first and third 32-bit data elements. MOVSLDUP OperandA, OperandB -- OperandA (128 bits, four data elements): 3a, 2a, 1a, 0a 12-4 Vol. 1 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 -- OperandB (128 bits, four data elements): 3b, 2b, 1b, 0b -- Result (stored in OperandA): 2b, 2b, 0b, 0b The MOVDDUP instruction loads/moves 64-bits; duplicating the 64 bits from the source. MOVDDUP OperandA, OperandB -- OperandA (128 bits, two data elements): 1a, 0a -- OperandB (64 bits, one data element): 0b -- Result (stored in OperandA): 0b, 0b 12.3.4 SIMD Floating-Point Instructions Provide Packed Addition/Subtraction The ADDSUBPS instruction has two 128-bit operands. The instruction performs single-precision addition on the second and fourth pairs of 32-bit data elements within the operands; and single-precision subtraction on the first and third pairs. ADDSUBPS OperandA, OperandB -- OperandA (128 bits, four data elements): 3a, 2a, 1a, 0a -- OperandB (128 bits, four data elements): 3b, 2b, 1b, 0b -- Result (stored in OperandA): 3a+3b, 2a-2b, 1a+1b, 0a-0b The ADDSUBPD instruction has two 128-bit operands. The instruction performs double-precision addition on the second pair of quadwords, and double-precision subtraction on the first pair. ADDSUBPD OperandA, OperandB -- OperandA (128 bits, two data elements): 1a, 0a -- OperandB (128 bits, two data elements): 1b, 0b -- Result (stored in OperandA): 1a+1b, 0a-0b 12.3.5 SIMD Floating-Point Instructions Provide Horizontal Addition/Subtraction Most SIMD instructions operate vertically. This means that the result in position i is a function of the elements in position i of both operands. Horizontal addition/subtraction operates horizontally. This means that contiguous data elements in the same source operand are used to produce a result. The HADDPS instruction performs a single-precision addition on contiguous data elements. The first data element of the result is obtained by adding the first and second elements of the first operand; the second element by adding the third and fourth elements of the first operand; the third by adding the first and second Vol. 1 12-5 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 elements of the second operand; and the fourth by adding the third and fourth elements of the second operand. HADDPS OperandA, OperandB -- OperandA (128 bits, four data elements): 3a, 2a, 1a, 0a -- OperandB (128 bits, four data elements): 3b, 2b, 1b, 0b -- Result (Stored in OperandA): 3b+2b, 1b+0b, 3a+2a, 1a+0a The HSUBPS instruction performs a single-precision subtraction on contiguous data elements. The first data element of the result is obtained by subtracting the second element of the fir...
View Full Document

This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.

Ask a homework question - tutors are online