This preview shows page 1. Sign up to view the full content.
Unformatted text preview: and the other operates on 64-bit operands. Multiplications are performed on each vertical pair of 16-bit data elements. The data elements in the source operand are signed integers, the data elements of the destination operand are unsigned integers. PMULHRSW multiplies vertically each signed 16-bit integer from the destination operand with the corresponding signed 16-bit integer of the source operand, producing intermediate, signed 32-bit integers. Each intermediate 32-bit integer is truncated to the 18 most significant bits. Rounding is always performed by adding 1 to the least significant bit of the 18-bit intermediate result. The final result is obtained by selecting the 16 bits immediately to the right of the most significant bit of each 18-bit intermediate result and packed to the destination operand. 12.6.5 Packed Shuffle Bytes There are two packed-shuffle-bytes instructions (represented by one mnemonic). One operates on 128-bit operands and the other operates on 64-bit operands. The shuffle operations are performed bytewise on the destination operand using the source operand as a control mask. PSHUFB permutes each byte in place, according to a shuffle control mask. The least significant three or four bits of each shuffle control byte of the control mask form the shuffle index. The shuffle mask is unaffected. If the most significant bit (bit 7) of a shuffle control byte is set, the constant zero is written in the result byte. 12-12 Vol. 1 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 12.6.6 Packed Sign There are six packed-sign instructions (represented by three mnemonics). Three operate on 128-bit operands and three operate on 64-bit operands. The widths of each data element for these instructions are 8 bit, 16 bit or 32 bit signed integers. PSIGNB/W/D negates each signed integer element of the destination operand if the sign of the corresponding data element in the source operand is less than zero. 12.6.7 Packed Align Right There are two packed-align-right instructions (represented by one mnemonic). One operates on 128-bit operands and the other operates on 64-bit operands. These instructions concatenate the destination and source operand into a composite, and extract the result from the composite according to an immediate constant. PALIGNR's source operand is appended after the destination operand forming an intermediate value of twice the width of an operand. The result is extracted from the intermediate value into the destination operand by selecting the 128-bit or 64-bit value that are right-aligned to the byte offset specified by the immediate value. 12.7 WRITING APPLICATIONS WITH SSSE3 EXTENSIONS The following sections give guidelines for writing application programs and operating-system code that use SSSE3 instructions. 12.7.1 Guidelines for Using SSSE3 Extensions The following guidelines describe how to maximize the benefits of using SSSE3 extensions: Ensure that the processor supports SSSE3 extensions. Ensure that your operating sys...
View Full Document
This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.
- Winter '11