Unformatted text preview: TTP instruction in SSE3 is intended to accelerate x87 style programming where performance is limited by frequent floating-point conversion to integers; this happens when the x87 FPU control word is modified frequently. Use of FISTTP can eliminate the need to access the x87 FPU control word. 12.5 OVERVIEW OF SSSE3 INSTRUCTIONS SSSE3 provides 32 instructions to accelerate a variety of multimedia and signal processing applications employing SIMD integer data. See: Section 12.6, "SSSE3 Instructions," provides an introduction to individual SSE3 instructions. Intel 64 and IA-32 Architectures Software Developer's Manual, Volumes 2A & 2B, provide detailed information on individual instructions. Chapter 12, "System Programming for Streaming SIMD Instruction Sets," in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A, gives guidelines for integrating SSE/SSE2/SSE3/SSSE3 extensions into an operating-system environment. Vol. 1 12-9 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 12.6 SSSE3 INSTRUCTIONS SSSE3 instructions include: Twelve instructions that perform horizontal addition or subtraction operations. Six instructions that evaluate the absolute values. Two instructions that perform multiply and add operations and speed up the evaluation of dot products. Two instructions that accelerate packed-integer multiply operations and produce integer values with scaling. Two instructions that perform a byte-wise, in-place shuffle according to the second shuffle control operand. Six instructions that negate packed integers in the destination operand if the signs of the corresponding element in the source operand is less than zero. Two instructions that align data from the composite of two operands. The operands of these instructions are packed integers of byte, word, or double word sizes. The operands are stored as 64 or 128 bit data in MMX registers, XMM registers, or memory. The instructions are discussed in more detail in the following paragraphs. 12.6.1 Horizontal Addition/Subtraction In analogy to the packed, floating-point horizontal add and subtract instructions in SSE3, SSSE3 offers similar capabilities on packed integer data. Data elements of signed words, doublewords are supported. Saturated version for horizontal add and subtract on signed words are also supported. The horizontal data movement of PHADD is shown in Figure 12-3. X3 X2 X1 X0 Y3 Y2 Y1 Y0 ADD ADD ADD ADD Y2 + Y3 Y0 + Y1 X2 + X3 X0 + X1 Figure 12-3. Horizontal Data Movement in PHADDD 12-10 Vol. 1 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 There are six horizontal add instructions (represented by three mnemonics); three operate on 128-bit operands and three operate on 64-bit operands. The width of each data element is either 16 bits or 32 bits. The mnemonics are listed below. PHADDW adds two adjacent, signed 16-bit integers horizontally from the source and destination operands and packs the signed 16-bit results to the destination operand. PHADDSW a...
View Full Document
- Winter '11
- X86, Intel corporation, 64-bit mode, fpu floating-point exception, FPU Control Instructions