Existing mmx technology routines using 128 bit simd

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: gment Override (2EH,36H,3EH,26H,64H,65H) Repeat Prefixes (F2H and F3H) Lock Prefix (F0H) Branch Hint Prefixes(E2H and E3H) Reserved and may result in unpredictable behavior. Affects instructions with a memory operand. Reserved for instructions without a memory operand and may result in unpredictable behavior. Reserved and may result in unpredictable behavior. Reserved; generates invalid opcode exception (#UD). Reserved and may result in unpredictable behavior. 11-38 Vol. 1 CHAPTER 12 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 The Pentium 4 processor supporting Hyper-Threading Technology introduces Streaming SIMD Extensions 3 (SSE3). The Intel Xeon processor 5100 series, Intel Core 2 processor families introduced Supplemental Streaming SIMD Extensions 3 (SSSE3). This chapter describes SSE3/SSSE3 and provides information to assist in writing application programs that use these extensions. 12.1 SSE3/SSSE3 PROGRAMMING ENVIRONMENT AND DATA TYPES The programming environment for using SSE3/SSSE3 is unchanged from that shown in Figure 3-1 and Figure 11-1. SSE3/SSSE3 do not introduce new data types. XMM registers are used to operate on packed integer data, single-precision floating-point data, or double-precision floating-point data. One SSE3 instruction uses the x87 FPU for x87-style programming. There are two SSE3 instructions that use the general registers for thread synchronization. The MXCSR register governs SIMD floating-point operations. Note, however, that the x87FPU control word does not affect the SSE3 instruction that is executed by the x87 FPU (FISTTP), other than by unmasking an invalid operand or inexact result exception. 12.1.1 SSE3/SSSE3 in 64-Bit Mode and Compatibility Mode In compatibility mode, SSE3/SSSE3 function like they do in protected mode. In 64-bit mode, eight additional XMM registers are accessible. Registers XMM8-XMM15 are accessed by using REX prefixes. Memory operands are specified using the ModR/M, SIB encoding described in Section 3.7.5. Some SSE3 instructions may be used to operate on general-purpose registers. Use the REX.W prefix to access 64-bit general-purpose registers. Note that if a REX prefix is used when it has no meaning, the prefix is ignored. Vol. 1 12-1 PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3 12.1.2 Compatibility of SSE3/SSSE3 with MMX Technology, the x87 FPU Environment, and SSE/SSE2 Extensions SSE3/SSSE3 do not introduce any new state to the Intel 64 and IA-32 execution environments. For SIMD and x87 programming, the FXSAVE and FXRSTOR instructions save and restore the architectural states of XMM, MXCSR, x87 FPU, and MMX registers. The MONITOR and MWAIT instructions use general purpose registers on input, they do not modify the content of those registers. 12.1.3 Horizontal and Asymmetric Processing Many SSE/SSE2/SSE3/SSSE3 instructions accelerate SIMD data processing using a model referred to as vertical computation. Using this model, data flow is vertical between the data elements of the inputs and the ou...
View Full Document

Ask a homework question - tutors are online