ia-32_volume1_basic-arch

Of 0000ffbfh note that this value indicates that bit

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: as a doubleprecision floating-point value and the processor will operate on them accordingly). Because the data types operated on and the data type expected by the ADDPD instruction were inconsistent, the instruction may result in a SIMD floating-point exception (such as numeric overflow [#O] or invalid operation [#I]) being generated, but the actual source of the problem (inconsistent data types) is not detected. The ability to operate on an operand that contains a data type that is inconsistent with the typing of the instruction being executed, permits some valid operations to be performed. For example, the following instructions load a packed double-precision floating-point operand from memory to register XMM0, and a mask to register XMM1; then they use XORPD to toggle the sign bits of the two packed values in register XMM0. movapd xmm0, [eax] ; EAX register contains pointer to packed ; double-precision floating-point operand movaps xmm1, [ebx] ; EBX register contains pointer to packed ; double-precision floating-point mask xorpd xmm0, xmm1 ; XOR operation toggles sign bits using ; the mask in xmm1 In this example: XORPS or PXOR can be used in place of XORPD and yield the same correct result. However, because of the type mismatch between the operand data type and the instruction data type, a latency penalty will be incurred due to implementations of the instructions at the microarchitecture level. Vol. 1 11-33 PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2) Latency penalties can also be incurred by using move instructions of the wrong type. For example, MOVAPS and MOVAPD can both be used to move a packed single-precision operand from memory to an XMM register. However, if MOVAPD is used, a latency penalty will be incurred when a correctly typed instruction attempts to use the data in the register. Note that these latency penalties are not incurred when moving data from XMM registers to memory. 11.6.10 Interfacing with SSE/SSE2 Procedures and Functions SSE and SSE2 extensions allow direct access to XMM registers. This means that all existing interface conventions between procedures and functions that apply to the use of the general-purpose registers (EAX, EBX, etc.) also apply to XMM register usage. 11.6.10.1 Passing Parameters in XMM Registers The state of XMM registers is preserved across procedure (or function) boundaries. Parameters can be passed from one procedure to another using XMM registers. 11.6.10.2 Saving XMM Register State on a Procedure or Function Call The state of XMM registers can be saved in two ways: using an FXSAVE instruction or a move instruction. FXSAVE saves the state of all XMM registers (along with the state of MXCSR and the x87 FPU registers). This instruction is typically used for major changes in the context of the execution environment, such as a task switch. FXRSTOR restores the XMM, MXCSR, and x87 FPU registers stored with FXSAVE. In cases where only XMM registers must be saved, or where selected XMM registers need to...
View Full Document

This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.

Ask a homework question - tutors are online