This preview shows page 1. Sign up to view the full content.
Unformatted text preview: nt is implemented by using a write combining (WC) memory type protocol (see "Caching of Temporal vs. Non-Temporal Data" in Chapter 10, of the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 1). Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MASKMOVEDQU instructions if multiple processors might use different memory types to read/write the destination memory locations. Behavior with a mask of all 0s is as follows: No data will be written to memory. Signaling of breakpoints (code or data) is not guaranteed; different processor implementations may signal or not signal these breakpoints. Exceptions associated with addressing memory and page faults may still be signaled (implementation dependent). If the destination memory region is mapped as UC or WP, enforcement of associated semantics for these memory types is not guaranteed (that is, is reserved) and is implementation-specific. The MASKMOVDQU instruction can be used to improve performance of algorithms that need to merge data on a byte-by-byte basis. MASKMOVDQU should not cause a read for ownership; doing so generates unnecessary bandwidth since data is to be written directly using the byte-mask without allocating old data prior to the store. Vol. 2 3-557 INSTRUCTION SET REFERENCE, A-M In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15). Operation
IF (MASK = 1) THEN DEST[DI/EDI] SRC[7:0] ELSE (* Memory location unchanged *); FI; IF (MASK = 1) THEN DEST[DI/EDI +1] SRC[15:8] ELSE (* Memory location unchanged *); FI; (* Repeat operation for 3rd through 14th bytes in source operand *) IF (MASK = 1) THEN DEST[DI/EDI +15] SRC[127:120] ELSE (* Memory location unchanged *); FI; Intel C/C++ Compiler Intrinsic Equivalent
void_mm_maskmoveu_si128(__m128i d, __m128i n, char * p) Protected Mode Exceptions
#GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. (even if mask is all 0s). If the destination operand is in a nonwritable segment. If the DS, ES, FS, or GS register contains a NULL segment selector. #SS(0) #PF(fault-code) #NM #UD For an illegal address in the SS segment (even if mask is all 0s). For a page fault (implementation specific). If CR0.TS[bit 3] = 1. If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:EDX.SSE2[bit 26] = 0. Real-Address Mode Exceptions
GP(0) #NM #UD If any part of the operand lies outside the effective address space from 0 to FFFFH. (even if mask is all 0s). If CR0.TS[bit 3] = 1. If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:EDX.SSE2[bit 26] = 0. Virtual-8086 Mode Exceptions
Same exceptions as in Real Address Mode #PF(fault-code) For a page fault (implementation specific). 3-558 Vol. 2 INSTRUCTION SET REFERENCE, A-M Compatibility Mode Exceptions
Same exceptions as in Protected Mode. 64-Bit Mode Exce...
View Full Document
This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.
- Winter '11