Unformatted text preview: ction calls and C variables instead of hardware registers. Using these intrinsics frees programmers from having to manage registers and assembly programming. Further, the compiler optimizes the instruction scheduling so that executable run faster. The following sections discuss the intrinsics API and the MMX technology and SIMD floating-point intrinsics. Each intrinsic equivalent is listed with the instruction description. There may be additional intrinsics that do not have an instruction equivalent. It is strongly recommended that the reader reference the compiler documentation for the complete list of supported intrinsics. See Appendix C, "InteL C/C++ Compiler Intrinsics and Functional Equivalents," in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 2B, for more information on using intrinsics. The Intrinsics API
The benefit of coding with MMX technology intrinsics and the SSE/SSE2/SSE3 intrinsics is that you can use the syntax of C function calls and C variables instead of hardware registers. This frees you from managing registers and programming assembly. Further, the compiler optimizes the instruction scheduling so that your executable runs faster. For each computational and data manipulation instruction in the new instruction set, there is a corresponding C intrinsic that implements it directly. The intrinsics allow you to specify the underlying implementation (instruction selection) of an algorithm yet leave instruction scheduling and register allocation to the compiler. MMXTM Technology Intrinsics
The MMX technology intrinsics are based on a __m64 data type that represents the specific contents of an MMX technology register. You can specify values in bytes, short integers, 32-bit values, or a 64-bit object. The __m64 data type, however, is not a basic ANSI C data type, and therefore you must observe the following usage restrictions: Use __m64 data only on the left-hand side of an assignment, as a return value, or as a parameter. You cannot use it with other arithmetic expressions ("+", ">>", and so on). Use __m64 objects in aggregates, such as unions to access the byte elements and structures; the address of an __m64 object may be taken. Use __m64 data only with the MMX technology intrinsics described in this manual and Intel C/C++ compiler documentation. See: -- http://www.intel.com/support/performancetools/ 3-12 Vol. 2 INSTRUCTION SET REFERENCE, A-M -- Appendix C, "InteL C/C++ Compiler Intrinsics and Functional Equivalents," in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 2B, for more information on using intrinsics. SSE/SSE2/SSE3 Intrinsics
SSE/SSE2/SSE3 intrinsics all make use of the XMM registers of the Pentium III, Pentium 4, and Intel Xeon processors. There are three data types supported by these intrinsics: __m128, __m128d, and __m128i. The __m128 data type is used to represent the contents of an XMM register used by an SSE intrinsic...
View Full Document
- Winter '11
- X86, Intel corporation, Packed Single-Precision Floating-Point, Packed Double-Precision Floating-Point, single-precision floating-point values