This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ked (enabled) exception, or emulating the instruction otherwise. Example 5-1. SIMD Floating-Point Exception Handler SIMD_FP_EXC_HANDLER PROC ;PROLOGUE ;SAVE REGISTERS THAT MIGHT BE USED BY THE EXCEPTION HANDLER PUSH EBP ;SAVE EBP PUSH EAX ;SAVE EAX ... MOV EBP, ESP ;SAVE ESP in EBP SUB ESP, 512 ;ALLOCATE 512 BYTES AND ESP, 0fffffff0h ;MAKE THE ADDRESS 16-BYTE ALIGNED FXSAVE [ESP] ;SAVE FP, MMX, AND SIMD FP STATE PUSH [EBP+EFLAGS_OFFSET] ;COPY OLD EFLAGS TO STACK TOP POPFD ;RESTORE THE INTERRUPT ENABLE FLAG IF ;TO VALUE BEFORE SIMD FP EXCEPTION ;BODY ;APPLICATION-DEPENDENT EXCEPTION HANDLING CODE GOES HERE LDMXCSR LOCAL_MXCSR ;LOAD LOCAL MXCSR VALUE IF NEEDED ... ... ;EPILOGUE FXRSTOR [ESP] ;RESTORE MODIFIED STATE IMAGE MOV ESP, EBP ;DE-ALLOCATE STACK SPACE ... POP EAX ;RESTORE EAX POP EBP ;RESTORE EBP IRET ;RETURN TO INTERRUPTED CALCULATION SIMD_FP_EXC_HANDLER ENDP E.3 EXCEPTION SYNCHRONIZATION An SSE/SSE2/SSE3 instruction can execute in parallel with other similar instructions, with integer instructions, and with floating-point or MMX instructions. Unlike for x87 instructions, special precaution for exception synchronization is not necessary in this case. This is because floating-point exceptions for SSE/SSE2/SSE3 instructions Vol. 1 E-3 GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERS occur immediately and are not delayed until a subsequent floating-point instruction is executed. However, floating-point emulation may be necessary when unmasked floating-point exceptions are generated. E.4 SIMD FLOATING-POINT EXCEPTIONS AND THE IEEE STANDARD 754 SSE/SSE2/SSE3 extensions are 100% compatible with the IEEE Standard 754 for Binary Floating-Point Arithmetic, satisfying all of its mandatory requirements (when the flush-to-zero or denormals-are-zeros modes are not enabled). But a programming environment that includes SSE/SSE2/SSE3 instructions will comply with both the obligatory and the strongly recommended requirements of the IEEE Standard 754 regarding floating-point exception handling, only as a combination of hardware and software (which is acceptable). The standard states that a user should be able to request a trap on any of the five floating-point exceptions (note that the denormal exception is an IA-32 addition), and it also specifies the values (operands or result) to be delivered to the exception handler. The main issue is that for SSE/SSE2/SSE3 instructions that raise post-computation exceptions (traps: overflow, underflow, or inexact), unlike for x87 FPU instructions, the processor does not provide the result recommended by IEEE Standard 754 to the user handler. If a user program needs the result of an instruction that generated a post-computation exception, it is the responsibility of the software to produce this result by emulating the faulting SSE/SSE2/SSE3 instruction. Another issue is that the standard does not specify explicitly how to handle multiple floating-point exceptions that occur simultaneously. For packed operati...
View Full Document
This note was uploaded on 10/01/2013 for the course CPE 103 taught by Professor Watlins during the Winter '11 term at Mississippi State.
- Winter '11