IntelSoftwareDevelopersManual

E gradual underflow consequently the flush to zero

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: he infinitely precise result. Rounded result is closest to, but no greater in absolute value than the infinitely precise result. 01B 10B 11B The round up and round down modes are termed directed rounding and can be used to implement interval arithmetic. Interval arithmetic is used to determine upper and lower bounds for the true result of a multistep computation, when the intermediate results of the computation are subject to rounding. The round toward zero mode (sometimes called the “chop” mode) is commonly used when performing integer arithmetic with the processor. Whenever possible, the processor produces an infinitely precise result. However, it is often the case that the infinitely precise result of an arithmetic or store operation cannot be encoded exactly in the format of the destination operand. For example, the following value ( a) has a 24bit fraction. The least-significant bit of this fraction (the underlined bit) cannot be encoded exactly in the single-real format (which has only a 23-bit fraction): (a) 1.0001 0000 1000 0011 1001 0111E2 101 To round this result (a), the processor first selects two representable fractions b and c that most closely bracket a in value (b < a < c). (b) 1.0001 0000 1000 0011 1001 011E2 101 (c) 1.0001 0000 1000 0011 1001 100E2 101 The processor then sets the result to b or to c according to the rounding mode selected in the RC field. Rounding introduces an error in a result that is less than one unit in the last place to which the result is rounded. The rounded result is called the inexact result. When the processor produces an inexact result, the floating-point precision (inexact) flag (PE) is set in MXCSR. When the infinitely precise result is between the largest positive finite value allowed in a particular format and +∞, the processor rounds the result as shown in Table 11-3. 11-4 STREAMING SIMD EXTENSIONS SYSTEM PROGRAMMING Table 11-3. Rounding of Positive Numbers Greater than the Maximum Positive Finite Value Rounding Mode Rounding to nearest (even) Rounding down (toward −∞) Rounding up (toward +∞) Rounding toward zero (Truncate) +∞ Maximum, positive finite value +∞ Maximum, positive finite value Result When the infinitely precise result is between the largest negative finite value allowed in a particular format and −∞, the processor rounds the result as shown in Table 11-4. Table 11-4. Rounding of Negative Numbers Smaller than the Maximum Negative Finite Value Rounding Mode Rounding to nearest (even) Rounding toward zero (Truncate) Rounding up (toward +∞) Rounding down (toward −∞) -∞ Maximum, negative finite value Maximum, negative finite value -∞ Result The rounding modes have no effect on comparison operations, operations that produce exact results, or operations that produce NaN results. 11.3.2.2. FLUSH-TO-ZERO Turning on the Flush-To-Zero mode has the following effects when tiny results occur (i.e. when the infinitely precise result rounded to the destination precision with an unbounded exponent, is smaller in absolute value than the smallest normal number that can be represented; this is similar to the underflow condition when underflow traps are unmasked): • • Zero results are returned with the sign of the true result Precision and underflow exception flags are set The IEEE mandated masked response to underflow is to deliver the denormalized result (i.e., gradual underflow); consequently, the flush-to-zero mode is not compatible with IEEE Standard 754. It is provided primarily for performance reasons. At the cost of a slight precision loss, faster execution can be achieved for applications where underflow is common. Underflow for flushto-zero is defined to occur when the exponent for a computed result, prior to denormalization scaling, falls in the denormal range; this is regardless of whether a loss of accuracy has occurred. Unmasking the underflow exception takes precedence over flush-to-zero mode; this means that an exception handler will be invoked for a Streaming SIMD Extensions instruction that generates an underflow condition while this exception is unmasked, regardless of whether flush-tozero is enabled. 11-5 STREAMING SIMD EXTENSIONS SYSTEM PROGRAMMING 11.4. ENABLING STREAMING SIMD EXTENSIONS SUPPORT This section describes the interface of the Intel Architecture Streaming SIMD Extensions with the operating system. 11.4.1. Enabling Streaming SIMD Extensions Support Certain steps must be taken in both the application and the OS to check if the CPU supports Streaming SIMD Extensions and associated unmasked exceptions. This section describes this process, which is conducted using the bits described in Table 11-5 and Table 11-6. If the OS wants to use FXSAVE/FXRSTOR, it will first check CPUID.FXSR to determine if the CPU supports these instructions. If the CPU does support FXSAVE/FXRSTOR, then the OS can set CR4.OSFXSR without faulting and enable code for context switching that utilizes FXSAVE/FXRSTOR instead of FSAVE/FRSTOR. At this point, if the OS also supports unmasked SIMD floating-point exceptio...
View Full Document

This note was uploaded on 06/07/2013 for the course ECE 1234 taught by Professor Kwhon during the Spring '10 term at University of California, Berkeley.

Ask a homework question - tutors are online