ARM.SoC.Architecture

# We start from the binary representation of 1995 which

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: may require negative exponents in their normalized form. Rather than use a signed exponent, the standard specifies a 'bias'. This bias (+127 for single precision normalized numbers) is added to the exponent value. Hence 1995 is represented as: Figure 6.2 IEEE 754 single precision representation of '1995'. The exponent is 127+10 = 137; the fraction is zero-extended to the right to fill the 23-bit field. Normalized value In general, the value of a 32-bit normalized number is given by: value (norm) = (-l)Sx l.fractionx 2(exponent"127) Equation 15 Although this format represents a wide range of values efficiently, it has one rather glaring problem: there is no way to represent zero. Therefore the IEEE 754 standard reserves numbers where the exponent is either zero or 255 to represent special values: Zero is represented by a zero exponent and fraction (but either sign value, so pos itive and negative zeros can be represented). Plus or minus infinity are represented by the maximum exponent value with a zero fraction and the appropriate sign bit. NaN (Not a Number) is indicated by the maximum exponent and a non-zero fraction; 'quiet' NaNs have a T in the most significant fraction bit position and 'signalling' NaNs have a '0' in that bit (but a T somewhere else, otherwise they look like infinity). Denormalized numbers, which are numbers that are just too small to normalize within this format, have a zero exponent, a non-zero fraction and a value given by: value (denorm) = (-l)s x O.fractionx2(-126) Equation 16 The 'NaN' format is used to represent the results of invalid floating-point operations, such as taking the logarithm of a negative number, and prevents a series of operations from producing an apparently valid result when an intermediate error condition was not checked. Floating-point data types 161 Double precision For many purposes the accuracy offered by the single precision format is inadequate. Greater accuracy may be achieved by using the double precision format which uses 64 bits to store each floating-point value. The interpretation is similar to...
View Full Document

## This document was uploaded on 10/30/2011 for the course CSE 378 380 at SUNY Buffalo.

Ask a homework question - tutors are online