15_Floating_Point_Numbers_with_ink

# 15_Floating_Point_Numbers_with_ink - Floating Point Numbers...

CMPE12 Cyrus Bazeghi Floating Point Numbers

CMPE12 Cyrus Bazeghi 2 Floating Point Numbers Registers for real numbers usually contain 32 or 64 bits, allowing 2 32 or 2 64 numbers to be represented. Which reals to represent? There are an infinite number between 2 adjacent integers. (or two reals!!) Which bit patterns for reals selected? Answer: use scientific notation
CMPE12 Cyrus Bazeghi 3 A B A x 10 B 0 any 0 1 .. 9 0 1 .. 9 1 .. 9 1 10 .. 90 1 .. 9 2 100 .. 900 1 .. 9 -1 0.1 .. 0.9 1 .. 9 -2 0.01 .. 0.09 Consider: A x 10 B , where A is one digit How to do scientific notation in binary? Standard: IEEE 754 Floating-Point Floating Point Numbers

CMPE12 Cyrus Bazeghi 4 IEEE 754 Single Precision Floating Point Format Representation: S E F S is one bit representing the sign of the number E is an 8 bit biased integer representing the exponent F is an unsigned integer The true value represented is: (-1) S x f x 2 e S = sign bit e = E bias f = F/2 n + 1 for single precision numbers n=23, bias=127 0 22 23 30 31
CMPE12 Cyrus Bazeghi 5 S, E, F are all fields within a representation . Each is just a bunch of bits. S is the sign bit (-1) S (-1) 0 = +1 and (-1) 1 = -1 Just a sign bit for signed magnitude E is the exponent field The E field is a biased-127 representation. True exponent is (E bias ) The base (radix) is always 2 (implied). Some early machines used radix 4 or 16 (IBM) IEEE 754 Single Precision Floating Point Format

CMPE12 Cyrus Bazeghi 6 F (or M ) is the fractional or mantissa field. It is in a strange form. There are 23 bits for F. A normalized FP number always has a leading 1. No need to store the one, just assume it. This MSB is called the HIDDEN BIT. IEEE 754 Single Precision Floating Point Format
CMPE12 Cyrus Bazeghi 7 How to convert 64.2 into IEEE SP 1. Get a binary representation for 64.2 Binary of left of radix point is:

