EE357Unit3_FP

# EE357Unit3_FP - Floating Point Used to represent very small...

This preview shows pages 1–3. Sign up to view the full content.

1 © Mark Redekopp, Al rights reserved EE 357 Unit 3 IEEE 754 Floating Point Representation Floating Point Arithmetic © Mark Redekopp, Al rights reserved Floating Point • Used to represent very small numbers (fractions) and very large numbers – Avogadro’s Number: +6.0247 * 10 23 – Planck’s Constant: +6.6254 * 10 -27 – Note: 32 or 64-bit integers can’t represent this range • Floating Point representation is used in HLL’s like C by declaring variables as float or double © Mark Redekopp, Al rights reserved Fixed Point • Unsigned and 2’s complement fall under a category of representations called “Fixed Point” • The radix point is assumed to be in a fixed location for all numbers – Integers: 10011101. (binary point to right of LSB) For 32-bits, unsigned range is 0 to ~4 billion – Fractions: .10011101 (binary point to left of MSB) Range [0 to 1) • Main point: By fixing the radix point, we limit the range of numbers that can be represented – Floating point allows the radix point to be in a different location for each value © Mark Redekopp, Al rights reserved Floating Point Representation • Similar to scientific notation used with decimal numbers – ±D.DDD * 10 ±exp • Floating Point representation uses the following form – ±b.bbbb * 2 ±exp – 3 Fields: sign, exponent, fraction (also called mantissa or significand) S Exp. fraction Overall Sign of #

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 © Mark Redekopp, Al rights reserved Normalized FP Numbers Decimal Example – +0.754*10 15 is not correct scientific notation – Must have exactly one significant digit before decimal point: +7.54*10 14 In binary the only significant digit is ‘1’ Thus normalized FP format is: ±1.bbbbbb * 2 ±exp FP numbers will always be normalized before being stored in memory or a reg. – The 1. is actually not stored but assumed since we always will store normalized numbers – If HW calculates a result of 0.001101*2 5 it must normalize to 1.101000*2 2 before storing © Mark Redekopp, Al rights reserved IEEE Floating Point Formats • Single Precision (32-bit format) – 1 Sign bit (0=p/1=n) – 8 Exponent bits (Excess-127 representation) – 23 fraction (significand or mantissa) bits – Equiv. Decimal Range: 7 digits x 10 ±38 • Double Precision (64-bit format) – 1 Sign bit (0=p/1=n) – 11 Exponent bits (Excess-1023 representation) – 52 fraction (significand or mantissa) bits – Equiv. Decimal Range: 16 digits x 10 ±308 S Fraction Exp. 1 8 23 S Fraction Exp. 1 11 52 © Mark Redekopp, Al rights reserved Exponent Representation Exponent includes its own sign (+/-) Rather than using 2’s comp. system, Single-Precision uses Excess-127 while Double-Precision uses Excess-1023 – This representation allows FP numbers to be easily compared Let E’ = stored exponent code and E = true exponent value For single-precision: E’ = E + 127 – 2 1 => E = 1, E’ = 128 10 = 10000000 2 For double-precision: E’ = E + 1023 – 2
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 04/03/2011 for the course EE 357 taught by Professor Mayeda during the Spring '08 term at USC.

### Page1 / 13

EE357Unit3_FP - Floating Point Used to represent very small...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online