EE103 Lecture Notes, Winter 2009, Prof S. Jacobsen
Section 2
16
SECTION 2: AN INTRODUCTION TO FLOATING POINT ARITHMETIC AND
RATES OF CONVERGENCE
Consider the number 12.
Of course, we can think of the number as
10
10
12
1 10
2 10
(12)
Δ
=⋅ +⋅ =
Or, consider the number 12.625.
We can think of this number as
1
2
3
12.625
1 10
2 10
6 10
2 10
6 10
−
−−
=⋅ +⋅ +⋅
+⋅
02
10
10
(12.625)
10
(.12625)
10
Δ
=⋅
=
⋅
On the other hand, we can also think of the number 12 as
32 1 0
0
2
12
1 2
1 2
0 2
0 2
(1100)
2
=⋅ +⋅ +⋅ +⋅ =
⋅
Similarly, we can think of 12.625 as
1
2
3
12.625
1 2
1 2
0 2
0 2
1 2
0 2
1 2
−
=⋅ +⋅ +⋅ +⋅ +⋅ +⋅ +⋅
04
22
(1100.101)
2
(.1100101)
2
Or, we can think of 12.625 as
1
0
2
88
12.625
1 8
4 8
5 8
(14.5) 8
(.145) 8
−
=⋅ +⋅ +⋅ =
⋅ =
⋅
Floating Point Arithmetic
Definition
:
an n digit floating point number, in base
β
, has the form
123
(.
)
e
n
ddd
d
±
⋅⋅⋅
⋅
1
where
0,1,2,.
..,
1
i
d
=−
and
is an integer.
Unless the number in question is zero, we
always write the number so that
1
0
d
≠
(i.e., we say the number is "normalized").
The
term
(.
)
n
d
is called the mantissa;
e
, the exponent, is an integer.
The number
n
is, of course, finite
and is often called the "precision".
The size of
n
depends upon the word length of the
computer in question and, of course, this value varies considerably.
The exponent,
e
, is
an integer and is limited to a range, denoted by
min
max
ee
e
≤
≤
1
Storage of floating point numbers on a real computer is, of course, slightly different.