Then we will also have the following:
1.
h
r
k
+1
,
r
‘
i
=
r
T
‘
r
k
+1
= 0 for all
‘
≤
k
.
To see this, notice that
r
T
‘
Hd
k
= (
d
‘

β
‘
d
‘

1
)
T
Hd
k
(5)
=
(
d
T
k
Hd
k
‘
=
k
0
‘ < k,
(6)
where the second step follows directly from the fact that
h
d
k
,
d
‘
i
H
=
0 for
‘ < k
. As a result
r
T
‘
r
k
+1
=
r
T
‘
r
k

r
T
k
r
k
d
T
k
Hd
k
r
T
‘
Hd
k
= 0
for all
‘
≤
k.
(7)
2.
h
d
k
+1
,
d
‘
i
H
=
d
T
‘
Hd
k
+1
= 0 for all
‘
≤
k
.
This follows from the expansion
d
T
‘
Hd
k
+1
=
d
T
‘
Hr
k
+1
+
β
k
+1
d
T
‘
Hd
k
.
Notice that
r
T
i
r
k
+1
=
r
T
i
r
k

α
k
r
T
i
Hd
k
⇒
r
T
i
Hd
k
=
1
α
k
r
T
k
r
k
i
=
k

1
α
k
r
T
k
+1
r
k
+1
i
=
k
+ 1
0
i < k.
(8)
42
Georgia Tech ECE 6250 Fall 2019; Notes by J. Romberg and M. Davenport. Last updated 2:06, November 18, 2019
Subscribe to view the full document.
Then for
‘
=
k
d
T
k
Hd
k
+1
=

1
α
k
r
T
k
+1
r
k
+1
+
β
k
+1
d
T
k
Hd
k
=

r
T
k
+1
r
k
+1
r
T
k
r
k
d
T
k
Hd
k
+
r
T
k
+1
r
k
+1
r
T
k
r
k
d
T
k
Hd
k
= 0
.
For
‘ < k
,
d
T
‘
Hd
k
+1
=
d
T
‘
Hr
k
+1
+
β
k
+1
d
T
‘
Hd
k
.
For the first term
d
T
‘
Hr
k
+1
= 0
,
since
Hd
‘
=
α

1
‘
(
r
‘

r
‘
+1
) and we have (
7
); for the second
term
β
k
+1
d
T
‘
Hd
k
= 0
,
since the
d
0
,
d
1
, . . . ,
d
k
are
H
orthogonal already.
We have established that the direction
d
k
that CG moves on iteration
k
is
H
orthogonal to all previous directions. Now let’s look at the
step sizes, where we want to establish that
α
k
=

c
k
/
k
d
k
k
H
=

d
T
k
He
0
/
k
d
k
k
2
H
. Start by noting (
6
) above, and recall that
r
k
=
b

Hx
k
=
H
(
b
x

x
k
) =

He
k
.
At the first step, we have
d
0
=
r
0
, and so
α
0
=
r
T
0
r
0
d
T
0
Hd
0
=
d
T
0
r
0
d
T
0
Hd
0
=
d
T
0
H
(
b
x

x
0
)
d
T
0
Hd
0
=

d
T
0
He
0
d
T
0
Hd
0
.
43
Georgia Tech ECE 6250 Fall 2019; Notes by J. Romberg and M. Davenport. Last updated 2:06, November 18, 2019
At subsequent steps, since
d
k
=
r
k
+
k

1
X
i
=0
γ
i
r
i
for some
γ
i
∈
R
,
by Fact 1, we have
r
T
k
r
k
=
d
T
k
r
k
,
and so
α
k
=
d
T
k
r
k
d
T
k
Hd
k
=

d
T
k
H
e
0
+
∑
k

1
‘
=0
α
‘
d
‘
d
T
k
Hd
k
=

d
T
k
He
0
d
T
k
Hd
k
.
So finally, this means that for the method of conjugate gradients,
e
k
=
N

1
X
‘
=
k
d
T
‘
r
‘
d
T
‘
Hd
‘
!
d
‘
,
k
e
k
k
2
H
=
N

1
X
‘
=
k

d
T
‘
r
‘

2
d
T
‘
Hd
‘
.
As
k
increases, the number of (positive) terms in the sum above gets
smaller and smaller, until finally
e
N
=
0
.
Thus
CG is guaranteed to converge exactly in
N
steps
.
Since each iteration of CG involves a vectormatrix multiply, each of
which are
O
(
N
2
), and we converge in
O
(
N
) iterations, CG solves
Hx
=
b
in
O
(
N
3
) computations in general, the same as other
solvers.
44
Georgia Tech ECE 6250 Fall 2019; Notes by J. Romberg and M. Davenport. Last updated 2:06, November 18, 2019
Subscribe to view the full document.
But there are two important things to realize
:
1. If
H
is specially structured so that it takes
O
(
N
2
) com
putations to apply, then CG takes advantage of this. The real
cost is
N
applications of
H
.
2. It is often the case that
k
e
k
k
2
H
is acceptably small for relatively
modest values of
k
.
This is particularly true if
H
is well
conditioned. Each iteration (application of
H
) gets us closer,
in a measurable way, to the solution.
CG can get an approximate (but still potentially very good) solu
tion using much less computation than solving the system directly.
 Fall '08
 Staff