ConvexOptimizationII-Lecture13
Instructor (Stephen Boyd)
:Great, I guess this means we’ve started. So today, we’ll
continue with the conjugate gradient stuff. So last time – let me just review sort of where
we were. It was actually yesterday, but logic, I mean, logically – in fact. But we can
pretend it’s five days or whatever it would be.
So we’re looking at solving symmetric positive definite systems of equations and this
would come up in Newton’s method, it comes up in, you know, interior point methods,
least squares, all these sorts of things. And last time we talked about, I mean, the CG
Method the basic idea is it’s a method which solves Ax=b where A is positive definite.
And – but it does so in a different way.
The ways you’ve seen before are factor solve methods. In fact, in those methods what
you need is you actually need the matrix. So you actually – you pass a pointed to an array
of numbers, roughly. And then you work on that.
What’s extremely interested about the CG method is actually the way A is described is
completely different. You do not give a set of numbers. In fact, in most of the interesting
applications of CG you will never form or store the matrix A, ever, because, in fact, in
most cases it’s huge. It’s some vast thing. That’s kind of the point. Instead, what you
need is simply a method for calculating A times a vector.
So what you really are gonna pass into a CG method is a pointer to a function that
evaluates this thing. How it evaluates it is it’s business, it’s none of your business. That’s
sort of how – that’s the idea.
Okay, well, last time we started looking at a little bit about this. We looked at two
different measures of the error. So one measure is this number tao and it’s the amount of
decrease, F is that quadratic function you’re minimizing, you’ve achieved divided by –
sorry, this is the amount of the decrease, the sub optimality and decrease divided by the
original sub optimality of decrease.
That’s probably what you’re interested in. But another one which is probably in many
applications what’s actually easier to get a handle on and all that sort of stuff is the
residual. And the residual is nothing but, you know, b-Ax. It’s basically how far are you
from solving Ax-b.
Now the CG method, and actually many others, actually work with the idea of a Krylov
subspace. And this is just to sort of rapidly review what we did last time.
The Krylov sequence of a set of vectors is defined this way. It’s actually – you take this
Krylov subspace and that’s the stand of b/ab up to some Ak-1b. And that essentially
means it’s all vectors that can be written this way. It’s B times a polynomial of A and that
polynomials of degree K minus one. That’s a subspace of dimension, oh, it could be K