{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

135Cbig - S tics atis t Ucla Computational Statistics with...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Statistics Ucla Jan de Leeuw Computational Statistics with R Part III: Optimization in R
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
In this part of the course we look at algorithms to minimize a real-valued function on a subset of n - dimensional space. Thus the problem we study is where This also easily covers maximization problems (just use -f ), but it does not cover vector-valued functions or functions defined on function spaces ( calculus of variations ). 2 min x Ω f ( x ) , Ω R n .
Background image of page 2
1. General Optimization 1.1 Iterative Algorithms 1.2 Global Convergence (Zangwill Theory) 1.3 Local Convergence (Ostrowski Theory) 1.4 Necessary Conditions 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
1.1 Iterative Algorithms For our purposes an algorithm is a map A point is a fixed point of F if F(x) = x . We use F to generate a sequence of points by The key problems we study are (a) when does this sequence converge to a fixed point of F , and (b) if it converges, how fast does it converge. 4 F : Ω Ω . x Ω x ( k + 1) = F ( x ( k ) ) .
Background image of page 4
Clearly this way of formulating the problem is only useful if we can write minimization problems as fixed point problems. The link between the two are usually the necessary conditions for an optimum, which allow us to rewrite the problem as solving a system of equations (and/or inequalities). This system can then be usually be rewritten in the form x = F(x) for some F . Many examples will follow. 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
1.2 Global Convergence (Zangwill Theory) We first need some regularity conditions (which can be relaxed, if needed). And we need a condition linking the algorithmic map F with the function f we are minimizing. 6 if F ( x ) x then f ( F ( x )) < f ( x ). Both F and f are continuous on Ω , Ω is compact (closed and bounded).
Background image of page 6
Zangwill Theorem 7 The sequence x ( k ) has at least one accumulation point. Each accumulation point x is a fixed point of F . All accumulation points x have the same function value f . f ( x ( k ) ) f . x ( k + 1) - x ( k ) 0 . If F has only a finite number of fixed points, then x ( k ) x .
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
The Zangwill Theorem can be used to prove global convergence of the iterative algorithm (where global convergence means convergence from any initial point, it does not mean convergence to the global minimum). The proof (which we omit in this version of the notes) is a typical epsilon-delta proof. Observe that the function values f(x (k) ) always converge, but for the iterates x (k) themselves to converge we need to have isolated fixed points. 8
Background image of page 8
1.3 Local Convergence (Ostrowski Theory) A globally convergent iterative algorithm can still be pretty useless if convergence is too slow. So we need a way to measure speed of convergence (and to estimate this speed without actually running the algorithm). Suppose we have a sequence x (k) converging to some point We measure the local convergence speed (which usually depends on the limit point).
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}