This preview shows pages 1–10. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: S t a t i s t i c s Ucla Jan de Leeuw Computational Statistics with R Part III: Optimization in R In this part of the course we look at algorithms to minimize a realvalued function on a subset of n dimensional space. Thus the problem we study is where This also easily covers maximization problems (just use f ), but it does not cover vectorvalued functions or functions defined on function spaces ( calculus of variations ). 2 min x f ( x ) , R n . 1. General Optimization 1.1 Iterative Algorithms 1.2 Global Convergence (Zangwill Theory) 1.3 Local Convergence (Ostrowski Theory) 1.4 Necessary Conditions 3 1. General Optimization 1.1 Iterative Algorithms 1.2 Global Convergence (Zangwill Theory) 1.3 Local Convergence (Ostrowski Theory) 1.4 Necessary Conditions 3 1.1 Iterative Algorithms For our purposes an algorithm is a map A point is a fixed point of F if F(x) = x . We use F to generate a sequence of points by The key problems we study are (a) when does this sequence converge to a fixed point of F , and (b) if it converges, how fast does it converge. 4 F : . x x ( k + 1) = F ( x ( k ) ) . Clearly this way of formulating the problem is only useful if we can write minimization problems as fixed point problems. The link between the two are usually the necessary conditions for an optimum, which allow us to rewrite the problem as solving a system of equations (and/or inequalities). This system can then be usually be rewritten in the form x = F(x) for some F . Many examples will follow. 5 1.2 Global Convergence (Zangwill Theory) We first need some regularity conditions (which can be relaxed, if needed). And we need a condition linking the algorithmic map F with the function f we are minimizing. 6 if F ( x ) x then f ( F ( x )) < f ( x ). Both F and f are continuous on , is compact (closed and bounded). Zangwill Theorem 7 The sequence x ( k ) has at least one accumulation point. Each accumulation point x is a fixed point of F . All accumulation points x have the same function value f . f ( x ( k ) ) f . x ( k + 1) x ( k ) . If F has only a finite number of fixed points, then x ( k ) x . The Zangwill Theorem can be used to prove global convergence of the iterative algorithm (where global convergence means convergence from any initial point, it does not mean convergence to the global minimum). The proof (which we omit in this version of the notes) is a typical epsilondelta proof. Observe that the function values f(x (k) ) always converge, but for the iterates x (k) themselves to converge we need to have isolated fixed points. 8 1.3 Local Convergence (Ostrowski Theory) A globally convergent iterative algorithm can still be pretty useless if convergence is too slow. So we need a way to measure speed of convergence (and to estimate this speed without actually running the algorithm)....
View Full
Document
 Fall '10
 JandeLeeuw

Click to edit the document details