This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: AMSC/CMSC 660 Scientific Computing I Fall 2006 Unit 3: Optimization Dianne P. O’Leary c ° 2002,2004,2006 Optimization: Fundamentals Our goal is to develop algorithms to solve the problem Problem P: Given a function f : S → R , find min x ∈ S f ( x ) with solution x opt . The point x opt is called the minimizer , and the value f ( x opt ) is the minimum . For unconstrained optimization, the set S is usually taken to be R n , but sometimes we make use of upper or lower bounds on the variables, restricting our search to a box { x : ‘ ≤ x ≤ u } for some given vectors ‘, u ∈ R n . The plan 1. Basics of unconstrained optimization 2. Alternatives to Newton’s method 3. Fundamentals of constrained optimization 1 Part 1: Basics of unconstrained optimization Plan for Part 1: The plan: • How do we recognize a solution? • Some geometry. • Our basic algorithm for finding a solution. • The model method: Newton. • How close to Newton do we need to be? • Making methods safe: – Descent directions and line searches. – Trust regions. How do we recognize a solution? What does it mean to be a solution? The point x opt is a local solution to Problem P if there is a δ > so that if x ∈ S and k x x opt k < δ , then f ( x opt ) ≤ f ( x ) . In other words, x opt is at least as good as any point in its neighborhood. The point x opt is a global solution to Problem P if for any x ∈ S , then f ( x opt ) ≤ f ( x ) . Note: It would be nice if every local solution was guaranteed to be global. This is true if f is convex . We’ll look at this case more carefully in the ”Geometry” section of these notes. Some notation 2 We’ll assume throughout this unit that f is smooth enough that it has as many continuous derivatives as we need . For this section, that means 2 continuous derivatives plus one more, possibly discontinuous. The gradient of f at x is defined to be the vector g ( x ) = 5 f ( x ) = ∂f/∂x 1 . . . ∂f/∂x n . The Hessian of f at x is the derivative of the gradient: H ( x ) = 5 2 f ( x ) , with h ij = ∂ 2 f ∂x i ∂x j Note that the Hessian is symmetric, unless f fails to be smooth enough. How do we recognize a solution? Recall from calculus Taylor series : Suppose we have a vector p ∈ R n with k p k = 1 , and a small scalar h . Then f ( x opt + h p ) = f ( x opt ) + h p T g ( x opt ) + 1 2 h 2 p T H ( x opt ) p + O ( h 3 ) . First Order Necessary Condition for Optimality f ( x opt + h p ) = f ( x opt ) + h p T g ( x opt ) + 1 2 h 2 p T H ( x opt ) p + O ( h 3 ) . Now suppose that g ( x opt ) is nonzero. Then we can always find a descent or downhill direction p so that p T g ( x opt ) < . (Take, for example, p = g ( x opt ) / k g ( x opt ) k .) Therefore, for small enough h , we can make 1 2 h 2 p T H ( x opt ) p small enough that f ( x opt + h p ) < f ( x opt ) ....
View
Full Document
 Fall '06
 oleary
 Optimization, The Land, positive definite, Newton's method in optimization, Optimization algorithms

Click to edit the document details