Unformatted text preview: CALCULUS OF VARIATIONS MA 4311 LECTURE NOTES
I. B. Russak Department of Mathematics Naval Postgraduate School Code MA/Ru Monterey, California 93943 July 9, 2002 c 1996  Professor I. B. Russak 1 Contents
1 Functions of n Variables 1.1 Unconstrained Minimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Constrained Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Examples, Notation 2.1 Notation & Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Shortest Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 First Results 3.1 Two Important Auxiliary Formulas: . . . . . . . . . . . . . . . . . . . . . . . 3.2 Two Important Auxiliary Formulas in the General Case . . . . . . . . . . . . 4 Variable EndPoint Problems 4.1 The General Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Higher Dimensional Problems and Another Equation 5.1 Variational Problems with Constraints . . . . 5.1.1 Isoparametric Problems . . . . . . . . . 5.1.2 Point Constraints . . . . . . . . . . . . . Proof of the Second Euler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 47 47 51 59 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 63 63 71 74 82 84 90 97 104 110 116 1 1 5 10 13 14 21 22 26 36 38 41 6 Integrals Involving More Than One Independent Variable 7 Examples of Numerical Techniques 7.1 Indirect Methods . . . . . . . . . . 7.1.1 Fixed End Points . . . . . . . 7.1.2 Variable End Points . . . . . 7.2 Direct Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 The RayleighRitz Method 8.1 Euler's Method of Finite Differences . . . . . . . . . . . . . . . . . . . . . . . 9 Hamilton's Principle 10 Degrees of Freedom  Generalized Coordinates 11 Integrals Involving Higher Derivatives 12 Piecewise Smooth Arcs and Additional Results 13 Field Theory Jacobi's Neccesary Condition and Sufficiency i List of Figures
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Neighborhood S of X0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neighborhood S of X0 and a particular direction H . . . . . . . . . . . . . . Two dimensional neighborhood of X0 showing tangent at that point . . . . . The constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The surface of revolution for the soap example . . . . . . . . . . . . . . . . . Brachistochrone problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An arc connecting X1 and X2 . . . . . . . . . . . . . . . . . . . . . . . . . . Admissible function vanishing at end points (bottom) and various admissible functions (top) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Families of arcs y0 + . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Line segment of variable length with endpoints on the curves C, D . . . . . . Curves described by endpoints of the family y(x, b) . . . . . . . . . . . . . . Cycloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A particle falling from point 1 to point 2 . . . . . . . . . . . . . . . . . . . . Cycloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Curves C, D described by the endpoints of segment y34 . . . . . . . . . . . . Shortest arc from a fixed point 1 to a curve N. G is the evolute . . . . . . . Path of quickest descent, y12 , from point 1 to the curve N . . . . . . . . . . Intersection of a plane with a sphere . . . . . . . . . . . . . . . . . . . . . . Domain R with outward normal making an angle with x axis . . . . . . . . Solution of example given by (14) . . . . . . . . . . . . . . . . . . . . . . . . The exact solution (solid line) is compared with 0 (dash dot), y1 (dot) and y2 (dash) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Piecewise linear function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The exact solution (solid line) is compared with y1 (dot), y2 (dash dot), y3 (dash) and y4 (dot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paths made by the vectors R and R + R . . . . . . . . . . . . . . . . . . . Unit vectors er , e , and e . . . . . . . . . . . . . . . . . . . . . . . . . . . . A simple pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A compound pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Two nearby points 3,4 on the minimizing arc . . . . . . . . . . . . . . . . . Line segment of variable length with endpoints on the curves C, D . . . . . . Shortest arc from a fixed point 1 to a curve N. G is the evolute . . . . . . . Line segment of variable length with endpoints on the curves C, D . . . . . . Conjugate point at the right end of an extremal arc . . . . . . . . . . . . . . Line segment of variable length with endpoints on the curves C, D . . . . . . The path of quickest descent from point 1 to a cuve N . . . . . . . . . . . . 2 2 5 6 11 12 15 15 17 22 27 29 29 32 33 36 40 56 61 71 85 86 88 90 94 99 100 112 116 118 120 121 123 127 ii Credits Much of the material in these notes was taken from the following texts: 1. Bliss  Calculus of Variations, Carus monograph  Open Court Publishing Co.  1924 2. Gelfand & Fomin  Calculus of Variations  Prentice Hall 1963 3. Forray  Variational Calculus  McGraw Hill 1968 4. Weinstock  Calculus of Variations  Dover 1974 5. J. D. Logan  Applied Mathematics, Second Edition John Wiley 1997 The figures are plotted by Lt. Thomas A. Hamrick, USN and Lt. Gerald N. Miranda, USN using Matlab. They also revamped the numerical examples chapter to include Matlab software and problems for the reader. iii CHAPTER 1 1 Functions of n Variables The first topic is that of finding maxima or minima (optimizing) functions of n variables. Thus suppose that we have a function f (x1 , x2 , xn ) = f (X) (where X denotes the ntuple (x1 , x2 , , xn )) defined in some subset of n dimensional space Rn and that we wish to optimize f , i.e. to find a point X0 such that f (X0 ) f (X) or f (X0 ) f (X) (1) The first inequality states a problem in minimizing f while the latter states a problem in maximizing f . Mathematically, there is little difference between the two problems, for maximizing f is equivalent to minimizing the function G = f . Because of this, we shall tend to discuss only minimization problems, it being understood that corresponding results carry over to the other type of problem. We shall generally (unless otherwise stated) take f to have sufficient continuous differentiability to justify our operations. The notation to discuss differentiability will be that f is of class C i which means that f has continuous derivatives up through the ith order. 1.1 Unconstrained Minimum As a first specific optimization problem suppose that we have a function f defined on some open set in Rn . Then f is said to have an unconstrained relative minimum at X0 if f (X0 ) f (X) (2) for all points X in some neighborhood S of X0 . X0 is called a relative minimizing point. We make some comments: Firstly the word relative used above means that X0 is a minimizing point for f in comparison to nearby points, rather than also in comparison to distant points. Our results will generally be of this "relative" nature. Secondly, the word unconstrained means essentially that in doing the above discussed comparison we can proceed in any direction from the minimizing point. Thus in Figure 1, we may proceed in any direction from X0 to any point in some neighborhood S to make this comparison. In order for (2) to be true, then we must have that
n i=1 fxi hi = 0 fxi = 0 i = 1, , n
n i,j=1 (3a) and fxi xj hi hj 0 1 (3b) S Xo Figure 1: Neighborhood S of X0 for all vectors H = (h1 , h2 , , hn ) where fxi and fxi xj are respectively the first and second order partials at X0 . fxi f 2f , fxi xj , xi xi xj The implication in (3a), follows since the first part of (3a) holds for all vectors H. Condition (3a) says that the first derivative in the direction specified by the vector H must be zero and (3b) says that the second derivative in that direction must be nonnegative, these statements being true for all vectors H. In order to prove these statements, consider a particular direction H and the points X( ) = X0 + H for small numbers (so that X( ) is in S). The picture is given in Figure 2.
S H X( )=Xo+ H Xo r Figure 2: Neighborhood S of X0 and a particular direction H 2 Define the function g( ) = f (X0 + H) where is small enough so that X0 + H is in S. Since X0 is a relative minimizing point, then g( )  g(0) = f (X0 + H)  f (X0 ) 0 0 (5a) 0 (4) Since H is also a direction in which we may find points X to compare with, then we may also define g for negative and extend (5a) to read g( )  g(0) = f (X0 + H)  f (X0 ) 0  (5b) Thus = 0 is a relative minimizing point for g and we know (from results for a function in one variable) that dg(0) d2 g(0) = 0 and 0 (6) d d2 Now f is a function of the point X = (x1 , , xn ) where the components of X( ) are specified by  i = 1, , n (7) xi ( ) = x0,i + hi so that differentiating by the chain rule yields dg(0) = 0= d and dxi = fxi d i=1
n n fxi hi
i=1 (which fxi =0) i = 1, , n (8a) n n dxi dxj d2 g(0) fxi xj fxi xj hi hj 0 = = d d d i,j=1 i,j=1 (8b) in which (8b) has used (8a). In (8) all derivatives of f are at X0 and the derivatives of x are at = 0. This proves (3a) and (3b) which are known as the first and second order necessary conditions for a relative minimum to exist at X0 . The term necessary means that they are required in order that X0 be a relative minimizing point. The terms first and second order refer to (3a) being a condition on the first derivative and (3b) being a condition on the second derivative of f . In this course we will be primarily concerned with necessary conditions for minimization, however for completeness we state the following: As a sufficient condition for X0 to be relative minimizing point one has that if
n n fxi hi = 0 and
i=1 i,j=1 fxi xj hi hj 0 (9) for all vectors H = (h1 , , hn ), with all derivatives computed at X0 , then X0 is an unconstrained relative minimizing point for f . 3 Theorem 1 If f (x) exists in a neighborhood of x0 and is continuous at x0 , then f (x0 + h)  f (x0 ) = f (x0 )h + where lim (h) = 0. h0 h2 Proof By Taylor's formula f (x0 + h)  f (x0 ) = f (x0 )h + f (x0 + h)  f (x0 ) = f (x0 )h + 1 f (x0 + h)h2 2 (11) 1 f (x0 )h2 + (h) 2 h < (10) 1 1 f (x0 )h2 + [f (x0 + h)  f (x0 )] h2 2 2 The term in brackets tends to 0 as h 0 since f is continuous. Hence 1 (h) = [f (x0 + h)  f (x0 )] 0 2 h 2 This proves (10). Now suppose f C 2 [a, b] and f has a relative minimum at x = x0 . Then clearly f (x0 + h)  f (x0 ) 0 and f (x0 ) = 0. Using (10) and (13) we have f (x0 + h)  f (x0 ) = with lim 1 f (x0 )h2 + (h) 0 2 as h 0. (12) (13) (14) (15) h0 (h) = 0. Now pick h0 so that h0  < , then h2 f (x0 + h0 )  f (x0 ) = 1 f (x0 )2 h2 + (h0 ) 0 0 2  1 (16) Since 1 (h0 ) 1 f (x0 )2 h2 + (h0 ) = 2 h2 f (x0 ) + 2 2 2 0 0 2 2 h0 lim (h0 ) = 0 2 h2 0 and since
h0 we have by necessity f (x0 ) 0. 4 1.2 Constrained Minimization As an introduction to constrained optimization problems consider the situation of seeking a minimizing point for the function f (X) among points which satisfy a condition (X) = 0 (17) Such a problem is called a constrained optimization problem and the function is called a constraint. If X0 is a solution to this problem, then we say that X0 is a relative minimizing point for f subject to the constraint = 0. In this case, because of the constraint = 0 all directions are no longer available to get comparison points. Our comparison points must satisfy (17). Thus if X( ) is a curve of comparison points in a neighborhood S of X0 and if X( ) passes through X0 (say at = 0), then since X( ) must satisfy (17) we have (X( ))  (X(0)) = 0 so that also d (X( ))  (X(0)) (0) = lim = 0 d
n (18) x i dxi (0) =0 d (19) i=1 In two dimensions (i.e. for N = 2) the picture is Tangent at X0 > (has components dx1(0)/d ,dx2(0)/d ) X0 < Points X( ) (for which = 0) Figure 3: Two dimensional neighborhood of X0 showing tangent at that point Thus these tangent vectors, i.e. vectors H which satisfy (19), become (with placed by hi )
n dxi (0) red (20) xi hi = 0
i=1 5 and are the only possible directions in which we find comparison points. Because of this, the condition here which corresponds to the first order condition (3a) in the unconstrained problem is
n fxi hi = 0
i=1 (21) for all vectors H satisfying (19) instead of for all vectors H. This condition is not in usable form, i.e. it does not lead to the implications in (3a) which is really the condition used in solving unconstrained problems. In order to get a usable condition for the constrained problem, we depart from the geometric approach (although one could pursue it to get a condition). As an example of a constrained optimization problem let us consider the problem of finding the minimum distance from the origin to the surface x2  z 2 = 1. This can be stated as the problem of minimize f = x2 + y 2 + z 2 subject to = x2  z 2  1 = 0 and is the problem of finding the point(s) on the hyperbola x2  z 2 = 1 closest to the origin.
zaxis 10 5 0 5 1.5 2 1.5 2 0.5 15 10 yaxis 0 Figure 4: The constraint A common technique to try is substitution i.e. using to solve for one variable in terms of the other(s). 6 0.5 1 xaxis 0 1 15 Solving for z gives z 2 = x2  1 and then f = 2x2 + y 2  1 and then solving this as the unconstrained problem min f = 2x2 + y 2  1 gives the conditions 0 = fx = 4x and 0 = fy = 2y which implies x = y = 0 at the minimizing point. But at this point z 2 = 1 which means that there is no real solution point. But this is nonsense as the physical picture shows. A surer way to solve constrained optimization problems comes from the following: For the problem of minimize f subject to = 0 then if X0 is a relative minimum, then there is a constant such that with the function F defined by F = f + (22) then
n Fxi hi = 0
i=1 for all vectors H (23) This constitutes the first order condition for this problem and it is in usable form since it's true for all vectors H and so implies the equations Fxi = 0 i = 1, , n (24) This is called the method of Lagrange Multiplers and with the n equations (24) together with the constraint equation, provides n + 1 equations for the n + 1 unknowns x1 , , xn , . Solving the previous problem by this method, we form the function F = x2 + y 2 + z 2 + (x2  z 2  1) The system (24) together with the constraint give equations 0 0 0 = = = = Fx = 2x + 2x = 2x(1 + ) Fy = 2y Fz = 2z  2z = 2z(1  ) x2  z 2  1 = 0 (26a) (26b) (26c) (26d) (25) Now (26b) y = 0 and (26a) x = 0 or = 1. For the case x = 0 and y = 0 we have from (26d) that z 2 = 1 which gives no real solution. Trying the other possibility, y = 0 and = 1 then (26c) gives z = 0 and then (26d) gives x2 = 1 or x = 1. Thus the only possible points are (1, 0, 0, ). 7 The method covers the case of more than one constraint, say k constraints. i = 0 i = 1, , k < n and in this situation there are k constants (one for each constraint) and the function
k (27) F =f+
i=1 i i (28) satisfying (24). Thus here there are k+n unknowns 1 , , k , x1 , , xn and k+n equations to determine them, namely the n equations (24) together with the k constraints (27). Problems 1. Use the method of Lagrange Multipliers to solve the problem minimize f = x2 + y 2 + z 2 subject to = xy + 1  z = 0 2. Show that max 0 = cosh cosh 0 where 0 is the positive root of cosh  sinh = 0. Sketch to show 0 . 3. Of all rectangular parallelepipeds which have sides parallel to the coordinate planes, and which are inscribed in the ellipsoid y2 z2 x2 + 2 + 2 = 1 a2 b c determine the dimensions of that one which has the largest volume. 4. Of all parabolas which pass through the points (0,0) and (1,1), determine that one which, when rotated about the xaxis, generates a solid of revolution with least possible volume between x = 0 and x = 1. [Notice that the equation may be taken in the form y = x + cx(1  x), when c is to be determined. 5. a. If x = (x1 , x2 , , xn ) is a real vector, and A is a real symmetric matrix of order n, show that the requirement that F xT Ax  xT x be stationary, for a prescibed A, takes the form Ax = x. 8 Deduce that the requirement that the quadratic form xT Ax be stationary, subject to the constraint xT x = constant, leads to the requirement Ax = x, where is a constant to be determined. [Notice that the same is true of the requirement that is stationary, subject to the constraint that = constant, with a suitable definition of .] b. Show that, if we write xT Ax , = xT x the requirement that be stationary leads again to the matrix equation Ax = x. [Notice that the requirement d = 0 can be written as d  d = 0 2 or d  d = 0] Deduce that stationary values of the ratio xT Ax xT x are characteristic numbers of the symmetric matrix A. 9 CHAPTER 2 2 Examples, Notation In the last chapter we were concerned with problems of optimization for functions of a finite number of variables. Thus we had to select values of n variables x1 , , xn in order to solve for a minimum of the function f (x1 , , xn ) . Now we can also consider problems of an infinite number of variables such as selecting the value of y at each point x in some interval [a, b] of the x axis in order to minimize (or maximize) the integral
x2 F (x, y, y )dx .
x2 x1 x1 Again as in the finite dimensional case, maximizing
x2 x1 F dx is the same as minimizing F dx so that we shall concentrate on minimization problems, it being understood that these include maximization problems. Also as in the finite dimensional case we can speak of relative minima. An arc y0 is said to provide a relative minimum for the above integral if it provides a minimum of the integral over those arcs which (satisfy all conditions of the problem and) are in a neighborhood of y0 . A neighborhood of y0 means a neighborhood of the points (x, y0 (x), y0 (x)) x1 x x2 so that an arc y is in this neighborhood if
x1 xx2 max y(x)  y0 (x) < and
x1 xx2 max y (x)  y0 (x) < for some > 0. Thus a relative minimum is in contrast to a global minimum where the integral is minimized over all arcs (which satisfy the conditions of the problem). Our results will generally be of this relative nature, of course any global minimizing arc is also a relative minimizing arc so that the necessary conditions which we prove for the relative case will also hold for the global case. The simplest of all the problems of the calculus of variations is doubtless that of determining the shortest arc joining two given points. The coordinates of these points will be We shall later speak of a different type of relative minimum and a different type of neighborhood of y0 . 10 denoted by (x1 , y1 ) and (x2 , y2 ) and we may designate the points themselves when convenient simply by the numerals 1 and 2. If the equation of an arc is taken in the form y : y(x) (x1 x x2 ) then the conditions that it shall pass through the two given points are y(x1 ) = y1 , y(x2 ) = y2 (2) (1) and we know from the calculus that the length of the arc is given by the integral I=
x2 x1 1+y 2 dx , where in the evaluation of the integral, y is to be replaced by the derivative y (x) of the function y(x) defining the arc. There is an infinite number of curves y = y(x) joining the points 1 and 2. The problem of finding the shortest one is equivalent analytically to that of finding in the class of functions y(x) satisfying the conditions (2) one which makes the integral I a minimum.
Y 1 2 0 X Figure 5: The surface of revolution for...
View
Full Document
 Summer '02
 RUSSAK
 Calculus, Derivative, X1, dx, y0

Click to edit the document details