Unformatted text preview: CALCULUS OF VARIATIONS MA 4311 LECTURE NOTES
I. B. Russak Department of Mathematics Naval Postgraduate School Code MA/Ru Monterey, California 93943 July 9, 2002 c 1996  Professor I. B. Russak 1 Contents
1 Functions of n Variables 1.1 Unconstrained Minimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Constrained Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Examples, Notation 2.1 Notation & Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Shortest Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 First Results 3.1 Two Important Auxiliary Formulas: . . . . . . . . . . . . . . . . . . . . . . . 3.2 Two Important Auxiliary Formulas in the General Case . . . . . . . . . . . . 4 Variable EndPoint Problems 4.1 The General Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Higher Dimensional Problems and Another Equation 5.1 Variational Problems with Constraints . . . . 5.1.1 Isoparametric Problems . . . . . . . . . 5.1.2 Point Constraints . . . . . . . . . . . . . Proof of the Second Euler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 47 47 51 59 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 63 63 71 74 82 84 90 97 104 110 116 1 1 5 10 13 14 21 22 26 36 38 41 6 Integrals Involving More Than One Independent Variable 7 Examples of Numerical Techniques 7.1 Indirect Methods . . . . . . . . . . 7.1.1 Fixed End Points . . . . . . . 7.1.2 Variable End Points . . . . . 7.2 Direct Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 The RayleighRitz Method 8.1 Euler's Method of Finite Differences . . . . . . . . . . . . . . . . . . . . . . . 9 Hamilton's Principle 10 Degrees of Freedom  Generalized Coordinates 11 Integrals Involving Higher Derivatives 12 Piecewise Smooth Arcs and Additional Results 13 Field Theory Jacobi's Neccesary Condition and Sufficiency i List of Figures
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Neighborhood S of X0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neighborhood S of X0 and a particular direction H . . . . . . . . . . . . . . Two dimensional neighborhood of X0 showing tangent at that point . . . . . The constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The surface of revolution for the soap example . . . . . . . . . . . . . . . . . Brachistochrone problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An arc connecting X1 and X2 . . . . . . . . . . . . . . . . . . . . . . . . . . Admissible function vanishing at end points (bottom) and various admissible functions (top) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Families of arcs y0 + . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Line segment of variable length with endpoints on the curves C, D . . . . . . Curves described by endpoints of the family y(x, b) . . . . . . . . . . . . . . Cycloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A particle falling from point 1 to point 2 . . . . . . . . . . . . . . . . . . . . Cycloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Curves C, D described by the endpoints of segment y34 . . . . . . . . . . . . Shortest arc from a fixed point 1 to a curve N. G is the evolute . . . . . . . Path of quickest descent, y12 , from point 1 to the curve N . . . . . . . . . . Intersection of a plane with a sphere . . . . . . . . . . . . . . . . . . . . . . Domain R with outward normal making an angle with x axis . . . . . . . . Solution of example given by (14) . . . . . . . . . . . . . . . . . . . . . . . . The exact solution (solid line) is compared with 0 (dash dot), y1 (dot) and y2 (dash) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Piecewise linear function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The exact solution (solid line) is compared with y1 (dot), y2 (dash dot), y3 (dash) and y4 (dot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paths made by the vectors R and R + R . . . . . . . . . . . . . . . . . . . Unit vectors er , e , and e . . . . . . . . . . . . . . . . . . . . . . . . . . . . A simple pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A compound pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Two nearby points 3,4 on the minimizing arc . . . . . . . . . . . . . . . . . Line segment of variable length with endpoints on the curves C, D . . . . . . Shortest arc from a fixed point 1 to a curve N. G is the evolute . . . . . . . Line segment of variable length with endpoints on the curves C, D . . . . . . Conjugate point at the right end of an extremal arc . . . . . . . . . . . . . . Line segment of variable length with endpoints on the curves C, D . . . . . . The path of quickest descent from point 1 to a cuve N . . . . . . . . . . . . 2 2 5 6 11 12 15 15 17 22 27 29 29 32 33 36 40 56 61 71 85 86 88 90 94 99 100 112 116 118 120 121 123 127 ii Credits Much of the material in these notes was taken from the following texts: 1. Bliss  Calculus of Variations, Carus monograph  Open Court Publishing Co.  1924 2. Gelfand & Fomin  Calculus of Variations  Prentice Hall 1963 3. Forray  Variational Calculus  McGraw Hill 1968 4. Weinstock  Calculus of Variations  Dover 1974 5. J. D. Logan  Applied Mathematics, Second Edition John Wiley 1997 The figures are plotted by Lt. Thomas A. Hamrick, USN and Lt. Gerald N. Miranda, USN using Matlab. They also revamped the numerical examples chapter to include Matlab software and problems for the reader. iii CHAPTER 1 1 Functions of n Variables The first topic is that of finding maxima or minima (optimizing) functions of n variables. Thus suppose that we have a function f (x1 , x2 , xn ) = f (X) (where X denotes the ntuple (x1 , x2 , , xn )) defined in some subset of n dimensional space Rn and that we wish to optimize f , i.e. to find a point X0 such that f (X0 ) f (X) or f (X0 ) f (X) (1) The first inequality states a problem in minimizing f while the latter states a problem in maximizing f . Mathematically, there is little difference between the two problems, for maximizing f is equivalent to minimizing the function G = f . Because of this, we shall tend to discuss only minimization problems, it being understood that corresponding results carry over to the other type of problem. We shall generally (unless otherwise stated) take f to have sufficient continuous differentiability to justify our operations. The notation to discuss differentiability will be that f is of class C i which means that f has continuous derivatives up through the ith order. 1.1 Unconstrained Minimum As a first specific optimization problem suppose that we have a function f defined on some open set in Rn . Then f is said to have an unconstrained relative minimum at X0 if f (X0 ) f (X) (2) for all points X in some neighborhood S of X0 . X0 is called a relative minimizing point. We make some comments: Firstly the word relative used above means that X0 is a minimizing point for f in comparison to nearby points, rather than also in comparison to distant points. Our results will generally be of this "relative" nature. Secondly, the word unconstrained means essentially that in doing the above discussed comparison we can proceed in any direction from the minimizing point. Thus in Figure 1, we may proceed in any direction from X0 to any point in some neighborhood S to make this comparison. In order for (2) to be true, then we must have that
n i=1 fxi hi = 0 fxi = 0 i = 1, , n
n i,j=1 (3a) and fxi xj hi hj 0 1 (3b) S Xo Figure 1: Neighborhood S of X0 for all vectors H = (h1 , h2 , , hn ) where fxi and fxi xj are respectively the first and second order partials at X0 . fxi f 2f , fxi xj , xi xi xj The implication in (3a), follows since the first part of (3a) holds for all vectors H. Condition (3a) says that the first derivative in the direction specified by the vector H must be zero and (3b) says that the second derivative in that direction must be nonnegative, these statements being true for all vectors H. In order to prove these statements, consider a particular direction H and the points X( ) = X0 + H for small numbers (so that X( ) is in S). The picture is given in Figure 2.
S H X( )=Xo+ H Xo r Figure 2: Neighborhood S of X0 and a particular direction H 2 Define the function g( ) = f (X0 + H) where is small enough so that X0 + H is in S. Since X0 is a relative minimizing point, then g( )  g(0) = f (X0 + H)  f (X0 ) 0 0 (5a) 0 (4) Since H is also a direction in which we may find points X to compare with, then we may also define g for negative and extend (5a) to read g( )  g(0) = f (X0 + H)  f (X0 ) 0  (5b) Thus = 0 is a relative minimizing point for g and we know (from results for a function in one variable) that dg(0) d2 g(0) = 0 and 0 (6) d d2 Now f is a function of the point X = (x1 , , xn ) where the components of X( ) are specified by  i = 1, , n (7) xi ( ) = x0,i + hi so that differentiating by the chain rule yields dg(0) = 0= d and dxi = fxi d i=1
n n fxi hi
i=1 (which fxi =0) i = 1, , n (8a) n n dxi dxj d2 g(0) fxi xj fxi xj hi hj 0 = = d d d i,j=1 i,j=1 (8b) in which (8b) has used (8a). In (8) all derivatives of f are at X0 and the derivatives of x are at = 0. This proves (3a) and (3b) which are known as the first and second order necessary conditions for a relative minimum to exist at X0 . The term necessary means that they are required in order that X0 be a relative minimizing point. The terms first and second order refer to (3a) being a condition on the first derivative and (3b) being a condition on the second derivative of f . In this course we will be primarily concerned with necessary conditions for minimization, however for completeness we state the following: As a sufficient condition for X0 to be relative minimizing point one has that if
n n fxi hi = 0 and
i=1 i,j=1 fxi xj hi hj 0 (9) for all vectors H = (h1 , , hn ), with all derivatives computed at X0 , then X0 is an unconstrained relative minimizing point for f . 3 Theorem 1 If f (x) exists in a neighborhood of x0 and is continuous at x0 , then f (x0 + h)  f (x0 ) = f (x0 )h + where lim (h) = 0. h0 h2 Proof By Taylor's formula f (x0 + h)  f (x0 ) = f (x0 )h + f (x0 + h)  f (x0 ) = f (x0 )h + 1 f (x0 + h)h2 2 (11) 1 f (x0 )h2 + (h) 2 h < (10) 1 1 f (x0 )h2 + [f (x0 + h)  f (x0 )] h2 2 2 The term in brackets tends to 0 as h 0 since f is continuous. Hence 1 (h) = [f (x0 + h)  f (x0 )] 0 2 h 2 This proves (10). Now suppose f C 2 [a, b] and f has a relative minimum at x = x0 . Then clearly f (x0 + h)  f (x0 ) 0 and f (x0 ) = 0. Using (10) and (13) we have f (x0 + h)  f (x0 ) = with lim 1 f (x0 )h2 + (h) 0 2 as h 0. (12) (13) (14) (15) h0 (h) = 0. Now pick h0 so that h0  < , then h2 f (x0 + h0 )  f (x0 ) = 1 f (x0 )2 h2 + (h0 ) 0 0 2  1 (16) Since 1 (h0 ) 1 f (x0 )2 h2 + (h0 ) = 2 h2 f (x0 ) + 2 2 2 0 0 2 2 h0 lim (h0 ) = 0 2 h2 0 and since
h0 we have by necessity f (x0 ) 0. 4 1.2 Constrained Minimization As an introduction to constrained optimization problems consider the situation of seeking a minimizing point for the function f (X) among points which satisfy a condition (X) = 0 (17) Such a problem is called a constrained optimization problem and the function is called a constraint. If X0 is a solution to this problem, then we say that X0 is a relative minimizing point for f subject to the constraint = 0. In this case, because of the constraint = 0 all directions are no longer available to get comparison points. Our comparison points must satisfy (17). Thus if X( ) is a curve of comparison points in a neighborhood S of X0 and if X( ) passes through X0 (say at = 0), then since X( ) must satisfy (17) we have (X( ))  (X(0)) = 0 so that also d (X( ))  (X(0)) (0) = lim = 0 d
n (18) x i dxi (0) =0 d (19) i=1 In two dimensions (i.e. for N = 2) the picture is Tangent at X0 > (has components dx1(0)/d ,dx2(0)/d ) X0 < Points X( ) (for which = 0) Figure 3: Two dimensional neighborhood of X0 showing tangent at that point Thus these tangent vectors, i.e. vectors H which satisfy (19), become (with placed by hi )
n dxi (0) red (20) xi hi = 0
i=1 5 and are the only possible directions in which we find comparison points. Because of this, the condition here which corresponds to the first order condition (3a) in the unconstrained problem is
n fxi hi = 0
i=1 (21) for all vectors H satisfying (19) instead of for all vectors H. This condition is not in usable form, i.e. it does not lead to the implications in (3a) which is really the condition used in solving unconstrained problems. In order to get a usable condition for the constrained problem, we depart from the geometric approach (although one could pursue it to get a condition). As an example of a constrained optimization problem let us consider the problem of finding the minimum distance from the origin to the surface x2  z 2 = 1. This can be stated as the problem of minimize f = x2 + y 2 + z 2 subject to = x2  z 2  1 = 0 and is the problem of finding the point(s) on the hyperbola x2  z 2 = 1 closest to the origin.
zaxis 10 5 0 5 1.5 2 1.5 2 0.5 15 10 yaxis 0 Figure 4: The constraint A common technique to try is substitution i.e. using to solve for one variable in terms of the other(s). 6 0.5 1 xaxis 0 1 15 Solving for z gives z 2 = x2  1 and then f = 2x2 + y 2  1 and then solving this as the unconstrained problem min f = 2x2 + y 2  1 gives the conditions 0 = fx = 4x and 0 = fy = 2y which implies x = y = 0 at the minimizing point. But at this point z 2 = 1 which means that there is no real solution point. But this is nonsense as the physical picture shows. A surer way to solve constrained optimization problems comes from the following: For the problem of minimize f subject to = 0 then if X0 is a relative minimum, then there is a constant such that with the function F defined by F = f + (22) then
n Fxi hi = 0
i=1 for all vectors H (23) This constitutes the first order condition for this problem and it is in usable form since it's true for all vectors H and so implies the equations Fxi = 0 i = 1, , n (24) This is called the method of Lagrange Multiplers and with the n equations (24) together with the constraint equation, provides n + 1 equations for the n + 1 unknowns x1 , , xn , . Solving the previous problem by this method, we form the function F = x2 + y 2 + z 2 + (x2  z 2  1) The system (24) together with the constraint give equations 0 0 0 = = = = Fx = 2x + 2x = 2x(1 + ) Fy = 2y Fz = 2z  2z = 2z(1  ) x2  z 2  1 = 0 (26a) (26b) (26c) (26d) (25) Now (26b) y = 0 and (26a) x = 0 or = 1. For the case x = 0 and y = 0 we have from (26d) that z 2 = 1 which gives no real solution. Trying the other possibility, y = 0 and = 1 then (26c) gives z = 0 and then (26d) gives x2 = 1 or x = 1. Thus the only possible points are (1, 0, 0, ). 7 The method covers the case of more than one constraint, say k constraints. i = 0 i = 1, , k < n and in this situation there are k constants (one for each constraint) and the function
k (27) F =f+
i=1 i i (28) satisfying (24). Thus here there are k+n unknowns 1 , , k , x1 , , xn and k+n equations to determine them, namely the n equations (24) together with the k constraints (27). Problems 1. Use the method of Lagrange Multipliers to solve the problem minimize f = x2 + y 2 + z 2 subject to = xy + 1  z = 0 2. Show that max 0 = cosh cosh 0 where 0 is the positive root of cosh  sinh = 0. Sketch to show 0 . 3. Of all rectangular parallelepipeds which have sides parallel to the coordinate planes, and which are inscribed in the ellipsoid y2 z2 x2 + 2 + 2 = 1 a2 b c determine the dimensions of that one which has the largest volume. 4. Of all parabolas which pass through the points (0,0) and (1,1), determine that one which, when rotated about the xaxis, generates a solid of revolution with least possible volume between x = 0 and x = 1. [Notice that the equation may be taken in the form y = x + cx(1  x), when c is to be determined. 5. a. If x = (x1 , x2 , , xn ) is a real vector, and A is a real symmetric matrix of order n, show that the requirement that F xT Ax  xT x be stationary, for a prescibed A, takes the form Ax = x. 8 Deduce that the requirement that the quadratic form xT Ax be stationary, subject to the constraint xT x = constant, leads to the requirement Ax = x, where is a constant to be determined. [Notice that the same is true of the requirement that is stationary, subject to the constraint that = constant, with a suitable definition of .] b. Show that, if we write xT Ax , = xT x the requirement that be stationary leads again to the matrix equation Ax = x. [Notice that the requirement d = 0 can be written as d  d = 0 2 or d  d = 0] Deduce that stationary values of the ratio xT Ax xT x are characteristic numbers of the symmetric matrix A. 9 CHAPTER 2 2 Examples, Notation In the last chapter we were concerned with problems of optimization for functions of a finite number of variables. Thus we had to select values of n variables x1 , , xn in order to solve for a minimum of the function f (x1 , , xn ) . Now we can also consider problems of an infinite number of variables such as selecting the value of y at each point x in some interval [a, b] of the x axis in order to minimize (or maximize) the integral
x2 F (x, y, y )dx .
x2 x1 x1 Again as in the finite dimensional case, maximizing
x2 x1 F dx is the same as minimizing F dx so that we shall concentrate on minimization problems, it being understood that these include maximization problems. Also as in the finite dimensional case we can speak of relative minima. An arc y0 is said to provide a relative minimum for the above integral if it provides a minimum of the integral over those arcs which (satisfy all conditions of the problem and) are in a neighborhood of y0 . A neighborhood of y0 means a neighborhood of the points (x, y0 (x), y0 (x)) x1 x x2 so that an arc y is in this neighborhood if
x1 xx2 max y(x)  y0 (x) < and
x1 xx2 max y (x)  y0 (x) < for some > 0. Thus a relative minimum is in contrast to a global minimum where the integral is minimized over all arcs (which satisfy the conditions of the problem). Our results will generally be of this relative nature, of course any global minimizing arc is also a relative minimizing arc so that the necessary conditions which we prove for the relative case will also hold for the global case. The simplest of all the problems of the calculus of variations is doubtless that of determining the shortest arc joining two given points. The coordinates of these points will be We shall later speak of a different type of relative minimum and a different type of neighborhood of y0 . 10 denoted by (x1 , y1 ) and (x2 , y2 ) and we may designate the points themselves when convenient simply by the numerals 1 and 2. If the equation of an arc is taken in the form y : y(x) (x1 x x2 ) then the conditions that it shall pass through the two given points are y(x1 ) = y1 , y(x2 ) = y2 (2) (1) and we know from the calculus that the length of the arc is given by the integral I=
x2 x1 1+y 2 dx , where in the evaluation of the integral, y is to be replaced by the derivative y (x) of the function y(x) defining the arc. There is an infinite number of curves y = y(x) joining the points 1 and 2. The problem of finding the shortest one is equivalent analytically to that of finding in the class of functions y(x) satisfying the conditions (2) one which makes the integral I a minimum.
Y 1 2 0 X Figure 5: The surface of revolution for the soap example There is a second problem of the calculus of variations, of a geometricalmechanical type, which the principles of the calculus readily enable us to express also in analytic form. When a wire circle is dipped in a soap solution and withdrawn, a circular disk of soap film bounded by the circle is formed. If a second smaller circle is made to touch this disk and then moved away the two circles will be joined by a surface of film which is a surface of revolution (in the particular case when the circles are parallel and have their centers on the same axis perpendicular to their planes.) The form of this surface is shown in Figure 5. It is provable by the principles of mechanics, as one may surmise intuitively from the elastic properties of a soap film, that the surface of revolution so formed must be one of minimum area, and the problem of determining the shape of the film is equivalent therefore to that of determining 11 such a minimum surface of revolution passing through two circles whose relative positions are supposed to be given as indicated in the figure. In order to phrase this problem analytically let the common axis of the two circles be taken as the xaxis, and let the points where the circles intersect an xyplane through that axis be 1 and 2. If the meridian curve of the surface in the xyplane has an equation y = y(x) then the calculus formula for the area of the surface is 2 times the value of the integral I=
x2 x1 y 1+y 2 dx . The problem of determining the form of the soap film surface between the two circles is analytically that of finding in the class of arcs y = y(x) whose ends are at the points 1 and 2 one which minimizes the lastwritten integral I. As a third example of problems of the calculus of variations consider the problem of the brachistochrone (shortest time) i.e. of determining a path down which a particle will fall from one given point to another in the shortest time. Let the yaxis for convenience be taken vertically downward, as in Figure 6, the two fixed points being 1 and 2.
0 X 1 2 Y Figure 6: Brachistochrone problem The initial velocity v1 at the point 1 is supposed to be given. Later we shall see that for 1 an arc defined by an equation of the form y = y(x) the time of descent from 1 to 2 is 2g times the value of the integral x2 1+y 2 dx , I= y x1
2 v1 . The problem 2g of the brachistochrone is then to find, among the arcs y : y(x) which pass through two points 1 and 2, one which minimizes the integral I. As a last example, consider the boundary value problem where g is the gravitational constant and has the constant value = y1  12 u (x) = r(x), subject to u(0) = 0, 0<x<1 u(1) = 1. The RayleighRitz method for this differential equation uses the solution of the following minimization problem: Find u that minimizes the integral I(u) =
1 0 1 (u )2  r(x)u dx 2 where u V = {v C 2 [0, 1], v(0) = 0, v(1) = 0} . The function r(x) can be viewed as force per unit mass. 2.1 Notation & Conventions The above problems are included in the general problem of minimizing an integral of the form x2 I= F (x, y, y ) dx (3)
x1 within the class of arcs which are continuously differentiable and also satisfy the endpoint conditions y(x1 ) = y1 y(x2 ) = y2 (4) where y1 , y2 are constants. In the previous three problems F was respectively F = 1 + y 2 , 1+y 2 2, F = and y1 , y2 were the y coordinates associated with the points F =y 1+y y 1 and 2. It should be noted that in (3) the symbols x, y, y denote free variables and are not directly related to arcs. For example, we can differentiate with respect to these variables to get in the case of our last example Fx = 0 Fy = 1 (y  )3/2 (1 + y 2 )1/2 , 2 Fy = y (y  )1/2 (1 + y 2 )1/2 (5a) It is when these functions are to be evaluated along an arc that we substitute y(x) for y and y (x) for y . The above considered only the two dimensional case. In the n + 1 (n > 1) dimensional case our arcs are represented by y: yi (x) x1 x x2 i = 1, , n (5b) (the distinction between yi (x) and y1 , y2 of (4) should be clear from the context) and the integral (3) is I=
x2 x1 F (x, y1, , yn , y1 , , yn )dx 13 (6) so that the integrals are functions of 2n + 1 variables and similar conventions to those for the two dimensional case hold for the n + 1 dimensional case. Thus for example we will be interested in minimizing an integral of the form (6) among the class of continuously differentiable arcs (5b) which satisfy the endpoint conditions yi (x1 ) = yi,1 yi (x2 ) = yi,2 i = 1, , n (7) where yi,1 , yi,2 are constants. For now, continuously differentiable arcs for which (6) is welldefined are called admissible arcs. Our problem in general will be to minimize the integral (6) over some subclass of admissible arcs. In the type of problems where the endpoints of the arcs are certain fixed values (as the problems thus far considered) the term fixed end point problem applies. In problems where the end points can vary, the term variable end point applies. 2.2 Shortest Distances The shortest arc joining two points. Problems of determining shortest distances furnish a useful introduction to the theory of the calculus of variations because the properties characterizing their solutions are familiar ones which illustrate very well many of the general principles common to all of the problems suggested above. If we can for the moment eradicate from our minds all that we know about straight lines and shortest distances we shall have the pleasure of rediscovering wellknown theorems by methods which will be helpful in solving more complicated problems. Let us begin with the simplest case of all, the problem of determining the shortest arc joining two given points. The integral to be minimized, which we have already seen may be written in the form x2 I= F (y )dx (8)
x1 if we use the notation F (y ) = (1 + y 2 ) , and the arcs y : y(x) (x1 x x2 ) whose lengths are to be compared with each other will always be understood to be continuous with a tangent turning continuously, as indicated in Figure 7. Analytically this means that on the interval x1 x x2 the function y(x) is continuous, and has a continuous derivative. As stated before, we agree to call such functions admissible functions and the arcs which they define, admissible arcs. Our problem is then to find among all admissible arcs joining two given points 1 and 2 one which makes the integral I a minimum. A first necessary condition. Let it be granted that a particular admissible arc y0 : y0 (x) (x1 x x2 ) 1 2 furnishes the solution of our problem, and let us then seek to find the properties which distinguish it from the other admissible arcs joining points 1 and 2. If we select arbitarily an admissible function (x) satisfying the conditions (x1 ) = (x2 ) = 0, the form y0 (x) + (x) 14 (x1 x x2 ) , (9) f(X1) f(X2) [ X1 X2 Figure 7: An arc connecting X1 and X2 involving the arbitrary constant a, represents a oneparameter family of arcs (see Figure 8) which includes the arc y0 for the special value = 0, and all of the arcs of the family pass through the endpoints 1 and 2 of y0 (since = 0 at endpoints). y0 [ x1 ] x2 (x) [ x1 x2 Figure 8: Admissible function vanishing at end points (bottom) and various admissible functions (top) The value of the integral I taken along an arc of the family depends upon the value of and may be represented by the symbol I( ) =
x2 x1 F (y0 + )dx . (10) Along the initial arc y0 the integral has the value I(0), and if this is to be a minimum when compared with the values of the integral along all other admissible arcs joining 1 with 2 it 15 must, in particular, be a minimum when compared with the values I( ) along the arcs of the family (9). Hence according to the criterion for a minimum of a function given previously we must have I (0) = 0. It should perhaps be emphasized here that the method of the calculus of variations, as it has been developed in the past, consists essentially of three parts; first, the deduction of necessary conditions which characterize a minimizing arc; second, the proof that these conditions, or others obtained from them by slight modifications, are sufficient to insure the minimum sought; and third, the search for an arc which satisfies the sufficient conditions. For the deduction of necessary conditions the value of the integral I along the minimizing arc can be compared with its values along any special admissible arcs which may be convenient for the purposes of the proof in question, for example along those of the family (9) described above, but the sufficiency proofs must be made with respect to all admissible arcs joining the points 1 and 2. The third part of the problem, the determination of an arc satisfying the sufficient conditions, is frequently the most difficult of all, and is the part for which fewest methods of a general character are known. For shortestdistance problems fortunately this determination is usually easy. By differentiating the expression (10) with respect to and then setting = 0 the value of I (0) is seen to be I (0) =
x2 x1 Fy dx , (11) where for convenience we use the notation Fy for the derivative of the integrand F (y ) with respect to y . It will always be understood that the argument in F and its derivatives is the function y0 (x) belonging to the arc y0 unless some other is expressly indicated. We now generalize somewhat on what we have just done for the shortest distance problem. Recall that in the finite dimensional optimization problem, a point X0 which is a relative (unconstrained) minimizing point for the function f has the property that
n n fxi hi = 0 and
i=1 i,j=1 fxi xj hi hj 0 (12) for all vectors H = (h1 , , hn ) (where all derivatives of f are at X0 ). These were called the first and second order necessary conditions. We now try to establish analogous conditions for the two dimensional fixed endpoint problem x2 minimize I = F (x, y, y )dx (13)
x1 among arcs which are continuously differentiable y: y(x) x1 x x2 (14) and which satisfy the endpoint conditions y(x1 ) = y1 with y1 , y2 constants. 16 y(x2) = y2 (15) In the process of establishing the above analogy, we first establish the concepts of the first and second derivatives of an integral (13) about a general admissible arc. These concepts are analagous to the first and second derivatives of a function f (X) about a general point X. Let y0 : y0 (x), x1 x x2 be any continuously differentiable arc and let (x) be another such arc (nothing is required of the endpoint values of y0 (x) or (x)). Form the family of arcs x1 x x2 (16) y0 (x) + (x) y0 [ x1 x2 Figure 9: Families of arcs y0 + Then for sufficiently small values of say  with small, these arcs will all be in a neighborhood of y0 and will be admissible arcs for the integral (13). Form the function I( ) =
x2 x1 F (x, y0 (x) + (x), y0 (x) + (x))dx,  < < (17) The derivative I ( ) of this function is I() =
x2 x1 [Fy (x, y0 (x) + (x), y0 (x) + (x))(x) + (18) +Fy (x, y0 (x) + (x), y0 (x) + (x)) (x)]dx Setting = 0 we obtain the first derivative of the integral I along y0 I (0) =
x2 x1 [Fy (x, y0 (x), y0 (x))(x) + Fy (x, y0 (x), y0 (x)) (x)]dx (19) Remark: The first derivative of an integral I about an admissible arc y0 is given by (19). Thus the first derivative of an integral I about an admissible arc y0 is obtained by evaluating I across a family of arcs containing y0 (see Figure 9) and differentiating that 17 function at y0 . Note how analagous this is to the first derivative of a function f at a point X0 in the finite dimensional case. There one evaluates f across a family of points containing the point X0 and differentiates the function. We will often write (19) as I (0) =
x2 x1 [Fy + Fy ]dx (20) where it is understood that the arguments are along the arc y0 . Returning now to the function I( ) we see that the second derivative of I( ) is I () =
x2 x1 [Fyy (x, y0 (x) + (x), y0 (x) + (x)) 2 (x) + (21) +2Fyy (x, y0 (x) + (x), y0 (x) + (x))(x) (x) + +Fy y (x, y0 (x) + (x), y0 (x) + (x)) 2 (x)]dx Setting = 0 we obtain the second derivative of I along y0 . The second derivative of I about y0 corresponds to the second derivative of f about a point X0 in finite dimensional problems. I (0) =
x2 x1 [Fyy (x, y0 (x), y0 (x)) 2 (x) + 2Fyy (x, y0 (x), y0 (x))(x) (x) + Fy y (x, y0 (x), y0 (x)) 2 (x)]dx I (0) =
x2 x1 (22) or more concisely [Fyy 2 + 2Fyy + Fy y 2 ]dx (23) where it is understood that all arguments are along the arc y0 . As an illustration, consider the integral I= In this case we have F = y(1 + y 2 )1/2 Fy = (1 + y 2 )1/2 Fy = yy (1 + y 2 ) 2 so that the first derivative is I (0) = Similarly
x2 x1
1 x2 x1 y(1 + y 2 )1/2 dx (24) (25) [(1 + y 2 )1/2 + yy (1 + y 2 )1/2 ]dx (26) Fyy = 0 Fyy = y (1 + y 2 )1/2
x2 x1 Fy y = y(1 + y 2 )3/2 (27) and the second derivative is I (0) = [2y (1 + y 2 )1/2 + y(1 + y 2 )3/2 2 ]dx . (28) 18 The functions (x) appearing in the first and second derivatives of I along the arc y0 correspond to the directions H in which the family of points X( ) was formed in chapter 1. Suppose now that an admissible arc y0 gives a relative minimum to I in the class of admissible arcs satisfying y(x1 ) = y1 , y(x2 ) = y2 where y1 , y2 , x1 , x2 are constants defined in the problem. Denote this class of arcs by B. Then there is a neighborhood R0 of the points (x, y0 (x), y0 (x)) on the arc y0 such that Iy0 Iy (29) (where Iy0 , Iy means I evaluated along y0 and I evaluated along y respectively) for all arcs in B whose points lie in R0 . Next, select an arbitrary admissible arc (x) having (x1 ) = 0 and (x2 ) = 0. For all real numbers the arc y0 (x) + (x) satisfies y0 (x1 ) + (x1 ) = y1 , y0 (x2 ) + (x2 ) = y2 (30) since the arc y0 satisfies (30) and (x1 ) = 0, (x2 ) = 0. Moreover, if is restricted to a sufficiently small interval  < < , with small, then the arc y0 (x) + (x) will be an admissible arc whose points be in R0 . Hence Iy0 + Iy0 The function I( ) = Iy0 + therefore has a relative minimum at = 0. Therefore from what we know about functions of one variable (i.e. I( )), we must have that I (0) = 0 I (0) 0 (32)  < < (31) where I (0) and I (0) are respectively the first and second derivatives of I along y0 . Since (x) was an arbitrary arc satisfying (x1 ) = 0 , (x2 ) = 0, we have: Theorem 2 If an admissible arc y0 gives a relative minimum to I in the class of admissible arcs with the same endpoints as y0 then I (0) = 0 I (0) 0 (33) (where I (0) , I (0) are the first and second derivatives of I along y0 ) for all admissible arcs (x), with (x1 ) = 0 and (x2 ) = 0. The above was done with all arcs y(x) having just one component, i.e. the n dimensional case with n = 1. Those results extend to n(n > 1) dimensional arcs y : yi(x) x1 x x2 i = 1, n). In this case using our notational conventions the formula for the first and second derivatives of I take the form n I (0) =
x2 x1 i=1 [Fyi i + Fyi i ]dx (34a) 19 I (0) = where = Problems 1. For the integral d . dx x2 n x1 i,j=1 [Fyi yj i j + 2Fyi yj i j + Fyi yj i j ]dx (34b) I = with x2 x1 f (x, y, y ) dx f = y 1/2 1 + y 2 write the first and second variations I (0), and I (0). 2. Consider the functional J(y) =
0 1 (1 + x)(y )2 dx where y is twice continuously differentiable and y(0) = 0 and y(1) = 1. Of all functions of the form y(x) = x + c1 x(1  x) + c2 x2 (1  x), where c1 and c2 are constants, find the one that minimizes J. 20 CHAPTER 3 3 First Results
Let M(x) be a piecewise continuous function on the interval x1 x2 x1 Fundamental Lemma. x x2 . If the integral M(x) (x)dx vanishes for every function (x) with (x) having at least the same order of continuity as does M(x) and also satisfying (x1 ) = (x2 ) = 0, then M(x) is necessarily a constant. To see that this is so we note first that the vanishing of the integral of the lemma implies also the equation x2 [M(x)  C] (x)dx = 0 (1)
x1 for every constant C, since all the functions (x) to be considered have (x1 ) = (x2 ) = 0. The particular function (x) defined by the equation
x (x) =
x1 M(x)dx  C(x  x1 ) (2) evidently has the value zero at x = x1 , and it will vanish again at x = x2 if, as we shall suppose, C is the constant value satisfying the condition 0=
x2 x1 M(x)dx  C(x2  x1 ) . The function (x) defined by (2) with this value of C inserted is now one of those which must satisfy (1). Its derivative is (x) = M(x)  C except at points where M(x) is discontinuous, since the derivative of an integral with respect to its upper limit is the value of the integrand at that limit whenever the integrand is continuous at the limit. For the special function (x), therefore, (1) takes the form
x2 x1 [M(x)  C]2 dx = 0 and our lemma is an immediate consequence since this equation can be true only if M(x) C. With this result we return to the shortest distance problem introduced earlier. In (9) of the last chapter, y = y0 (x) + (x) of the family of curves passing through the points 1 and 2, the function (x) was entirely arbitrary except for the restrictions that it should be admissible and satisfy the relations (x1 ) = (x2 ) = 0, and we have seen that the expression for (11) of that chapter for I (0) must vanish for every such family. The lemma just proven is therefore applicable and it tells us that along the minimizing arc y0 an equation Fy = y 1+y2 =C Thus if M (x) is continuous (piecewise continuous), then (x) should be continuous (at least piecewise continuous) 21 must hold, where C is a constant. If we solve this equation for y we see that y is also a constant along y0 and that the only possible minimizing arc is therefore a single straightline joining the point 1 with the point 2. The property just deduced for the shortest arc has so far only been proven to be necessary for a minimum. We have not yet demonstrated conclusively that the straightline segment y0 joining 1 and 2 is actually shorter than every other admissible arc joining these points. This will be done later. 3.1 Two Important Auxiliary Formulas: At this point we shall develop two special cases of more general formulas which are frequently applied in succeeding pages. Let y34 be a straightline segment of variable length which moves so that its endpoints describe simultaneously the two curves C and D shown in Figure 10, and let the equations of these curves in parametric form be (C) : (D) : x = x1 (t), y = y1 (t) , x = x2 (t), y = y2 (t) . 4 y34 y y56 (x,y) D 6 3 5 C Figure 10: Line segment of variable length with endpoints on the curves C, D For example, the point 3 in Figure 10 is described by an (x, y) pair at time t1 as x3 = x1 (t1 ), y3 = y1 (t1 ). The other points are similarly given, (x4 , y4 ) = (x2 (t1 ), y2(t1 )), (x5 , y5 ) = (x1 (t2 ), y1 (t2 )), and (x6 , y6) = (x2 (t2 ), y2 (t2 )). The length = (x4  x3 )2 + (y4  y3 )2 of the segment y34 has the differential d = (x4  x3 )(dx4  dx3 ) + (y4  y3 )(dy4  dy3 ) (x4  x3 )2 + (y4  y3 )2 22 . Note that since y34 is a straight line, then (y4  y3 )/(x4  x3 ) is the constant slope of the line . This slope is denoted by p. This result may be expressed in the convenient formula of the following theorem: Theorem 3 If a straightline segment y34 moves so that its endpoints 3 and 4 describe simultaneously two curves C and D, as shown in Figure 10, then the length of y34 has the differential dx + pdy 4 d (y34 ) = (3) 1 + p2 3 where the vertical bar indicates that the value of the preceding expression at the point 3 is to be subtracted from its value at the point 4. In this formula the differentials dx, dy at the points 3 and 4 are those belonging to C and D, while p is the constant slope of the segment y34 . We shall need frequently to integrate the right hand side of (3) along curves such as C and D. This is evidently justifiable along C, for example, since the slope p = (y4  y3 )/(x4  x3 ) is a function of t and since the differentials dx, dy can be calculated in terms of t and dt from the equations of C, so that the expression takes the form of a function of t. The integral I defined by the formula dx + pdy I = 1 + p2 will also be well defined along an arbitrary curve C when p is a function of x and y (and no longer a constant), provided that we agree to calculate the value of I by substituting for x, y, dx, dy the expressions for these variables in terms of t and dt obtained from the parametric equations of C. It is important to note that I is parametrically defined, i.e. we integrate with respect to t. Before we state the next theorem, let's go back to Figure 10 to get the geometric interpretation of the integrand in I . The integrand of I has a geometric interpretation at the points of C along which it is evaluated. At the point (x, y) on C, we can define two tangent vectors, one along the curve C (see Figure 10) and one along the line y. The tangent vector along C is given by v1 = and the tangent vector along y is v2 = 1 p i+ j. 2 1+p 1 + p2 x2 y x i+ 2 j 2 +y x +y2 The angle between these two vectors v1 and v2 is given by the dot product (since the vectors are of unit length), cos = v1 v2 or cos = x + py (1 + p2 )(x 2 + y 2) 23 . (4) The element of arc length, ds, along C can be written as ds = x 2 + y 2 dt From (4) it follows that the integral I can also be expressed in the convenient form I = dx + pdy = 1 + p2 cos ds . (5) Let t3 and t5 be two parameter values which define points 3 and 5 on C, and which at the same time define two corresponding points 4 and 6 on D, as in Figure 10. If we integrate the formula (3) with respect to t from t3 to t5 and use the notation I just introduced, we find as a further result: Theorem 4 The difference of the lengths (y34 ) and (y56 ) of the moving segment in two positions y56 and y34 is given by the formula (y56 )  (y34 ) = I (D46 )  I (C35 ) . (6) This and the formula (3) are the two important ones which we have been seeking. It is evident that they will still hold in even simpler form when one of the curves C or D degenerates into a point, since along such a degenerate curve the differentials dx and dy are zero. We now do a similar investigation of a necessary condition for the general problem defined in (13) and (15) of the last chapter: Minimize an integral I=
x2 x1 F (x, y, y )dx (7) on the class of admissible arcs joining two fixed points (x1 , y1 ) and (x2 , y2) in the xy plane (i.e. in 2 dimensional space). Suppose we are given an arc y0 that gives a relative minimum to I on the class . Then by the previous chapter, the first derivative I (0) of I about y0 has the property that x2 I (0) = [Fy + Fy ]dx = 0 (8)
x1 for all admissible arcs with (x1 ) = 0 and (x2 ) = 0 where the arguments in the derivatives of F are along y0 . If we make use of the formula Fy (x) = d ( dx
x x1 Fy ds)  x x1 Fy ds (9) and the fact that (x1 ) = (x2 ) = 0 then (8) becomes I (0) =
x2 x1 [Fy  24 x x1 Fy ds] dx (10) Then by use of the fundamental lemma we find that
x Fy (x) = x1 Fy ds + C x1 x x2 (11) holds at every point along y0 . Since we are only thus far considering arcs on which y (x) is continuous, then we may differentiate (11) to obtain d Fy (x) = Fy (x) dx x1 x x2 (12) along y0 (i.e. the arguments in Fy and Fy are those of the arc y0 ). This is the famous Euler equation. There is a second less wellknown Euler equation, namely: d (F  y Fy ) = Fx dx (13) which is true along y0 . For now, we prove this result only in the case that y0 is of class C 2 (i. e. has continuous second derivative y0 ). It is however true when y0 is of class C 1 (i.e. has continuous tangent) except at most at a finite number of points. Beginning with the left hand side of (13) d d [F  y Fy ] = Fx + Fy y + Fy y  y Fy y Fy dx dx
=0 (14) Thus, factoring y from last terms, we have d [F  y Fy ] = Fx + y dx Fy 
=0 d Fy dx (15) by (12) Thus we end up with the right hand of (13). This proves: Theorem 5 The Euler equations (12) and (13) are satisfied by an admissible arc y0 which provides a relative minimum to I in the class of admissible arcs joining its endpoints. Definition: An admissible arc y0 of class C 2 that satisfies the Euler equations on all of [x1 , x2 ] is called an extremal. We note that the proof of (13) relied on the fact that (12) was true. Thus on arcs of class C , then (13) is not an independent result from (12). However (13) is valid on much more general arcs and on many of these constitutes an independent result from (12).
2 We call (12)(13) the complete set of Euler equations. Euler's equations are in general second order differential equations (when the 2nd derivative y0 exists on the minimizing arc). There are however some special cases where these equations can be reduced to first order equations or algebraic equations. For example: 25 Case 1 Suppose that the integrand F does not depend on y, i. e. the integral to be minimized is x2 F (x, y ) dx (16)
x1 where F does not contain y explicitly. In this case the first Euler's equation (12) becomes along an extremal d Fy = 0 (17) dx or Fy = C (18) where C is a constant. This is a first order differential equation which does not contain y. This was the case in the shortest distance problem done before. Case 2 If the integrand does not depend on the independent variable x, i. e. if we have to minimize x2 F (y, y ) dx (19)
x1 then the second Euler equation (13) becomes d (F  y Fy ) = 0 dx or F  y Fy = C (where C is a constant) a first order equation. Case 3 If F does not depend on y , then the first Euler equation becomes 0 = Fy (x, y) (22) (21) (20) which is not a differential equation, but rather an algebraic equation. We next develop for our general problem the general version of the two auxiliary formulas (3) and (4) which were developed for the shortest distance problem. 3.2 Two Important Auxiliary Formulas in the General Case For the purpose of developing our new equations let us consider a oneparameter family of extremal arcs y : y(x, b) (x3 x x4 ) (23) satisfying the Euler differential equation Fy = Fy . x (24) 26 The partial derivative symbol is now used because there are always the two variables x and b in our equations. If x3 , x4 and b are all regarded as variables the value of the integral I along an arc of the family is a function of the form I(x3 , x4 , b) =
x4 x3 F (x, y(x, b), y (x, b))dx . With the help of Euler's equation (24), we see that along an extremal F = b Fy use (24) and the three partial derivatives of the function I(x3 , x4 , b) have therefore the values I = F x3 I = b
x4 x3 3 4 I =F , x4 y y y y y + Fy = Fy + Fy = Fy b b b x b x b . , y y 4 Fy dx = Fy , x b b 3 in which the arguments of F and its derivatives are understood to be the values y, y belonging to the family (23). Suppose now that the variables x3 , x4 , b are functions x3 (t), x4 (t), b(t) of a variable t so that the endpoints 3 and 4 of the extremals of the family (23) describe simultaneously two curves C and D in Figure 11 whose equations are x = x1 (t) , x = x2 (t) , y = y(x1 (t), b(t)) = y1 (t) , y = y(x2 (t), b(t)) = y2 (t) . (25) 4 y34 6 3 y56 D 5 C Figure 11: Curves described by endpoints of the family y(x, b) The differentials dx3 , dy3 and dx4 , dy4 along these curves are found by attaching suitable subscripts 3 and 4 to dx, and dy in the equations dx = x (t)dt , dy = yx dx + yb db . 27 (26) From the formulas for the derivatives of I we now find the differential dI = I I I dx3 + dx4 + db = (F dx + Fy yb db) x3 x4 b
4 3 = (F dx + Fy (dy  pdx)) 4 3 where the vertical bar indicates the difference between the values at the points 4 and 3. With the help of the second of (26) this gives the following important result: Theorem 6 The value of the integral I taken along a oneparameter family of extremal arcs y34 (x, b) whose endpoints describe the two curves C and D shown in Figure 11 has the differential dI = [F (x, y, p)dx + (dy  p dx)Fy (x, y, p)] ,
3 4 (27) where at the points 3 and 4 the differentials dx, dy are those belonging to C and D, while y and p are the ordinate and slope of y34 (x, b). We may denote by I the integral I = {F (x, y, p)dx + (dy  p dx)Fy (x, y, p)} . If we integrate the formula (27) between the two values of t defining the points 3 and 5 in Figure 11 we find the following useful relation between values of this integral and the original integral I. COROLLARY: For two arcs y34 (x, b) and y56 (x, b) of the family of extremals shown in Figure 11 the difference of the values of the integral I is given by the formula I(y56 (x, b))  I(y34 (x, b)) = I (D46 )  I (C35 ) . (28) Let us now use the results just obtained in order to attack the Brachistochrone problem introduced in chapter 2. That problem is to find the path joining points 1 and 2 such that a particle starting at point 1 with velocity v1 and acted upon only by gravity will reach point 2 in minimum time. It is natural at first sight to suppose that a straight line is the path down which a particle will fall in the shortest time from a given point 1 to a second given point 2, because a straight line is the shortest distance between the two points, but a little contemplation soon convinces one that this is not the case. John Bernoulli explicitly warned his readers against such a supposition when he formally proposed the brachistochrone problem in 1696. The surmise, suggested by Galileo's remarks on the brachistochrone problem, that the curve of quickest descent is an arc of a circle, is a more reasonable one, since there seems intuitively some justification for thinking that steepness and high velocity at the beginning of a fall will conduce to shortness in the time of descent over the whole path. It turns out, however, that this characteristic can also be overdone; the precise degree of steepness required at the start can in fact only be determined by a suitable mathematical investigation. The first step which will be undertaken in the discussion of the problem in the following pages is the proof that a brachistochrone curve joining two given points must be a cycloid. 28 A cycloid is the arched locus of a point on the rim of a wheel which rolls on a horizontal line, as shown in Figure 12. It turns out that the brachistochrone must consist of a portion of one of the arches turned upside down, and the one on the underside of which the circle rolls must be located at just the proper height above the given initial point of fall. Figure 12: Cycloid When these facts have been established we are then faced with the problem of determining whether or not such a cycloid exists joining two arbitrarily given points. Fortunately we will be able to prove that two points can always be joined by one and only one cycloid of the type desired. The analytic formulation of the problem. In order to discuss intelligently the problem of the brachistochrone we should first obtain the integral which represents the time required by a particle to fall under the action of gravity down an arbitrarily chosen curve joining two fixed points 1 and 2. Assume that the initial velocity v1 at the point 1 is given, and that the particle is to fall without friction on the curve and without resistance in the surrounding medium. If the effects of friction or a resisting medium are to be taken into account the brachistochrone problem becomes a much more complicated one.
0 1 y= x P mg 2 y Figure 13: A particle falling from point 1 to point 2 Let m be the mass of the moving particle P in Figure 13 and s the distance through which it has fallen from the point 1 along the curve of descent C in the time t. In order to make our analysis more convenient we may take the positive yaxis vertically downward, as shown in the figure. The vertical force of gravity acting upon P is the product of the mass m by the gravitational acceleration g, and the only force acting upon P in the direction of 29 the tangent line to the curve is the projection mg sin of this vertical gravitational force d2 s upon that line. But the force along the tangent may also be computed as the product m 2 dt of the mass of the particle by its acceleration along the curve. Equating these two values we find the equation d2 s dy = g sin = g dt2 ds in which a common factor m has been cancelled and use has been made of the formula dy sin = . ds ds To integrate this equation we multiply each side by 2 . The antiderivatives of the two dt sides are then found, and since they can differ only by a constant we have ds dt
2 = 2gy + c . (29) The value of the constant c can be determined if we remember that the values of y and ds at the initial point 1 of the fall are y1 and v1 , respectively, so that for t = 0 the last v= dt equation gives 2 v1 = 2gy1 + c . With the help of the value of c from this equation, and the notation = y1  equation (29) becomes ds dt
2 2 v1 , 2g (30) = 2gy + 2 v1 2 v1  y1 = 2g(y  ) .  2gy1 = 2gy + 2g 2g (31) An integration now gives the following result The time T required by a particle starting with the initial velocity v1 to fall from a point 1 to a point 2 along a curve is given by the integrals 1 T = 2g where ds 1 = y 2g
x2 x1 0 1+y2 dx y (32) 2 v1 . 2g An arc which minimizes one of the integrals (32) expressing T will also minimize that 1 integral when the factor is omitted, and vice versa. Let us therefore use the notations 2g is the length of the curve and = y1  I= x2 x1 F (y, y )dx , F (y, y ) = 1+y2 y (33) 30 for our integral which we seek to minimize and its integrand. Since the value of the function F (y, y ) is infinite when y = and imaginary when y < we must confine our curves to the portion of the plane which lies below the line y = in figure 13. This is not really a 2 ds 2 restriction of the problem since the equation v = = 2g(y  ) deduced above shows dt that a particle started on a curve with the velocity v1 at the point 1 will always come to rest if it reaches the altitude y = on the curve, and it can never rise above that altitude. For the present we shall restrict our curves to lie in the halfplane y > . In our study of the shortest distance problems the arcs to be considered were taken in the form y : y(x) (x1 x x2 ) with y(x) and y (x) continuous on the interval x1 x x2 , An admissible arc for the brachistochrone problem will always be understood to have these properties besides the additional one that it lies entirely in the halfplane y > . The integrand F (y, y ) and its partial derivatives are: F = 1+y2 , y Fy = 1 2 1+y2 , (y  )3 Fy = y (y  )(1 + y 2 ) (34) Since our integrand in (33) is independent of x we may use the case 2 special result (21) of the Euler equations. When the values of F and its derivative Fy for the brachistochrone problem are substituted from (34) this equation becomes F  y Fy = 1 = , 2b (y  )(1 + y 2 ) 1 (35) 1 the value of the constant being chosen for convenience in the form . 2b The curves which satisfy the differential equation (35) may be found by introducing a new variable u defined by the equation y =  tan sin u u = . 2 1 + cos u (36) From the differential equation (35) it follows then, with the help of some trigonometry, that along a minimizing arc y0 we must have y = Thus u 2b = 2b cos2 = b(1 + cos u) 1+y2 2 dy = b sin u. du dx dx dy 1 + cos u = = (b sin u) = b(1 + cos u) du dy du sin u Now 31 Integrating, we get x x = a + b(u + sin u) where a is the new constant of integration. It will soon be shown that curves which satisfy the first and third of these equations are the cycloids described in the following theorem: Theorem 7 A curve down which a particle, started with the initial velocity v1 at the point 1, will fall in the shortest time to a second point 2 is necessarily an arc having equations of the form x  a = b(u + sin u) , y  = b(1 + cos u) . (37) These represent the locus of a point fixed on the circumference of a circle of radius b as the v2 circle rolls on the lower side of the line y = = y1  1 . Such a curve is called a cycloid. 2g Cycloids. The fact that (37) represent a cycloid of the kind described in the theorem is proved as follows: Let a circle of radius b begin to roll on the line y = at the point whose coordinates are (a, ), as shown in Figure 14. After a turn through an angle of u radians the point of tangency is at a distance bu from (a, ) and the point which was the lowest in the circle has rotated to the point (x, y). The values of x and y may now be calculated in terms of u from the figure, and they are found to be those given by (37).
x b (a,) u y Figure 14: Cycloid The fact that the curve of quickest descent must be a cycloid is the famous result discovered by James and John Bernoulli in 1697 and announced at approximately the same time by a number of other mathematicians. We next continue using the general theory results to develop two auxiliary formulas for the Brachistochrone problem which are the analogues of (3), (4) for the shortest distance problem. Two Important Auxiliary Formulas If a segment y34 of a cycloid varies so that its endpoints describe two curves C and D, as shown in Figure 15 then it is possible to find a formula for the differential of the value of the integral I taken along the moving segment, and a formula expressing the difference of the values of I at two positions of the segment. The equations x = a(t) + b(t)(u + sin u) , 32 y = + b(t)(1 + cos u) (u3 (t) u u4 (t)) (38) define a oneparameter family of cycloid segments y34 when a, b, u3 , u4 are functions of a parameter t as indicated in the equations. If t varies, the endpoints 3 and 4 of this segment describe the two curves C and D whose equations in parametric form with t as independent variable are found by substituting u3 (t) and u4 (t), respectively, in (38). These curves and two of the cycloid segments joining them are shown in Figure 15. 3 5 y C y 4 D 6 Figure 15: Curves C, D described by the endpoints of segment y34 Now applying (27) of the general theory to this problem, regrouping (27), then the integral in (33) has the differential d = (F  pFy )dx + Fy dy (39) where (recalling (27)) the differentials dx, dy in (39) are those of C and D while p is the slope of y34 . Then by (35) and the last part of (34) substituted into (39) the following important result is obtained. Theorem 8 If a cycloid segment y34 varies so that its endpoints 3 and 4 describe simultaneously two curves C and D, as shown in Figure 15, then the value of the integral I taken along y34 has the differential 4 dx + pdy (40) d = y  1 + p2 3 At the points 3 and 4 the differentials dx, dy in this expression are those belonging to C and D, while p is the slope of the segment y34 . If the symbol I is now used to denote the integral I = dx + p dy y  1 + p2 (41) then by an integration of the formula (39) with respect to t from t3 to t5 we find the further result that Theorem 9 The difference between the values of at two different positions y34 and y56 of the variable cycloid segment, shown in Figure 15, is given by the formula 33 (y56 )  (y34 ) = I (D46 )  I (C35 ) . (42) The formulas (40) and (42) are the analogues for cycloids of the formulas (3) and (4) for the shortest distance problems. We shall see that they have many applications in the theory of brachistochrone curves. Problems 1. Find the extremals of I = for each case a. F = (y )  k 2 y 2 b. F = (y ) + 2y c. F = (y ) + 4xy d. F = (y ) + yy + y 2 e. F = x (y )  yy + y f. F = a(x) (y )  b(x)y 2 g. F = (y ) + k 2 cos y
b 2 2 2 2 2 2 2 x2 x1 F (x, y, y ) dx (k constant) 2. Solve the problem minimize I =
a (y )  y 2 dx y(b) = yb . 2 with y(a) = ya , What happens if b  a = n? 3. Show that if F = y 2 + 2xyy , then I has the same value for all curves joining the endpoints. 4. A geodesic on a given surface is a curve, lying on that surface, along which distance between two points is as small as possible. On a plane, a geodesic is a straight line. Determine equations of geodesics on the following surfaces: a. Right circular cylinder. [Take ds = a d + dz and minimize or a2 d dz
2 2 2 2 2 a2 + dz d 2 d + 1 dz] b. Right circular cone. [Use spherical coordinates with ds2 = dr 2 + r 2 sin2 d2 .] c. Sphere. [Use spherical coordinates with ds2 = a2 sin2 d2 + a2 d2 .] d. Surface of revolution. [Write x = r cos , y = r sin , z = f (r). Express the desired relation between r and in terms of an integral.] 34 5. Determine the stationary function associated with the integral I = when y(0) = 0 and y(1) = 1, where 1 0 (y ) f (x) ds 2 1 0 x < 1
1 4 1 4 f (x) = <x1 6. Find the extremals a. J(y) = b. J(y) = c. J(y) =
1 0 1 0 1 0 y dx, yy dx, xyy dx, y(0) = 0, y(1) = 1. y(0) = 0, y(1) = 1. y(0) = 0, y(1) = 1. 7. Find extremals for 1 y2 a. J(y) = dx, 0 x3 b. J(y) =
1 0 y 2 + (y )2 + 2yex dx. 8. Obtain the necessary condition for a function y to be a local minimum of the functional
b J(y) =
R K(s, t)y(s)y(t)dsdt +
a y 2dt  2 b y(t)f (t)dt
a where K(s, t) is a given continuous function of s and t on the square R, for which a s, t b, K(s, t) is symmetric and f (t) is continuous. Hint: the answer is a Fredholm integral equation. 9. Find the extremal for J(y) =
1 0 (1 + x)(y )2 dx, y(0) = 0, y(1) = 1. What is the extremal if the boundary condition at x = 1 is changed to y (1) = 0? 10. Find the extremals J(y) =
a b x2 (y )2 + y 2 dx. 35 CHAPTER 4 4 Variable EndPoint Problems We next consider problems in which one or both endpoints are not fixed. For illustration we again consider the shortest arc problem. However now we investigate the shortest arc from a fixed point to a curve. If a fixed point 1 and a fixed curve N are given instead of two fixed points then the shortest arc joining them must again be a straightline segment, but this property alone is not sufficient to insure a minimum length. There are two further conditions on the shortest line from a point to a curve for which we shall find very interesting analogues in connection with the problems considered in later chapters. Let the equations of the curve N in Figure 16 be written in terms of a parameter in the form x = x( ) , y = y( ) , Let y12 be the solution to the problem of finding the shortest arc from point 1 to curve N.
2 y12 4 N 3 1 L 5 y56 6 G Figure 16: Shortest arc from a fixed point 1 to a curve N. G is the evolute Let 2 be the parameter value defining the intersection point 2 of N. Clearly the arc y12 is a straightline segment. The length of the straightline segment joining the point 1 with an arbitrary point (x( ) , y( )) of N is a function I( ) which must have a minimum at the value 2 defining the particular line y12 . The formula (3) of chapter 3 is applicable to the oneparameter family of straight lines joining 1 with N when in that formula we replace C by the point 1 and D by N. Since along C (now degenerated to a point) the differentials 36 dx, dy are then zero it follows that the differential of the function I( ) along the arc y12 is dx + pdy dI = 1 + p2
2 where the bar indicates that the value of the preceding expression is to be taken at the point 2. Since for a minimum the differential dI must vanish it follows that at the point 2 the differentials dx, dy of N and the slope p of y12 satisfy the condition dx + pdy = 0, and hence that these two curves must intersect at right angles (see (5) of chapter 3). Even a straightline segment through 1 and intersecting N at right angles may not be a shortest arc joining 1 with N, as may be seen with the help of the familiar string property of the evolute of N . The segments of the straight lines perpendicular to N cut off by N and its evolute G in Figure 16 form a family to which the formula (6) of chapter 3 is applicable. If in that formula we replace the curve C by G and D by N then (note that the points 2,3,5,6 are vertices of a quadrilateral similar to figure 11) (y56 )  (y32 ) = I (N26 )  I (G35 ) . But by using (5) of chapter 3 the integrals on the right hand side of this formula are seen to have the values s2 cos ds = 0 , I (G35 ) = I(G35 ) I (N26 ) =
s1 since cos = 0 along N (the straight lines of the family meet N at right angles), and cos = 1 along the envelope G (to which these lines are tangent). Hence from the next to last equation we have the formula (y32 ) = I(G35 ) + (y56 ) . This is the string property of the evolute, for it implies that the lengths of the arcs y32 (x) and G35 + y56 are the same, and hence that the free end 6 of the string fastened at 3 and allowed to wrap itself around the evolute G will describe the curve N. It is evident now that the segment y12 cannot be a shortest line from 1 to N if it has on it a contact point 3 with the evolute G of N. For the composite arc y13 + G35 + y56 would in that case have the same length as y12 and the arc y13 + L35 + y56 formed with the straight line segment L35 , would be shorter than y12 . It follows then that: If an arc y12 intersecting the curve N at the point 2 is to be the shortest joining 1 with N it must be a straight line perpendicular to N at the point 2 and having on it no contact point with the evolute G of N. Our main purpose in this section was to obtain the straight line condition and also the perpendicularity condition at N for the minimizing arc as we have done above. This last result concerning the evolute G, is a hint of something that we shall see more of later on.
The evolute of a curve is the locus of the centers of curvature of the given curve. The family of straight lines normal to a given curve are tangent to the evolute of this curve, and the changes in length of the radius of curvature is equal to the change in length of arc of the evolute as the point on the curve moves continuously in one direction along the curve. 37 4.1 The General Problem We now consider the general problem: Minimize the integral I=
x2 x1 F (x, y, y )dx (1) on the class of arcs joining fixed point 1 with coordinates (x, y) with the curve N. Note that now point 2 with coordinates (x2 , y2 ) is not fixed since it is as yet an undetermined point on N. Necessary conditions when one endpoint is variable. A minimizing arc y12 for this problem, meeting the curve N at the point 2, must evidently be a minimizing arc for the problem with endpoints fixed at 1 and 2, and hence must satisfy at least the necessary conditions (12), (13) of chapter 3. For the problem with one variable end point there is a new necessary condition for a minimum, involving the directions of the curves y12 and N at their intersection point 2, which is called the transversality condition. This condition may be proved with the help of the formula (27) of the last chapter. Let the points of N be joined to the point 1 of y12 by a oneparameter family of arcs containing y12 as one member of the family. If the curve C of the formula just cited is replaced by the fixed point 1, and the curve D by N, then this formula shows that the value of I taken along the arcs of the oneparameter family has at the particular arc y12 the differential dI = [F (x, y, y )dx + (dy  y dx)Fy (x, y, y )]
2 , where at the point 2 the differentials dx, dy are those of N and the element (x, y, y ) belongs to y12 . If the values of I along the arcs of the family are to have I(y12 ) as a minimum then the differential dI must vanish along y12 and we have the following result: THE TRANSVERSALITY CONDITION. If for an admissible arc y12 joining a fixed point 1 to a fixed curve N the value I(y12 ) is a minimum with respect to the values of I on neighboring admissible arcs joining 1 with N, then at the intersection point 2 of y12 and N the direction dx : dy of N and the element (x, y, y ) of y12 must satisfy the relation F (x, y, y )dx + (dy  y dx)Fy (x, y, y ) = 0 . (2) If this condition is satisfied the arc N is said to cut y12 transversally at the point 2. When the arc N is the vertical line x = x1 or x = x2 , this condition is called a natural boundary condition. For many problems the transversality condition implies that y12 and N must meet at right angles. Indeed (2) when applied to the shortest distance problem gives the condition of perpendicularity obtained there. However (2) does not in general imply perpendicularity as one may verify in many special cases. By a slight modification of the above reasoning we may treat the problem of minimizing the integral (1) on the class of arcs joining two given curves C and D as in Figure 11. Let 38 y12 be a minimizing arc meeting curves C and D at points 1 and 2 respectively. Then y12 must also be a minimizing arc for the problem with fixed endpoints 1 and 2 and hence must sastisfy the necessary conditions (12) and (13) of the last chapter. Furthermore, y12 is also a minimizing arc for the problem of joining point 1 with the curve D so that the transversality condition just deduced for the problem with one endpoint varying must hold at point 2. By a similar argument, with point 2 fixed for arcs joining point 2 with C, we see that the transversality condition must also hold at point 1. Thus we have: THE TRANSVERSALITY CONDITION (when both endpoints vary). If for an admissible arc y12 joining two fixed curves C and D, the value I(y12 ) is a minimum with respect to the values of I on neighboring admissible arcs joining C and D, then at the intersection points 1 and 2 of y12 with C and D respectively, the directions dx : dy of C and the element (x, y, y ) of y12 at points 1 and 2 must satisfy the separate relations [F (x, y, y )dx + (dy  y dx)Fy (x, y, y )] = 0
i i = 1, 2 (3) We now use the results just developed for the general theory by applying them to the brachistochrone problem. The path of quickest descent from a point to a curve. First necessary conditions. At the conclusion of his now famous solution of the brachistochrone problem, published in 1697, James Bernoulli proposed to other mathematicians, but to his brother in particular, a number of further questions. One of them was the problem of determining the arc down which a particle, starting with a given initial velocity, will fall in the shortest time from a fixed point to a fixed vertical straight line. This is a special case of the more genreral problem of determining the brachistochrone arc joining a fixed point 1 to an arbitrarily chosen fixed curve N. Let the point 1, the curve N, and the path y12 of quickest descent be those shown in Figure 17, (where has the significance described in the previous chapter), and let the given initial velocity at the point 1 again be v1 . Since by our general theory just developed, we know that Euler's equations (12) and (13) of the previous chapter apply, then by what has been shown in chapter 3, the minimizing arc y12 must be a cycloid joining point 1 to some as yet undetermined point 2 on the curve N. This constitutes a first necessary condition for this problem. Applying (2) to the present problem and using (33) of chapter 3 gives at point 2 1+y2 y =0 dx + (dy  y dx) y 1+y2 y (4) where y , y are values on the minimizing arc y12 at point 2 and dy, dx are values of the curve N at point 2. After multiplying and dividing by 1 + y 2 one obtains the condition dx + y dy = 0 39 (5) y= 1 N y12 2 Figure 17: Path of quickest descent, y12 , from point 1 to the curve N which is the transversality condition for this problem. This condition means that y12 must be perpendicular to curve N at point 2. So the transversality condition here as in the shortest distance problem, is one of perpendicularity, but as already noted, this is not true for all problems. Then for the brachistochrone problem from a point to a curve N, we have the result: For a particle starting at point 1 with initial velocity v1 , the path of quickest descent from 1 to a curve N, is necessarily an arc y12 of a cycloid, generated by a point fixed on the 2 circumference of a circle, rolling on the lower side of the line y = y1  v1 /2g. The path y12 must furthermore be cut at right angles by the curve N at their intersection point 2. Example: Minimize the integral
/4 I = with left end point fixed 0 y 2  (y )2 dx y(0) = 1 and the right end point is along the curve x = . 4 Since F = y 2  (y )2 , then the Euler equation becomes y + y = 0. The solution is y(x) = A cos x + B sin x Using the condition at x = 0, we get y = cos x + B sin x 40 Now for the transversality condition F + (  y )Fy
x= 4 = 0 where is the curve on the right end. Since the curve is a vertical line, the slope is infinite, thus we have to rewrite the condition after dividing by . This will become (noting that 1/ = 0) Fy In our case
x= 4 = 0 y( )=0 4 This implies B = 1, and thus the solution is y = cos x + sin x. 4.2 Appendix Let's derive the transversality condition obtained before by a different method. Thus consider the problem min I = among arcs y: y(x) x1 x x2 (where x1 , x2 can vary with the arc) satisfying y(xi ) = Yi (xi ) i = 1, 2 (2)
x2 x1 F (x, y, y )dx (1) This is a variable endpoint problem with Y1 (x) as the left boundary curve and Y2 (x) as the right boundary curve. Assume y0 : y(x) x01 x x02 is a solution to this problem. Let (x) be an arc and create the family of arcs y( ) : y0 (x) + (x) x1 ( ) x x2 ( )  < < (3) for some > 0, where (x), x1 ( ), x2 ( ) are as yet arbitrary functions. In order that each arc in this family satisfies (2) we must have y0 (xi ( )) + (xi ( )) = Yi (xi ( )) i = 1, 2 (4) 41 Differentiating (4) with respect to y0 (xi (0)) at = 0 gives (recall that (xi ) term has a factor of ) (5) dxi (0) dYi + (xi (0)) = (xi (0)) d d Equation (5) gives a relation between , dYi (0) of the boundary curves. Namely, (with x0i = xi (0), the endpoints of y0 ) d (x0i ) = dYi (0) dxi (0)  y0 (x0i ) d d i = 1, 2 dxi (0) at the endpoints of the solution arc and d (6) These are the only (x) arcs that can be used in this problem since they are the ones which create families of arcs satisfying (2). We call these, admissible (x). For such an admissible (x), evaluate I on the resultant family to get I( ) = Differentiating with respect to I (0) =
x2 ( ) x1 ( ) F (x, y0 + , y0 + )dx (7) at = 0 gives
x2 (0) x1 (0) [Fy + Fy ]dx + F (x0i ) dxi (0) d 2 1 =0 (8) where F (x0i ) means F (xi (0), y0(xi (0)), y0(xi (0))) i.e. F evaluated on the arc y0 at the ith endpoint and all terms in F or its derivatives are on y0 and the last term in the right side means to difference the value at the left end point from its value at the right endpoint and where we have set I (0) = 0 (why?). By doing the usual integration by parts we get 0 = I (0) =
x02 x01 [Fy  x x01 Fy ds] dx + x02 x01 d [ dx dxi (0) Fy ds]dx + F (x0i ) d x01
x 2 1 (9) Evaluating the second integral on the right side gives 0 = I (0) = and noting that
x01 x01 x02 x01 [Fy  x x01 Fy ds] dx + [(x0i ) x0i x01 Fy ds + F (x0i ) dxi (0) 2 ] d 1 (10) Fy ds = 0, gives
x02 x01 0 = I (0) = [Fy  x x01 Fy ds] dx + (x02 ) x02 x01 Fy ds + F (x0i ) dxi (0) d 2 1 (11) Now a particular class of admissible (x) are those for which (x02 ) = 0 dxi (0) =0 d 42 i = 1, 2 (12) For such (x), all terms after the first integral on the right side in (11) are zero so that for such (x) we have 0=
x02 x01 [Fy  x x01 Fy ds] dx (13) then by the fundamental lemma we have that Fy (x) 
x x01 Fy ds = c (14) holds along the solution arc y0 . This is the same as the Euler equation for the fixed endpoint problem. Furthermore by (14) (15) c = Fy (x01 ) Now let (x) be any arc satisfying (6), i.e. we are returning to the full class of admissible (x). Then by (14) and (15) we get that the first integral on the right side in (11) is
x02 x01 [Fy  x x01 Fy ds] dx = x02 x01 c dx = c((x02 )  (x01 )) (16) = Fy (x01 )[(x02 )  (x01 )] Then by (16), (15) and (14) the equation (11) becomes 0 = Fy (x01 )[(x02 )  (x01 )] + (x02 ) x02 x01 Fy ds by (14) by (15) +F (x0i ) dxi (0) d 2 1 (17) Fy (x02 )c Fy (x02 )Fy (x01 ) Simplifying gives [Fy (x0i )(x0i ) + F (x0i ) Then by (6), for all admissible (x), (18) becomes dxi (0) 2 ] =0 d 1
2 1 (18) dxi (0) dYi(0) + Fy (x0i ) [F (x0i )  y0 (x0i )Fy (x0i )] d d =0 (19) When (19) is multiplied by d , this is the transversality condition obtained previously. Next, for future work, we'll need an alternate form of the fundamental lemma which we've been using. Alternate fundamental lemma If (x) is continuous on [x1 , x2 ] and if
x2 x1 (x)(x)dx = 0 for every arc (x) of class C 1 satisfying (x1 ) = (x2 ) = 0 then (x) 0 for all x on [x1 , x2 ]. 43 Problems 1. Solve the problem minimize I =
x1 0 y 2  (y ) 2 dx with left end point fixed and y(x1 ) is along the curve x1 = 2. Find the extremals for I = where end values of y are free. 3. Solve the EulerLagrange equation for
b 1 0 . 4 1 2 (y ) + yy + y + y dx 2 I =
a y 1 + (y )2 dx where y(a) = A, b. Investigate the special case when a = b, A=B y(b) = B. and show that depending upon the relative size of b, B there may be none, one or two candidate curves that satisfy the requisite endpoints conditions. 4. Solve the EulerLagrange equation associated with
b I =
a y 2  yy + (y ) 2 dx 5. What is the relevant EulerLagrange equation associated with I =
1 0 y 2 + 2xy + (y ) 2 dx 6. Investigate all possibilities with regard to tranversality for the problem
b min
a 1  (y )2 dx 7. Determine the stationary functions associated with the integral 44 I = 1 0 (y )  2yy  2y dx 2 where and are constants, in each of the following situations: a. The end conditions y(0) = 0 and y(1) = 1 are preassigned. b. Only the end conditions y(0) = 0 is preassigned. c. Only the end conditions y(1) = 1 is preassigned. d. No end conditions are preassigned. 8. Determine the natural boundary conditions associated with the determination of extremals in each of the cases considered in Problem 1 of Chapter 3. 9. Find the curves for which the functional I = with y(0) = 0 can have extrema, if a. The point (x1 , y1 ) can vary along the line y = x  5. b. The point (x1 , y1) can vary along the circle (x  9)2 + y 2 = 9. 10. If F depends upon x2 , show that the transversality condition must be replaced by F + (  y ) F y +
x=x2 x2 x1 x1 0 1+y2 dx y F dx = 0. x2 11. Find an extremal for
e J(y) = 1 1 2 1 x (y )2  y 2 dx, 2 8 y(1) = 1, y(e) is unspecified. 12. Find an extremal for J(y) =
1 0 (y )2 dx + y(1)2 , y(0) = 1, y(1) is unspecified. 45 CHAPTER 5 5 Higher Dimensional Problems and Another Proof of the Second Euler Equation Up to now our problems have been twodimensional i.e. our arcs have been described by two variables, namely x, y. A natural generalization is to consider problems involving arcs in threespace with coordinates x, y, z or in even higher dimensional space, say N +1 dimensional space with coordinates x, y1 , , yN . The problem to be considered then involves an integral of the form x2 I= F (x, y1 , , yN , y1 , , yN )dx . (1)
x1 and a class of admissible arcs y where superscript bar designates a vector arc, with components y : yi(x) x1 x x2 i = 1, , N (2) on which the integral (1) has a well defined value. As a fixed endpoint version of this problem one would want to minimize (1) on a subclass of arcs (2) that join two fixed points y 1 and y 2 , i.e. arcs that satisfy yi(x1 ) = yi1 yi(x2 ) = yi2 (3) where y 1 has coordinates (y11 , , yN 1 ) and y 2 has coordinates (y12 , , yN 2). Analogous to the proof used in obtaining the first Euler equation in chapter 3 for the twodimensional problem we could obtain the following condition along a minimizing arc
x Fyi (x) = x1 Fyi dx + ci i = 1, , N (4) where ci are constants. And then by differentiation d Fy (x) = Fyi (x) x1 x x2 (5) dx i Using this result we can now prove the second Euler equation for the twodimensional problem with the same generality as for the first Euler equation. We previously stated this result in chapter 3 but only really proved it for arcs of class C 2 (i.e. having a continuous second derivative y ). Thus let us now consider the twodimensional problem of chapter 3, defined by (7) and the remarks following it, without assuming even the existence of y on our arcs. We now write our arcs in parametric form. In particular our minimizing arc y0 (from chapter 3) is now written as x = t y = y0 (t) x1 t x2 (6) where t is the parameter and x(t) = t is the first component and y(t) = y0 (t) is the second component. Being a minimizing arc, then this arc must minimize the integral (7) of chapter 3 on the class of parametric arcs x = (t) y = (t) 46 t1 t t2 (7) (where we have set t1 = x1 and t2 = x2 ), which join the fixed points (x1 , y1) and (x2 , y2 ) of that problem and have (t) > 0 on [t1 , t2 ]. This is true since each nonparametric arc of the originally stated problem can be written (as in (6)) as a parametric vector arc of the class just described and vice versa. In terms of these parametric arcs, the integral (7) of chapter 3 takes the form I=
t2 t1 F F ( , , ) dt (8) in (1) where the primes now mean derivatives with respect to t . This is an integral like (1) (i.e. in threedimensional t, x, y space). By (4) applied to the variable (use i = 2 in (4)), there results t Fx dt + c (9) F  Fy =
t1 When we write y for and use (6) we get along y0 F  y Fy =
x x1 Fx dx + c (10) and by differentiation d [F  y Fy ] = Fx dx which is the result listed in chapter 3. (11) 5.1 Variational Problems with Constraints 5.1.1 Isoparametric Problems In the general problem considered thus far, the class of admissible arcs was specified (apart from certain continuity conditions) by conditions imposed on the endpoints. However many applications of the calculus of variations lead to problems in which not only boundary conditions, but also conditions of quite a different type, known as subsidiary conditions (or also side conditions or constraints) are imposed on the admissible arcs. As an example of this, we consider the isoparametric problem. This problem is one of finding an arc y passing through the points (x1 , 0) and (x1 , 0) of given length L which together with the interval [x1 , x1 ] on the xaxis, encloses the largest area. In general form, this problem can be stated as finding the arc y for which the integral I[y] = x2 x1 F (x, y, y )dx
dy dx dy = / . dx dt dt (12) The argument replaces y in the original integral. This follows since by calculus 47 is minimized (or maximized) where the admissible arcs satisfy the endpoint conditions y(xi) = yi and also are such that another integral K[y] =
x2 x1 i = 1, 2 (13) G(x, y, y )dx (14) has a fixed value L. In the specific application noted above (assuming that y is in the positive halfplane), the integral I giving the area is x1 I= y(x)dx (15)
x1 while the integral K is the length integral K=
x1 x1 1 + y 2 dx (16) and is required to have fixed length. Returning now to the general version of this problem stated in (12)(14), we will follow the reasoning employed originally in solving the shortest distance problem. Thus assume that y0 is a solution to this problem. Let 1 (x) and 2 (x) be two functions satisfying 1 (xi ) = 0 , 2 (xi ) = 0 i = 1, 2 Create the twoparameter (17) family of arcs (18) y(1 , 2 ) : y0 (x) + 1 1 (x) + 2 2 (x) By (17) this family satisfies the endpoint conditions of our problem. Consider the integrals (12) and (14) evaluated on this family. For example I(y(1 , 2 )) =
x2 x1 F (x, y0 (x) + 1 1 (x) + 2 2 (x), y0 (x) + 1 1 (x) + 2 2 (x))dx (19) and similarly for K(y(1 , 2 )). On this family of arcs our problem can be stated as minimize I(y(1 , 2 )) subject to K(y(1 , 2 )) = L (20) Now noting that on this family, these integrals can be considered as functions of two variables (1 , 2 ) (instead of arcs), then, when considering this family, our problem can be stated as min I(1 , 2 ) subject to K(1 , 2 ) = L (21) where in somewhat loose notation we have written I(1 , 2 ) for I(y(1 , 2 )) and similarly for K. This is a finite (actually, two) dimensional problem of the type described in chapter 1. Note that up to now our families of arcs have been one parameter families. 48 By the results there and also noting that our minimizing arc y 0 = y(0, 0) solves this problem we must have that dI dK 0= (0, 0) + (0, 0) i = 1, 2 (22) di di where is a Lagrange multiplier. Writing the integrals I and K out in terms of the family (18) and differentiating separately with respect to 1 , 2 under the integral sign gives the two equations 0=
x2 x1 [Fy i + Fy i ]dx + x2 x1 [Gy i + Gy i ]dx i = 1, 2 (23) where the partial derivatives of F and G are at (1 , 2 ) = (0, 0) i.e. along the arc y0 . Writing this as one integral, gives 0=
x2 x1 [(F + G)y i + (F + G)y i ]dx = x2 x1 [F y i + F y i ]dx i = 1, 2 (24) where F F + G and where this is true for all functions i (x) satisfying (17). Making use of the integration by parts, formula (9) of chapter 3, but with F , we get as there x2 x 0= [F y  F y ds]i dx (25)
x1 x1 Then by the fundamental lemma we obtain
x F y (x) = x1 F y ds + c x1 x x2 (26) (where c is a constant) which holds at every point along the arc y0 and then also by differentiating d F y (x) = F y (x) x1 x x2 (27) dx along y0 . In terms of the functions F and G, this is d (F + G)y = (F + G)y dx x1 x x2 (28) This is the first Euler equation for the isoparametric problem. In a manner similar to that used in the beginning of this chapter, it can be shown that the second Euler equation (F + G)  y (F + G)y = or in differentiated form d [(F + G)  y (F + G)y ] = (F + G)x dx also holds along y0 . 49 x1 x x2 (30)
x x1 (F + G)x dx x1 x x2 (29) These results are summarized in Theorem 10. For the problem stated in (12)  (14) let y0 be a solution, then there exists a constant such that (26), (28), (29), and (30) are true along y0 . Note that if our problem did not have a fixed right end point but instead was required to intersect some curve N then (x) would not have to satisfy (17) for i = 2 and then a line of reasoning similar to that used in chapter 4, would give (F + G)dx + (dy  y dx)(F + G)y = 0 (31) as the transversality condition at intersection with N, where the direction dy : dx comes from N, and the arguments of F, G are from y0 . For this problem a corresponding condition at left end point would hold if the left end point were not prescribed. Let's go through an application: Consider the problem of determining the curve of length L with endpoints (0,0) and (1,0) which encloses the largest area between it and the xaxis. Thus we need to 1 maximize I = ydx (32)
0 subject to fixed end points y(0) = y(1) = 0 and fixed length constraint K= Setting F =y+ 1+y2 the first Euler equation (27) is Direct integration gives y 1+y Now make the substitution tan = y then (37) gives (recall that 1 + tan2 = sec2 ) sin = Now since tan = x  c1 (39) (38)
2 0 1 (33) (34) (35) 1 + y 2 dx = L d dx y 1+y2 1 =0 (36) = x  c1 (37) sin sin = , then (38), (39) give cos 1  sin2 y = (x  c1 ) 1
(xc1 2 )2 = (x  c1 ) 2  (x  c1 )2 (40) 50 or when using y = dy dx dy = (x  c1 ) 2  (x  c1 )2 dx. (41) Integration gives y = 2  (x  c1 )2 + c2 or then (y  c2 )2 + (x  c1 )2 = 2 (42) (43) This is part of a circle with center (c1 , c2 ), and radius . The three constants c1 , c2 , are determined to satisfy the two endpoint conditions and the fixed length constraint this completes the problem solution. (see problem 5) 5.1.2 Point Constraints We next discuss problems in which minimization is to be done subject to a different type of constraint than the one just discussed. Thus consider the problem minimize I = subject to fixed endpoints y(xi) = yi and also subject to a constraint (x, y, z, y , z ) = 0 (46) Assume (as for previous problems) that y 0 is a solution to this problem. The notation y0 denotes a vector arc with components y0 (x), z0 (x). All arcs considered in this problem have two components. Also assume that y , z do not equal zero simultaneously at any point on y0 . Next, let (x), (x) be functions which satisfy (xi ) = 0 (xi ) = 0 i = 1, 2 . (47) z(xi ) = zi i = 1, 2 (45)
x2 x1 F (x, y, z, y , z )dx (44) As in previous chapters, create the oneparameter family of arcs (but note that now our arcs are vector arcs) y( ) : y0 (x) + (x) , z0 (x) + (x) x1 x x2 (48) We assume also that for some > 0, and for   < , the functions (x), (x) satisfy (x, y0 (x) + (x), z0 (x) + (x), y0 (x) + (x), z0 (x) + (x)) = 0 x1 x x2 . (49) 51 Again, similar to previous chapters, evaluate the integral in (44) on our family and define I( ) =
x2 x1 F (x, y0 (x) + (x), z0 (x) + (x), y0 (x) + (x), z0 (x) + (x))dx at
x2 x1 (50) Differentiating this with respect to 0 = I (0) = = 0 gives (51) [Fy + Fz + Fy + Fz ]dx where the partials of F are taken at points along y0 . Next, differentiate (49) with respect to at = 0 to get y + z + y + z = 0 x1 x x2 (52) where the partials of are at points along y 0 . Equation (52) reveals that the , functions are not independent of each other but are related. Multiplying (52) by an as yet unspecified function (x) and adding the result to the integrand of (51) yields.
x2 x1 [(Fy + y ) + (Fy + y ) + (Fz + z ) + (Fz + z ) ]dx = 0 (53a) ^ Setting F = F + gives (53a) in the form
x2 x1 ^ ^ ^ ^ [Fy + Fy + Fz + Fz ]dx = 0 (53b) Using the now familiar integration by parts formula on the 1st and 3rd terms in the integrand of (53b) gives: x x d ^ ^ ^ ( Fy = (54) Fy dx)  Fy dx dx x1 x1 ^ and similarly for Fz . Using these and (47) yields 0 = I (0) =
x2 x1 ^ ([Fy  x x1 ^ ^ Fy ds] + [Fz  x x1 ^ Fz ds] )dx (55) However we cannot take the step that we essentially did in developing the Euler equation in the unconstrained case at the start of this chapter and say that the , functions, are independent since as noted above (see (52)), they are not. Now, assuming that y = 0 (consistent with our assumption either y or z = 0) we can choose such that the d (Fy + y )  (Fy + y ) = 0 or then coefficient of is constant (i.e. choose such that dx d d Fy  y )/y and integrate this result). Next choose arbitrarily = (Fy + y  dx dx (consistent with (47)) and consistent with (49) and (47). By (47) and the fundamental lemma, the coefficient of must also be constant. This results in ^ Fy (x) 
x x1 ^ Fy ds = c1 (56a) 52 ^ Fz (x)  x x1 ^ Fz ds = c2 (56b) where c1 , c2 are constants. In differentiated form this is d ^ ^ Fy  Fy = 0 dx d ^ ^ Fz  Fz = 0 dx ^ Substituting for F , then (56c), (56d) become (Fy + y )  (Fz + z )  d (Fy + y ) = 0 dx (57a) (56c) (56d) d (57b) (Fz + z ) = 0 dx This result is actually contained in a larger result as follows. If the constraint (46) does not depend on y , z i.e. if the constraint is (x, y, z) = 0 (58) and if y and z are not simultaneously zero at any point of y0 then the analogous equations for (57a) and (57b) are d (59a) Fy + y  Fy = 0 dx Fz + z  These results are summarized in the following: Theorem: Given the problem min I =
x2 x1 d Fz = 0 dx (59b) F (x, y, z, y , z )dx (60) subject to fixed end points and the constraint (x, y, z, y , z ) = 0 (61) then if y , z (or in case does not depend on y , z , then if y , z ) do not simultaneously ^ equal zero at any point of a solution y0 , then there is a function (x) such that with F F + , then (56a) and (56b) or in differentiated form, (56c) and (56d) are satisfied along y0 . The three equations (56a,b) or (56c,d) and (61) are used to determine the three functions y(x), z(x), (x) for the solution. 53 In more general cases if our integrand has k dependent variables I= and we have (N < k) constraints i (x, y1 , yk , y1, yk ) = 0 , i = 1, , N such that the matrix i = 1, , N (63)
x2 x1 F (x, y1 , y2 , yk , y1 , yk )dx (62) i i (or in case the are independent of y1 yk , then assume ) yj yj j = 1, k has maximal rank along a solution curve, y0 then with ^ F =F +
N i (x)i
i=1 (64) d ^ ^ Fyj  Fyj = 0 j = 1, , k (65) dx holding on y0 where the i (x) are N multiplier functions. As an application, consider the problem of finding the curve of minimum length between two points (x1 , y1 , z1 ) and (x2 , y2 , z2 ) on a surface (x, y, z) = 0 (66) we have Doing this in parametric form our curves will be written as x = x(t), y = y(t), z = z(t) and with arc length as ds = x2 + y 2 + z 2 dt (67) where "" denotes differentiation with respect to t. Then our problem is minimize I = with fixed end points x(ti ) = xi subject to (66). For this problem, with F = the EulerLagrange equations (65) are x  Now noting that F = 54 ds dt (72) d x dt F = 0 y  d y dt F = 0 z  d z dt F =0 (71) x2 + y 2 + z 2 (70) y(ti) = yi z(ti ) = zi i = 1, 2 (69)
t2 t1 x2 + y 2 + z 2 dt (68) where s is arc length then e.g. d x dt F and if we multiply this by = d dx ds d dx dt d dx = = / dt dt dt dt dt ds dt ds (73) dt we get ds d x dt F dt d dx = ds dt ds d2 y d y dt = 2 dt F ds ds d2 z d z dt = 2 dt F ds ds dt d2 x = 2 ds ds (74a) and similarly (74b) (74c) dt give as shown above ds dt d2 z = z ds2 ds (75) Thus, multiplying each of the equations of (71) by d2 x dt = x ds2 ds or then dt d2 y = y ds2 ds d2 y d2 z d2 x : : = x : y : z (76) ds2 ds2 ds2 which has the geometric interpretation that the principal normal to the curve is parallel to the gradient to the surface (i.e. it's perpendicular to the surface). If we do this in particular for geodesics on a sphere so that (66) is (x, y, z) = x2 + y 2 + z 2  R2 = 0 where R is the radius of sphere, then (71) becomes (after solving for ) F x  xF F y  yF F z  zF = = 2xF 2 2yF 2 2zF 2 Multiplying by 2F 2 gives F (78) (77) x  xF F y  yF F z  zF F = = x y z which after cross multiplying gives yx  yx and then y x  x = y F F F F = x  xy y and y  y z = z y  z y z F F F F F F (y x  xy) and y  z y = (y z  z y) z F F 55 (79) (80) (81) or y x  x y  z y y z F = = y x  xy yz  zy F
d (y x dt (82) The first equality can be restated as  xy) = y x  xy d (y z dt  z y) yz  zy (83) and integration using du = ln u + c gives u y x  xy = A(y z  z y) (84) where A is a constant of integration. This gives y(x  Az) = y(x  Az) or then x  Az y = x  Az y x  Az = By (85) (86) (87) so that another integration gives where B is a constant. This is the equation of a plane through the center of sphere and containing the two end points of the problem. The intersection of this plane with the two points and passing through center of sphere is a great circle. This completes the problem solution. Figure 18: Intersection of a plane with a sphere Note that to cover all possible pairs of points we really have to do this problem in parametric form since for example if we tried to express solutions in terms of x as x, y(x), z(x), then any two points given in yz plane would not have a great circle path expressible in x. 56 Problem 1. A particle moves on the surface (x, y, z) = 0 from the point (x1 , y1 , z1 ) to the point (x2 , y2 , z2 ) in the time T . Show that if it moves in such a way that the integral of its kinetic energy over that time is a minimum, its coordinates must also satisfy the equations y z x = = . x y z 2. Specialize problem 2 in the case when the particle moves on the unit sphere, from (0, 0, 1) to (0, 0, 1), in time T . 3. Determine the equation of the shortest arc in the first quadrant, which passes through the points (0, 0) and (1, 0) and encloses a prescribed area A with the xaxis, where A . 8 4. Finish the example on page 51. What if L = ? 2 5. Solve the following variational problem by finding extremals satisfying the conditions J(y1 , y2) = y1 (0) = 1, y1 6. Solve the isoparametric problem J(y) = and
0 1 0 4 0 2 2 4y1 + y2 + y1 y2 dx 4 = 0, y2 (0) = 0, y2 4 = 1. (y )2 + x2 dx, y(0) = y(1) = 0,
1 y 2 dx = 2. 7. Derive a necessary condition for the isoparametric problem Minimize b I(y1 , y2 ) = L(x, y1 , y2, y1 , y2 )dx
a subject to
b a G(x, y1 , y2, y1 , y2 )dx = C y2 (a) = A2 , y1 (b) = B1 , y2 (b) = B2 and y1 (a) = A1 , where C, A1 , A2 , B1 , and B2 are constants. 8. Use the results of the previous problem to maximize 57 I(x, y) = subject to
t1 t0 t1 t0 (xy  y x)dt x2 + y 2dt = 1. Show that I represents the area enclosed by a curve with parametric equations x = x(t), y = y(y) and the contraint fixes the length of the curve. 9. Find extremals of the isoparametric problem I(y) = subject to 0 (y )2 dx, 0 y(0) = y() = 0, y 2dx = 1. 58 CHAPTER 6 6 Integrals Involving More Than One Independent Variable Up to now our integrals have been single integrals, i.e. integrals involving only one independent variable which we have usually called x. There are problems in the calculus of variations where the integral involves more than one independent variable. For example, given some contour C in xyz space, then find the surface z = z(x, y) contained within C that has minimum surface area. In this case we'd minimize the surface area integral S=
R 2 2 1 + zx + zy dy dx (1) where R is the region in the xy plane enclosed by the projection of C in the xy plane. In this problem there are two independent variables, x, y and one dependent variable, z. In order to see what conditions for a minimum hold when the integrand involves more than one independent variable, i.e. the Euler Lagrange equations in this more general case, let I be defined by I= F (x, y, z, zx , zy )dydx (2a)
R where x, y are the independent variables and z is a continuously differentiable function of x, y and is to be determined, subject to z = g(s) (2b) on the boundary of R where s is arc length, R is some closed region in the xy plane, and F has continuous first and second partial derivatives with respect to its arguments. Doing the analogous steps that we did in the single integral problems, assume that z0 : z0 (x, y) is a solution to this problem and that (x, y) is a surface which is continuous with continuous first partials defined over R and satisfies (x, y) = 0 Create the family of surfaces z( ) = z0 (x, y) + (x, y) and evaluate I on this family to obtain I( ) =
R on boundary of R . (3) (4) F [x, y, z0 (x, y) + (x, y), z0x (x, y) + x (x, y), z0y (x, y) + y (x, y)]dxdy (5) at = 0 and setting it to zero (why?) gives
R Differentiating I( ) with respect to 0 = I (0) = [Fz + Fzx x + Fzy y ]dydx 59 (6) At this point, let's recall (from an earlier chapter) the line of reasoning followed for the single integral case. The expression corresponding to (6) was 0 = I (0) =
x2 x1 [Fy + Fy ]dx We then rewrote this integrand (by using integration by parts) to involve only terms instead of and and used the fundamental lemma to get the EulerLagrange equation. As an alternate to this procedure we could have used a variant of the integration by parts formula used above and then written the integrand above in terms of , with no terms. Our next step would have been to use a modified form of the fundamental lemma introduced in chapter 4, involving but not terms. As a generalization to two variables of that modified form of the fundamental lemma we have Lemma 1. If (x, y) is continuous over a region R in the xy plane and if (x, y)(x, y)dydx = 0
R for every continuous function (x, y) defined over R and satisfying = 0 on the boundary of R, then (x, y) 0 for all (x, y) in R. We will not prove this lemma since it is not pertinent to the discussion. Returning now to our double integral and equation (6), then the second term in the integrand there can be written Fzx x = Fzx [Fzx ]  x x (7) This is analogous to the integration by parts formula used in the single integral problems. Now recalling Green's theorem
R (Qx + Py )dydx =
boundary of R (Q cos + P sin )ds (8) where P, Q are functions of x, y; is the angle between the outward normal of the boundary curve of R and the positive xaxis (see figure 19); ds is the differential of arc length and the boundary integral is taken in a direction to keep R on the left (positive). Integrating (7) over R and using (8) with Q as Fzx and P 0 gives:
R Fzx x dydx =
boundary of R Fzx cos ds  R (Fz )dydx x x (9) By performing a similar line of reasoning on the third term in the integrand of (6), then (6) becomes 0 = I (0) =
boundary of R [Fzx cos  Fzy sin ]ds + R [Fz  Fzx  Fz ]dydx x y y (10) 60 y R x Figure 19: Domain R with outward normal making an angle with x axis Thus in the expression for the derivative of I with respect to , (at = 0), we have written all terms involving and eliminated x and y . This is entirely analogous to the single integral case outlined above. Since (10) is true for all (x, y) which satisfy (3) then the first integral on the right side of (10) is zero for such and then by lemma 1, the coefficient of in the second integral of (10) must be zero over R. That is Fzx + Fz  Fz = 0 x y y (11) which constitutes the EulerLagrange equation for this problem. As an application of the above results, consider the minimal surface problem started before. Thus minimize 2 2 S= 1 + zx + zy dydx (12)
R where the surface is assumed representable in the form z = z(x, y) with z(x, y) specified on C, the given contour and R is the region in the xy plane, enclosed by the projection of C. Then (11) gives zx zy + =0 (13) 2 2 2 2 x y 1 + zx + zy 1 + zx + zy which by algebra can be reduced to
2 2 (1 + zy )zxx  2zx zy zxy + (1 + zx )zyy = 0 (14) (15) (16) Next, by setting p = zx then (14) becomes q = zy r = zxx u = zxy t = zyy (1 + q 2 )r  2pqu + (1 + p2 )t = 0 61 Now from differential geometry the mean curvature, M, of the surface is M Eg  2F f + Ge 2(EG  F 2 ) (17) where E, F, G and e, f, g are the coefficients of the first and second fundamental forms of the surface. For surfaces given by z = z(x, y) then one can show that E = 1 + p2 and e= so that M= r 1 + p2 + q 2 F = pq G = 1 + q2 g= t 1 + p2 + q 2 (18a) f= u 1 + p2 + q 2 (18b) (1 + p2 )t  2upq + (1 + q 2 )r 2(1 + p2 + q 2 )3/2 (19) So the numerator is the same as the left side of Euler's equation (16). Thus (16) says that the mean curvature of the minimal surface must be zero. Problems 1. Find all minimal surfaces whose equations have the form z = (x) + (y). 2. Derive the Euler equation and obtain the natural boundary conditions of the problem R (x, y)u2 + (x, y)u2  (x, y)u2 dxdy = 0. x y In particular, show that if (x, y) = (x, y) the natural boundary condition takes the form where u is the normal derivative of u. n u u = 0 n 3. Determine the natural boundary condition for the multiple integral problem I(u) =
R L(x, y, u, ux, uy )dxdy, u C 2 (R), u unspecified on the boundary of R 4. Find the Euler equations corresponding to the following functionals a. I(u) =
R (x2 u2 + y 2u2 )dxdy x y (u2  c2 u2 )dxdt, c is constant t x 62 b. I(u) =
R CHAPTER 7 7 Examples of Numerical Techniques Now that we've seen some of the results of the Calculus of Variations, we can study the solution of some problems by numerical techniques. All of the numerical techniques used in variational problems are iterative in nature, that is, they do not solve the problem in one step but rather proceed from an initial estimate (usually input by the user) and generate a sequence of succeeding estimates which converges to the answer. The iterative procedures used, are based upon a search from the present estimate to obtain a next estimate which has certain characteristics. The types of search procedures fall into two main classes called "Indirect Methods" and "Direct Methods." We will also look at a computer program for the variable end point case using indirect methods. 7.1 Indirect Methods Indirect methods are those which seek a next estimate satisfying certain of the necessary conditions for a minimizing arc, established previously. Thus these methods for example seek arcs that satisfy the Euler equations. An example of an indirect method is Newton's method for variational problems. We will now discuss this method and provide a sample computer program written in Matlab for students to try on their computer. First we discuss the fixed end points case. 7.1.1 Fixed End Points Consider the fixed endpoint problem of minimizing the integral I= among arcs satisfying y(x1 ) = Y1 , y(x2 ) = Y2 . (2) The indirect method seeks to find an arc y0 which satisfies the Euler equations and also satisfies the endpoint conditions (2). Writing the Euler equation d fy = fy (3) dx and then differentiating, gives fy x + fy y y + fy y y = fy (4)
x2 x1 f (x, y, y )dx (1) (Note that we assumed that our solution will have a second derivative y at each point). 63 In this procedure, the selection of y(x) and y (x) for x1 < x x2 is dictated by (4) as soon as y(x1 ) and y (x1 ) (5) are selected. Thus each time we alter the initial conditions (5), we will get a different solution of (4). Since by the first part of (2), the value of y(x1 ) is fixed, then the only variable left to satisfy the second part of (2) is y (x1 ). Calling the initial estimate of the minimizing arc y1 with value y1 (x1 ) and denoting the value of left endpoint slope for any other arc y (x1 , c) = y1 (x1 ) + c, then the solutions to (4) are a family of arcs y(c) : so that y (x1 , c) = y1 (x1 ) + c and y (x1 , 0) = y1 (x1 ) Differentiating the family (6) with respect to c at c = 0 we obtain (since (x) y(x, c) c =
c=0 y(x, c) x1 x x2 (6) (7) y (x1 , c) c = 1)
c=0 y(x, c) y (x1 , c) c=0 x1 x x2 (8) where we have assigned the name (x) to y(x, c) c c=0 y(x2 , c) In particular at x = x2 we get (= (x2 )) as the change in the value of y (x1 , c) c=0 y(x2 , 0) to a solution to (4) with each unit change in value of its left endpoint slope y1 (x1 ) (= y (x1 , 0)). Thus knowing (x2 ), we can form the differential correction to y1 (x1 ) as y1 (x1 ) = Y2  y1 (x2 ) (x2 ) (9) and use this to iterate on y1 (x1 ) to satisfy the second part of (2). In order to obtain (x) we note that for any arc y(c) : y(x, c) y (x, c) x1 x x2 (10) in our family (6) then by (4) we must have fy x (x, y(x, c), y (x, c)) + fy y (x, y(x, c), y (x, c)) y (x, c) +fy y (x, y(x, c), y (x, c)) y (x, c) = fy (x, y(x, c), y (x, c)) Differentiating (11) with respect to c at c = 0 and assuming that in our family, y(x, c) is continuously differentiable up through third order in x, c so that order of differentiation is immaterial and 2 y(x, c) (x) = xc 2 y(x, c) = cx 64 =
c=0 (11) c=0 y (x, c) c c=0 and (x) = which results in, fy xy + fy xy + (fy yy + fy yy )y1 + fy y +(fy y y + fy y y )y1 + fy y  fyy  fyy = 0 (12) 3 y(x, c) xxc =
c=0 3 y(x, c) cxx c=0 where in (12) all arguments of the derivatives of f are x, y(x), y (x) i. e. along the arc y1 . Equation (12) represents a second order linear differential equation for . The initial conditions for solution are obtained by differentiating (10) with respect to c at c = 0. Thus y(x1 , c) c = (x1 ) , 1= y (x1 , c) y (x1 , c) =
c=0 c=0 y (x1 , c) c c=0 = (x1 ) (13) where in the second equation in (13) we have recalled the definition of c. Then by the second equation of (13) we get that (x1 ) = 1. Furthermore, by the first part of (2) we see that for any c, y(x1 , c) = Y1 = y1 (x1 ) so that (x1 ) = 0. Thus we solve for (x) on x1 x x2 by solving the second order differential equation (12) with initial conditions (x1 ) = 0 (x1 ) = 1. For example, suppose we wish to find the minimum of I =
1 0 (y )2 + y 2 dx (14) y(0) = 0 y(1) = 1. The function odeinput.m supplies the user with the boundary conditions, a guess for the initial slope, tolerance for convergence. All the derivatives of f required in (4) are supplied in rhs2f.m. function [fy1y1,fy1y,fy,fy1x,t0,tf,y1,y2,rhs2,sg,tol] = odeinput % Defines the problem for solving the ode: % (f_{y'y'} )y" + (f_{y'y})y' = f_y  f_{y'x} % % % % % % t0 tf y1 y2 sg tol start time end time left hand side boundary value right hand side boundary value initial guess for the slope tolerance e.g. 1e4 65 t0 tf y1 y2 sg tol = 0; = 1; = 0; = 1; = 1; = 1e4; %rhs2f.m % function [rhs2]=rhs2f(t,x) % %input % t is the time % x is the solution vector (y,y') % % fy1fy1  fy'y' (2nd partial wrt % fy1y  fy'y (2nd partial wrt % fy  fy (1st partial wrt % fy1x  fy'x (2nd partial wrt % fy1y1 = 2; fy1y = 0; fy = 2*x(1); fy1x = 0; y' y') y' y) y) y' x) rhs2=[fy1y/fy1y1,(fyfy1x)/fy1y1]; The main program is ode1.m which uses a modified version of ode23 from matlab. This modified version is called ode23m.m. Since we have to solve a second order ordinary differential equation, we have to transform it to a system of first order to be able to use ode23. To solve the equation, the ode23 is used without any modifications. We also need the right hand side of the 2 equations to be solved (one for y and one for ). These are called odef.m and feta.m, respectively. All these programs (except the original ode23.m) are given here % % % % ode1.m This program requires an edited version of ode23 called ode23m.m Also required is odef.m, feta.m & odeinput.m All changes to a problem should ONLY be entered in odeinput.m 66 [fy1y1,fy1y,fy,fy1x,t0,tf,y1,y2,rhs2,sg,tol] = odeinput; correct = 100; while abs(correct) > tol %solve the initial value with the slope guessed x0=[y1,sg]'; [t,x]=ode23m('odef',t0,tf,x0,y2,'rhs2f',tol,0); n1=size(x,1); yy(1:n1)=x(1:n1,1); plot(t,yy) % check the value at tf % change the value of the slope to match the solution eta0=[0,1]'; [tt,eta]=ode23('feta',t0,tf,eta0); [nn1,nn2]=size(eta); correct=(y2yy(n1))/eta(nn1); sg=sg+correct; end % msode23m.m % % This code is a modified version of MATLAB's ODE23 to find a numerically integ % solution to the input system of ODEs. % % This code is currently defined for the variable right hand endpoint defined b % following boundary conditions: % y(0) = 1, y(x1) = Y1 = x2  1 % % Lines which require modification by the user when solving different problems % (different boundary function) are identified by (user defined) at the right m % % function [tout, yout] = msode23m(ypfun, t0, tfinal, y0, rhs2f, tol, trace) %ODE23 Solve differential equations, low order method. % ODE23 integrates a system of ordinary differential equations using % 2nd and 3rd order RungeKutta formulas. % [T,Y] = ODE23('yprime', T0, Tfinal, Y0, Y2, rhs2) integrates the system % of ordinary differential equations described by the Mfile YPRIME.M, 67 % over the interval T0 to Tfinal, with initial conditions Y0. % [T, Y] = ODE23(F, T0, Tfinal, Y0, y2, rhs2, TOL, 1) uses tolerance TOL % and displays status while the integration proceeds. % % INPUT: % F  String containing name of usersupplied problem description. % Call: yprime = fun(t,y) where F = 'fun'. % t  Time (scalar). % y  Solution columnvector. % yprime  Returned derivative columnvector; % yprime(i) = dy(i)/dt. % t0  Initial value of t. % tfinal Final value of t. % y0  Initial value columnvector. % tol  The desired accuracy. (Default: tol = 1.e3). % trace  If nonzero, each step is printed. (Default: trace = 0). % % OUTPUT: % T  Returned integration time points (columnvector). % Y  Returned solution, one solution columnvector per toutvalue. % % The result can be displayed by: plot(tout, yout). % % See also ODE45, ODEDEMO. % C.B. Moler, 32587, 82691, 90892. % Copyright (c) 198493 by The MathWorks, Inc. % % Initialization pow = 1/3; if nargin < 7, tol = 1.e3; end if nargin < 8, trace = 0; end t = t0; hmax = (tfinal  t)/256; %(user defined) %the denominator of this expression may %require adjustment to %refine the number of subintervals over %which to numerically %integrate  consider adjustment if infinite %loops are encountered %within this routine and keep the value as %a power of 2 h = hmax/8; y = y0(:); chunk = 128; 68 tout = zeros(chunk,1); yout = zeros(chunk,length(y)); k = 1; tout(k) = t; yout(k,:) = y.'; if trace clc, t, h, y end % The main loop while (t < tfinal) & (t + h > t) if t + h > tfinal, h = tfinal  t; end % Compute the slopes rhs2=feval(rhs2f,t,y); rhs2=rhs2(:); s1 = feval(ypfun, t, y,rhs2); s1 = s1(:); rhs2=feval(rhs2f,t+h,y+h*s1); rhs2=rhs2(:); s2 = feval(ypfun, t+h, y+h*s1,rhs2); s2 = s2(:); rhs2=feval(rhs2f,t+h/2,y+h*(s1+s2)/4); rhs2=rhs2(:); s3 = feval(ypfun, t+h/2, y+h*(s1+s2)/4,rhs2); s3 = s3(:); % Estimate the error and the acceptable error delta = norm(h*(s1  2*s3 + s2)/3,'inf'); tau = tol*max(norm(y,'inf'),1.0); % Update the solution only if the error is acceptable if delta <= tau t = t + h; y = y + h*(s1 + 4*s3 + s2)/6; k = k+1; if k > length(tout) tout = [tout; zeros(chunk,1)]; yout = [yout; zeros(chunk,length(y))]; end tout(k) = t; yout(k,:) = y.'; end if trace home, t, h, y end % Update the step size if delta ~= 0.0 h = min(hmax, 0.9*h*(tau/delta)^pow); end varendpt = t  1; %(user defined) tolbnd = 1e2; %(user defined) %varendpt is the equation of the variable 69 %endpoint as defined by %the right hand side boundary curve where %t is the independent variable %tolbnd is the desired tolerance for meeting %the variable right %endpoint condition and may require some %experimentation if abs(y(1)  varendpt) < tolbnd %this checks to see if the endpoint of the solution disp('hit boundary in msode23m'); break; %curve comes within a user specified end %tolerance of the right hand side %boundary curve end if (t < tfinal) disp('Singularity likely.') t end tout = tout(1:k); yout = yout(1:k,:); % feta.m function xdot=feta(t,x) xdot=[x(2),0]'; % odef.m function xdot=odef(t,x,rhs2) xdot=[x(2),rhs2(1)*x(2)+rhs2(2)]'; The solution obtained via matlab is plotted in figure 20. 70 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 20: Solution of example given by (14) 7.1.2 Variable End Points We have previously obtained necessary conditions that a solution arc to the variable end point problem had to satisfy. We consider now a computer program for a particular variable end point problem. We will use Newton's method to solve the problem: Thus consider the problem of minimizing the integral I= among arcs satisfying y(x1 ) = 1, y(x2 ) = Y2 = x2  1, x1 = 0 (16)
x2 x1 f (x, y, y )dx = x2 x1 [(y )2 + y 2]dx (15) (where we use Y2 (x) for the right hand boundary, which is a straight line). Our procedure now will be much the same as for the fixed end point problem done by Newton's method in that we'll try to find a solution to the Euler equation. Also as before, all of our estimate arcs y of solutions to this problem will have y(x1 ) = 1 x1 = 0 (17) so that these items are fixed. However we note that in general this will not be the case, and in other problems we may be allowed to vary these quantities in our iterative procedure but will then be required to satisfy a transversality condition involving them. Returning now to the problem at hand, we start with an initial estimate y 1 , satisfying the left end point condition y1 (x1 ) = 1 x1 = 0 (18) and the Euler equations d fy = fy dx or fy x + fy y y + fy y y = fy . 71 (19) As for the fixed endpoint case, only y (x1 ) is free to iterate with, so that setting y (x1 , c) = y1 (x1 ) + c with y(x1 , c) = y1 (x1 ) = 1 (20a) and integrating the Euler equation we get the family y(c) : y(x, c) x1 x x2 (c)  c (20b) (where only the right end value of x varies with c since the left end value is fixed and) which safisfies the Euler equation and y(x1 , c) = 1 Thus we have on this family fy x (x, y(x, c), y (x, c)) + fy y (x, y(x, c), y (x, c))y (x, c) + fy y (x, y(x, c), y (x, c))y (x, c) = fy (x, y(x, c), y (x, c)) Proceeding as we did in the fixed endpoint case we differentiate (22) with respect to c at c = 0. Thus fy xy + fy xy + (fy yy + fy yy )y1 + fy y +(fy y y + fy y y )y1 + fy y = fyy + fyy which is the same equation for that we got in the fixed endpoint case. The initial conditions for , , are obtained from (20a) by differentiation (at c = 0). In particular, differentiating the second part of (20a) yields (x1 ) = y(x1 , c) =0 c (24) (23) (22) x1 = 0 (21) and differentiating the first part of (20a) gives (x1 ) = y (x1 , c) =1 c (25) We have two conditions that our estimates have to satisfy at the right hand end, namely, (with subscript F denoting final values, e.g. yF (c) y(x2 (c), c)). yF = Y2 = x2  1 and the transversality condition (3) of chapter 4 which applied to this problem yields
2 2yF  (yF )2 + yF = 0 (26a) (26b) 72 Since x2 is unrestricted we choose to stop integration for each estimate when (26a) is satisfied and there to evaluate the expression (26b) which we call TERM
2 TERM = 2yF  (yF )2 + yF (27) Then if TERM differs from 0 we compute as before how much to change c by in order to reduce this value T ERM c = d(T ERM ) (28)
dc Next, differentiating (27) yields dy dyF dyF dy dy d(T ERM) = 2 F  2yF F + 2yF = 2[(1  yF ) F + yF ] dc dc dc dc dc dc where all arguments are along the arc y 1 . Now concentrating on yF which is a function of c yF (c) y(x2 (c), c) and differentiating with respect to c at c = 0 (i.e. at y 1 ) yields dyF dx2 y(x2 , c) = +yF dc c dc
F (29a) (29b) (30a) Doing analogous operations for yF (c) yields after differentiation with respect to c at c = 0. dyF dx2 = F + yF dc dc (30b) Also by differentiating the middle constraint in (16) i.e. the equation yF (c) = Y2 = x2 (c)  1 yields dx2 dyF = (30c) dc dc so that putting together (30a) and (30c) gives dx2 dx2 dyF = = F + yF dc dc dc or then (30d) dx2 (1  yF ) = F (30e) dc (compare to equation (6) of the appendix 4.2 but with = c and with x0i = x2 and Yi = x2 1 dYi dx2 so that = ) or then dc dc F dx2 = (30f ) dc 1  yF 73 and then by (29a), (30b), (30f), (30a) we get F F d(T ERM) ) + yF (F + yF )] = 2[(1  yF )(F + yF dc 1  yF 1  yF From the Euler equation we get yF = yF so that after collecting terms d(T ERM) yF F yF yF F = 2[(1  yF )(F + ) + yF F + ]= dc 1  yF 1  yF yF F ] = 2[F  yF F + yF F + 1  yF We have thus obtained all of the quantities necessary to compute the correction to c. The program for the present problem is then: a) start with an initial estimate y 1 with y1 (x1 ) = 1, y1 (x1 ) = y (x1 ), x1 = 0 y stopping the integration c (32) (31) b) integrate the Euler equation for y and the equation for = when the end point condition y(x2 ) = Y2 is met c) determine the error in the transversality condition and the correction in y (x1 ) needed d(T ERM) T ERM to correct it d(T ERM ) = c, where is computed using F . dc dc d) reenter (b) with initial conditions y(x1) = 1, x1 = 0, y (x1 ) = y1 (x1 ) + c and continue through the steps (b) and (c) e) stop when the error is smaller than some arbitrary number . 7.2 Direct Methods Direct methods are those which seek a next estimate by working directly to reduce the functional value of the problem. Thus these methods search in directions which most quickly tend to reduce I. This is done by representing I to various terms in its Taylor series and reducing the functions represented. The direct method we will work with is the gradient method (also called method of steepest descent). This method is based on representing the integral to be minimized as a linear functional of the arcs y over which it is evaluated. The gradient method has an analogue for finite dimensional optimization problems and we will first describe this method in the finite dimensional case. Thus suppose that we wish to minimize a function of the two dimensional vector y = (y1 , y2 ) f (y) (= (y1 )2 + (y2 )2 as an example) (33) 74 subject to the constraint (y) = 0 (y1 + y2  1 = 0 for our example) . (34) The gradient method says that starting with an initial estimate y 1 = (y1,1 , y1,2), we first linearize f as a function of the change vector = (1 , 2 ). Expanding f to first order at the point y 1 , gives f (y 1 + ) f (y 1 ) + fy1 1 + fy2 2
2 2 (= y1,1 + y1,2 + 2 (y1,1 1 + y1,2 2 )) (35) if   is small. Since f (y 1 ) is constant, then this allows us to consider f as a function F of only the change vector = (1 , 2 ) F () f (y 1 ) + fy1 1 + fy2 2 (36) where we don't list as an argument of F since it will be determined independently of and we wish to concentrate on the determination of first. We can similarly linearize the constraint and approximate by the function L which depends on L() (37) Now we wish to choose in order that F () is as small as possible and also L() = 0 for a given step size length, (  = ST ). Recall from calculus, that the maximum negative (positive) change in a function occurs if the change vector is opposite (in the same direction) to the gradient of the function. Now, the gradient of the function (36) considered as a function of is: F = (F1 , F2 ) = (fy1 , fy2 ) (= (2y1,1 , 2y1,2) for our example) so that fastest way to reduce F requires that be oriented in the direction = (1 , 2 ) = (fy1 , fy2 ) (= (2y1,1 , 2y1,2 ) for our example) (39) (38) (note that this choice is independent of ). However since we have a constrained problem, then our change should be restricted so that our new point y 1 + satisfies (y 1 + ) = 0 or by our approximation L() = 0 (40b) according to the way we defined L. Thus we modify from (39) so that it satisfies (40b). These conditions establish the direction of and then the value of is established by the requirement that   = ST , i. e. the change is equal to the step size. The gradient procedure then computes the function f (y 1 + ) which should be smaller than f (y1 ) and repeats the above procedure at the point y 2 = y 1 + . (40a) 75 In the infinite dimensional case, the idea is the same, except that we are now dealing with a function of an infinite number of variables, namely arcs y: y(x) x1 x x2 and our change vector will have direction defined by the arc : (x) x1 x x2 Thus consider the case of minimizing the integral I=
x2 x1 f (x, y, y )dx (41) subject to the fixed endpoint conditions (the constraint on the problem) y(x1 ) = a y(x2 ) = b (42) Following the procedure used in the finite dimensional case, we start with an initial arc y 1 and first linearize the integral I by computing the first variation of I . I =
x2 x1 [fy + fy ]dx (43) Integrating (by parts) the first term in the integrand gives,
x2 x1 x fy (x)dx = [(x) x1 fy ds]x2  x1 x2 x1 x [ (x)
x1 fy ds]dx (44) Since the variations (x) must (why?) satisfy (x1 ) = (x2 ) = 0 then the first term on the right in (44) vanishes and plugging back into (43) gives I =
x2 x1 (45) [fy  x x1 fy ds] (x)dx . (46) Corresponding to (36) we then approximate I(y 1 + ) by ^ I = I(y1 ) + I (47) where the second term is represented by (46). Analogous to the finite dimensional case, we desire to select or equivalently (x1 ) and (x), x1 x x2 so that subject to a step size ^ constraint, we have that I (and also approximately I) has minimum value at y1 + . The stepsize constraint in this case looks like
Alternatively we can think of this as the derivative at with the arc (x) but we don't set it = 0 (why?) = 0 of I evaluated on the family y( ): created 76 max  (x) x1 x x2 (48) (which represents the maximum change from y1 (x) along our arcs) and where will be selected according to the stepsize we wish. It can be shown formally that the best selection of (x) at each x is (x) = [fy 
x x1 fy ds] x1 x x2 (49) ^ This hueristically can be considered the direction opposite to the gradient of I with respect to (x) for each x. However, as in the finite dimensional case, we must modify this change in order to satisfy the constraint (45). Defining the integral of (x) of (49) from x1 to x as M(x) =  and defining the average of this as Mavg = M(x2 ) 1 = x2  x1 x2  x1
x2 x1 x x1 [fy  x1 fy ds]d (50) [fy  x x1 fy ds]dx (51) (note that M(x1 ) = 0) then with (x) defined as (x) =  [fy 
x x1 fy ds]  Mavg = 1 fy ds  [fy  x2  x1 x1
x x2 x1 x2 x1 [fy  x x1 (52) fy ds]dx] we get (x2 ) = (x)dx + (x1 ) = (x1 ) (53) which together with (x1 ) = 0 (which we can easily choose) yields which satisfies our constraint (45). Integrate (52) from x1 to x (x) = M(x)  (x  x1 )Mavg . While this is not the only way to create satisfying (45), it can be formally shown that subject to (45), this (x) is the best selection to reduce I. We now give a matlab program that uses direct method to minimize the integral I. This program requires the user to supply the functions f, fy , fy . These functions are supplied in the finput.m file that follows. % This program solves problems of the form % ___x2 %  % Minimize I =  f(x,y,y') dx 77 % % % % % % % ___x1 using the direct method. The user must supply the F(x,y,y'), Fy(x,y,y') and Fy'(x,y,y') in a file called finput.m See finput.m functions % By Jerry Miranda, 12/10/96 % WARNING: Early termination may occur if N is TOO large or if epsilon is % TOO small. The count parameter is set at 50 and can be adjusted % below. Count is required to prevent runaway in the while loop or % excessive computation until this version is modified. clear C = 50; % set the count paramater % Here we solve the problem min [ int(0>1) {2y' + y^2} dx] % s.t. y(0)=0, y(1)=1 % setup boundary conditions x1 = 0; y1 = 0; % y(x1) = y1 x2 = 1; y2 = 1; % y(x2) = y2 % choose an epsilon and the number of points to iterate epsilon = .01; N = 25; (User define) (User define) if x2x1 == 0, error('x2 and x1 are the same'), break, end deltax = (x2x1)/N; x = [x1:deltax:x2]'; % x is a col vector (User define) % make an initial guess for the solution arc: % this is a function satisfying the boundary conditions ybar = (y2y1)/(x2x1)*(xx1)+y1; % this is the derivative of a linear function ybar % if ybar is NOT linear, % we should use finite difference to approximate yprime yprime = ones(size(x))*(y2y1)/(x2x1); % calculate M(x2) and Mavg 78 sum1=0; MM(1)=0; for i = 2:N+1 sum2=0; for jj=1:i1 sum2= deltax*finput(x(jj),ybar(jj),yprime(jj),2)+sum2; end sum1 = deltax*(finput(x(i),ybar(i),yprime(i),3)sum2)+sum1; MM(i)=  sum1; end Mx2 =  sum1; Mavg = Mx2/(x2x1); % Calculate eta(x) for each x(i) for i = 1:N+1 eta(i,1) = MM(i)  Mavg*(x(i)x1); end % Calculate eta'(x) for each x(i) for i = 1:N+1 sum2=0; for jj=1:i1 sum2= deltax*finput(x(jj),ybar(jj),yprime(jj),2)+sum2; end etaprm(i,1)=  finput(x(i),ybar(i),yprime(i),3)sum2 Mavg; end % The main loop % We now compute Ihat = I(ybar1) + epsilon*I' and check to minimize Ihat % First Ihat sum1=0; for i = 1:N+1 F = finput(x(i),ybar(i),yprime(i),1); sum1 = deltax*F+sum1; end Ihatnew = sum1; Ihatold = Ihatnew+1; count = 0; %set counter to prevent runaway while (Ihatnew <= Ihatold) & (count <= C) count = count + 1; % Integrate to get I' sum1=0; 79 for i = 1:N+1 sum2 =0; for j = 1:i1 Fy = finput(x(j),ybar(j),yprime(i),2); sum2 = deltax*Fy+sum2; end Fyp = finput(x(i),ybar(i),yprime(i),3); sum1 = deltax*(Fyp+sum1sum2)*etaprm(i); end Iprm = sum1; % Integrate to get I sum1=0; for i = 1:N+1 F = finput(x(i),ybar(i),yprime(i),1); sum1 = deltax*F+sum1; end I = sum1; Ihatnew = I + epsilon*Iprm; if Ihatnew < Ihatold % what delta is used ybar = ybar + epsilon*eta; Ihatold = Ihatnew; for ij=2:N+1 yprime(ij)=(ybar(ij)ybar(ij1))/(x(ij)x(ij1)); end end end % we now have our solution arc ybar plot(x,ybar), grid, xlabel('x'), ylabel('y') title('Solution y(x) using the direct method') function value = finput(x,y,yp,num) % function VALUE = FINPUT(x,y,yprime,num) returns the value of the % functions F(x,y,y'), Fy(x,y,y'), Fy'(x,y,y') at a given % x,y,yprime % for a given num, 80 % num defines which function you want to evaluate. % 1 for F, 2 for Fy, 3 for Fy'. if nargin < 4, error('Four arguments are required'), break, end if (num < 1)  (num > 3) error('num must be between 1 and 3'), break end if num == 1, value = yp^2 + y^2; end if num == 2, value = 2*y; end if num == 3, value = 2*yp; end % F % Fy % Fy' Problems 1. Find the minimal arc y(x) that solves, minimize I =
x1 0 y 2  (y ) 2 dx a. Using the indirect (fixed end point) method when x1 = 1. b. Using the indirect (variable end point) method with y(0)=1 and y(x1 ) = Y1 = x2  . 4 2. Find the minimal arc y(x) that solves, minimize I =
1 0 where y(0) = 1 and y(1) = 2. 3. Solve the problem, minimze I =
x1 0 1 2 (y ) + yy + y + y dx 2 y 2  yy + (y ) 2 dx a. Using the indirect (fixed end point) method when x1 = 1. b. Using the indirect (variable end point) method with y(0)=1 and y(x1 ) = Y1 = x2  1. 4. Solve for the minimal arc y(x) : I = where y(0) = 0 and y(1) = 1.
1 0 y 2 + 2xy + 2y dx 81 CHAPTER 8 8 The RayleighRitz Method We now discuss another numerical method. In this technique, we approximate the variational problem and end up with a finite dimensional problem. So let us start with the problem of seeking a function y = y0 (x) that extremizes an integral I(y). Assume that we are able to approximate y(x) by a linear combination of certain linearly independent functions of the type: y(x) 0 (x) + c1 1 (x) + c2 2 (x) + + cN N (x) (1) where we will need to determine the constant coefficients c1 , cN . The selection of which approximating function (x) to use is arbitrary except for the following considerations: a) If the problem has boundary conditions such as fixed end points, then 0 (x) is chosen to satisfy the problem's boundary conditions, and all other i vanish at the boundary. (This should remind the reader of the method of eigenfunction expansion for inhomogeneous partial differential equation. The functions i are not necessarily eigenfunctions though.) b) In those problems where one knows something about the form of the solution then the functions i(x) can be chosen so that the expression (1) will have that form. By using (1) we essentially replace the variational problem of finding an arc y(x) that extremizes I to finding a set of constants c1 , , cN that extremizes I(c1 , c2 , , cN ). We solve this problem as a finite dimensional one i. e. by solving I = 0, ci i = 1, , N (2) The procedure is to first determine an initial estimate of c1 by the approximation y 0 +c1 1 . Next, the approximation y 0 +c1 1 +c2 2 is used (with c1 being redetermined). The process continues with y 0 + c1 1 + c2 2 + c3 3 as the third approximation and so on. At each stage the following two items are true: a) At the ith stage, the terms c1 , ci1 that have been previously determined are redetermined b) The approximation at the ith stage y 0 + c 1 1 + + ci i will be better or at least no worse than the approximation at the i  1st stage y 0 + c1 1 + + ci1 i1 Convergence of the procedure means that
N N (3) (4) lim 0 +
i=1 ci i = y0 (x) 82 (5) where y0 (x) is the extremizing function. In many cases one uses a complete set of functions e. g. polynomials or sines and cosines. A set of functions i (x) (i = 1, 2, ) is called complete over [a, b] if for each Riemann integrable function f (x), there is a number N (depending on , c1 , , cN ) such that max[f 
[a,b] N i=1 ci i ]2 < (6) The above outlined procedure can be extended in a number of ways. For example, more than one independent variable may be involved. So for the problem of min I =
R F (x, y, w, wx, wy )dydx (7) subject to w = h(s) on the boundary of R (8) where h(s) is some prescribed function and s is the arc length along . Analogous to (1) we write w(x, y) = 0 (x, y) + c1 1 (x, y) + + cN N (x, y) (9) and 0 satisfies (8) and i (x) i = 1, 2, 3 are zero on . We could also extend the procedure to functions involving higher derivatives, more independent variables, etc. Example 1: Apply the procedure to the problem of min I = with the boundary conditions y(0) = 1 y(1) = 2 (11) Solution: Since the boundary conditions are NOT homogeneous, we have to take 0 to satisfy the boundary conditions, i.e. 0 = 1 + x. We choose 1 (x) = x(1  x) since it should satisfy zero boundary conditions. Setting y1 (x) = 1 + x + c1 x(1  x) . Substituting (12) into (10) and performing the integration gives I=1 + Solving I 10 or = 0 implies that c1 = c1 9 y1 (x) = 1 + x + 1 0 [(y )2  y 2  2xy]dx (10) (12) 2 3 2 c1  c1 10 3 (13) 10 x(1  x) 9 (14) A function f is Riemann integrable over [a, b] if all points of discontinuity of f can be enclosed in a set of subintervals, the sum of whose lengths can be made as small as desired 83 d2 I 1 as the first approximate solution. We note that 2 is positive at c1 = ; thus we have dc1 9 minimized I on the class of functions defined by (12). Continuing we try yn (x) = 1 + x + x(1  x)[c1 + c2 x + c3 x2 + + cn xn1 ] (15) where n = 2, 3, 4, . The boundary conditions are satisfied by yn for all values of ci that is for n = 1, 2, 3, yn (0) = 1 yn (1) = 2 . (16) For n = 2, when y2 (x) = 1 + x + x(1  x)[c1 + c2 x] (17) is used and the integration carried out, we get 2 equations for the two parameters c1 and c2 I I when solving = = 0. This gives c1 c2 c1 = 0.9404 so that y2 (x) = 1 + x + x(1  x)[0.9404 + 0.3415x] Comparing the two approximations y1 (x) and y2 (x) with the exact solution y = cos x + 3  cos 1 sin x  x sin 1 (20) (19) c2 = 0.3415 (18) In the next figure we plot the exact solution and y1 (x) and y2 (x). It can be seen that y1 is already reasonably close. 8.1 Euler's Method of Finite Differences Euler solved many variational problems by the method of finite differences. Suppose we want to extremize the integral xn+1 I(y) = F (x, y, y )dx (21)
x0 where x0 and xn+1 are given and the function y is subject to the boundary conditions y(x0 ) = y0 and y(xn+1 ) = yn+1 . Dividing the interval [x0 , xn+1 ] into n + 1 equal parts, the width of each piece is xn+1  x0 x = n+1
The exact solution is obtained in this problem, by noting that the variational problem (10) and (11) has the same solution as the boundaryvalue problem y + y = x y(0) = 1 y(1) = 2 84 2 1.8 1.6 1.4 1.2 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 21: The exact solution (solid line) is compared with 0 (dash dot), y1 (dot) and y2 (dash) (See Figure 22.) Next, let y1 , y2 , , yn be the values of y corresponding to x1 = x0 + x, x2 = x0 + 2x, , xn = x0 + nx respectively. The associated values y1 , y2 , , yn are unknowns because the function which solves the problem is unknown as yet. The integral (21) (by definition) is the limit of a summation, and thus we may approximate the integral by a function of n variables (y1 , y2, , yn ). n yi+1  yi (y1 , y2 , , yn ) = F xi , yi , x (22) x i=0 In this way the derivative is replaced by a difference quotient and the integral by a finite sum. The quantities y1 , y2 , , yn are determined so that solves =0 yi i = 1, 2, , n (23) The terms of (22) which involve yi are yi+1  yi yi  yi1 F xi , yi , x and F xi1 , yi1, x x x so that yi+1  yi yi+1  yi = Fyi xi , yi, x  Fyi xi , yi, yi x x yi  yi1 +Fyi1 xi1 , yi1, x 85 (24) (25) =0 y y0 y1 y2 yN yN+1 x xN+1 0 x0 x0 + x x0 + 2 x x0 + N x Figure 22: Piecewise linear function where in (25) yi = With yi = yi+1  yi , (25) is Fyi [Fyi (xi , yi, yi )  Fyi1 (xi1 , yi1 , yi1 )] yi x x xi , yi,  = 0. x x (26) yi+1  yi . x This yields the following system of n equations in n unknowns: Fyi1 yi =0 i = 1, 2, , n (27)  x x Equation (27) is the finite difference version of the Euler equation. As n , x 0 and (27) becomes the Euler equation. Fyi xi , yi , Example: Find a polygonal line which approximates the extremizing curve for
2 0 [(y )2 + 6x2 y]dx y(0) = 0, y(2) = 4 (28) Solution: With n = 1, x = 1, x0 = 0, x1 = 1, x2 = 2, y0 = 0, y2 = 4, and y1 = y(x1 ) = y(1) is unknown. We form (in accordance with (22))
1 (y1 ) =
i=0 6x2 yi + i yi+1  yi x 2 x = 6x2 y0 + 6x2 y1 + 0 1 y1  y0 x 2 + y2  y1 2 (29) x 2 2 = 0 + 6y1 + y1 + (4  y1 )2 = 2y1  2y1 + 16 Now d = 4y1  2 = 0 y1 = 1/2 . dy1 (30) 86 2 2 4 With n = 2, x = , x0 = 0, x1 = , x2 = , x3 = 2, y0 = 0, and y3 = 4. The 3 3 3 2 4 variables are y1 = y( ) and y2 = y( ). And then 3 3 9 2 8 9 2 9 22 2 (y1, y2 ) = [ y1 + y1 + y2  y1 y2  y2 + 36] 2 3 2 2 3 3 So then 9 8 /y1 = 9y1  y2 + = 0 2 3 9 22 /y2 =  y1 + 9y2  =0 2 3 giving y1 = 4 24 and y2 = . 27 27 (32) (31) 1 With n = 3 and x = , we have 2 1 3 37 2 2 2 (y1 , y2, y3 ) = [8(y1 + y2 + y3  y1 y2  y2 y3 ) + y1 + 6y2  y3 + 64] 2 2 2 and the partial derivatives with respect to yi give 3 = 0 2 16y2  8(y1 + y3 ) + 6 = 0 37 = 0. 16y3  8y2  2 16y1  8y2 + Solving gives y1 = 1 16 y2 = 5 16 y3 = 21 . 16 With n = 4 and x = 0.4, we get y1 = y(0.4) = 0.032 y3 = y(1.2) = 0.5568 The Euler equation for this problem is 3x2  y = 0 which when solved with the boundary conditions gives y = x4 /4. If we compare the approximate values for n = 1, 2, 3, 4 with the exact result, the results are consistently more accurate for the larger values of the independent variable (i.e., closer to x = 2). But the relative errors are large for x close to zero. These results are summarized in the table below and the figure. 87 y2 = y(0.8) = 0.1408 y4 = y(1.6) = 1.664 x y for n = 1 y for n = 2 y for n = 3 y for n = 4 0.0 0.0 0.0000 0.0000 0 0.4 0.0320 0.5 0.0625 2/3 0.1481 0.8 0.1408 1.0 0.5 0.3125 1.2 0.5568 4/3 0.8889 1.5 1.3125 1.6 1.6640 2.0 4.0 4.0000 4.0000 4.0000 yexact 0.0000 0.0064 0.0156 0.0494 0.1024 0.2500 0.5184 0.7901 1.2656 1.6384 4.0000 4 3.5 3 2.5 2 1.5 1 0.5 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Figure 23: The exact solution (solid line) is compared with y1 (dot), y2 (dash dot), y3 (dash) and y4 (dot) Problems 1. Write a MAPLE program for the RayleighRitz approximation to minimize the integral I =
1 0 (y )2  y 2  2xy dx 88 y(0) = 1 y(1) = 2. Plot the graph of y0 , y1, y2 and the exact solution. 2. Solve the same problem using finite differences. 89 CHAPTER 9 9 Hamilton's Principle Variational principles enter into many physical real world problems and can be shown in certain systems to derive equations which are equivalent to Newton's equations of motion. Such a case is Hamilton's principle, the development is as follows: First let's assume Newton's equations of motion hold for a particle of mass m with vector position R acted on by force F . Thus mR  F = 0 (1) (where "" denotes time differentiation) is the differential equation which defines the motion of the particle. Consider the resulting path in time R(t) t1 t t2 and let R(t) be a curve satisfying R(t1 ) = 0, R(t2 ) = 0 (3) (this is in our previous chapters) and consider (see Figure 24) the varied path R(t)+R(t) . When using the notation, it's often called the variation. Thus R is the variation in R. The variation is likened to the differential. So e.g. for a function g(x, y) then g = gx x + gy y, is the variation in g due to variations x in x and y in y.
z R(t) + R(t) R(t) Varied path t1 t t2 (2) t = t2 True path R(t) y x Figure 24: Paths made by the vectors R and R + R Now take the dot product between (1) and R and integrate from t1 to t2
t2 t1 (mR R  F R)dt = 0 (4) If the first term is integrated by parts using d (R R) = R R + R R dt (5) In the notation of previous chapters, R would be called , (and would have three components, one each for x, y, z) however the notation is widely used in presenting Hamilton's principle and has some advantage over the other. 90 where we've used d R = R. Then by (3) this gives dt
t2 t1 mR Rdt = [mR R  =0 at both ends t2 = (mR R)dt ]t2 t1
t1 t2 t1 (mR R)dt (6) Now consider the variation (change) in the term R2 , due to R R2 = (R R) = 2R R so that (6) becomes
t2 t1 (7) mR Rdt =  t2 t1 1 mR2 dt =  2 t2 t1 T dt (8) 1 where T = mR2 is the kinetic energy of the particle. Thus using (8) in (4) gives 2
t2 t1 (T + F R)dt = 0 (9) This is the most general form of Hamilton's Principle for a single particle under a general force field and says that the path of motion is such that along it, the integral of the variation T of the kinetic energy T plus F R must be zero for variations in the path satisfying R(t1 ) = 0, R(t2 ) = 0. Conversely, from Hamilton's Principle we may deduce Newton's law as follows: From (9), the definition of T and (7) comes
t2 t1 (mR R + F R)dt = 0 (10) Now by (5) and integration by parts, using (3) we get
t2 t1 (mR + F ) Rdt = 0 (11) And since this holds for all R(t) satisfying (3) we get (by a modified form of the fundamental lemma presented in chapter 4) mR  F = 0 (12) which is Newton's law of motion. If the force field is conservative, then there is a function of position say (x, y, z) for motion in 3space such that x = F1 , y = F2 , z = F3 or then F = (13) 91 where F1 , F2 , F3 are the components of force F along x, y, z axes respectively and is the gradient of . Then = x x + y y + z z = F1 x + F2 y + F3 z = F R (14) where x, y, z are the components of R. The function is called the force potential. The function V defined by V  (15a) satisfies (by (14)) F R = V (15b) This function is called the potential energy. For example in gravitational motion in a spherically symmetric field centered at the origin, the force on a particle is F =  R R = 2 R R R3 (16) (where is the gravitational constant). In this case V =  R (17) and one can check that (13) and (15) hold. For conservative fields, by (15b), then (9) becomes
t2 t1 (T  V )dt = 0 (18) and this is Hamilton's principle for a conservative force field. Thus, Hamilton's principle for a conservative system states that the motion is such that the integral of the difference between kinetic and potential energies has zero variation. The difference T  V is often called the Lagrangian L LT V (19) and in these terms, Hamilton's principle for a conservative system says that the motion is such that t2 t2 Ldt = Ldt = 0 (20)
t1 t1 (where t2 t1 Ldt means the variation in the integral). Then in the usual way we can show dLx dLy dLz = 0 Ly  = 0 Lz  =0 dt dt dt that this means that the motion is such that Lx  (21) i.e. the Euler equations hold for L. 92 For one dimensional x(t), Euler's equation is Lx  d Lx = 0. dt Let's define a canonical momentum, p, by Lx , then if Lxx = 0, then we can solve for x in terms of t, x, p, x = (t, x, p). Define the Hamiltonian H by H(t, x, p) = L(t, x, (t, x, p)) + p(t, x, p). In many systems H is the total energy. Note that H = = x p H = p x These are known as Hamilton's equations. Let's continue with our example of motion of a particle in a spherically symmetric gravitational force field. This is the situation for a satellite travelling about a spherical earth. We only assume that the force is along a line directed through the center of the Earth, where we put our origin. We don't assume any special form for F , such that as we know it depends only on distance to center of the Earth. Let t0 be some instant and let P be the plane containing the position and velocity vectors at t0 . For ease of presentation let P be the horizontal x, y plane pictured below (we can always orient our coordinates so that this is true). Form spherical coordinates (r, , ) with measured in P and , measured perpendicular to P ( = 0 for the current case). Then in these spherical coordinates R = er r + e r + e r cos (22) where r = R is distance from origin to particle and er , e , e are the unit direction vectors (see Figure 25) that R changes due to respective changes in the spherical coordinates (r, , ). Then by (22) F R = F er r + F e r + F e r cos = F er r (23) where the last equality follows since F is along er according to our assumption. Now using (15), (23) and second part of (13) results in F er r = F R = V = [ V V V r + + ] r (24) and since r, , are independent, then this gives V V = =0 93 (25) z y e x e er Figure 25: Unit vectors er , e , and e i.e. V is only a functions of r (actually we know V = V = V (r)  where is a constant), r (26) Now for our particle in the xy plane in spherical coordinates, the velocity, of our particle at "t0 " has components along er , e , e respectively of r, 0, r (27) the second value being due to the velocity vector being in P and e being perpendicular to P . Then the kinetic energy T is 1 T = m(r 2 + r 2 2 ) 2 So that L=T V = (28) 1 m(r 2 + r 2 2 )  V (r) (29) 2 and the Euler equations (21) given in spherical coordinates: For r d dV d + mr 2  [mr] = 0 (30a) Lr  Lr = 0  dt dr dt or dV m =  r + mr 2 dr While for we see that since does not enter in the problem at any time then the motion stays in same plane. For d d L  L = 0 (mr 2 ) = 0 (30b) dt dt 94 Equation (30a) says that (since dV is the force in the r direction and that is the total dr force here then) the acceleration in that direction is the sum of that due to the force and the centrifugal acceleration. Equation (30b) gives a first integral of the motion saying that mr 2 = constant (31) which says that the angular momentum is constant. This is actually a first integral of the motion resulting in a first order differential equation instead of a second order differential equation as in (30a). Problems 1. If is not preassigned, show that the stationary functions corresponding to the problem subject to y(0) = 2, are of the form y = 2 + 2x cos , where y( ) = sin satisfies the transcendental equation = 0. 3 and . 2 4
1 0 y 2 dx = 0 2 + 2 cos  sin Also verify that the smallest positive value of 2. If is between is not preassigned, show that the stationary functions corresponding to the problem 1 0 y 2 + 4(y  ) dx = 0
2 subject to y(0) = 2, are of the form y = x2  2 2
4 y( ) = x + 2, where is one of the two real roots of the quartic equation  3  1 = 0. 3. A particle of mass m is falling vertically, under the action of gravity. If y is distance measured downward and no resistive forces are present. a. Show that the Lagrangian function is L = T V = m 1 2 y + gy 2 + constant and verify that the Euler equation of the problem t2 t1 L dt = 0 95 is the proper equation of motion of the particle. b. Use the momentum p = my to write the Hamiltonian of the system. c. Show that H = = y p H = p y 4. A particle of mass m is moving vertically, under the action of gravity and a resistive force numerically equal to k times the displacement y from an equilibrium position. Show that the equation of Hamilton's principle is of the form t2 t1 1 1 2 my + mgy  ky 2 dt = 0, 2 2 and obtain the Euler equation. 5. A particle of mass m is moving vertically, under the action of gravity and a resistive force numerically equal to c times its velocity y. Show that the equation of Hamilton's principle is of the form t2 t2 1 2 my + mgy dt  cyy dt = 0. 2 t1 t1 6. Three masses are connected in series to a fixed support, by linear springs. Assuming that only the spring forces are present, show that the Lagrangian function of the system is L = 1 m1 x2 + m2 x2 + m3 x2  k1 x2  k2 (x2  x1 )2  k3 (x3  x2 )2 + constant, 1 2 3 1 2 where the xi represent displacements from equilibrium and ki are the spring constants. 96 CHAPTER 10 10 Degrees of Freedom  Generalized Coordinates If we have a system of particles whose configuration we're trying to describe, then usually, owing to the presence of constraints on the system, it is not required to give the actual coordinates of every particle. Suppose, for instance, that a rigid rod is moving in a plane, then it's sufficient to specify the (x, y) coordinates of mass center and the angle that the rod makes with the xaxis. From these, the position of all points of the rod may be found. In order to describe the configuration of a system, we choose the smallest possible number of variables. For example, the configuration of a flywheel is specified by a single variable, namely the angle through which the wheel has rotated from its initial position. The independent variables needed to completely specify the configuration of a system are called generalized coordinates. The generalized coordinates are such that they can be varied arbitrarily and independently without violating the constraints. The number of generalized coordinates is called the number of degrees of freedom of a system. In the case of the flywheel the number of degrees of freedom is one while the rigid bar in the plane has three generalized coordinates. A deformable body does not possess a finite number of generalized coordinates. Consider a system of N particles with masses m1 , , mN and position vectors R1 , , RN and with Fi the resultant force on the ith particle. Then for each particle we have mi Ri  Fi = 0 For this system Hamilton's principle gives
t2 t1 N i = 1, , N (1) [T +
i=1 Fi Ri ]dt = 0 (2) where T is the kinetic energy of the system of N particles, and T = 1 2 mi Ri . i=1 2
N (3a) As before, if there is a potential energy V (R1 , RN ) then (2) becomes (see (15b) of Chapter 9) t2 t1 (T  V )dt = 0 (3b) Now each position vector consists of a triple of numbers so that the system configuration is determined by 3N numbers. Generally, the system is subject to constraints which implies that not all of the 3N coordinates are independent. Suppose that there are K constraints of the type i (R1 , , RN , t) = 0 i = 1, k (4) which must be satisfied by the coordinates. Then there are only 3N  k = p independent coordinates so that we can select a set of p independent "generalized" coordinates q1 , qp 97 which define the configuration of the system. Therefore, the position vectors Ri can be written Ri = Ri (q1 , , qp , t) i = 1, , N (5) and similarly for the velocities Ri = Ri (q1 , qp , q1 , qp , t) so that the kinetic energy is a function of q1 qp , q1 , qp , t T = T (q1 qp , q1 qp , t) and also if there is a potential energy V = V (R1 RN ) then V = V (q1 , qp , t) so that L = T  V = L(q1 qp , q1 , qp , t) (9) Then when using (3b), the independent variations are the qi and not the Ri and the resultant Euler equations are d (T  V )  (T  V ) = 0 dt qi qi (10) (8) (7) (6) Before we do examples let's review some material on the potential energy V of a conservative system. We know from a previous chapter that with F as the force then F R is the infinitesimal amount of work done by moving through the displacement R and also that F R = V (11) i.e. this infinitesimal work is equal to the negative of the infinitesimal change in the potential energy. For noninfinitesimal changes, then we integrate (thinking of R as dR and similarly for V ) and
R R1 F R = [V (R)  V (R1 )] (12) and get the change in the potential energy between R = R1 and R. For example if a particle moves along the yaxis (y positive down) in a constant gravity field from y = y1 (R = y here as the variable defining the system configuration) to y then the change in potential energy is R y F R = mgy = mgy  mgy1 = [V (y)  V (y1 )] = V (y) + c (13)
R1 y1 (thinking of V (y1 ) as a fixed reference value) giving the potential energy V (y) = mgy (often the constant is included in V (y)). 98 (14) Of course, if the components of R are not all independent, and instead the q variables are the independent ones we could express everything in terms of those variables and have
q q1 F (q) q = [V (q)  V (q 1 )] = V (q) + c (15) (where q is the vector with components as the independent variables q). Example: A simple pendulum consisting of a point mass is suspended by an inextensible string of length . The configuration of this system is completely specified by the single angle between the deflected position and some reference position, say equilibrium position where it's hanging vertically (Figure 26).
x l m y Figure 26: A simple pendulum Using (14) we see that the potential energy V = mgy. Here y is determined by as y = cos so that V = mg cos Since the velocity is , then the kinetic energy T is 1 T = m( )2 2 99 (18) (17) (16) so that with q = , then (10) becomes here 1 d 1 ( m( )2 + mg cos )  ( m( )2 + mg cos ) = 0 dt 2 2 or then m + mg sin = 0 the equation of motion for the pendulum. Example: Consider a compound pendulum as pictured in Figure 27.
x 1 l1 (19) (20) m1 2 l2 m2 y Figure 27: A compound pendulum In this problem we can't go directly to circular coordinates since there are a different set of these for motions about the two pivot points and the motion of m2 is the sum of these motions so that we must add these two vectorially. We use rectangular coordinates with (x1 , y1 ) and (x2 , y2 ) as the coordinates of the two masses m1 and m2 respectively. Then in terms of the independent (generalized) coordinates 1 , 2 we have (choosing y negative down) x1 = x2 = sin 1 2 sin 2 +
1 1 sin 1 y1 =  1 cos 1 y2 =  2 cos 2  (21)
1 cos 1 The potential and kinetic energies analogously as done before are 1 T1 = m1 (x2 + y1 ) 1 2 2 100 1 T2 = m2 (x2 + y2 ) 2 2 2 (22a) V1 = m1 gy1 V2 = m2 gy2 (22b) Plugging (21) into (22) and writing the Lagrangian gives 1 1 2 2 L = T  V = (m1 + m2 ) 2 1 + m2 1 2 1 2 cos(1  2 ) + m2 2 2 + 1 2 2 2 g(m1 + m2 ) 1 cos 1 + gm2 2 cos 2 Then 0= d d [L1 ]  L1 = [(m1 + m2 ) 2 1 + m2 1 dt dt +g(m1 + m2 ) 1 sin 1
1 2 2 (23) cos(1  2 )] + m2 1 2 1 2 sin(1  2 ) (24) and 0 = d d [L2 ]  L2 = [m2 1 2 1 cos(1  2 ) + m2 2 2 ] + gm2 2 dt dt  m2 1 2 1 2 sin(1  2 )
2 sin 2 (25) Example: Consider the harmonic oscillator whose Lagrangian is given by L(t, y, y) = The canonical momentum is p = Ly = my. Solving for y gives y = i.e. (t, y, p) = p , m 1 2 1 2 my  ky . 2 2 p . Therefore the Hamiltonian is m 1 p2 1 2 H = L + yLy = + ky 2m 2 which is the sum of the kinetic and potential energy of the system. Differentiating H = ky y H p = p m so Hamilton's equations are y = p , m p = ky. 101 To solve these equations in the yp plane (the so called phase plane) we divide them to get p ky = y p/m or pdy + kmydy = 0 After integration, we have p2 + kmy 2 = c, c is constant which is a family of ellipses in the py plane. These represent trajectories that the system evolves along in the positionmomentum space. Fixing an initial value at time t0 selects out the particular trajectory that the system takes. Problems 1. Consider the functional
b I(y) =
a r(t)y 2 + q(t)y 2 dt. Find the Hamiltonian and write the canonical equations for the problem. 2. Give Hamilton's equations for
b I(y) =
a (t2 + y 2)(1 + y 2 )dt. Solve these equations and plot the solution curves in the yp plane. 3. A particle of unit mass moves along the y axis under the influence of a potential f (y) =  2 y + ay 2 where and a are positive constants. a. What is the potential energy V (y)? Determine the Lagrangian and write down the equations of motion. b. Find the Hamiltonian H(y, p) and show it coincides with the total energy. Write down Hamilton's equations. Is energy conserved? Is momentum conserved? 2 , and y(0) = 0, what is the initial velocity? c. If the total energy E is 10 d. Sketch the possible phase trajectories in phase space when the total energy in the 6 . system is given by E = 12a2 Hint: Note that p = 2 E  V (y). What is the value of E above which oscillatory solution is not possible? 102 4. A particle of mass m moves in one dimension under the influence of the force F (y, t) = ky 2 et , where y(t) is the position at time t, and k is a constant. Formulate Hamilton's principle for this system, and derive the equations of motion. Determine the Hamiltonian and compare it with the total energy. 5. A Lagrangian has the form a2 (y )4 + a(y )2 G(y)  G(y)2 , L(x, y, y ) = 12 where G is a given differentaible function. Find Euler's equation and a first integral. 6. If the Lagrangian L does not depend explicitly on time t, prove that H = constant, and if L doesn't depend explicitly on a generalized coordinate y, prove that p = constant. 7. Consider the differential equations r 2 = C, k r  r 2 + r 2 = 0 m governing the motion of a mass in an inversely square central force field. a. Show by the chain rule that r = Cr 2 dr d , r = C r 2 4 d r dr  2C 2 r 5 d2 d 2 2 and therefore the differential equations may be written dr d2 r  2r 1 d2 d b. Let r = u1 and show that k d2 u +u= 2 . d2 C m c. Solve the differential equation in part b to obtain u = r 1 = where k (1 + cos(  0 )) C 2m
2 r+ k 2 r = 0 C 2m and 0 are constants of integration. < 1. d. Show that elliptical orbits are obtained when 103 CHAPTER 11 11 Integrals Involving Higher Derivatives
min I =
x2 x1 Consider the problem among arcs y: y(x) F (x, y, y , y )dx x1 x x2 (1) with continuous derivatives up to and including the second derivative, that satisfy the boundary conditions y(x1 ) = A0 , y (x1 ) = A1 , y(x2 ) = B0 , y (x2 ) = B1 . (2) (Notice now that we also have conditions on y at the endpoints.) Let y0 (x) be a solution to this problem. Let (x) be an arc on the interval [x1 , x2 ] with continuous derivatives up through the second order and satisfying (x1 ) = 0, (x2 ) = 0, (x1 ) = 0, (x2 ) = 0 Create the family of arcs y( ) : y0 (x) + (x) x1 x x2  < < (4) (3) for some > 0. Evaluate I on this family to get I( ) = Differentiating at
x2 x1 F (x, y0 + , y0 + , y0 + ]dx (5) = 0 and setting the derivative equal to zero gives 0 = I (0) =
x2 x1 [Fy + Fy + Fy ]dx (6) As done previously we can write the second term in the integrand of (6) as Fy = d d [Fy ]  [Fy ] dx dx (7) and the third term in the integrand can similarly be written Fy = and then also d d [Fy ]  [Fy ] dx dx (8a) d d d d2 [Fy ] = [ Fy ]  2 Fy dx dx dx dx 104 (8b) Then using (7) and (8) in (6) gives 0 = I (0) =
x2 x1 [Fy  d d2 Fy + 2 Fy ]dx + dx dx x2 x1 d d [Fy + Fy + Fy ]dx dx dx (9) Evaluating the last integral of (9) gives 0 = I (0) =
x2 x1 x2 d d2 d [Fy  Fy + 2 Fy ]dx + [(Fy + Fy ) + Fy ] dx dx dx x1 (10) By (3), the last integral is zero, leaving 0 = I (0) =
x2 x1 [Fy  d d2 Fy + 2 Fy ]dx dx dx (11) which must be true for the full class of arcs described above. Then by a slight extension of the modified form of the fundamental lemma we get that the integrand of (11) must be zero giving d d2 Fy  Fy + 2 Fy = 0 (12) dx dx as the Euler equation for this problem. Note that the integrated form of the Euler equation (12) is
x s [
x1 x1 Fy ds  Fy ]dx + c1 x + c2 = Fy (13) where c1 , c2 are constants. As a generalization of this, it can be shown by a directly analagous argument that if the integrand involves the first N derivatives of y so that the corresponding problem would be min I =
x2 x1 F (x, y, y , y , , y (N ) )dx (14) among arcs with continuous N th derivatives on [x1 , x2 ] and satisfying y(x1 ) = A0 , y (x1 ) = A1 , , y (N 1) (x1 ) = AN 1 y(x2 ) = B0 , y (x2 ) = B1 , , y (N 1) (x2 ) = BN 1 then the Euler equation is Fy  dN d d2 Fy + 2 Fy  + (1)N N Fy(N) = 0 dx dx dx (16) (15) a differential equation of order 2N. This result is summarized in Theorem 1 Consider the problem defined by (14), (15) and the accompanying remarks. Then a solution arc must satisfy (16). To take into consideration that exists and is continuous 105 As an application consider the problem
/4 min I = 0 [16y 2  (y )2 + (x)]dx (17) (where is an arbitrary continuous function of x), among arcs possessing continuous derivatives through second order and satisfying y(0) = 0, y(/4) = 1, y (0) = 1 y (/4) = 0 (18) Applying the Euler equation gives d d2 0 = Fy  Fy + 2 Fy = 32y  2y (4) = 0 dx dx or y (4)  16y = 0 D 4  16 = 0 are D = 2, 2i so that the solution is y = c1 e2x + c2 e2x + c3 cos 2x + c4 sin 2x and then y = 2c1 e2x  2c2 e2x  2c3 sin 2x + 2c4 cos 2x (23) (24) (22) (19) (20) The roots of the characteristic equation (21) Applying the conditions (18) gives 0 = y(0) = c1 + c2 + c3 1 = y(/4) = c1 e/2 + c2 e/2 + c4 1 = y (0) = 2c1  2c2 + 2c4 0 = y (/4) = 2c1 e/2  2c2 e/2  2c3 Solving this system will yield solution of the problem. We discuss now the Newton method to solve this problem. Analagous to our procedure for Newton method applications to problems involving terms in x, y, y we start with an initial estimate y1 satisfying the left hand conditions of (2) and the Euler equation Fy  d d2 Fy + 2 Fy = 0 dx dx 106 (26) (25a) (25b) (25c) (25d) which after differentiating out (with respect to x) gives a fourth order equation in y. Noting that d Fy = Fy x + Fy y y + Fy y y + Fy y y dx d Fy = Fy x + Fy y y + Fy dx then the Euler equation for this problem is and
y (27a) (27b) y + Fy y y Fy  Fy x  Fy y y  Fy y y  Fy y y +Fy xx + Fy xy y + Fy xy y + Fy xy y + Fy + Fy + Fy
yx + Fy + Fy + Fy yy y y yy + Fy yy y + Fy yy y y y + Fy y y + y + Fy y
y (28) yx + Fy + Fy yy y + Fy yy y +
y y x y yy y y y + Fy y y y + Fy y (4) = 0 where all but the first five terms come from differentiating (27b) with respect to x and where these have been grouped so that the second line of (28) comes from differentiating the first term on the right in (27b) and each succeeding line comes from differentiating another term on the right of (27b). Calling this equation E4 (fourth order Euler equation) we define a two parameter family of curves y(c1 , c2 ) which satisfies E4 and the left hand conditions of (2) and also has initial values of y (x0 ) and y (x0 ) as y (x0 ) = y1 (x0 ) + c1 y (x0 ) = y1 (x0 ) + c2 (29) Notice that we have two conditions to satisfy (namely the two right hand conditions of (2)) and we have two parameters to do it with. As before, we set y(x) x0 x x1 (30) i (x) ci which now this means 1 (x) = y(x) y(x) = c1 y (x0 ) and 2 (x) = y(x) y(x) = c2 y (x0 ) (31) We obtain the differential equation that i (x) has to satisfy by differentiating (28) (evaluated on the curve y(c)) with respect to ci . By examining (28) we see that the differential equation for i will be fourth order (since we only have terms up through fourth order in y in (28) and differentiating term by term and using y i = ci y i = ci y i = ci 107 y i = , ci
(4) i y (4) = ci (32) we get a fourth order equation for i , i = 1, 2). The remainder of the program consists in determining initial conditions for i , i , i , i = 1, 2 and setting up an iteration scheme to achieve the righthand end point conditions of (2). Problems 1. Derive the Euler equation of the problem in the form d2 dx2
x2 x1 F (x, y, y , y ) dx = 0 F y  d dx F y + F = 0, y and show that the associated natural boundary conditions are d F F  dx y y and F y y
x2 x2 y
x1 = 0 = 0.
x1 2. Derive the Euler equation of the problem x2 x1 y2 y1 F (x, y, u, ux, uy , uxx , uxy , uyy ) dxdy = 0, where x1 , x2 , y1 , and y2 are constants, in the form 2 x2 F uxx 2 + xy F uxy 2 + y 2 F uyy  x F ux  y F uy + F = 0, u and show that the associated natural boundary conditions are then F F F +  u x uxx y uxy ux F ux uxx and
x2 x2 = 0
x1 = 0,
x1 y2 F F F +  u y uyy x uxy uy F uy uyy 108
y2 = 0
y1 = 0.
y1 3. Specialize the results of problem 2 in the case of the problem x2 x1 y2 y1 1 2 1 uxx + u2 + uxx uyy + (1  )u2 dxdy = 0, xy 2 2 yy where is a constant. Hint: Show that the Euler equation is 4 u = 0, regardless of the value of , but the natural boundary conditions depend on . 4. Specialize the results of problem 1 in the case F = a(x)(y )2  b(x)(y )2 + c(x)y 2 . 5. Find the extremals a. I(y) = b. I(y) =
1 0 0 (yy + (y )2 )dx, y(0) = 0, y (0) = 1, y(1) = 2, y (1) = 4 y(0) = 1, y (0) = 2, y() = 0, y () = 0. (y 2 + (y )2 + (y + y )2 )dx, 6. Find the extremals for the functional
b I(y) =
a (y 2 + 2y 2 + y 2 )dt. 7. Solve the following variational problem by finding extremals satisfying the given conditions I(y) =
1 0 (1 + (y )2 )dx, y(0) = 0, y (0) = 1, y(1) = 1, y (1) = 1. 109 CHAPTER 12 12 Piecewise Smooth Arcs and Additional Results Thus far we've only considered problems defined over a class of smooth arcs and hence have only permitted smooth solutions, i.e. solutions with a continuous derivative y (x). However one can find examples of variational problems which have no solution in the class of smooth arcs, but which do have solutions if we extend the class of admissible arcs to include piecewise smooth arcs . For example, consider the problem minimize
1 1 y 2 (1  y )2 dx y(1) = 0 y(1) = 1 . (1) The greatest lower bound of this integral is clearly (non negative integrand) zero, but it does not achieve this value for any smooth arc. The minimum is achieved for the arc y(x) = 0 1 x 0 x 0<x1 (2) which is piecewise smooth and thus has a discontinuity in y (i.e. a "corner") at the point x = 0. In order to include such problems into our theory and to discuss results to follow we consider again the term admissible arcs y: y(x) x1 x x2 (3) We will need to refer to the general integral (defined many times before) which we seek to minimize x2 F (x, y, y )dx (4) I=
x1 The definition of the particular class of admissible arcs may be made in many ways, each of which gives rise to a distinct problem of the calculus of variations. For a special problem the properties defining the class will in general be in part necessitated by the geometrical or mechanical character of the problem itself, and in part free to be chosen with a large degree of arbitrariness. An example of a property of the former type is the restriction for the brachistochrone problem that the curves considered shall all lie below the line y = , since on arcs above that line the integral expressing the time of descent has no meaning. On the other hand we frequently find it convenient to make the arbitrary restriction that our curves shall all lie in a small neighborhood of a particular one whose minimizing properties we are investigating, always remembering that on each of the arcs of our class the integral I must have a welldefined value.
An arc y : y(x) x1 x x2 is piecewise smooth if there are at most a finite number of points x = xi i = 1, , L in the interval [x1 , x2 ] where y (x) is discontinuous. The points xi at which y (x) is discontinuous are called corners. 110 In order to make a definition of a class of admissible arcs, which will be generally applicable, let us first assume that there is a region R of sets of values (x, y, y ) in which the integrand F (x, y, y ) is continuous and has continuous derivatives of as many orders as may be needed in our theory. The sets of values (x, y, y ) interior to the region R may be designated as admissible sets. An arc (3) will now be called an admissible arc if it is continuous and has a continuously turning tangent except possibly at a finite number of corners, and if the sets of values (x, y(x), y (x)) on it are all admissible according to the definition just given. For an admissible arc the interval [x1 , x2 ] can always be subdivided into a number of subintervals on each of which y(x) is continuous and has a continuous derivative. At a value x where the curve has a corner the derivative y (x) has two values which we may denote by y (x  0) and y (x + 0), corresponding to the backward and forward slopes of the curve, respectively. With the above considerations in mind, then the problem with which we are concerned is to minimize the integral (4) on the class of admissible arcs (3) joining two fixed points. The Euler equations which we've seen in differentiated and integrated forms, e.g. the first Euler equation x d Fy  Fy = 0 Fy  Fy ds = c (5) dx (where c is a constant) were proven by explicitly considering only smooth arcs (i.e. arcs without corners). Recall our proof of the integrated form of the first Euler equation (the second equation of (5)) which we originally did for the shortest distance problem. There we used the fundamental lemma involving the integral M(x) (x)dx (6) where in that lemma M(x) was allowed to be piecewise continuous and (x) was required to have at least the continuity properties of M(x). The term M(x) turned out to be M(x) = Fy (x) 
x Fy ds . (7) When we allowed only smooth arcs, then Fy (x) (i.e. Fy (x, y(x), y (x))) and Fy (x) (i.e. Fy (x, y(x), y (x))) were continuous (since y(x) and y (x) were so) and the piecewise continuity provision of the fundamental lemma was not used. This is the procedure that was followed in chapter 3 and proved that when considering only smooth arcs, the first Euler equation held in integrated and in differentiated form on a minimizing arc. However, now in allowing piecewise smooth arcs, then Fy (x), and Fy (x) may be discontinuous at the corners of the arc and then by (7) this will also be true for M(x). Since the fundamental lemma allowed for this, then the proof of the integrated form is still valid when permitting piecewise smooth arcs. The differentiated form also still holds in between the corners of the arc but may not hold at the corners themselves. A similar
A function M (x) is said to be piecewise continuous on an interval [x1 , x2 ] if there at most a finite number of points xi i = 1, , L in [x1 , x2 ] where M (x) is discontinuous 111 statement is true for the other Euler equation. With this in mind the theorem concerning the Euler equations for the general problem stated above, is: For the problem minimize I =
x2 x1 F (x, y, y )dx (8) on the class of admissible (where the term admissible is consistent with our above discussions) arcs (3) joining two fixed points, let the arc y0 : y0 (x) x1 x x2 (9) be a solution. Then the Euler equations hold in integrated form Fy (x) 
x Fy ds = c1 F (x)  y (x)Fy (x)  x Fy ds = c2 (10a) (where c1 and c2 are constants) along y 0 while the differentiated forms d Fy  Fy = 0 dx hold between the corners of y 0 Remark: These same modification to the Euler equations hold for the other types of problems considered. All other results such as the transversality condition remain unchanged when allowing piecewise smooth arcs. Because we allow piecewise smooth arcs, then there are two additional necessary conditions to be established and one of these will imply a third additional necessary condition. Finally we will present one other necessary condition that has nothing to do with corners. For our discussions to follow, assume that the arc y 0 of (9) is a solution to our problem. The necessary conditions of Weierstrass and Legendre. In order to prove Weierstrass' necessary condition, let us select arbitrarily a point 3 on our minimizing arc y 0 , and a second point 4 of this arc so near to 3 that there is no corner of y 0 between them. Through the point 3 we may pass an arbitrary curve C with an equation y = Y (x), and the fixed point 4 can be joined to a movable point 5 on C by a one parameter family of arcs y 54 containing the arc y 34 as a member when the point 5 is in the position 3.
C 5 d (F  y Fy ) = Fx dx (10b) 3 y0 4 Figure 28: Two nearby points 3,4 on the minimizing arc 112 We shall soon see that such a family can easily be constructed. If the integral I(y 0 ) is to be a minimum then it is clear that as the point 5 moves along C from the point 3 the integral I(C35 + y 54 ) =
x5 x3 F (x, Y, Y )dx + I(y 54 ) (11) must not decrease from the initial value I(y 34 ) which it has when 5 is at the point 3. Then at the point 3 the differential of this integral with respect to x5 must not be negative. The differential of the term I(y 54 ) in the expression (11), at the position y 34 , is given by the expression derived in chapter 4 which we now repeat here dI(y 54 ) = F (x, y, y )dx + (dy  y dx)Fy (x, y, y )4 3 where the point 4 is fixed so that dx4 = dy4 = 0. For that formula holds along every arc of the family in question which satisfies the Euler equations and we know that our minimizing arc must satisfy these equations. Since the differential of the first integral in the expression (11) with respect to its upper limit is the value of its integrand at that limit, it follows that when 5 is at 3 we have for the differential of I(C35 + y 54 ) the value at the point 3 of the quantity F (x, Y, Y )dx  F (x, y, y )dx  (dy  y dx)Fy (x, y, y ) . The differentials in this expression belong to the arc C and satisfy the equation dy = Y dx, and at the point 3 the ordinates of C and y are equal, so that the differential of (11) is also expressible in the form [F (x, y, Y )  F (x, y, y )  (Y  y )Fy (x, y, y )]dx3 (12) Since this differential must be positive or zero for an arbitrarily selected point 3 and arc C through it, i.e., for every element (x, y, y ) on y 0 and every admissible element (x, y, Y ), we have justified the necessary condition of Weierstrass. Theorem The Necessary Condition of Weierstrass. At every element (x, y, y ) of a minimizing arc y 0 , the condition F (x, y, Y )  F (x, y, y )  (Y  y )Fy (x, y, y ) 0 is satisfied for every admissible point (x, y, Y ) different from (x, y, y ). The expression on the left side of (13) is usually called the Weierstrass Efunction E(x, y, y , Y ) F (x, y, Y )  F (x, y, y )  (Y  y )Fy (x, y, y ). Thus in terms of this quantity, the necessary condition of Weierstrass may be stated as E(x, y, y , Y ) 0 (15) (14) (13) where (x, y, y ) and (x, y, Y ) are as noted above. With the help of Taylor's formula, the Weierstrass Efunction may be expressed in the form 1 E(x, y, y , Y ) = (Y  y )2 Fy y (x, y, y + (Y  y )) (16) 2 113 where 0 < < 1. If we let Y approach y we find from this formula the necessary condition of Legendre, as an immediate corollary of the condition of Weierstrass. Theorem The Necessary Condition of Legendre At every element (x, y, y ) of a minimizing arc y 0 , the condition Fy y (x, y, y ) 0 (17) must be satisfied. In order now to demonstrate the consturction of a family of arcs y 54 of the type used in the foregoing proof of Weierstrass' condition, consider the equation y = y(x) + Y (a)  y(a) (x4  x) = y(x, a) . x4  a (18) For x = x4 these arcs all pass through the point 4, and for x = a they intersect the curve C. For a = x3 the family contains the arc y 34 since at the intersection point 3 of y 34 and C we have Y (x3 )  y(x3 ) = 0 and the equation of the family reduces to the equation y = y(x) of the arc y 34 . For an element (x, y, y (x  0)) at a corner of a minimizing arc the proof just given for Weierstrass' necessary condition does not apply, since there is always a corner between this element and a point 4 following it on y 0 . But one can readily modify the proof so that it makes use of a point 4 preceding the corner and attains the result stated in the condition for the element in question. There are two other necessary conditions that result from satisfaction of the Euler equations. One condition involves corners. Consider the equation x Fy = Fy dx + c . (19)
x1 The right hand side of this equation is a continuous function of x at every point of the arc y 0 and the left hand side must therefore also be continuous, so that we have Corollary 1. The WeierstrassErdmann Corner Condition. At a corner (x, y) of a minimizing arc y 0 the condition Fy (x, y, y (x  0)) = Fy (x, y, y (x + 0)) (20) must hold. This condition at a point (x, y) frequently requires y (x  0) and y (x + 0) to be identical so that at such a point a minimizing arc can have no corners. It will always require this identity if the sets (x, y, y ) with y between y (x  0) and y (x + 0) are all admissible and the derivative Fy y is everywhere different from zero, since then the first derivative Fy varies monotonically with y and cannot take the same value twice. The criterion of the corollary has an interesting application in a second proof of Jacobi's condition which will be given later. 114 We have so far made no assumption concerning the existence of a second derivative y (x) along our minimizing arc. If an arc has a continuous second derivative then Euler's equation along it can be expressed in the form Fy x + Fy y y + Fy y y  Fy = 0 . (21) The following corollary contains a criterion which for many problems enables us to prove that a minimizing arc must have a continuous second derivative and hence satisfy the last equation. Corolary 2. Hilbert's Differentiability condition. Near a point on a minimizing arc y 0 where Fy y is different from zero, the arc always has a continuous second derivative y (x). To prove this let (x, y, y ) be a set of values on y 0 at which Fy y is different from zero, and suppose further that (x + x, y + y, y + y ) is also on y 0 and with no corner between it and the former set. If we denote the values of Fy corresponding to these two sets by Fy and Fy + Fy then with the help of Taylor's formula we find Fy x 1 {Fy (x + x, y + y, y + y )  Fy (x, y, y )} x = Fy x (x + x, y + y, y + y ) y + Fy y (x + x, y + y, y + y ) x y + Fy y (x + x, y + y, y + y ) x = (22) where 0 < < 1. In this expression the left hand side Fy /x has the definite limit Fy as x approaches zero, because of the definition of the derivative and the differentiated form of the first Euler equation which holds on intervals that have no corners. Also, the first two terms on the right hand side of (22) have welldefined limits. It follows that the last term must have a unique limiting value, and since Fy y = 0 this can be true only if y = lim y /x exists. The derivative Fy y remains different from zero near the element (x, y, y ) on the subarc of y 0 on which this element lies. Consequently Euler's equation in the form given in (21) can be solved for y , and it follows that y must be continuous near every element (x, y, y ) of the kind described in the corollary. 115 CHAPTER 13 13 Field Theory Jacobi's Neccesary Condition and Sufficiency In this chapter, we discuss the notion of a field and a sufficiency proof for the shortest distance from a point to a curve. Recall figure 10 of chapter 3 in which a straight line segement of variable length moved so that its ends described two curves C and D. These curves written in parametric form are C: D: x = x3 (t) x = x4 (t) y = y3 (t) y = y4 (t) . (1) 4 y34 6 3 (x,y) 5 y56 D C Figure 29: Line segment of variable length with endpoints on the curves C, D Points 5, 6 are two other points on these curves. We have seen in chapter 3 that necessary conditions on the shortest arc problem may be deduced by comparing it with other admissible arcs of special types. It is also true that for sufficiency in that problem, then a particular arc can be proved to be actually the shortest, only by comparing it with all of the admissible arcs satisfying the conditions of the problem. The sufficiency proof of this chapter is valid not only for the arcs which we have named admissible but also for arcs with equations in the parametric form x = x(t), y = y(t) (t3 t t5 ) . (2) We suppose always that the functions x(t) and y(t) are continuous, and that the interval [t3 , t5 ] can be subdivided into one or more parts on each of which x(t) and y(t) have continuous 116 derivatives such that x 2 + y 2 = 0. The curve represented is then continuous and has a continuously turning tangent except possibly at a finite number of corners. A much larger variety of curves can be represented by such parametric equations than by an equation of the form y = y(x) because the parametric representation lays no restriction upon the slope of the curve or the number of points of the curve which may lie upon a single ordinate. On the other hand for an admissible arc of the form y = y(x) the slope must always be finite and the number of points on each ordinate must at most be one. The mathematician who first made satisfactory sufficiency proofs in the calculus of variations was Weierstrass, and the ingenious device which he used in his proofs is called a field. We describe first a generic field for shortest distance problems in general and after giving some other examples of fields, we introduce the particular field which will be used for the shortest distance problem from a point to a curve. For the shortest distance problems, a field is a region of the xyplane with which there is associated a oneparameter family of straightline segments all of which intersect a fixed curve D, and which have the further property that through each point (x, y) of there passes one and only one of the segments. The curve D may be either inside the field, or outside as illustrated in Figure 29, and as a special case it may degenerate into a single fixed point. The whole plane is a field when covered by a system of parallel lines, the curve D being in this case any straight line or curve which intersects all of the parallels. The plane with the exception of a single point 0 is a field when covered by the rays through 0, and 0 is a degenerate curve D. The tangents to a circle do not cover a field since through each point outside of the circle there pass two tangents, and through a point inside the circle there is none. If, however, we cut off half of each tangent at its contact point with the circle, leaving only a one parameter family of halfrays all pointing in the same direction around the circle, then the exterior of the circle is a field simply covered by the family of halfrays. At every point (x, y) of a field the straight line of the field has a slope p(x, y), the function so defined being called the slopefunction of the field. The integral I introduced in chapter 4 dx + pdy I = (3) 1 + p2 with this slope function used for p and with dx, dy coming from the arc C of figure 29, has a definite value along every arc C35 in the field having equations of the form (2), as we have seen before. We can prove with the help of the formulas of chapter 3 that the integral I associated in this way with a field has the two following useful properties: The values of I are the same along all curves C35 in the field having the same endpoints 3 and 5. Furthermore along each segment of one of the straight lines of the field the value of I is equal to the length of the segment. To prove the first of these statements we may consider the curve C35 shown in the field of Figure 29. Through every point (x, y) of this curve there passes, by hypothesis, a straight line of the field intersecting D, and (4) of chapter 3, applied to the one parameter family of straightline segments so determined by the points of C35 , gives I (C35 ) = I (D46 )  I(y 56 ) + I(y 34 ) . 117 (4) The values of the terms on the right are completely determined when the points 3 and 5 in the field are given, and are entirely independent of the form of the curve C35 joining these two points. This shows that the value I (C35 ) is the same for all arcs (C35 ) in the field joining the same two endpoints, as stated in the theorem. The second property of the theorem follows from the fact that along a straightline segment of the field the differentials dx and dy satisfy the equation dy = p dx, and the integrand of I reduces to 1 + p2 dx which is the integrand of the length integral. We now have the mechanism necessary for the sufficiency proof for the problem of shortest distance from a fixed point 1 to a fixed curve N (introduced in chapter 4). We recall figure 16 (chapter 4) which is repeated below in which the curve N, its evolute G and two positions of normals to N, one of them containing the point 1, are shown.
2 y12 4 N 3 1 L 5 y56 6 G Figure 30: Shortest arc from a fixed point 1 to a curve N. G is the evolute We infer by inspection from the Figure, that when the endpoint 1 lies between 3 and 2, there is adjoining y 12 a region of the plane which is simply covered by the normals to N which are near to y 12 . An analytic proof of this statement for a more general case will be given if time permits. For the present we shall rely on our inference of it from the figure. The region so covered by the normals to N forms a field such as was described above. The integral I formed with the slope function p(x, y) of the field in its integrand, is independent of the path and has the same value as I along the straightline segment y 12 of the field. It also has the value zero on every arc of N since the straight lines of the field are all perpendicular to N and its integrand therefore vanishes identically along that curve. Hence for an arbitrarily selected arc C14 in F joining 1 with N, as shown in Figure 30, we have I(y 12 ) = I (y 12 ) = I (C14 + N42 ) = I (C14 ) , (5a) 118 and the difference between the lengths of C14 and y 12 is I(C14 )  I(y 12 ) = I(C14 )  I (C14 ) =
s2 s1 (1  cos )ds 0 (5b) with the last equality following from (6) of chapter 3. The difference between the values of I along C14 and y 12 is therefore I(C14 )  I(y12 ) =
s2 s1 (1  cos )ds 0 . (6) The equality sign can hold only if C14 coincides with y 12 . For when the integral in the last equation is zero we must have cos = 1 at every point of C14 , from which it follows that C14 is tangent at every point to a straight line of the field and satisfies the equation dy = p dx. Such a differential equation can have but one solution through the initial point 1 and that solution is y 12 . We have proved therefore that the length I(C14 ) of C14 is always greater than that of y 12 unless C14 is coincident with y 12 . For a straightline segment y 12 perpendicular to the curve N at the point 2 and not touching the evolute G of N there exists a neighborhood in which y 12 is shorter than every other arc joining 1 with N. We now prove a sufficiency theorem for the general problem of chapters 12 and 3 which we repeat here for completeness. We wish to minimize the integral I= on a class of admissible arcs y : y(x) x1 x x2 (8) joining two given points and lying in some region R of admissible points. We will often refer to a class of extremals. Recalling the definition from previous chapters an extremal y is an arc which is a solution to the Euler equations on [x1 , x2 ] and which has continuous first and second derivatives (y (x) and y (x)). We also define a field for this general problem. A region of the plane is called a field if it has associated with it a oneparameter family of extremals all intersecting a curve D and furthermore such that through each point (x, y) of there passes one and but one extremal of the family. Figure 31 is a picture suggesting such a field. The function p(x, y) defining the slope of the extremal of the field at a point (x, y) is called the slopefunction of the field. With this slopefunction substituted, then the integrand of the integral I of chapter 3 I = [F (x, y, p)dx + (dy  pdx)Fy (x, y, p)] (9)
x2 x1 F (x, y, y )dx (7) depends only upon x, y, dx, dy, and the integral itself will have a welldefined value on every arc C35 in having equations x = x(t) , y = y(t) (t3 t t5 ) 119 (10) D 4 6 y C y 3 5 Figure 31: Line segment of variable length with endpoints on the curves C, D of the type described in (2). Furthermore the endpoints of C35 determine two extremal arcs y 34 and y 56 of the field, and a corresponding arc D46 , which are related to it like those in equation (28) of chapter 3, which we repeat here I(y 56 )  I(y 34 ) = I (D46 )  I (C35 ) . (11) It is clear then that the value I (C35 ) depends only upon the points 3 and 5, and not at all upon the form of the arc C35 joining them, since the other three terms in equation (11) have this property. The importance of the integral I in the calculus of variations was first emphasized by Hilbert and it is usually called Hilbert's invariant integral. Its two most useful properties are described in the following corollary: Corollary: For a field simply covered by a one parameter family of extremals all of which intersect a fixed curve D, the Hilbert integral I formed with the slopefunction p(x,y) of the field has the same value on all arcs C35 in with the same endpoints 3 and 5. Furthermore on an extremal arc of the field, I has the same value as I. The last statement follows, since along an extremal of the field we have dy = p dx and the integrand of I reduces to F (x, y, p)dx. The formula (27) of chapter 3 which we also repeat dI = F (x, y, p) + (dy  pdx)Fy (x, y, p)
4 3 (12) and (11) of this chapter are the two important auxiliary formulas developed in chapter 3. They remain valid in simpler forms if one of the curves C35 or D46 degenerates into a point, since then the differentials dx, dy along that curve are zero. 120 We shall see that through a fixed point 1 there passes in general a oneparameter family of extremals. If such a family has an envelope G as shown in figure 32, then the contact point 3 of an extremal arc y 12 of the family with the envelope, is called conjugate to point 1 on y 12 . y 4 1 y 3 G Figure 32: Conjugate point at the right end of an extremal arc We next prove two results which are required for the sufficiency theorem. The envelope theorem and Jacobi's condition. The formula (11) enables us to prove the envelope theorem which is a generalization of the string property of the evolute noted in the shortest distance problem of chapter 4. Let y 14 and y 13 be two extremals of a oneparameter family through the point 1, touching an envelope G of the family at their endpoints 4 and 3, as shown in Figure 32. When we replace the arc C35 of the formula (11) above by the fixed point 1, and the arc D46 by G43 , we find the equation I(y 13 )  I(y 14 ) = I (G43 ) . (13) Furthermore the differentials dx, dy at a point of the envelope G satisfy the equation dy = p dx with the slope p of the extremal tangent to G at that point, and it follows that the value of the (Hilbert) integral (9) along G43 is the same as that of I. Hence we have: The Envelope Theorem. Let y 14 and y 13 be two members of a oneparameter family of extremals through the point 1, touching an envelope G of the family at their endpoints 4 and 3, as shown in Figure 32. Then the values of the integral I along the arcs y 14 , y 13 , G43 satisfy the relation I(y13 ) + I(G43 ) = I(y 14 ) (14) for every position of the point 4 preceding 3 on G. We next prove a condition which was hinted at in chapter 4. This is Jacobi's condition. Theorem (Jacobi). On a minimizing arc y 12 which is an extremal with Fy y = 0 everywhere on y 12 , there can be no point 3 conjugate to 1 between 1 and 2. We notice that according to the envelope theorem, the value of I along the composite arc y 14 + G43 + y 32 in Figure 32 is always the same as its value along y 12 . But G43 is not an extremal and can be replaced therefore by an arc C43 giving I a smaller value. In every neighborhood of y 12 there is consequently an arc y 14 + C43 + y 32 giving I a smaller value than y 12 and I(y 12 ) cannot be a minimum. 121 To insure that G43 is not an extremal arc we make use of a wellknown property of (Euler's) second order differential equation expanded out: d Fy  Fy = Fy x + Fy y y + Fy y y  Fy = 0 dx (15) which is satisfied by all extremals. That property states that when such an equation can be solved for the derivative y there is one and only one solution of it through an arbitrarily selected initial point and direction (x3 , y3, y3 ). But we know that equation (15) is solvable for y near the arc y 12 since the hypothesis of Jacobi's condition requires Fy y to be different from zero along that arc. Hence if G43 were an extremal it would necessarily coincide with y 13 , in which case all of the extremal arcs of the family through the point 1 would by the same property be tangent to and coincide with y 13 . There would then be no oneparameter family such as the theorem supposes. The fundamental sufficiency theorem. The conditions for a minimum which have so far been deduced for our problem have been only necessary conditions, but we shall see in the following that they can be made over with moderate changes into conditions which are also sufficient to insure an extreme value for our integral. Since the comparison of necessary with sufficient conditions is one of the more delicate parts of the theory of the calculus of variations, it's a good idea before undertaking it to consider a sufficiency theorem which in special cases frequently gives information so complete that after using it one does not need to use farther the general theory. Using the general field described above, we as usual designate the function p(x, y) defining the slope of the extremal of the field at a point (x, y) as the slopefunction of the field. With E as the Weierstrass E function of chapter 14 E(x, y, y , Y ) = F (x, y, Y )  F (x, y, y )  (Y  y )Fy (x, y, y ) (16) we have the following theorem, which is fundamental for all of the sufficiency proofs: The Fundamental Sufficiency Theorem. Let y 12 be an extremal arc of a field such that at each point (x, y) of the inequality E(x, y, p(x, y), y ) 0 (17) holds for every admissible set (x, y, y ) different from (x, y, p). Then I(y12 ) is a minimum in , or, more explicitly, the inequality I(y 12 ) I(C12 ) is satisfied for every admissible arc C12 in joining the points 1 and 2. If the equality sign is excluded in the hypothesis (17) then I(y 12 ) < I(C12 ) unless C12 coincides with y 12 , and the minimum is a socalled proper one. In order to accomplish the analysis involved in the proof of the above sufficiency theorem we now list the properties of the family of extremal arcs covering the field . It is supposed that the family has an equation of the form y = y(x, a) (a1 a a2 ; x1 (a) x x2 (a)) (18) in which the functions y(x, a), y (x, a) and their partial derivatives up to and including those of the second order, as well as the functions x1 (a) and x2 (a) defining the endpoints 122 of the extremal arcs, are continuous. It is understood that the point of the curve D on each extremal is defined by a function x = (a) which with its first derivative is continuous on the interval [a1 , a2 ], and furthermore that the derivative ya is everywhere different from zero on the extremal arcs. To each point (x, y) in there corresponds a value a(x, y) which defines the unique extremal of the field through that point, and as a result of the hypothesis that ya is different from zero we can prove that a(x, y) and its first partial derivatives are continuous in . The same is then true of the slopefunction p(x, y) = y (x, a(x, y)) of the field. These properties form the analytical basis of the theory of the field, and we assume them always. The Hilbert integral (9) formed with the slopefunction p(x, y) in place of p has now a definite value on every admissible arc C12 in the field. Furthermore as shown above its values are the same on all such arcs C12 which have the same endpoints, and if the points 1 and 2 are the endpoints of an extremal arc y 12 of the field, this value is that of the original integral I. Hence we find for the pair of arcs C12 and y 12 shown in figure 33, I(C12 )  I(y 12 ) = I(C12 )  I (y 12 ) = I(C12 )  I (C12 ) , (19) y D C 2 y 1 Figure 33: Line segment of variable length with endpoints on the curves C, D and when we substitute for I and I their values as integrals, it follows that I(C12 )  I(y 12 ) =
x2 x1 E(x, y, p(x, y), y )dx . (20) In the integral on the right, y and its derivative y are functions of x obtained from the equation y = y(x) of the admissible arc C12 . The sufficiency theorem is an immediate consequence of this formula. For the hypothesis (17) that the Efunction is greater than or equal to zero in the field implies at once that I(y 12 ) I(C12 ). If the Efunction vanishes in the field only when y = p then the equality I(y 12 ) = I(C12 ) can hold only if the equation y = p(x, y) is satisfied at every point of C12 . But in that case C12 must coincide with y 12 since the differential equation y = p(x, y) has one and but one solution through the initial point 1, and that one is y 12 . 123 The sufficiency proof of the shortest distance problem was an application of a special case of the formula (20) and this theorem. For that special problem the second derivative Fy y is positive for all admissible sets (x, y, y ) and the formula (16) of chapter 12 which we repeat here 1 E(x, y, p, y ) = (y  p)2 Fy y (x, y, p + (y  p)) (0 < < 1) 2 (21) shows that the Efunction is positive whenever y = p, as presupposed in the last sentence of the sufficiency theorem. In order to efficiently discuss further sufficiency results it is convenient now to collect together all of the necessary conditions which have been obtained thus far for our general problem. I. For every minimizing arc y 12 there exists a constant c such that the equation
x Fy (x, y(x), y (x)) = x1 Fy (x, y(x), y (x))dx + c (22) holds identically on y 12 . An immediate consequence of this equation is that on each arc of y 12 having a continuously turning tangent Euler's differential equation d Fy  Fy = 0 dx must also be satisied. II. (Weierstrass). At every element (x, y, y ) of a minimizing arc y 12 the condition E(x, y, y , Y ) 0 must be satisfied foe every admissible set (x, y, Y ) different from (x, y, y ). III. (Legendre). At every element (x, y, y ) of a minimizing arc y 12 the condition Fy y (x, y, y ) 0 (25) (24) (23) must be satisfied. IV. (Jacobi). On a minimizing arc y 12 which is an extremal with Fy y = 0 everywhere on it, there can be no point 3 conjugate to 1 between 1 and 2. The fundamental sufficiency theorem (f.s.t.) proven above refers to a set of admissible points (of which the admissible arcs are made up) which according to our discussion in chapter 12 will be contained in some region R. The results to follow will each be closely associated with a specific R. Also the selection of R will depend in part on the field (also referred to in the f.s.t) that we are able to construct. Next, using a notation introduced by Bolza let us designate by II', III' the conditions II, III with the equality sign excluded, and by IV' the condition IV when strengthened to exclude the possibility of a conjugate point at the endpoint 2 as well as between 1 and 2 on y 12 . If time permits it will be proven later that for an extremal arc y 12 which satisfies the 124 conditions I, III', IV' there is always some neighborhood which is a field simply covered by a oneparameter family of extremals having y 12 as a member of the family. The value I(y 12 ) is said to be a weak relative minimum if there is a neighborhood R of the values (x, y, y ) on y 12 such that the inequality I(y12 ) I(C12 ) is true, not necessarily for all admissible arcs C12 , but at least for all those whose elements (x, y, y ) lie in R . With the help of the sufficiency theorem stated above and the field described in the last paragraph we shall be able to prove that an arc y 12 which satisfies the conditions I, III', IV' will make the value I(y 12 ) at least a weak relative minimum. This result will be established by replacing the original region R by R and choosing R so small that every admissible arc with respect to it is necessarily in the field , and furthermore so small that the condition 17) of the theorem holds in in its stronger form with respect to all of the sets (x, y, y ) in R . Following Bolza again let us denote by IIb the condition II strengthened to hold not only for elements (x, y, y ) on y 12 but also for all such elements in a neighborhood of those on y 12 . It will be proved that for an arc which satisfies the conditions I, IIb , III', IV' the field about y 12 , existent as a result of the conditions I, III', IV', can be so constructed that the stronger condition 17) holds in it with respect to the sets (x, y, y ) in the region R itself. The value I(y 12 ) will therefore again be a minimum in , and it is called a strong relative minimum because it is effective with respect to all admissible comparison curves C whose elements (x, y, y ) have their points (x, y) in a small neighborhood of those on y 12 . No restrictions are in this case imposed upon the slopes y except those due to the definition of the original region R. Sufficient Condition for Relative Minima For our immediate purposes we state now and will prove if time permits a result referred to above. Lemma: Every extremal arc y 12 having Fy y = 0 along it and containing no point conjugate to 1 is interior to a field of which it itself is an extremal arc. We now discuss the important sets of sufficient conditions which insure for an arc y 12 the property of furnishing a relative minimum. We have seen in chapter 12 that there is a considerable degree of arbitrariness in the choice of the region R in which the minimum problem may be studied. Relative minima are really minima in certain types of subregions of the region R originally selected, and their existence is assured by the conditions described in the following two theorems. Sufficient conditions for a weak relative minimum. Let y 12 be an arc having the properties: 1) it is an extremal, 2) Fy y > 0 at every set of values (x, y, y ) on it, 3) it contains no point 3 conjugate to 1. Then I(y 12 ) is a weak relative minimum, or, in other words, the inequality I(y 12 ) < I(C12 ) holds for every admissible arc C12 distinct from y 12 , joining 1 with 2, and having its elements (x, y, y ) all in a sufficiently small neighborhood R of those on y 12 . To prove this we note in the first place that the conditions 1, 2, 3 of the theorem imply the conditions I, III', IV'. Furthermore the same three properties insure the existence of a field having the arc y 12 as one of its extremals, as indicated in the lemma just stated above. Let us now choose a neighborhood R of the values (x, y, y ) on y 12 so small that all elements (x, y, y ) in R have their points (x, y) in , and so small that for the slopefunction 125 p = p(x, y) of , the elements x, y, p + (y  p) having 0 1 are all admissible and make Fy y = 0. Then the function 1 E(x, y, p(x, y), y ) = (y  p)2 Fy y (x, y, p + (y  p)) , 2 (26) is positive for all elements (x, y, y ) in R with y = p, and the fundamental sufficiency theorem proven earlier in this chapter, with R replaced by R in the definition of admissible sets, justifies the theorem stated above for a weak relative minimum. Sufficient Conditions for a Strong Relative Minimum. Let y 12 be an arc having the properties of the preceding theorem and the further property 4) at every element (x, y, y ) in a neighborhood R of those on y 12 the condition E(x, y, y , Y ) > 0 is satisfied for every admissible set (x, y, Y ) with Y = y . Then y 12 satisfies the conditions I, IIb, III , IV and I(y 12 ) is a strong relative minimum. In other words, the inequality I(y 12 ) < I(C12 ) holds for every admissible arc C12 distinct from y 12 , joining 1 with 2, and having its points (x, y) all in a sufficiently small neighborhood of those on y 12 . The properties 1), 2), 3) insure again in this case the existence of a field having y 12 as one of its extremal arcs, and we may denote the slopefunction of the field as usual by p(x, y). If we take the field so small that all of the elements (x, y, p(x, y)) belonging to it are in the neighborhood R of the property 4), then according to that property the inequality E(x, y, p(x, y), y ) > 0 holds for every admissible element (x, y, y ) in distinct from (x, y, p(x, y)), and the fundamental sufficiency theorem gives the desired conclusion of the theorem. We now use the results just developed for the general theory by applying them to the brachistochrone problem of finding the curve of quickest descent for a particle to slide from a given point 1 with coordinates (x1 , y1 ) to a given curve N with a given initial velocity v1 . This is the same problem we saw in chapter 4 where first necessary conditions were obtained. Let the point 1, the curve N and the path y 12 of quickest descent be those shown in figure 2 34. The constant has the same meaning as in chapter 4, namely = y1  v1 /2g where y1 is the value of y at point 1, and g is the gravitational acceleration. We recall the integral I to be minimized from chapter 4 I=
x2 x1 1+y2 dx . y (27) By chapters 3 and 4 we already know that a minimizing arc y 12 for this problem must consist of a cycloid lying below the line y = . We also know by Jacobi's condition that y 12 cannot contain a conjugate point between its endpoints. We now prove that with the 126 y= N 2 y 3 1 C 5 4 y G 6 Figure 34: The path of quickest descent from point 1 to a cuve N assumption of a slight strengthening of Jacobi's condition, this cycloid provides a strong minimizing arc for the problem at hand, (i.e. it satisfies the conditions of the strong sufficiency theorem). With F as the integrand of (27) we first compute Fy = y y 1+y2 (28) Next, by the WeierstrassErdmann Corner Condition (chapter 12) one sees that the expression on the righthand side of (28) is continuous on y 12 . We now show that this implies that y = sin y must also be continuous on y 12 . With the substitution y = tan , then 1+y2 and the continuity of (28) implies that sin and hence also and tan = y must be contin1 >0 uous along y 12 . Thus y 12 contains no coners. Next, note that Fy y = y  (1  y 2 )3/2 for all admissible (see chapter 3) points (x, y, y ) with y > , and so certainly also on y 12 then Hilbert's Differentiability Condition (chapter 12) shows that y is continuous on y 12 . Now let R be any neighborhood of y 12 that is contained within the admissible set of points. Let x, y, y and x, y, Y be any points in R (with the same x, y). Then by (26) and the positivity of Fy y for all admissible points, we have condition 4) of the strong sufficiency theorem. Finally, if we assume that y 12 does not contain a conjugate point at its right endpoint, then all of the conditions of the strong sufficiency theorem are met and y 12 provides a strong relative minimum for our problem as stated in that theorem. 127 Index
C i, 1 Rn , 1 admissible functions, 14 admissible arc, 42 admissible arc, 14 admissible arcs, 47 approximating function, 82 Auxiliary Formulas, 22, 26, 32 both endpoints vary, 39 brachistochrone, 12, 28, 31, 39 canonical momentum, 93, 101 complete, 83 complete set of Euler equations, 25 compound pendulum, 100 conservative field, 91 conservative force field, 92 constrained optimization, 5 constraint, 5 constraints, 47, 97 cycloid, 28, 29, 32 degrees of freedom, 97 difference quotient, 85 differential correction, 64 direct method, 63, 74 Euler equation, 25 Euler equations, 63 Euler Lagrange equations, 59 Euler's method, 86 evolute, 37 extremal, 25 feta.m, 66, 70 finite differences, 84 finput.m, 77 first Euler equation, 46, 49 first necessary condition, 14 first order necessary condition, 3 128 fixed end point, 14 fixed end points, 63 force potential, 92 fundamental lemma, 21, 25, 43, 60, 91 generalized coordinates, 97 gradient method, 74 gravitational constant, 92 Green's theorem, 60 Hamilton's equations, 93 Hamilton's Principle, 91 Hamilton's principle, 90, 92, 97 Hamiltonian, 93 harmonic oscillator, 101 indirect method, 63 infinite number of variables, 10 initial estimate, 64 isoparametric, 47, 49 iterative, 63 kinetic energy, 91 Lagrange multiplers, 7 Lagrange multiplier, 49 Lagrangian, 92, 101 maxima, 1 mean curvature, 62 method of eigenfunction expansion, 82 minima, 1 minimal surface problem, 61 modified version of ode23, 66 natural boundary condition, 38 Necessary condition, 38 Newton's equations of motion, 90 Newton's law, 91 Newton's method, 63, 71 numerical techniques, 63 ode1.m, 66 ode23m.m, 66, 67 odef.m, 66, 70 odeinput.m, 65 optimize, 1 phase plane, 102 potential energ, 92 RayleighRitz, 13 RayleighRitz method, 82 relative minimum, 10, 19 rhs2f.m, 65, 66 Riemann integrable, 83 second Euler equation, 46, 49 second order necessary condition, 3 shortest arc, 10, 14, 36 shortest distance, 21, 31 shortest time, 12 side conditions, 47 simple pendulum, 99 steepest descent, 74 subsidiary conditions, 47 sufficient condition, 3 surface of revolution, 11 Taylor series, 74 transversality condition, 3841, 50, 71 two independent variables, 59 twodimensional problem, 46 unconstrained, 1 unconstrained relative minimum, 1 variable end point, 14, 36, 38, 71 129 CALCULUS OF VARIATIONS MA 4311 SOLUTION MANUAL
Department of Mathematics Naval Postgraduate School Code MA/Nd Monterey, California 93943 June 11, 2001
B. Neta c 1996  Professor B. Neta 1 Contents
1 2 3 4 5 Functions of n Variables 1 Examples, Notation 9 First Results 13 Variable EndPoint Problems 33 Higher Dimensional Problems and Another Proof of the Second Euler Equation 54 6 Integrals Involving More Than One Independent Variable 74 7 Examples of Numerical Techniques 80 8 The RayleighRitz Method 85 9 Hamilton's Principle 91 10 Degrees of Freedom  Generalized Coordinates 101 11 Integrals Involving Higher Derivatives 103 i List of Figures
1 :::::::::::::::::::::: 2 :::::::::::::::::::::: 3 :::::::::::::::::::::: 4 :::::::::::::::::::::: 5 :::::::::::::::::::::: 6 :::::::::::::::::::::: 7 :::::::::::::::::::::: 8 :::::::::::::::::::::: 9 :::::::::::::::::::::: 10 Plot of y = ` and y = 1 tan(`) ; sec(`) 2 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 64 64 81 82 83 84 87 90 95 ii Credits
Thanks to Lt. William K. Cooke, USN, Lt. Thomas A. Hamrick, USN, Major Michael R. Huber, USA, Lt. Gerald N. Miranda, USN, Lt. Coley R. Myers, USN, Major Tim A. Pastva, USMC, Capt Michael L. Shenk, USA who worked out the solution to some of the problems. iii CHAPTER 1 1 Functions of n Variables
Problems
1. Use the method of Lagrange Multipliers to solve the problem minimize f = x2 + y2 + z2 subject to = xy + 1 ; z = 0 2. Show that where
0 is the positive root of max cosh cosh = cosh0 0 ; sinh = 0: Sketch to show 0 . 3. Of all rectangular parallelepipeds which have sides parallel to the coordinate planes, and which are inscribed in the ellipsoid x2 + y 2 + z 2 = 1 a2 b2 c2 determine the dimensions of that one which has the largest volume. 4. Of all parabolas which pass through the points (0,0) and (1,1), determine that one which, when rotated about the xaxis, generates a solid of revolution with least possible volume between x = 0 and x = 1: Notice that the equation may be taken in the form y = x + cx(1 ; x), when c is to be determined. 5. a. If x = (x1 x2 xn) is a real vector, and A is a real symmetric matrix of order n, show that the requirement that F be stationary, for a prescibed A, takes the form xT Ax ; xT x Ax = x:
Deduce that the requirement that the quadratic form xT Ax
1 be stationary, subject to the constraint xT x = constant,
leads to the requirement where is a constant to be determined. Notice that the same is true of the requirement that is stationary, subject to the constraint that = constant, with a suitable de nition of .] b. Show that, if we write T Ax = xxT x the requirement that be stationary leads again to the matrix equation Ax = x Ax = x:
Notice that the requirement d = 0 can be written as d ; d =0
2 or d ; d = 0] Deduce that stationary values of the ratio are characteristic numbers of the symmetric matrix A. xT Ax xT x 2 1. f = x2 + y2 + z2 ' = xy + 1 ; z = 0 F = f + ' = x2 + y2 + z2 + (xy + 1 ; z) @F = 2x + y = 0 @x @F = 2y + x = 0 @y @F = 2z ; = 0 @z @F = xy + 1 ; z = 0 @
(3) (4) (1) (2) (3) (4) (5) (6) (7) ) ) = 2z z = xy + 1 (5) and (16) Substitute (7) in (1)  (2) ) = 2(xy + 1) ) 2x + 2(xy + 1)y = 0
2y + 2(xy + 1)x = 0 (8) (9) x + xy2 + y = 0 y + x2y + x = 0
3 9 > = > ;
(10)
xy(y ; x) = 0 )x=0 or y = 0 or x = y x = 0 ) = 2 ) z = 1 y = 0 by(1) (7) (5) y = 0 ) = 2 ) z = 1 x = 0 by(1) (7) (5) x = y ) = 2 ) z = ;1 ) xy = ;2 (7) (5) (6) ) x = ;2
2 Not possible So the only possibility x=y=0 z=1 =2 ) f =1 4 2. Find max cosh Di erentiate d = cosh ; 2 sinh = 0 d cosh cosh Since cosh 6= 0
! cosh ; sinh = 0
0 0 The positive root is Thus the function at
0 becomes cosh 0 No need for absolute value since
5 4 3 2 1 0 1 2 3 4 5 5 0 >0 0 4 3 2 1 0 1 2 3 4 5 Figure 1: 5 2 2 2 3. max xyz s:t: x2 + y2 + z2 = 1 a b c x2 + y2 + z2 ; 1) , then Write F = xyz + ( a2 b2 c2 0 = Fx = yz + 2a2x 0 = Fy = xz + 2b2y 0 = Fz = xy + 2c2z
2 2 2 0 = F = x2 + y2 + z2 ; 1 a b c (1) (2) (3) (4) If any of x y or z are zero then the volume is zero and not the max. Therefore x 6= 0 y 6= 0 z 6= 0 so
2 y2 0 = ;xFx + yFy = ;2 2x + 2 b2 a )y b 2 2 = x2 a 2 (5) Also
2 2 2 y2 0 = ;zFz + yFy = ;2 2x + 2 b2 ) y2 = z2 a b c (6) 2 2 b Then by (4) 3y2 = 1 ) y2 = b taking only the (+) square root (length) y = p b 3 3 c a x = p z = p by (5), (6) respectfully. 3 3 2 2 2 The largest volume parallelepiped inside the ellipsoid x2 + y2 + z2 = 1 has dimension a b c b c a p p p3 3 3 6 4. ' = ;y + x + cx(1 ; x) Volume V = min V =
Z Z 1 0 y2dx 1 0 x + cx(1 ; x)]2 dx
Z dV (c) = dc
2
Z 1 0 2 x + cx(1 ; x)] x(1 ; x)dx = 0
Z 1 0 x (1 ; x)dx + 2 c
2 1 0 x2(1 ; x)2dx = 0
1 0 1 1 2 1 x3 ; 1 x4 +2 c 1 x3 ; 2 x4 + 1 x5 3 4 0 3 5 1 1 1 +2 c 1 ; 2 + 1 = 0 2 (3 ; 4) 3 5 1 1 2 12 + 2 c 30 = 0 c = ; 15 = ; 5 6 2 y = x ; 5 x(1 ; x) 2 =0 V (c) =
= Z 1h 0 x2 + 2cx2(1 ; x) + c2x2(1 ; x)2 dx
i 1 + 2e 1 + c 2 1 3 12 30 3 V (c = ;5=2) = 24 7 5. F = xT Ax ; xT x =
X ij Aij xixj ;
X X i x2 i
i @F = @xk ) Ax + AT x ; 2 x = 0
Since A is symmetric min F = xT Ax + (xT x ; c) implies (by di erentiating with respect to xk k = 1 : : : n) j Akj xj + X Aik xi ; 2 xk = 0 k = 1 2 ::: n Ax = x Ax = x T b. = xxTAx = x To minimize we require d = d ; d =0
2 Divide by d ; d or =0 d ; d =0 8 CHAPTER 2 2 Examples, Notation
Problems
1. For the integral with I= Z x2 x1 f (x y y ) dx
0 0 f = y 1= 2 1 + y 2 write the rst and second variations I (0), and I (0).
0 00 J (y) = (1 + x)(y )2dx 0 where y is twice continuously di erentiable and y(0) = 0 and y(1) = 1: Of all functions of the form y(x) = x + c1x(1 ; x) + c2x2(1 ; x) where c1 and c2 are constants, nd the one that minimizes J:
0 2. Consider the functional Z 1 9 1. f = y1=2(1 + y 2) 1 fy = 2 y 1=2(1 + y 2) fy = 2y y1=2 x2 1 1=2 I (0) = y (1 + y 2) + 2y y1=2 dx x1 2 1 fyy = ; 4 y 3=2(1 + y 2) fyy = y 1=2y
0 ; 0
0 0 Z 0 ; 0 0 0 ; 0 0 ; 0 fy y = 2y1=2
0 0 I (0) =
00 Z x2 x1 1 ;4y ; 3=2 (1 + y 2) 2 + 2y
0 ; 1=2 y 0 0 + 2 y 1 =2 0 2 dx 10 2. We are given that,after expanding, y(x) = (c1 + 1)x + (c2 ; c1)x2 ; c2x3.Then we also have that y (x) = (c1 + 1) + 2(c2 ; c1)x ; 3c2x2 and that (y (x))2 = (c1 + 1)2 + 4x(c1 + 1)(c2 ; c1) ; 6x2c2(c1 + 1) +4x2(c2 ; c1)2 ; 12x3c2(c2 ; c1) + 9c2x4 2 Therefore, we now can integrate J (y) and get a solution in terms of c1 and c2
0 0 R1 0 3 (1 + x)(y )2dx = 2 (c1 + 1)2 + 10 (c1 + 1)(c2 ; c1) 3 14 ; 4 c2(c1 + 1) + 73 (c2 ; c1)2 99 ; 27 c2(c2 ; c1) + 30 c22 5
0 To get the minimum, we want to solve Jc1 = 0 and Jc2 = 0. After taking these partial derivatives and simplifying we get 7 Jc2 = 17 c1 + 15 c2 ; 1 = 0 30 6 and 17 Jc1 = c1 + 30 c2 ; 1 = 0 3 Putting this in matrix form, we want to solve
" 1 Using Cramer's rule, we have that 17 30 7 15 17 30 #" c1 = c2 # " 1 6 1 3 # c1 =
and 1 1 6 1 3 17 30 7 15 17 30 7 15 17 30 1 6 1 3 7 15 17 30 55 = 131 :42 c2 = 1 Therefore, we have that the y(x) which minimizes J (y) is 17 30 1 17 30 = ;20 131 ;:15 y(x) = 77 20 x + 131 x2 + 131 x3 1:42x ; :57x2 + :15x3 186 131
; 11 Using a technique found in Chapter 3, it can be shown that the extremal of the J (y) is 1 y(x) = ln2 ln(1 + x) which, after expanding about x = 0 is represented as y(x) = ; ln x + ln x + R(x) 1:44x ; :72x + :48x + R(x)
ln2 x
1 1 2 2 2 1 3 3 2 3 2 So we can see that the form for y(x) given in the problem is similar to the series representation gotten using a di erent method. 12 CHAPTER 3 3 First Results
Problems
1. Find the extremals of I= Z x2 for each case a. F = (y )2 ; k2y2 (k constant) b. F = (y )2 + 2y c. F = (y )2 + 4xy d. F = (y )2 + yy + y2 e. F = x (y )2 ; yy + y f. F = a(x) (y )2 ; b(x)y2 g. F = (y )2 + k2 cos y b 2. Solve the problem minimize I = (y )2 ; y2 dx a with y(a) = ya y(b) = yb: What happens if b ; a = n ?
0 0 0 0 0 0 0 0 0 0 x1 F (x y y ) dx
0 Z h 0 i 3. Show that if F = y2 + 2xyy , then I has the same value for all curves joining the endpoints.
0 4. A geodesic on a given surface is a curve, lying on that surface, along which distance between two points is as small as possible. On a plane, a geodesic is a straight line. Determine equations of geodesics on the following surfaces: dz 2 d 2 2 2 2 2 + a a. Right circular cylinder. Take ds = a d + dz and minimize d 2 or a2 d + 1 dz] dz b. Right circular cone. Use spherical coordinates with ds2 = dr2 + r2 sin2 d 2 :] c. Sphere. Use spherical coordinates with ds2 = a2 sin2 d 2 + a2d 2:] d. Surface of revolution. Write x = r cos y = r sin , z = f (r): Express the desired relation between r and in terms of an integral.]
v Z u u t ! v Z u u t ! 13 5. Determine the stationary function associated with the integral I=
when y(0) = 0 and y(1) = 1, where Z 1 0 (y )2 f (x) ds
0 f (x) =
6. Find the extremals 1 a. J (y) = y dx
Z
0 8 > > > < > > > : ;1
1 0 x< 1 4
1 4 <x 1 b. J (y) = c. J (y) = Z 0 y(0) = 0 y(1) = 1: y(0) = 0 y(1) = 1: y(0) = 0 y(1) = 1: 1 0 Z 1 0 yy dx
0 xyy dx
0 0 7. Find extremals for 1 2 a. J (y) = y 3 dx x
Z b. J (y) = Z 0 1 0 y2 + (y )2 + 2yex dx:
0 8. Obtain the necessary condition for a function y to be a local minimum of the functional J (y) = Z Z R K (s t)y(s)y(t)dsdt + Z b a y2dt ; 2 Z b a y(t)f (t)dt where K (s t) is a given continuous function of s and t on the square R, for which a s t b K (s t) is symmetric and f (t) is continuous. Hint: the answer is a Fredholm integral equation. 9. Find the extremal for J (y) =
10. Find the extremals Z 1 0 (1 + x)(y )2dx
0 y(0) = 0 y(1) = 1:
0 What is the extremal if the boundary condition at x = 1 is changed to y (1) = 0? J (y) = Z b a x2(y )2 + y2 dx:
0 14 1. Z x2 F (x y y )dx Find the externals.
0 0 0 0 a. F (y y ) = (y )2 ; k2y2 k = constant by Euler's equation F ; y Fy = c so,
0 ) (y ) = ;(ky) ; c ) y = (;(ky) ; c) = dy dy = = dx ) (;(ky) ; c) (ky) + c] = = idx p Using the fact that p du = ln ju + u + a j we get
0 (y )2 ; (ky)2 ; y (2y ) = ;(y )2 ; (ky)2 = c
0 0 0 0 2 2 0 2 1 2 2 1 2 2 1 2 2 Z u2 + a2 2 Z dy = ln jky + (ky)2 + cj = 2 + c)1=2 ((ky)
q q Z idx = ix ky + (ky)2 + c = e ix
Z Let's try another way using
Z Z dy p;c2 ; (ky)2]1=2 = dx ) sin 1 pky c = x ) pky c = sin( x) = sin x ; ;
; padu u ;
2 2 = sin 1 u a
; y= p;c k sin x c<0 15 b. F (y y ) = (y )2 + 2y
0 0 0
0 F ; y Fy = (y )2 + 2y ; y (2y ) = ;(y )2 + 2y = c ) (y )2 = 2y ; c ) p2dy; c = dx y
0 0 0 0 0 ) (2y ; c) = 1 2 = x ) 2 y ; c = x2 y = 1 (x2 + c) 2 c. F (x y ) = 4xy + (y )2
0 0 0 use Fy = c
0 ) 4x + 2y = c
0 ) 1 y = 2 (c ; 4x)
0 ) y = 1 x(c ; 2x) 2 16 d. F = y 2 + yy + y2
0 0 0
0 F ; y Fy = c see (21) Fy = 2y + y
0 0 Z ) F ; y Fy = y + yy + y ; y (2y + y) = ;y + y ) ;y + y = c y =y ;c y = y ;c pdy ; c = dx y
0
0 0 2
0 0 2 0 0 2 2 0 2 2 0 2 2 q 0 2 Z 2 y (arc cosh pc + c2) = x can also be written as ln j y + y2 ; c j y arc cosh pc = x ; c2 y cosh ( x ; c2) = pc
q y = c cosh( x ; c2) p 17 e. F = x y 2 ; yy + y d F =F see (12) y dx y Fy = ;y + 1
0 0
0 0 Fy = 2xy ; y d (2xy ; y) = ;y + 1 dx 2y + 2xy ; y = ;y + 1
0 0 0 0 0 00 0 0 2xy + 2y = 1
00 0 (2xy ) = 1
0 0 2xy = c1 c1 y = 2x dy = c21 dx x y = c21 ln j x j + c2
0 0 f. F = a(x)y 2 ; b(x)y2
0 Fy = ;2b(x)y
0 Fy = 2a(x)y d F = F ) (2a(x)y ) = ;2by y dx y a(x)y + a y + by = 0
0
0 0 0 00 0 0 Linear nonconstant coe cients. Can be Solved ! 18 g. F = y 2 + k2 cos y
0 F ; y Fy = c
0 0 Fy = 2y
0 0 0 y 2 + k2 cos y ; 2y 2 = c1
0 y 2 = c1 ; k2 cos y dy = c ; k2 cos y 1 dx pc1 ;dy2 cos y = dx k pc1 ;dy2 cos y x + c2 = k
0 ;y 0 2 + k2 cos y = c1
q Z 19 2. F = y 2 ; y2
0 From problem 1a with k = 1 we have y= p;c sin x y(a) = ya ) y(b) = yb ) c<0 ya = yb = y y The solution is not unique: y = sinb b sin x = sina a sin x
If b = a + n then yb = = otherwise no solution. sin b = yb to get a solution. sin a ya p;c sin a p;c sin b p;c sin(a + n p;c sin a ) then yb = ya for a solution ! 20 3. If F = y2 + 2xyy , show I has the same values for all curves joining the endpoints.
0 Using Euler's equation (12) in Chapter 3, we need only show d F (x) = F (x) x x x : y 2 dx y
0 Fy = 2xy Fy = 2y + 2xy d F (x) = d (2xy) = 2xy + 2y dx y dx which is Fy
0 0 0 0 Note that ) R x2 x1 F = xy2 x2 x1 d F = dx (xy2)
independent of curve. 21 4. a. Right circular cylinder min
v Z u u t a + dz d
2
0 0 !2 F ( z z ) = a2 + z 2
0 p
0 d pa + z ; z p 1 = c a +z p a +z ;z =c a +z pa + z = a
2
0 F ; z Fz = c Fz = 1 (a2 + z 2) 2
0 0 ; 1=2 2z
0 0 2 0 2 2 2 1 2 0 2 0 2 1 2 0 2 2 0 2 2 c1 a2 a +z = c 1
2
0 2 !2 2 z2= a c1
0 !2 ;a
!2 2 z=
0 v u u t a2 c1 a2 c1 ;a ;a 2 z= v u u t !2 2 + c2 2 parameter family of helical lines. 22 4. b. Right circular cone
v Z u u t dr 2 + r2 sin2 d d
! q
0 0 F ( r r ) = r 2 + r2 sin2
No dependence on , thus we can use (21) 1 Fr = 2 (r 2 + r2 sin2 ) 1=2 2r F ; r Fr = c1 2 2 + r2 sin2 pr 2 +rr2 sin2 = c1 ; ) r
0 0 ; 0 0 0 q 0 0 0 r 2 + r2 sin2
0 ;r
" 0 2 = c1 r 2 + r2 sin2
0 q r + r sin
0 2 2 2 2 = sin r2 c1 !2 r = r sin
0 2 2 2 r2 sin2 c2 1 ;1 # r = r sin r2 sin2 ; c2 1 c1 dr = d c1 r sin r2 sin2 ; c2 1
q
0 q Let = r sin d = sin
Z q Z 2 ;c
; 2 1 d c1 1 sec 1 r sin + c c = 2 1 sin c1 r sin = c1 sec ( ; c2c1) sin ] 23 4. c. Sphere a2 d d
v Z u u t !2 + a2 sin2 d
0 F ; F = c1
0
0 F( 0 ) = a2 sin2 + a2 q 2 )
2 q a2 sin2 + a2
+a
2
0 0 2 ;
0 q (2a2 ) a2 sin2 + a2
0 0 0 2 = c1
0 a sin
0 2 2 ;a 2
!2 2 = c1 a2 sin2 + a2
3 5 q 2 2 2 = sin2 =
q 4 a sin c1 a sin2 c1 ;1 0 sin s ;1
d sin d 2 a c1 sin ;1 = 24 4. d. Surface is given as ~( ) r
in parametric form x = cos y = sin z = f( )
The length L( ) =
Z t1 t0 q ~ ~ r r 0 2 + 2~ r ~ r 0 0 +~ r ~ r 0 2 dt ~ = cos # i + sin # j + f ( ) k r
0 ~ = r ; sin # i + cos # j
0 0 ~ ~ = cos2 # + sin2 # + f 2( ) = 1 + f ( )]2 r r ~ ~# = 0 r r ~# ~# = r r
2
Z )
or L=
Z v u u t t1 q t0 (1 + f ( )]2)
0 0 1 + 2 # 2 dt
0 L= d (1 + f ( )]2) d#
0 !2 + 2 d# d So F is a function of and d# F ; F = c1
0
0 d 1 F = 2 (1 + f ( )]2) d#
0 8 < : !2 0 +
0 2 9;1 = =2 2 (1 + f ( )]2)
0 0 0 q (1 + f ( )]2)
0 0 2 + 2 ; (1 +
0 f ( )]2) 2
0 q (1 + f ( )]2)
0 2 + 2 = c1 2 = c1 (1 + f ( )]2)
0 q 2 + 2 25 2 !2 c1
0 = (1 + f ( )]2)
0 0 2 + 2 = v u u u t 1 + f ( )]2
0 c1 2 2 ; 2 26 5. F = f (x) y 2 d Using (12) dx Fy = Fy Fy = 0
0
0 Fy = 2f (x) y d ) dx (2f (x) y ) = 0 f (x) y = c y = f (cx) dy = f (cx) dx y = f (cx) dx + k using y(0) = 0 x y(x) = 0 f (c ) d
0 0 0 0 0 Z Z Z Z y(1) = 1 = ) ; Z 1 0 c d =1 f( ) cd +
=1
Z Substituting for f : Z 1=4 0 1 1=4 cd = 1 1c = 1 2 c=2 ;c 1 + c 1 ; 1 4 4 y(x) = Z x 0 2 d f( ) 27 6. a. J (y) = y dx y(0) = 0 y(1) = 1 0 Euler's equation in this case is d1=0 dx which is satis ed for all y. Clearly that y should also satisfy the boundary conditions, i.e. y = x: Looking at this problem from another point of view, notice that J (y) can be computed directly and we have (after using the boundary condition),
0 Z 1 J (y ) = 1
Since this value is constant, the functional is minimzed by any y that satis es the boundary conditions. b. J (y) = yy dx y(0) = 0 y(1) = 1 0 Euler's equation in this case is dy=y dx which is the identity y = y which is satis ed for all y. Clearly that y should also satisfy the boundary conditions, i.e. y = x: Looking at this problem from another point of view, notice that J (y) can be computed directly and we have (after using the boundary condition), J (y) = 1 2 Since this value is constant, the functional is minimzed by any y that satis es the boundary conditions.
0 0 0 0 Z 1 c. J (y) = xyy dx y(0) = 0 y(1) = 1 0 Euler's equation in this case is d xy = xy dx which is y + xy = xy or y = 0:
0 0 0 Z 1 0 Clearly that y could NOT satisfy the boundary conditions. 28 7. a. J (y) = 0 (y 3) dx x (y )2 F = x3 Euler equation d Fy = Fy dx 2y Fy = x 3 Fy = 0 Integrate Euler's equation Fy = c =) 2y3 = c x 2y = cx3 cx3 y= 2
Z 1 0 2 0 0 0 0 0 0 0 0 cx4 + b =) y = 8
Z b. J (y) = 0 (y2 + (y )2 + 2yex)dx F = y2 + (y )2 + 2yex Fx = 2yex Fy = 2y + 2ex Fy = 2y Euler equation d Fy = Fy dx d 2y = 2y + 2ex dx y ; 2y = ex Solve the homogeneous y ; 2y = 0 =) y = c1e 2x + c2e 2x Find a particular solution of the nonhomogeneous y ; 2y = ex =) y = 2ex Therefore the general solution of the nonhomogeneous is: y = c1e 2x + c2e 2x + 2ex
0 0
0 1 0 0 0 00 p 00 p ; 00 p p ; 29 8. Obtain the necessary condition for a function y to be a local minimum of the functional: J (y) = Z Z b b Find the rst variation of J, a a K (s t)y(s)y(t)dsdt + y(t) dt ; 2 y(t)f (t)dt
Z b 2 Z b a a J (y + " ) = Z Z b b a a K (s t) y(s) + " (s)] y(t) + " (t)]dsdt + y(t) + " (t)]2dt
b a Z b ;2
Then, Z a y(t) + " (t)]f (t)dt d J (y + " ) = d" Z Z b b a a
Z fK (s t) y(s) + " (s)] (t) + K (s t) y(t) + " (t)] (s)g dsdt
b
Z +2 y(t) + " (t)] (t)dt ; 2 Now letting " = 0 we have,
a
8 Z <Z b a (t)f (t)dt
8 Z <Z d J (y + " ) = d" "=0 b b a : +2 y(t) (t)dt ; 2 f (t) (t)dt
Z a Zb a K (s t)y(s)ds (t)dt +
b a 9 = b b a : a K (s t)y(t)dt (s)ds 9 = Since the limits of s and t are constants, we can interchange s for t, and vice versa, in the second of four terms above, =
8 Z <Z b b Combining the rst two terms and factoring out an (t)dt yields: =
8 Z <Z a : a K (s t)y(s)ds (t)dt +
b b 9 = 8 Z <Z b b a : a K (t s)y(s)ds (t)dt +2 y(t) (t)dt ; 2 f (t) (t)dt
Z Z 9 = b b a a Setting this equal to 0 implies: 1 b K (s t) + K (t s)] y(s)ds + y(t) = f (t) 2a Which is a Fredholm equation.
Z a : a K (s t) + K (t s)] y(s)ds + 2y(t) ; 2f (t) (t)dt 9 = 30 9. Given F = (1 + x)(y )2. It is easy to nd that Fy = 2y (1 + x) Fy = 0 d d Therefore dx Fy = 0 =) dx y (1 + x) = 0 Integrating both sides we obtain, y (1 + x) = c1 =) y = (1 c1 x) +
0
0 0 0 0 0 0 Integrating again leads to y = c1 ln(1 + x) + c2 Now applying the boundary conditions, y(0) = 0 =) c1 ln(1 + 0) + c2 = 0 =) c2 = 0 1 y(1) = 1 =) c1 ln(1 + 1) = 1 =) c1 = ln 2 Therefore the nal solution is y = ln(1 + x) ln 2
0 It is easy to show that in that case the functional J (y) is 1 : ln 2 If our boundary condition at x = 1 was y (1) = 0, then c1 y = c1 ln(1 + x) and y = 1 + x Then y (1) = 1 c1 1 = 0 =) c1 = 0 + and we get the trivial solution.
0 0 31 10. Find the extremal: J (y ) =
0 R b 2 2 a (x y
0 + y2) dx
0 F = x2(y )2 + y2 dF dx y Fy = 2x2y Fy = 2y d F = 2x2y + 4xy dx y
0
0 00 0 Euler's Equation: 0 ; Fy = 0
2x2y + 4xy x2y + 2xy
00 00 0 0 ; 2y = 0 ; y=0
Thus (a b) must not contain the origin. This is an Euler equation
0 ; Let y = xr y = rxr 1 y = r(r ; 1) xr
00 ; 2 Substituting, (r2 ; r) xr + 2rxr ; xr = 0 (r2 ; r + 2r ; 1) xr = 0 r2 ; r + 2r ; 1 = 0 r2 + r ; 1 = 0 2 2 ;1 p1 + 4 = ;1 p5 r=
y= x y(x) = c1 x
;
; 1; 2 p 5 y= x
0:618 ; 1+ 2 p 5 1:618 + c2 x for a x b 32 CHAPTER 4 4 Variable EndPoint Problems
Problems
1. Solve the problem minimize I = y2 ; (y )2 dx 0 with left end point xed and y(x1) is along the curve
Z x1 h 0 i x1 = 4 :
2. Find the extremals for I=
where end values of y are free. Z 1 0 1 (y )2 + yy + y + y dx 2
0 0 0 3. Solve the EulerLagrange equation for I=
where Z b a y 1 + (y )2 dx
0 q y(a) = A a = ;b y(b) = B: A=B b. Investigate the special case when and show that depending upon the relative size of b B there may be none, one or two candidate curves that satisfy the requisite endpoints conditions. 4. Solve the EulerLagrange equation associated with I= Z b h a y2 ; yy + (y )2 dx
0 0 i 5. What is the relevant EulerLagrange equation associated with I= Z 1 0 h y2 + 2xy + (y )2 dx
0 i 6. Investigate all possibilities with regard to tranversality for the problem 33 min
Z Z b q a 1 ; (y )2 dx
0 7. Determine the stationary functions associated with the integral I = 0 (y )2 ; 2 yy ; 2 y dx where and are constants, in each of the following situations: a. The end conditions y(0) = 0 and y(1) = 1 are preassigned. b. Only the end conditions y(0) = 0 is preassigned. c. Only the end conditions y(1) = 1 is preassigned. d. No end conditions are preassigned.
1
h
0 0 0 i 8. Determine the natural boundary conditions associated with the determination of extremals in each of the cases considered in Problem 1 of Chapter 3. 9. Find the curves for which the functional I= Z x1 p1 + y
y 0 2 0 dx with y(0) = 0 can have extrema, if a. The point (x1 y1) can vary along the line y = x ; 5: b. The point (x1 y1) can vary along the circle (x ; 9)2 + y2 = 9: 10. If F depends upon x2, show that the transversality condition must be replaced by x2 @F F + ( ; y ) @F + @y x=x2 x1 @x2 dx = 0:
" # Z
0 0 0 11. Find an extremal for J (y) = Z e 1 1 x2(y )2 ; 1 y2 dx 2 8
0 y(1) = 1 y(e) is unspeci ed. 12. Find an extremal for J (y ) = Z 1 0 (y )2dx + y(1)2
0 y(0) = 1 y(1) is unspeci ed. 34 1. F = y2 ; (y )2 d Fy ; dx Fy = 0 d 2y ; dx (;2y ) = 0 y +y =0
0
0 0 00 y = A cos x + B sin x
using y(0) = 0 y = B sin x
Now for the transversity condition F + (' 0 slope of curve Since the curve is x = 4 (vertical line, slope is in nite) we should rewrite the condition y 1 F ' + (1 ; ' ) Fy = 0
0 0 0
0 " ; y )Fy x
0
0 = =4 =0 {z} =0 {z} =0 Fy 0 x= =4
0 =0
=4 ;2y x = =0 ) ;2B cos 4 =0 )
y B=0
0: ) 35 2. F = 1 y 2 + yy + y + y 2 d Fy ; dx Fy = 0 Fy = y + 1
0 0 0
0 0 Fy = y + y + 1 d Fy ; dx Fy = y + 1 ; (y + y + 1) = 0 y +1;y ;y =0
0 0 0 0 0 0 0 00 0 y 00 ;1=0 yH = Ax + B yP = 1 x2 2 y = Ax + B + 1 x2 2
Free ends at x = 0 x = 1 F + (' 0 ; y )Fy x
0
0 =0 =0 =0 F + (' Fy Fy 0 ; y )Fy x
0
0 =1 The free ends are on vertical lines x = 0 x = 1
0 x=0 =0 =0 ) ) y (0) + y(0) + 1 = 0
0 A+B+1=0 y (1) + y(1) + 1 = 0
0 0 x=1 5 2A + B + 2 = 0 36 A+B+1 =0
2A + B + 5 = 0 2 A + 3=2 = 0 1 y = ; 3 x + 1 + 2 x2 2 2 9 > > > = > > > ; ) A = ;3=2 B = ;A ; 1 = 1=2 37 3. F = y 1 + y 2
0 q Fy = 1 + y 2 Fy = y 1 (1 + y 2) 1=2 2y 2 Fy = p1yy y 2 + F ; y Fy = c1 yy 2 y 1 + y 2 ; p1 + y 2 = c1
0
0 q 0 ; 0 0 0 0 0 0 q 0 0 0 y(1 + y 2) ; yy 2 = c1 1 + y 2
q
0 0 0 y2 = 1 + y 2 c2 1
0 y2 ; 1 c2 1 c1 dy = dx y 2 ; c2 1 c1 arc cosh cy + c2 = x 1 y=
0 s Z Z q arc cosh cy = xc; c2 1 1 c1 cosh xc; c2 = y 1 y(a) = A y(b) = B OR c1 ln y + y2 ; c2 + c2 = x 1
q ) ) c1 cosh a c; c2 = A 1 c1 cosh b c; c2 = B 1 a ; c2 = c1 arc cosh cA1 b ; c2 = c1 arc
38 cosh cB 1 9 > = > ; a
This gives c1 , then A B b = c1 arc cosh c ; arc cosh c 1 1 A c2 = a ; c1 arc cosh c : 1 (1) (2) If a = zero on left of (1) ;b + A=B
zero on right of (1) + Thus we cannot specify c1 , based on that free c1 , we can get c2 using (2). Thus we have a one parameter family. 39 4. F = y2 ; yy + (y )2
0 0 arc cosh pyc + c2 = x 1 ; y Fy = c y ; yy + y ; y (; y + 2y ) = c y ; (y ) = c (y ) = y ; c y = y ;c py dy c = dx ;
F
0
0 1 2 2 0 0 2 0 0 1 0 2 1 1 0 2 2 q 0 2 1 2 1 40 5. F = y2 + 2xy + (y )2 d Fy ; dx Fy = 0 d 2y + 2x ; dx (2y ) = 0 ;2y + 2y = ; 2x
0
0 0 00 y y 00 00 ;y=x ;y=0 ) yh = Aex + Be yp = Cx + D ; x ;Cx ; D = x C = ;1 D = 0 yp = ; x y = Aex + Be x ; x
; 41 6. F = 1 ; (y )2
q
0 Fy =
0 2 q ;2y ) 1 ; (y )
0 0 2 y = Ax + B y =A
0 F +( F +(
q 0 ; y ) Fy
0 0 0 x =a x=b =0 =0
0 0 0 ; y ) Fy
0 0 1 ; (y )2 + (
0 0 0 1 ; (y )2 + (y )2 1 1 ) =A =y x=a 1 1 =y ) =A x=b Therefore if both end points are free then the slopes are the same 1 ' (a) = (b) = A
0 0 0 0 0 0 0 0 0 0 ; y ) 1 ; y(y ) ; ; y =0
q
0 2 x=a b =0 42 7. F = (y )2
0 ;2 yy 0 ;2 y 0 a. y(0) = 0 y(1) = 1 d Fy ; dx Fy = 0 d ;2 y ; dx (2y ; 2 y ; 2 ) = 0 ;2 y ; 2y + 2 y = 0
0 0 0 0 00 0 y =0
00 y = Ax + B y(0) = 0 y(1) = 1 ) ) ) B=0 A=1 y=x b. If only y(0) = 0 ) y = Ax Fy
=0 Transversality condition at x = 1
0 imples
0 x=1 2y (1) ; 2 y(1) ; 2 = 0 Substituting for y 2A ; 2 A ; 2 = 0 Thus the solution is A = 1; y = 1 ; x:
43 c. y(1) = 1 only y = Ax + B y(1) = 1 Fy y = Ax + 1 ; A
0 ) A+B =1 ) B =1;A 2y (0) ; 2 y(0) ; 2 = 0
0 x=0 =0 A ; (1 ; A) ; = 0 A (1 + ) = A=
+ +1 + 44 d. No end conditions y = Ax + B y =A
0 2y (0) ; 2 y(0) ; 2 = 0
0 2y (1) ; 2 y(1) ; 2 = 0
0 2A ; 2 B ;2 2A ; 2 (A + B ) ; 2 ) A=0 + 2 B = 2A ; 2 = ; 2 B=;
2 A=0 =0 =0 9 > = > ; 45 8. Natural Boundary conditions are Fy 0 x=a b =0 (1a of chapter 3) a. For F = y 2 ; k2 y2
0 Fy = 2y
0 0 y (a) = 0 y (b) = 0
0 0 b. For F = y 2 + 2y
0 exactly the same c. For F = y 2 + 4xy
0
0 0 Fy = 2y + 4x
0 y (a) + 2a = 0 y (b) + 2b = 0
0 0 46 d. F = y 2 + yy + y2
0 0 Fy = 2 y + y
0 0 2y (a) + y(a) = 0 2y (b) + y(b) = 0
0 0 e. F = x y 2 ; yy + y
0 0 Fy = 2xy
0 0 2ay (a) ; y(a) = 0 2by (b) ; y(b) = 0
0 0 ;y f. F = a(x) y 2 ; b(x)y2
0 Fy = 2a(x) y
0 0 2a( ) y ( ) = 0
0 2a( ) y ( ) = 0
0 Divide by 2a( ) or 2a( ) to get same as part a. g. F = y 2 + k2 cos y
0 Fy = 2y
0 0 same as part a 47 9. F = p1 + y
y 0 2 Fy d ; dx Fy = 0 p1 + y Fy = ; y
0 0 2 2 Fy = yp1y+ y 2
0
0 0 ) ; y y (1 +yy (1 ; yy )(1p+ y+ )y ; yy y + 1 ; (1 + y ) ; y y ; y ; y ] = 0 ; 1 ; 2y ; y ; y y + y + y = 0 yy + y = ; 1 (yy ) = ;1 yy = ; x + c ydy = (; x + c ) dx 1y = ; x + c x + c 2 2 y = ; x + 2c x + 2c y(0) = 0 ) c = 0 yy 2 2 d F = y y 1 + y ; y y 1 + y + y p1+y 2 dx y y2 (1 + y 2)
00 0 0 0 0
0 00 0 0 p p
00 Fy ; dF = dx y
0 ; p1 + y
2
0 00 0 0 2 0 2 0 2 0 2 0 2 2 0 2 0 2 y =0
00 0 2 2 2 00 2 0 4 0 0 4 0 2 0 4 00 0 2 0 0 0 1 1 2 2 1 2 2 2 1 2 2 a. Transversality condition: 0 =1 F + (1 ; y ) Fy ]
0
0 " p1 + y
y
0 0 2 (1 ; + y p1 y )yy2 +
0 0 0 0 x=x1 =0
# x=x1 =0 (1 + y 2 + y ;y 0 2 ) x=x1 =0 48 Since y2 = ; x2 + 2c1x 2yy =
0 ) 1 + y (x1) = 0
0 ) y (x1) = ; 1
0 ; 2x + 2c
; 1 at x = x1
 {z
; b. ;2(x ; 5) = ; 2x + 2c ) c =5 p10x ; x y = ; x + 10x or y = On (x ; 9) + y = 9
1 1 1 1 2 2 2 2
0 2y(x1) y (x1) = ; 2x1 + 2c1 x1 on 5 = 1 the line
0 }  {z } 2 The slope
0 yy = ; (x ; 9)
At x = x1
0 2(x ; 9) + 2yy = 0
0 is computed ( x1 ) = ; ; xy(x )9
1 1 Remember that at x1 y(x1) from the solution: y(x1)2 = ; x2 + 2c1 x1 1 is the same as from the circle ) ;x ) ;x
F+( y(x1)2 + (x1 ; 9)2 = 9
2 1 2 1 c1x1 = 9x1 ; 36
0 0 + 2c1x1 = 9 ; x2 + 18x1 1 () + 2c1x1 = 9 ; (x1 ; 9) 2 ; 81 Substituting in the transversality condition ; y ) Fy ] x = 0
0 1 49 we have y (x1) = 0 ;x + c 1 y x1 ; 9 x1 1 ; x1(x ) ;y(x+)c1 = 0 y 1 1 x 1 + (x1 ;y9)(x 1) ; c1) = 0 2( 1
0 0 1 + (x1)  {z } 9;(x1 ;9)2 y2(x1) + (x1 ; 9)(x1 ; c1) = 0
 {z } 9 ; (x1 ; 9) x1 ; 9 ; x1 + c1)] = 0 9 ; (x1 ; 9) (c1 ; 9) = 0 Solve this with (*) c1 x1 = 9x1 ; 36 to get: c1 = 9 ; 36 x1 9 ; (x1 ; 9)2 + (x1 ; 9)(x1 ; c1) = 0 : 9 + 36 = 9x36 1
36 x1 = 9 45 9 ; (x1 ; 9)(; 36 ) = 0 x
1 ) x1 = 36 5 ) c1 = 9 ; 5 = 4 ) y2 = ; x2 + 8x 50 10. In this case equation (8) will have another term resulting from the dependence of F on x2( ), that is x2 (0) @F dx x1 (0) @x2
Z 51 11. In this problem, one boundary is variable and the line along which this variable point moves is given by y(e) = y2 which implies that is the line x = e. First we satisfy Euler's 1 2 1 2 2 rst equation. Since F = x (y ) ; y , we have 2 8
0 and so, Therefore d Fy ; dx Fy = 0
0 0 = d ; y ; dx (x y )
1 4 2
0 = ; 1 y ; (2xy + x2y ) 4 1 = x2y + 2xy + 4 y
0 00 00 0 0 x2y + 2xy + 1 y = 0 4 This is a CauchyEuler equation with assumed solution of the form y = xr . Plugging this in and simplifying results in the following equation for r r2 + (2 ; 1)r + 1 = 0 4 1 which has two identical real roots, r1 = r2 = ; 2 and therefore the solution to the di erential equation is 2 y(x) = c1x 12 + c2x 1 ln x The initial condition y(1) = 1 implies that c1 = 1. The solution is then
00 ; ; y=x ; 1=2 + c2x
0 ; 1=2 ln x: To get the other constant, we have to consider the transversality condition. Therefore we need to solve F + ( ; y )Fy jx=e = 0 Which means we solve the following (note that is a vertical line)
0
0 ; F + (1 ; y )Fy x e
0 0 0 0 = = Fy jx=e = x2y jx=e = 0
0 0 which implies that y (e) = 0 is our natural boundary condition. y = ; 1 x 3=2 ; 1 c2x 3=2 ln x + c2x 3=2 2 2 With this natural boundary condition we get that c2 = 1, and therefore the solution is
0 0 ; ; ; 1 y(x) = x 2 (1 + ln x)
; 52 12. Find an extremal for J (y) =
0 Z 1 0 (y )2dx + y(1)2, where y(0) = 1, y(1) is unspeci ed.
0 F = (y ) + y(1) , Fy = 0 Fy = 2y . Notice that since y(1) is unspeci ed, the right hand value is on the vertical line x = 1. By the Fundamental Lemma, an extremal solution, y, must satisfy the Euler equation d Fy ; dx Fy = 0: d 0 ; dx 2y = 0 ;2y = 0 y = 0: Solving this ordinary di erential equation via standard integration results in the following: y = Ax + B . Given the xed left endpoint equation, y(0) = 1, this extremal solution can be further re ned to the following: y = Ax + 1. Additionally, y must satisfy a natural boundary condition at y(1). In this case where y(1) is part of the functional to minimize, we substitute the solution y = Ax + 1 into the functional to get:
2 2
0 0 0 0 00 00 I (A) = A2dx + (A + 1)2 = A2 + (A + 1)2 0 Di erentiating I with respect to A and setting the derivative to zero (necessary condition for a minimum), we have
Therefore and the solution is 2A + 2(A + 1) = 0 Z 1 A = ;1 2 y = ; 1 x + 1: 2 53 CHAPTER 5 5 Higher Dimensional Problems and Another Proof of the Second Euler Equation
Problems
1. A particle moves on the surface (x y z) = 0 from the point (x1 y1 z1) to the point (x2 y2 z2) in the time T . Show that if it moves in such a way that the integral of its kinetic energy over that time is a minimum, its coordinates must also satisfy the equations x = y = z:
x y z 2. Specialize problem 1 in the case when the particle moves on the unit sphere, from (0 0 1) to (0 0 ;1), in time T . 3. Determine the equation of the shortest arc in the rst quadrant, which passes through the points (0 0) and (1 0) and encloses a prescribed area A with the xaxis, where A 8 . 4. Finish the example on page 51. What if L = 2 ? 5. Solve the following variational problem by nding extremals satisfying the conditions J (y1 y2) = Z 4 0 2 2 4y1 + y2 + y1y2 dx
0 0 y1(0) = 1 y1 4 = 0 y2(0) = 0 y2 4 = 1:
6. Solve the isoparametric problem J (y) =
and Z 1 0 (y )2 + x2 dx y(0) = y(1) = 0
0 Z 1 0 y2dx = 2: 7. Derive a necessary condition for the isoparametric problem Minimize b I (y1 y2) = L(x y1 y2 y1 y2)dx
Z a 0 0 54 subject to and Z b a G(x y1 y2 y1 y2)dx = C
0 0 y1(a) = A1 y2(a) = A2 where C A1 A2 B1 and B2 are constants. I (x y) =
subject to
Z Z y1(b) = B1 y2(b) = B2 8. Use the results of the previous problem to maximize
t1 t0 (xy ; yx)dt _ _ t1 q t0 x2 + y2dt = 1: _ _ Show that I represents the area enclosed by a curve with parametric equations x = x(t) y = y(y) and the contraint xes the length of the curve. 9. Find extremals of the isoparametric problem I (y) =
subject to Z 0 (y )2dx
0 y(0) = y( ) = 0 Z 0 y2dx = 1: 55 1. Kinetic energy E is given by T 1 2 _ _ 2 _2 E= 2 (x + y + z ) dt 0 The problem is to minimize E subject to
Z '(x y z) = 0
Let F (x y z) = 1 (x2 + y2 + z2) + '(x y z) _ _ 2 _ Using (65) d Fyj ; dt Fyj = 0 j = 1 2 3 x d _ 'x ; dt x = 0 ) ' = x d _ y 'y ; dt y = 0 ) ' = y d z 'z ; dt z_ = 0 ) ' = z ) 'xx = 'yy = 'zz
0 56 2. If ' x2 + y2 + z2 ; 1 = 0 then 2x = 2yy = 2zz = ; x x+2 x=0 y+2 y =0 z+2 z=0
Solving x = A cos 2 t + B sin 2 t p p y = C cos 2 t + D sin 2 t p p z = E cos 2 t + G sin 2 t p p Use the boundary condition at t = 0 x(0) = y(0) = 0 z(0) = 1 )A = 0 C=0 E=1
Therefore the solution becomes p p y = D sin 2 t p p z = cos 2 t + G sin 2
x = B sin 2 t x(T ) = 0 y(T ) = 0 z(T ) = ;1 t The boundary condition at t = T ) B sin p2 t=0 ) p2 p2 same conclusion for y ) x = B sin nT t t=n =n T 57 y = D sin n t T z = cos n t + G sin n t T T Now use z(T ) = ; 1 ) ;1 = cos n ) n is odd + G sin n
 =0 {z } x = B sin n t T y = D sin n t T z = cos n t + G sin n t T T
Now substitute in the kinetic energy integral 9 > > > > > > > > > = > > > > > > > > > n = odd E = 1 (x2 + y2 + z_2) dt _ _ 0 2 1 T n B 2 cos2 n t + n D 2 cos2 n t = 2 T T T T 0
Z T Z ( + ; sin n t + G cos n t T T 1 = 2 + n T
2
Z 2 n T 2 ) dt T ( 0 n T
2 2 B 2 + D2 + G2 cos2 n t T
) sin2 n t T ;2G nT sin n t cos n t dt T T Z T
0 T n sin 2T t dt = 2T cos 2n t = 0 n T 0 58 T sin2 n t dt = ; 1 T sin 2n t + 1 t = T T 2 2n T 2 0 2 0 n t+1 ; cos 2 T 2 T cos2 n t dt = T T 2 0 2n cos T t + 1 2 2 1 E = 2 n (B 2 + D2 + G2 + 1) T T 2 Clearly E increases with n, thus the minimum is for n = 1:
Z T  {z } Z  {z } Therefore the solution is x = B sin T t y = D sin T t z = cos sin T t + G sin T t 59 3. Min L = Z 1 0
Z q 1 + y 2 dx
0 subject to A =
q 1 0 y dx =8 F = 1 + y2 + y
0 F ; y Fy
0 0 = c1
0 0 y + 1 + y2 ; y
q q
0 1 + y 2 = ;1 1 1 + y 2 = (c ; y)2 1 (y y 1 + y 2 + 1 + y 2 ; y 2 = c1 1 + y 2
q
0 0 0 p1 y+ y
0 0 2 = c1 ;c)
1
0 q 0 y = i
0 s q (c1 ;
Z q ; 1 dy
y
)2 ; (c ;1
1 y)2 + 1 (c1 ; y) dy = i (c1 ; y)2 + 1 +1 = i dx
Z dx Use substitution du = 2(c1 ; y) (; ) dy (c1 ; y) dy = ; du 2 1 ; 2 udu2 = i dx 1= 1=2 ; 21 u=2 = i x + c2 1 substitute for u
Z Z ) ) u = (c1 ; y)2 ; 1 ; q (c1 ; y)2 ; 1 = ( i x + c2)
60 square both sides (c1 y)2 ; 1 = 2( i x + c2)2 ; c1 + y 2 ; 12 = (;x2 2ix c2 + c22)
2 ; y ; c1 y ; c1 ; 1 = 2 ;x
 2 2i c2 x ; c2 2
(x+D)2
{z } 2 + (x + D)2 = 12 We need the curve to go thru (0, 0) and (1, 0) x=y=0 x=1 y=0 D2 D2 ) ; c1 2 + D = 12
2 2 9 > > > > > = > > > > > ) ;c 1 + (1 + D)2 = 12 ; ; (1 + D) = 0 ; 1 ; 2D ; D = 0 2D = ; 1 D = ; 1=2 )
2 2 y ; c1 2 Let + x;1 2 2 = 12 k= ;c 1 then the equation is 2 x ; 1 + (y + k)2 = k2 + 1 2 4 To nd k1 we use the area A A=
use: Z 1 0 y dx = Z 1 0 2 4 1 + k2 + 4 s ; x;
61 1 2 2 ;k 3 5 dx Z where pa ; u du = u pa ; u 2
2 2 2 2 2 + a arc sin u 2 a 1 a2 = k 2 + 4 u=x;1 2 A= x; 2
s 1 2 s k2 + 1 4 1 ; x; 2 2 + k2 + 2
q 1 4 arc sin x ; 1=21 k2 + 4
q ; kx 1 0 1 = 4 k2 + 1 4 ;1+ 4 ; k2 + 2 1 4 arc sin
1 4 1=2 k2 +
q 1 4 ;k
9 = ; 8 < : ; 1 k2 + 1 4 4 s 1 + k2 + 4 2 arc sin 1=2 k2 + 1 4 1 A = 1 k ; k + k2 + 1=4 arc sin p 2 2 4k + 1 2 1 A + 1 k = (4k 4+ 1) arc sin p 2 2 4k + 1 1 4A + 2k = (4k2 + 1) arc sin p 2 = (4k2 + 1) arc cot 2k 4k + 1 So: 4A + 2k = (4k2 + 1) arc cot 2k and x;1 2 2 + (y + k)2 = k2 + 1 4 62 4. c2 + c2 = 2 1 subtract 2 2 c2 + (1 ; c1)2 = c2 ; (1 ; c1)2 = 0 1 c2 ; 1 + 2c1 1 c1 = 1=2 at (0 0) at (1 0) ;c 2 1 =0 Now use (34): Since y = tan
0 L=
since Z 1 0 sec dx sin = x ; c1 dx = cos d x=0 x=1 ) ) sin sin
Z 1 = 2 ; c = ; 21 = 1;c = 1
1 1 2 ) L= 2 1 sec cos d = ( 2 ; 1 Suppose we sketch the two sides as a function of 21 0 is the value such that L = arc sin 1 2 0 20 0 is a function of L ! c2 + 1 = 2 (L) 2 0 4 c2 = 2 (L) ; 1 2 0 4 63 L = arc sin 1 2 2 ) = 2 arc sin 21 1.5 y=1.2/(2) 1 0.5 0 1/(20) 0.5 1 1.5 1.5 1 0.5 0 0.5 1 1.5 Figure 2: L = =2 ) = 1=2 1 c2 = 4 ; 1 = 0 2 4 1 The curve is then y2 + (x ; 2 )2 = 1 4
1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0.2 0.2 A= /4 L= /2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Figure 3: 64 5. Solve the following variational problem by nding the extremals satisfying the conditions:
2 2 J (y1 y2) = (4y1 + y2 + y1y2)dx
0 0 Z =4 0 y1(0) = 1 y1( =4) = 0 y2(0) = 1 y2( =4) = 1
Vary each variable independently by choosing
1 1 and 2 in C 2 0 =4] satisfying: (0) = 2(0) = 1( =4) = 2( =4) = 0 Form a one parameter admissible pair of functions: y1 + " 1 and y2 + " 2 Yielding two Euler equations of the form: d d Fy1 ; dx Fy1 = 0 and Fy2 ; dx Fy2 = 0 For our problem:
0 0 2 2 F = 4y1 + y2 + y1y2
0 0 Taking the partials of F yields: Fy1 Fy2 Fy1 Fy2
0 0 = = = = 8y1 2y2 y2 y1
0 0 Substituting the partials with respect to y1 into the Euler equation: d 8y1 ; dx y2 = 0 y2 = 8y1
0 00 Substituting the partials with respect to y2 into the Euler equation d 2y2 ; dx y1 = 0 y1 = 2y2
0 00 65 Solving for y2 and substituting into the rst, second order equation: y2 = 1 y1 =) y1 = 16y1 2 Since this is a 4th order, homogeneous, constant coe cient, di erential equation, we can assume a solution of the form
00 0000 y1 = erx
Now substituting into y1 = 16y1 gives:
0000 r4erx r4 r2 r
This yields a homogeneous solution of: = 16erx = 16 = 4 = 2 2i y1 = C1e2x + C2e = C1e2x + C2e
Now using the result from above: ; 2x ; + C 3e2ix + C 4e 2ix 2x + C3 cos 2x + C4 sin 2x
; y2 = 1 y1 2 1 d 2C e2x ; 2C e 2x ; 2C sin 2x + 2C cos 2x = 2 dx 1 2 3 4 = 1 4C1e2x + 4C2e 2x ; 4C3 cos 2x ; 4C4 sin 2x 2 = 2C1e2x + 2C2e 2x ; 2C3 cos 2x ; 2C4 sin 2x
00 ; ; ; Applying the initial conditions: We now have 4 equations with 4 unknowns y1(0) y1( 4 ) y2(0) y2( 4 ) = = = = 1 =) C1 + C2 + C3 = 1 0 =) C1e 2 + C2e 2 + C4 = 0 0 =) C1 + C2 ; C3 = 0 1 1 =) C1e 2 + C2e 2 ; C4 = 2
; ; 66 C1 C2 = C3 C4 Performing Gaussian Elimination on the augmented matrix: (;1) 1 1 1 0 1 1 1 ,! 1 1 ;1 0 0 0 0 (;1) e 2 e 2 0 1 0 = e 2 e 2 0 0 ,! e 2 e 2 0 ;1 1 2 The augmented matrix yields:
2 6 6 6 4 1 1 1 1 e2 e 2 e2 e 2
; ; 1 0 ;1 0 0 1 0 ;1 32 76 76 76 54 3 7 7 7 5 2 6 6 6 4 1 0 0
1 2 3 7 7 7 5 2 6 6 6 4 3 7 7 7 5 2 6 6 6 4 ; ; ; 1 0 1 ;2 0 ;1 0 1 0 1 0 ;2 2 3 7 7 7 5 C1 + C2 + C3 = 1 ;2C3 = ;1 =) C3 = 1 2 1 =) C = ; 1 ;2C4 = 2 4 4 C1e 2 + C2e 2 + C4 = 0
; Substituting C3 and C4 into the rst and fourth equations gives: 1 C1 + C2 = 1 =) C1 = 2 ; C2 2 1 1 and C = 1 ; C =) C = 4 e 2 + C2 e 2 C1 e = 4 1 2 2 2 e Finally:
; ; 2 ; ; ;1 1 2 = :4683 =) C1 = :0317 1 + 2 cos 2x ; 1 sin 2x 4 1 sin 2x y2 = :0634e2x + :9366e 2x ; cos 2x + 2 y1 = :0317e2x + :4683e ; 2x ; 67 6. The problem is solved using the Lagrangian technique.
Z L = ((y ) + x )dx + (y2 ; 2)dx 0 0 L = F + G = (y )2 + x2 + (y2 ; 2) where F = (y )2 + x2 and G = y2 ; 2 Ly = 2 y and Ly = 2y Now we use Euler's Equation to obtain d (2y ) = 2 y dx y = y Solving for y p p y = A cos( x) + B sin( x) Applying the initial conditions, p p y(0) = A cos( 0) + B sin( 0) =) A = 0 p y(1) = B sin( ) = 0 p If B = 0 then we get the trivial solution. Therefore, we want sin( ) = 0. p This implies that = n , n = 1 2 3 : : : Now we solve for B using our constraint. y = B sin(n x)
1
0 2 2 Z 1 0 0 0 0 0 00 Z y2dx = 0 B 2 sin2(n x)dx = 2 0 1 B 2 x ; sin42 x = 2 =) B 2 ( 1 ; 0) ; 0 = 2 2 2 0 2 = 4 or B = 2: B Therfore, our nal solution is y = 2 sin(n x), n = 1 2 3 : : : 1 Z 1 68 7. Derive a necessary condition for the isoperimetric problem. Minimize subject to I (y1 y2) =
Z Z b a L (x y1 y2 y1 y2) dx
0 0 0 0 b a G (x y1 y2 y1 y2) dx = C and y1(a) = A1 y2(a) = A2 y1(b) = B1 y2(b) = B2 where A1 A2 B1 B2 and C are constants. Assume L and G are twice continuously di erentiable functions. The fact that b G (x y1 y2 y1 y2) dx = C is called an isoperimetric constraint.
Z
0 0 Let W = a Z b a G (x y1 y2 y1 y2) dx
0 0 We must embed an assumed local minimum y(x) in a family of admissible functions with respect to which we carry out the extremization. Introduce a twoparameter family zi = yi(x) + "i i (x)
where
1 2 i=1 2 i=1 2
(11) C 2 (a b) and
i (a) = i(b) = 0 and "1 "2 are real parameters ranging over intervals containing the orign. Assume W does not have an extremum at yi then for any choice of 1 and 2 there will be values of "1 and "2 in the neighborhood of (0 0) for which W (z) = C: Evaluating I and W at z gives J ("1 "2) = Z b a L (x z1 z2 z1 z2) dx and V ("1 "2) =
0 0 Z b a G (x z1 z2 z1 z2) dx
0 0 Since y is a local minimum subject to V the point ("1 "2) = (0 0) must be a local minimum for J ("1 "2) subject to the constraint V ("1 "2) = C . This is just a di erential calculus problem and so the Lagrange multiplier rule may be applied. There must exist a constant such that @ J = @ J = 0 at (" " ) = (0 0) (12) 1 2 @ "1 @ "2 where J is de ned by J =J+ V = Z b a L (x z1 z2 z1 z2) dx
0 0 69 with L =L+ G
We now calculate the derivatives in (12), afterward setting "1 = "2 = 0. Accordingly, @ J (0 0) = @ "i Z b a h Ly (x y1 y2 y1 y2) i + Ly (x y1 y2 y1 y2)
0 0
0 0 0 i 0 i dx i = 1 2 Integrating the second term by parts (as in the notes) and applying the conditions of (11) gives @ J (0 0) = @ "i Z b a d Ly (x y1 y2 y1 y2) ; dx Ly (x y1 y2 y1 y2)] i dx i = 1 2
0 0
0 0 0 0 Therefore from (12), and because of the arbitrary character of 1 or 2 the Fundamental Lemma implies d Ly (x y1 y2 y1 y2) ; dx Ly (x y1 y2 y1 y2) = 0 Which is a necessary condition for an extremum.
0 0
0 0 0 70 ~ ~ 8. Let the two dimensional position vector R be R = x~ + y~ , then the velocity vector i j ~ + y~ . From vector calculus it is known that the triple ~ ~ ~ gives the volume of the ~ = xi _ j v _ a b c parallelepiped whose edges are these three vectors. If one of the vectors is of length unity then the volume is the same as the area of the parallelogram whose edges are the other 2 vectors. Now lets take ~ = ~ ~ = R and ~ = ~ . Computing the triple, we have xy ; xy a k b ~ c v _ _ which is the integrand in I . The second integral gives the length of the curve from t0 to t1 (see de nition of arc length in any Calculus book). To use the previous problem, let L(t x y x y) = xy ; xy _ _ _ _
q then _ _ G(t x y x y) = x2 + y2 _ _ = y _ Ly = ;x _ = 0 Gy = 0 = ;y Ly_ = x _ x _ = p 2 2 Gy_ = p 2y 2 x +y _ _ x +y _ _ Substituting in the Euler equations, we end up with the two equations: _ x y 2 ; (xxy+;_ 2_)y =2 = 0 _ 2 _ y 3 _ x x ;2 + (xxy+;_ 2_)y =2 = 0 _ 2 _ y 3 Lx Gx Lx_ Gx_ ( ) ( ) Case 1: y = 0 _ Substituting this in the second equation, yields x = 0. _ Thus the solution is x = c1 y = c2 Case 2: x = 0, then the rst one yields y = 0 and we have the same solution. _ _ Case 3: x 6= 0, and y 6= 0 _ _ In this case the term in the braces is zero, or 2 (x2 + y2)3=2 = xy ; xy _ _ _ _ _ The right hand side can be written as y dt x . _ d y _ x , we get _ Now let u = y _ du = 2 dy 2 )3=2 (1 + u For this we use the trigonometric substitution u = tan . This gives the following:
! 2 71 Simplifying we get x = 2y + c _ y _ dx =
2
q s _ 1 + ( x )2 y _ Substitute v = 1 ; ( 2 y + c)2 and we get
! y+c dy 1 ; ( 2 y + c)2
!2 !2 c 2+ x+ k y+ 2 2 which is the equation of a circle. = 2 72 9. Let F = F + G = (y )2 + y2. Then Euler`s rst equation gives
0 d 2 y ; dx (2y ) = 0
0 ) 2 y ; 2y = 0 ) y ; y=0 ) r;p0 = ) r=
00 00 2 p; x) + c sin(p; x) p The initial conditions result in c = 0 and c sin( ; ) = 0. Since c = 0 would give us the p trivial solution again, it must be that ; = n where n = 1 2 : : :. This implies that ; = n or eqivalently = ;n n = 1 2 : : :.
y(x) = c1cos(
1 2 2 2 2 Where we are substituting the assumed solution form of y = erx into the di erential equation to get an equation for r. Note that = 0 and > 0 both lead to trivial solutions for y(x) and there would be no way to satisfy the condition that o y2dx = 1. Therefore, assume that < 0. We then have that the solution has the form
R We now use this solution and the requirement Therefore, we have
Z 2 R o y2dx = 1 to solve for the constant c2. 0 c2sin2(nx)dx = 2
= = = = Z n
0 2 2 n c n 0 c2 2 2 c2 2 2 1 for n = 1 2 : : :
! c2 sin2udu 2 n u ; sin(2u) 2 4 sin(2n ) ; 4 After solving for the constant we have that
s y(x) = 2 sin(nx) n = 1 2 : : :
Z
0 If we now plug this solution into the equation (y )2dx we get that I (y) = n2 which implies 0 we should choose n = 1 to minimize I (y). Therefore, our nal solution is
s y(x) = 2 sin(x) 73 CHAPTER 6 6 Integrals Involving More Than One Independent Variable
Problem
1. Find all minimal surfaces whose equations have the form z = (x) + (y): 2. Derive the Euler equation and obtain the natural boundary conditions of the problem
Z Z h R (x y)u2 + (x y)u2 ; (x y)u2 dxdy = 0: x y
i In particular, show that if (x y) = (x y) the natural boundary condition takes the form @u u = 0 @n @u where @n is the normal derivative of u. 3. Determine the natural boundary condition for the multiple integral problem I (u) = Z Z R L(x y u ux uy )dxdy u C 2(R) u unspeci ed on the boundary of R 4. Find the Euler equations corresponding to the following functionals a. I (u) = (x2u2 + y2u2)dxdy x y
Z Z b. I (u) = Z Z R R (u2 ; c2u2 )dxdt c is constant t x 74 1. z = (x) + (y) S=
= Z Z q R
Z Z q 2 2 1 + zx + zy dx dy
0 R 1+
0 2 (x) +
!
0 0 2 (y) dx dy @ @x
00 p1 + (x)+ 2
0 2 + @@y
0 0 p1 + (2y)+
0 0 !
0 2 =0 + =0 Di erentiate and multiply by 1 + (x) 1 + (y) 1 +
q q q
0 2 2 2 + + 0 2 2 00 0 0 ; ; + 1+ 0 2 2 2 2 2 q
00 1+ 0 + + 0 2 ; 1=2 q
00 0 0 0 2 ; 1=2 Expand and collect terms
00 (x) 1 +
00 0 0 2 + (y ) + 00 (y ) 1 + q 0 2 + (x)] = 0 Separate the variables (x) () = ; 1 + 2 y+ (x) 2 + (y ) 1+ One possibility is
00 0 00 (x) = 00 (y) = 0 ) ) z = Ax + By + C
00 (x) = Ax + (y) = By + which is a plane The other possibility is that each side is a constant (left hand side is a function of only x and the right hand side depends only on y) (y) (x) = = ; 2(x) 2 + (y ) 1+ 1+ Let = (x) then
00 0 0 0 1+ 2 = d = dx 1+ 2 arc tan = x + c1 75 0 = tan ( x + c1) Integrate again (x) = (x) =
Z tan ( x + c1) dx
1 ; 1 ln cos ( x + c ) + c2
; e(c2 (x)) = cos ( x + c1)) (1) Similarly for (y) (sign is di erent !) (y) = 1 ln cos( y ; D1) + D2 e( (y) ; D2 ) = cos ( y ;D)
1 Divide equation (2) by equation (1) e ( c2 D2 + (y) + (x)) = cos( y ; D1)) cos( x + c1 using z = (x) + (y) we have e ( c2 D2) e z = cos( y ; D1)) cos( x + c1 If we let (x0 y0 z0) be on the surface, we nd cos( x + c e (z z0 ) = cos( y ; D1)) cos( y 0 ; D1)) cos( x + c1 0 1
; ; ; ; ; 76 2. F = (x y) u2 + (x y) u2 x y ; (x y) u2 @ + @y Fuy = 0 (see equation 11) Fux = 2 (x y) ux @ ; Fu + @x Fu
x Fu = ; 2 (x y) u @ @ ) @x ( (x y) ux) + @y ( (x y) uy) + (x y) u = 0 The natural boundary conditions come from the boundary integral Fux cos + Fuy sin = 0
( (x y) ux cos + (x y) uy sin ) = 0 If (x y) = (x y) then (x y) (ux cos + uy sin ) = 0 ru ~ n @u = @n @u ) @n = 0
 {z {z }  } Fuy = 2 (x y) uy 77 3. Determine the natural boundary condition for the muliple integral problem I (u) = R L(x y u ux uy )dxdy u 2 C 2(R) u unspeci ed on the boundary of R. Let u(x y) be a minimizing function (among the admissible functions) for I (u). Consider the oneparameter family of functions u(") = u(x y) + " (x y) where 2 C 2 over R and (x y) = 0 on the boundary of R. Then if I (") = R L(x y u + " ux + " x uy + " y )dxdy a necessary condition for a minimum is I (0) = 0: Now, I (0) = ( Lu + xLux + y Luy )dxdy, where the arguments in the partial derivaR tives of L are the elements (x y u ux uy ) of the minimizing function u: Thus,
Z Z Z Z
0 Z Z 0 I (0) =
0 Z Z R The second integral in this equation is equal to (by Green's Theorem) (`Lux + mLuy )ds @R where ` and m are the direction cosines of the outward normal to @R and ds is the arc length of the @R . But, since (x y) = 0 on @R this integral vanishes. Thus, the condition I (0) = 0 which holds for all admissible (x y) reduces to @ @ (Lu ; @x Lux ; @y Luy )dxdy = 0: R @ @ Therefore, Lu ; @x Lux ; @y Luy = 0 at all points of R. This is the EulerLagrange equation (11) for the two dimensional problem. Now consider the problem
I
0 @ @ (Lu ; @x Lux ; @y Luy )dxdy + Z Z R @ @ ( @x ( Lux ) + @y ( Luy ))dxdy: Z Z L(x y u ux uy )dxdy = I (u) = L(x y u ux uy )dxdy R c a where all or or a portion of the @R is unspeci ed. This condition is analogous to the single integral variable endpoint problem discussed previously. Recall the line integral presented above: (`Lux + mLuy )ds where ` and m are the direction cosines of the outward normal to @R @R and ds is the arc length of the @R . Recall that in the case where u is given on @R (analogous to xed endpoint) this integral vanishes since (x y) = 0 on @R. However, in the case where on all or a portion of @R u is unspeci ed, (x y) 6= 0. Therefore, the natural boundary condition which must hold on @R is `Lux + mLuy = 0 where ` and m are the direction cosines of the outward normal to @R.
I Z Z Z dZ b 78 4. Euler's equation @F + @F @x ux @y uy ; Fu = 0 a. F = x2u2 + y2u2 x y Di erentiate and substitute in Euler's equation, we have b. F = u2 ; c2u2 t x Di erentiate and substitute in Euler's equation, we have which is the wave equation. 2xux + x2uxx + 2yuy + y2uyy = 0 utt ; c2uxx = 0 79 CHAPTER 7 7 Examples of Numerical Techniques
Problems
1. Find the minimal arc y(x) that solves, minimize I = y2 ; (y )2 dx 0 a. Using the indirect ( xed end point) method when x1 = 1: b. Using the indirect (variable end point) method with y(0)=1 and y(x1) = Y1 = x2 ; 4 :
Z x1 h 0 i 2. Find the minimal arc y(x) that solves, minimize I = where y(0) = 1 and y(1) = 2:
Z Z 1 0 1 (y )2 + yy + y + y dx 2
0 0 0 3. Solve the problem, minimze I = y2 ; yy + (y )2 dx 0 a. Using the indirect ( xed end point) method when x1 = 1: b. Using the indirect (variable end point) method with y(0)=1 and y(x1) = Y1 = x2 ; 1:
x1
h
0 0 i 4. Solve for the minimal arc y(x) : I= where y(0) = 0 and y(1) = 1: Z 1 h 0 y2 + 2xy + 2y dx
0 i 80 1. a. Here is the Matlab function de ning all the derivatives required
% odef.m function xdot=odef(t,x) % fy1fy1  fy'y' (2nd partial wrt y' y') % fy1y  fy'y (2nd partial wrt y' y) % fy  fy (1st partial wrt y) % fy1x  fy'x (2nd partial wrt y' x) fy1y1 = 2 fy1y = 0 fy = 2*x(1) fy1x = 0 rhs2= fy1y/fy1y1,(fyfy1x)/fy1y1] xdot= x(2),rhs2(1)*x(2)+rhs2(2)]' The graph of the solution is given in the following gure
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 4: 81 2. First we give the modi ed nput.m
% function VALUE = FINPUT(x,y,yprime,num) returns the value of the % functions F(x,y,y'), Fy(x,y,y'), Fy'(x,y,y') for a given num. % num defines which function you want to evaluate: % 1 for F, 2 for Fy, 3 for Fy'. if nargin < 4, error('Four arguments are required'), break, end if (num < 1)  (num > 3) error('num must be between 1 and 3'), break end if num == 1, value = .5*yp^2+yp*y+yp+y if num == 2, value = yp+1 end if num == 3, value = yp+y+1 end end % F % Fy % Fy' The boundary conditions are given in the main program dmethod.m (see lecture notes). The graph of the solution (using direct method) follows
Solution y(x) using the direct method 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 y 0.1 0.2 0.3 0.4 0.5 x 0.6 0.7 0.8 0.9 1 Figure 5: 82 3. a. Here is the Matlab function de ning all the derivatives required
% odef.m function xdot=odef(t,x) % fy1fy1  fy'y' (2nd partial wrt y' y') % fy1y  fy'y (2nd partial wrt y' y) % fy  fy (1st partial wrt y) % fy1x  fy'x (2nd partial wrt y' x) fy1y1 = 2 fy1y = 1 fy = 2*x(1)x(2) fy1x = 0 rhs2= fy1y/fy1y1,(fyfy1x)/fy1y1] xdot= x(2),rhs2(1)*x(2)+rhs2(2)]' The graph of the solution is given in the following gure
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 6: 83 4. First we give the modi ed nput.m
% function VALUE = FINPUT(x,y,yprime,num) returns the value of the % functions F(x,y,y'), Fy(x,y,y'), Fy'(x,y,y') for a given num. % num defines which function you want to evaluate: % 1 for F, 2 for Fy, 3 for Fy'. if nargin < 4, error('Four arguments are required'), break, end if (num < 1)  (num > 3) error('num must be between 1 and 3'), break end if num == 1, value = y^2+2*x*y+2*yp if num == 2, value = 2*y+2*x end if num == 3, value = 2 end end % F % Fy % Fy' The boundary conditions are given in the main program dmethod.m (see lecture notes). The graph of the solution (using direct method) follows
Solution y(x) using the direct method 1 0.8 0.6 0.4 y 0.2 0 0.2 0 0.1 0.2 0.3 0.4 0.5 x 0.6 0.7 0.8 0.9 1 Figure 7: 84 CHAPTER 8 8 The RayleighRitz Method
Problems
1. Write a MAPLE program for the RayleighRitz approximation to minimize the integral
Z I= 1h 0 (y )2
0 ; y ; 2xy
2 i dx y(0) = 1 y(1) = 2: Plot the graph of y0 y1 y2 and the exact solution. 2. Solve the same problem using nite di erences. 85 1.
with(plots): phi0:= 1+x: y0 :=phi0: p0:=plot(y0,x=0..1,color=yellow,style=point): phi0:= 1+x:phi1:= a1*x*(1x): y1 :=phi0 + phi1: dy1 :=diff(y1,x): f := (dy1^2  y1^2  2*x*y1): w := int(f,x=0..1): dw := diff(w,a1): a1:= fsolve(dw=0,a1): p1:=plot(y1,x=0..1,color=green,style=point): phi0:= 1+x:phi1:= b1*x*(1x):phi2 := b2*x*x*(1x): y2 :=phi0 + phi1 + phi2: dy2 :=diff(y2,x): f := (dy2^2  y2^2  2*x*y2): w := int(f,x=0..1): dw1 := diff(w,b1): c_1:=solve(dw1=0,b1): dw2 := diff(w,b2): c_2:=solve(dw2=0,b1): b3:= c_1c_2: b2:=solve(b3=0,b2): b1:=c_1: p2:=plot(y2,x=0..1,color=cyan,style=point): phi0:= 1+x: phi1:= c1*x*(1x): phi2 := c2*x*x*(1x): phi3 := c3*x*x*x*(1x): y3 :=phi0 + phi1 + phi2 + phi3: dy3 :=diff(y3,x): f := (dy3^2  y3^2  2*x*y3): w := int(f,x=0..1): dw1 := diff(w,c1): c_1:=solve(dw1=0,c1): dw2 := diff(w,c2): c_2:=solve(dw2=0,c1): dw3 := diff(w,c3): c_3:=solve(dw3=0,c1): a1:= c_1  c_2: a_1:=solve(a1=0,c2): a2:= c_3  c_2: 86 a_2:=solve(a2=0,c2): b1:= a_1  a_2: c3:=solve(b1=0,c3): c2:=a_1: c1:=c_1: p3:=plot(y3,x=0..1,color=blue,style=point): y:= cos(x) +((3cos(1))/sin(1))*sin(x)  x: p:=plot(y,x=0..1,color=red,style=line): display({p,p0,p1,p2,p3}) Note: Delete p2 or p3 (or both) if you want to make the True versus Approximations more noticable.
2 1.8 1.6 1.4 1.2 1 0 0.2 0.4 x 0.6 0.8 1 Figure 8: 87 2. F =dy^2y^22*y*x with(plots): f = (((y i+1]y i])/delx)^2  y i]^2  2*x i]*y i]) phi1 :=sum((((y1 i+1]y1 i])/delx1)^2  y1 i]^2  2*x1 i]*y1 i])*delx1,'i'=0..1 dy1 0] := diff(phi1,y1 0]): dy1 1] := diff(phi1,y1 1]): dy1 2] := diff(phi1,y1 2]): x1 0]:=0: x1 1]:=.5: x1 2]:=1: delx1 := 1/2: y1 0] := 1: y1 2]:=2: y1 1]:=solve(dy1 1]=0,y1 1]): p1:=array(1..6, x1 0],y1 0],x1 1],y1 1],x1 2],y1 2]]): p1:=plot(p1): phi2 :=sum((((y2 i+1]y2 i])/delx2)^2  y2 i]^2  2*x2 i]*y2 i])*delx2,'i'=0..2 dy2 0] := diff(phi2,y2 0]): dy2 1] := diff(phi2,y2 1]): dy2 2] := diff(phi2,y2 2]): dy2 3] := diff(phi2,y2 3]): x2 0]:=0: x2 1]:=1/3: x2 2]:=2/3: x2 3]:=1: delx2 := 1/3: y2 0] := 1: y2 3]:=2: d2 2]:=solve(dy2 2]=0,y2 2]): d2 1]:=solve(dy2 1]=0,y2 2]): d2 3] :=d2 2]d2 1]: y2 1]:= solve(d2 3]=0,y2 1]): y2 2]:=d2 2]: p2:=array(1..8, x2 0],y2 0],x2 1],y2 1],x2 2],y2 2],x2 3],y2 3]]): p2:=plot(p2): phi3 :=sum((((y3 i+1]y3 i])/delx3)^2  y3 i]^2  2*x3 i]*y3 i])*delx3,'i'=0..3 dy3 0] := diff(phi3,y3 0]): dy3 1] := diff(phi3,y3 1]): dy3 2] := diff(phi3,y3 2]):dy3 3] := diff(phi3,y3 3]): 88 dy3 4] := diff(phi3,y3 4]): x3 0]:=0: x3 1]:=1/4: x3 2]:=1/2: x3 3]:=3/4: x3 4]:=1: delx3 := 1/4: y3 0] := 1: y3 4]:=2: d3 1]:=solve(dy3 1]=0,y3 2]): d3 2]:=solve(dy3 2]=0,y3 2]): d3 3]:=solve(dy3 3]=0,y3 2]): d3 1] :=d3 2]d3 1]:d3 3] :=d3 2]d3 3]: d3 1]:=solve(d3 1]=0,y3 3]): d3 3]:=solve(d3 3]=0,y3 3]): d3 1]:= d3 1]d3 3]: y3 1]:= solve(d3 1]=0,y3 1]): y3 3]:=d3 3]: y3 2]:=d3 2]: p3:=array(1..10, x3 0],y3 0],x3 1],y3 1],x3 2],y3 2],x3 3],y3 3],x3 4],y3 4]]): p3:=plot(p3): phi4 :=sum((((y4 i+1]y4 i])/delx4)^2  y4 i]^2  2*x4 i]*y4 i])*delx4,'i'=0..4 dy4 0] := diff(phi4,y4 0]): dy4 1] := diff(phi4,y4 1]): dy4 2] := diff(phi4,y4 2]): dy4 3] := diff(phi4,y4 3]): dy4 4] := diff(phi4,y4 4]): dy4 5] := diff(phi4,y4 5]): x4 0]:=0: x4 1]:=1/5: x4 2]:=2/5: x4 3]:=3/5: x4 4]:=4/5: x4 5]:=1: delx4 := 1/5: y4 0] := 1: y4 5]:=2: d4 1]:=solve(dy4 1]=0,y4 2]): d4 2]:=solve(dy4 2]=0,y4 3]): d4 3]:=solve(dy4 3]=0,y4 4]): d4 4]:=solve(dy4 4]=0,y4 4]): d4 3]:= d4 3]d4 4]: d4 3]:=solve(d4 3]=0,y4 3]): 89 d4 2]:=d4 2]d4 3]: d4 2]:=solve(d4 2]=0,y4 2]): d4 1]:=d4 1]d4 2]: y4 1]:=solve(d4 1]=0,y4 1]): y4 2]:=d4 2]: y4 3]:=d4 3]: y4 4]:=d4 4]: p4:=array(1..12, x4 0],y4 0],x4 1],y4 1],x4 2],y4 2],x4 3],y4 3],x4 4],y4 4], x4 5],y4 5]]): p4:=plot(p4): y:= cos(x) +((3cos(1))/sin(1))*sin(x)  x: p:=plot(y,x=0..1,color=red,style=line): display({p,p1,p2,p3,p4}) 2 1.8 1.6 1.4 1.2 1 0 0.2 0.4 0.6 0.8 1 Figure 9: 90 CHAPTER 9 9 Hamilton's Principle
Problems
1. If ` is not preassigned, show that the stationary functions corresponding to the problem
Z 1 0 y 2 dx = 0
0 y(0) = 2 y(`) = sin ` are of the form y = 2 + 2x cos `, where ` satis es the transcendental equation
Also verify that the smallest positive value of ` is between 2 and 34 : 2. If ` is not preassigned, show that the stationary functions corresponding to the problem
Z subject to 2 + 2` cos ` ; sin ` = 0: 1 0 h y 2 + 4(y ; `) dx = 0
0 i y(0) = 2 y(`) = `2 are of the form y = x2 ; 2 x + 2 where ` is one of the two real roots of the quartic equation ` 2`4 ; `3 ; 1 = 0:
3. A particle of mass m is falling vertically, under the action of gravity. If y is distance measured downward and no resistive forces are present. a. Show that the Lagrangian function is 1_ L = T ; V = m 2 y2 + gy + constant and verify that the Euler equation of the problem
Z subject to t2 t1 L dt = 0 is the proper equation of motion of the particle. b. Use the momentum p = my to write the Hamiltonian of the system. _ c. Show that 91 @H= =y _ @p @ H = ;p _ @y
4. A particle of mass m is moving vertically, under the action of gravity and a resistive force numerically equal to k times the displacement y from an equilibrium position. Show that the equation of Hamilton's principle is of the form t2 1 2 1 my + mgy ; 2 ky2 dt = 0 _ t1 2 and obtain the Euler equation.
Z 5. A particle of mass m is moving vertically, under the action of gravity and a resistive force numerically equal to c times its velocity y. Show that the equation of Hamilton's principle _ is of the form t2 t2 1 2 my + mgy dt ; t cy y dt = 0: _ _ t1 2 1
Z Z 6. Three masses are connected in series to a xed support, by linear springs. Assuming that only the spring forces are present, show that the Lagrangian function of the system is 1 L = 2 m1x2 + m2x2 + m3x2 ; k1x2 ; k2(x2 ; x1)2 ; k3(x3 ; x2)2 + constant _1 _2 _3 1 where the xi represent displacements from equilibrium and ki are the spring constants.
h i 92 1. If ` is not preassigned, show that the stationary functions corresponding to the problem
Z ` ( y )2dx = 0
0 0 Subject to y(0) = 2 and y(`) = sin` Are equal to,
d Using the Euler equation Ly ; dt Ly = 0 with
0 y = 2 + 2x cos ` L = (y )2 Ly = 0 Ly = 2y
0
0 0 We get the 2nd order ODE ;2y
Integrating twice, we have = 0 y = 0
00 00 y = Ax + B Using our initial conditions to solve for for A and B, y(0) = 2 = A(0) + B =) B = 2 y(`) = sin ` = A` + 2 =) A = sin `` ; 2
! Substituting A and B into our original equation gives, y = sin `` ; 2 x + 2 Now, because we have a variable right hand end point, we must satisfy the following transversality condition: F +( 0 ; y )Fy jx ` = 0
0
0 = 93 Where, Therefore, F = (y )2 ` Fy = 2 sin(` ) ; 4 = cos ` y = sin(``) ; 2
0
0 0 0 y (`)] + cos(`) ; sin(``) ; 2 2y y (`)]2 + cos(`) ; sin(``) ; 2 2 sin(``) ; 2 sin(`) ; 2 2 + cos(`) ; sin(`) ; 2 2 sin(`) ; 2 ` ` ` sin(`) ; 2 + 2 cos(`) ; sin(`) ; 2 ` ` sin(`) ; 2 + 2` cos(`) ; 2 sin(`) + 4 2 + 2` cos(`) ; sin(`)
!
0 2 0 = 0 = 0 = 0 = 0 = 0 = 0 ! ! 0 ! ! ! ! ! Which is our transversality condition. Since ` satis es the transcendental equation above, sin ` ; 2 = 2 cos ` ` Substituting this back into the equation for y yields, we have, y = 2 + 2x cos `
Which is what we wanted to show. To verify that the smallest positive value of ` is between 2 and 34 , we must rst solve the transcendental equation for `. 2 + 2` cos ` ; sin ` = 0 sin 2 2` = cos ` ; cos ` ` 1 tan ` ; sec ` ` = 2 94 20 15 10 5 0 y
5 10 15 20 pi/2 25 0 0.5 1 1.5 l 2 2.5 3 Figure 10: Plot of y = ` and y = 1 tan(`) ; sec(`) 2 Then plot the curves, y = ` y = 1 tan ` ; sec ` 2 between 0 and Pi, to see where they intersect. 1 Since they appear to intersect at approximately 2 , lets verify the limits of y = 2 tan ` ; sec ` analytically.
lim 1 tan ` ; sec ` 2 2 1 sin 1 = 2 cos 2 ; cos 2 2 1 sin 2 ; 2 = 2 cos ;1 2 = 0 = 1
l
; ! Which agrees with the plot . Therefore, 2 is the smallest value of ` 95 2. subject to
0 Z 1h 0 (y )2 + 4(y ; `) dx = 0
0 i Since L = (y )2 + 4(y ; `) we have Ly = 4 and Ly = 2y d d Thus Euler's equation: Ly ; dx Ly = 0 becomes dx 2y = 4 Integrating leads to y = 2x + c21 Integrating again y = x2 + c21 x + c2 Now use the left end condition: y(0) = 2 = 0 + 0 + c2 At x = ` we have: y(`) = `2 = `2 + c21 ` + 2 =) c1 = ; 4 ` 2x + 2 2 Thus the solution is: y = x ; `
0 0 0 0 0 0 0 0 0 y(0) = 2 y(`) = `2 Let's di erentiate y for the transversality condition: y = 2x ; 2 ` Now we apply the transversality condition L + ( ; y )Ly = 0 where = `2 and = 2`
0 Now substituting for , L, Ly , y and y and evaluating at x = `, we obtain (2` ; 2 )2 + 4(`2 ; 2 ` + 2 ; `) + (2` ; (2` ; 2 ))2(2` ; 2 ) = 0 ` ` ` ` 4`2 ; 8 + `42 + 4(`2 ; `) + 4 (2` ; 2 ) = 0 ` ` 4 + 4`2 ; 4` + 8 ; 8 = 0 4`2 ; 8 + `2 `2 8`2 ; 4` ; `42 = 0 2`4 ; `3 ; 1 = 0 Therefore the nal solution is y = x2 ; 2 x + 2 ` where ` is one of the two real roots of 2`4 ; `3 ; 1 = 0.
0 x=` 0 96 3. First, using Newton's Second Law of Motion, a particle with mass m with position vector y is acted on by a force of gravity. Summing the forces gives my ; F = 0 Taking the downward direction of y to be positive, F = mgy: Thus my + mgy = 0
1 From Eqn (9) and the de nition of T = 2 my2 we obtain _
Z t2 t1 ( T + F dy) dt = 0 From Eqn (10),
Z t2 t1 (my y + F y) dt = 0 _ De ning the potential energy as gives F y = ; V = mgy y
Z t2 t1 (T ; V ) dt = 0 or
Z t2 t1 If we de ne the Lagrangian L as L T ; V , we obtain the result L = m( 1 y2 + gy) + constant 2_ Note: The constant is arbitrary and dependent on the initial conditions. To show the Euler Equation holds, recall L = m( 1 y2 + gy) + constant 2_ ( 1 my2 ; mgy) dt = 0 2 _ Ly = mg
Thus, Ly = my _
0 d L = my dt y
0 97 Since the particle falls under gravity (no initial velocity), y = g and d Ly ; dt Ly = 0 The Euler Equation holds. b. Let p = my. The Hamiltonian of the system is _
0 d Ly ; dt Ly = mg ; my = m(g ; y)
0 H (t x p) = ;L(t x (t x p)) + p (t x p) = ; m( 1 y2 + gy) + constant + my (t x p) 2_ @ c. @p H = @ H = y (by de nition) _ @p @ H = ;mg = ;my = ;p _ @y 98 4. Newton's second law: mR ; F = 0 Note that F = mg ; kR, so we have
Z t2 t1 mR R ; mg R + kR R dt = 0
1 mR2 + mgR ; 1 kR2 dt = 0 _ 2 2 This can also be written as
Z t2 t1 To obtain Euler's equation, we let _ L = 1 mR2 + mgR ; 1 kR2 2 2 Therefore LR = mg ; kR _ LR = mR _ d LR ; dt LR = mg ; kR ; mR = 0 _ 5. The rst two terms are as before (coming from ma and the gravity). The second integral gives the resistive force contribution which is proportional to y with a constant of _ proportionality c. Note that the same is negative because it acts opposite to other forces. 6. Here we notice that the rst spring moves a distance of x1 relative to rest. The second spring in the series moves a distance x2 relative to its original position, but x1 was the contribution of the rst spring therefore, the total is x2 ; x1. Similarly, the third moves x3 ; x2 units. 99 CHAPTER 10 10 Degrees of Freedom  Generalized Coordinates
Problems
1. Consider the functional I (y) =
2. Give Hamilton's equations for Z bh a r(t)y2 + q(t)y2 dt: _ i Find the Hamiltonian and write the canonical equations for the problem. I (y) = Z bq a (t2 + y2)(1 + y2)dt: _ Solve these equations and plot the solution curves in the yp plane. 3. A particle of unit mass moves along the y axis under the in uence of a potential f (y) = ;!2y + ay2 where ! and a are positive constants. a. What is the potential energy V (y)? Determine the Lagrangian and write down the equations of motion. b. Find the Hamiltonian H (y p) and show it coincides with the total energy. Write down Hamilton's equations. Is energy conserved? Is momentum conserved? 2 c. If the total energy E is ! , and y(0) = 0, what is the initial velocity? 10 d. Sketch the possible phase trajectories in phase space when the total energy in the !6 system is given by E = 12a2 : p Hint: Note that p = 2 E ; V (y): What is the value of E above which oscillatory solution is not possible?
q 4. A particle of mass m moves in one dimension under the in uence of the force F (y t) = ky 2et where y(t) is the position at time t, and k is a constant. Formulate Hamilton's principle for this system, and derive the equations of motion. Determine the Hamiltonian and compare it with the total energy. 5. A Lagrangian has the form a2 L(x y y ) = 12 (y )4 + a(y )2G(y) ; G(y)2
; 0 0 0 100 where G is a given di erentaible function. Find Euler's equation and a rst integral. 6. If the Lagrangian L does not depend explicitly on time t, prove that H = constant, and if L doesn't depend explicitly on a generalized coordinate y, prove that p = constant: 7. Consider the di erential equations r2 _ = C governing the motion of a mass in an inversely square central force eld. a. Show by the chain rule that
2 dr 2 4d r r_ = Cr d r = C r d 2 ; 2C 2r and therefore the di erential equations may be written
; k r ; r _2 + m r 2 = 0
; 2 ; ; 5 dr d !2 d2r ; 2r d2
b. Let r = u 1 and show that
; ; 1 dr d !2 ; r + Ckm r
2 2 =0 d2u + u = k : d2 C 2m c. Solve the di erential equation in part b to obtain u = r 1 = Ckm (1 + cos( ; 0)) 2 where and 0 are constants of integration. d. Show that elliptical orbits are obtained when < 1:
; 101 CHAPTER 11 11 Integrals Involving Higher Derivatives
Problems
1. Derive the Euler equation of the problem
Z x2 x1 F (x y y y ) dx = 0
0 00 d2 @F ; d @F + @F = 0 dx2 @y dx @y @y and show that the associated natural boundary conditions are d @F ; @F y x2 = 0 dx @y @y x1 and @F y x2 = 0: @y
! !
00 0 in the form " ! # 00 0 " # 0 00 x1 2. Derive the Euler equation of the problem
Z x2 x1 Z y2 y1 F (x y u ux uy uxx uxy uyy ) dxdy = 0
! ! ! ! where x1 x2 y1 and y2 are constants, in the form @ 2 @F + @ 2 @F + @ 2 @F ; @ @F ; @ @F + @F = 0 @x2 @uxx @[email protected] @uxy @y2 @uyy @x @ux @y @uy @u and show that the associated natural boundary conditions are then @ @F + @ @F ; @F u x2 = 0 @x @uxx @y @uxy @ux x1
! " ! # " and @F u @uxx x # x2 x1 =0 " @ @F + @ @F @y @uyy @x @uxy @F u @uyy y
" ;
y2 y1 # @F @uy ! u # y2 y1 =0 = 0: 102 3. Specialize the results of problem 2 in the case of the problem x2 y2 1 2 uxx + 1 u2 + uxxuyy + (1 ; )u2 dxdy = 0 xy 2 yy x1 y1 2 where is a constant. Hint: Show that the Euler equation is r4u = 0 regardless of the value of , but the natural boundary conditions depend on :
Z Z 4. Specialize the results of problem 1 in the case
00 F = a(x)(y )2 ; b(x)(y )2 + c(x)y2:
0 5. Find the extremals 1 a. I (y) = 0 (yy + (y )2)dx
Z
0 00 y(0) = 0 y (0) = 1 y(1) = 2 y (1) = 4
0 0 00 0 b. I (y) = Z 1 0 (y2 + (y )2 + (y + y )2)dx
0 y(0) = 1 y (0) = 2 y(1) = 0 y (1) = 0:
0 0 6. Find the extremals for the functional I (y) = Z b a (y2 + 2y2 + y2)dt: _ 7. Solve the following variational problem by nding extremals satisfying the given conditions I (y) = Z 1 0 (1 + (y )2)dx
00 y(0) = 0 y (0) = 1 y(1) = 1 y (1) = 1:
0 0 103 ...
View
Full Document
 Summer '02
 RUSSAK
 Calculus, Derivative, X1, dx, y0

Click to edit the document details