**preview**has

**blurred**sections. Sign up to view the full version! View Full Document

**Unformatted text preview: **MATH 450 - NOTES
April 10, 2009
Contents
1 METRIC SPACES 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 Deﬁnition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Open Balls and Sets: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Closed Sets, Interior and Limit Points: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assignment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sequences in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assignment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuous Functions: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Properties of Continuous Functions: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 3 5 5 5 8 8 9
Some Odds and Ends: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.10 Assignement 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2 DIFFERENTIAL CALCULUS ON Rn 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 The Topology of R
n
12
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
The Derivative of a function of several variables: . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Some Properties of Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Assignment 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Directional and Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Assignment 5: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Higher order derivatives: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Taylor’s Formula: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Assignment 6: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1
3 THE INVERSE AND IMPLICIT FUNCTION THEOREMS 3.1 3.2 3.3 3.4
23
The Inverse Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Assignment 7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 The Implicit Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Assignment 8: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 30
4 THE BANACH SPACE C(X) 4.1 4.2 4.3 4.4 4.5 4.6
Consequences and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 A nowhere diﬀerentiable continuous function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Assignment 9: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Dense subsets of C(X): . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Compactness in C(X): . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Assignment 10: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 43
5 POWER SERIES AND FOURIER SERIES 5.1 5.2 5.3 5.4 5.5
Number series with non-negative terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Assignment 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Fourier Series and Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Assignment 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
1
1.1
METRIC SPACES
Deﬁnition and Examples
Deﬁnition 1 Let X be a set. A function d : X × X → R is called a metric or distance function if for all x, y, z ∈ X 1. d(x, y) ≥ 0, and d(x, y) = 0 if and only if x = y. 2. d(x, y) = d(y, x) 3. d(x, z) ≤ d(x, y) + d(y, z) The pair (X, d) is called a metric space. If it is clear from the context which metric we refer to, we will not explicitly mention the metric and just call X a metric space. For the remaining part of the course we will assume that we have a metric space, even if its not explicitly mentioned. 2
Some Examples: 1. On R we can deﬁne d(x, y) = |x − y|. This is a metric and makes R into a metric space. 2. Other metrics on R: |x − y| . d1 (x, y) = 1 + |x − y| 0 x=y d2 (x, y) = 1 x=y
1 2
(1)
The second metric can be used to make any set into a metric space. 3. Metrics on Rn , let x, y ∈ Rn , x = (x1 , . . . , xn )
n
d2 (x, y) d1 (x, y) d∞ (x, y)
=
i=1 n
|xi − yi |2
(2) (3) (4)
=
i=1
|xi − yi |
=
max {|xi − yi | : i = 1, . . . , n}
4. Let C((a, b)) be the set of all real valued continuous functions on (a, b).
b
1 2
d2 (f, g) =
a b
|f − g| dx
2
(5) (6) (7)
d1 (f, g) =
a
|f − g| dx
d∞ (f, g) = are all metrics on this space.
sup {|f (x) − g(x)| : x ∈ (a, b)}
1.2
Open Balls and Sets:
Deﬁnition 2 Let (X, d) be a metric space and r > 0. The set Br (x0 ) = {x ∈ X : d(x, x0 ) < r} is called the open ball of radius r around x0 Example: In R2 with the metric d2 , the open ball of radius 1 around the origin is just the interior of the circle of radius 1 centered at the origin. With the metric d1 the ball is the interior of the diamond with vertices at (1, 0), (0, 1), (−1, 0), (0, −1). With the metric d∞ the ball is the square (−1, 1) × (−1, 1). It is important to note that balls can come in many shapes or forms. Often we consider balls with their centers ˙ removed. We will use the following notation Br (x) = Br (x) \ {x}. 3
Br (x) ⊂ O.
Deﬁnition 3 Let (X, d) be a metric space. A set O ⊂ X is open if for any x ∈ O there is a r > 0 such that
The ﬁrst thing to note is that we used the term open in two diﬀerent ways, once in the connection with open balls, the next time in connection with open sets . We need to show that open balls are open sets. To to this consider the open ball Br (x0 ). Let x ∈ Br (x0 ). Let δ= r − d(x, x0 ) . 2
We will show that Bδ (x) ⊂ Br (x0 ). To do this let y ∈ Bδ (x). Then d(y, x0 ) ≤ d(y, x) + d(x, x0 ) < δ + d(x, x0 ) = so y ∈ Br (x0 ). Proposition 1 Let (X, d) be a metric space. Then 1. X and ∅ are open. 2. If {Oα : α ∈ A} is an arbitrary collection of open sets, then 3. If O1 and O2 are open, then so is O1 ∩ O2 . Proof: To show the ﬁrst assertion, clearly B1 (x) ⊂ X for all x ∈ X. For any x ∈ ∅ (there are none) B1 (x)subset∅. For the second assertion let x ∈
α∈A Oα , α∈A Oα
r d(x, x0 ) − + d(x, x0 ) < r, 2 2
is also open.
set is open there is a r > 0 such that Br (x) ⊂ Oα0 ⊂ then Br (x) ⊂ Br1 (x) ∩ Br−2 (x) ⊂ O1 ∩ O2 .
α∈A Oα .
then there is an α0 such that x ∈ Oα0 . Since this
Thus the union is open. For the intersection
observe that if x ∈ O1 ∩O2 then there is r1 , r2 > 0 such that Br1 (x) ⊂ O1 and Br2 (x) ⊂ O2 . Let r = min{r1 , r2 }
Note that the intersection of arbitrarily many open sets may not be open. If we consider the collection of open balls around a point p with radii
1 n,
1 then B n (p) is open for every n ∈ N, but
∞ n=1
1 B n (p) = {p},
which is not open. Proposition 2 Let (X, d) be a metric space. Then X has the , i.e. for any p, q ∈ X, p = q there exists open
sets Up and Uq such that p ∈ Up , q ∈ Uq and Up ∩ Uq = ∅.
Proof: Since p = q we have r = d(p, q) > 0. Then B r (p) ∩ B r (q) = ∅. 3 3
4
1.3
Closed Sets, Interior and Limit Points:
Deﬁnition 4 A subset F of a metric space is closed , if its complement is open. Proposition 3 Let (X, d) be a metric space. Then X and the empty set are closed. The union of ﬁnitely many closed sets is closed, and the intersection of arbitrarily many closed sets is closed. Proof: Left as homework.
Deﬁnition 5 Let A be a subset of a metric space X. p ∈ X is an interior point of A if there exists an open ˙ ball Br (p) such that Br (p) ⊂ A. p ∈ X is a limit point of A if for any r > 0 Br (x) ∩ A = ∅. It is clear from this that every element of an open set is an interior point. Proposition 4 Let f be a closed set and p be a limit point of F . Then p ∈ F . Proof: Assume that p ∈ F . Then p ∈ F C , which is open. Therefore there is an r > 0 such that Br (p) ⊂ F C , /
i.e. Br (p) ∪ F = ∅. Thus p cannot be a limit point of F .
1.4
Assignment 1
1. Show that the functions in equations (3), (4), (6), (7) are metrics. 2. Proof Proposition 3. 3. Deﬁne l2 to be the set of all real sequences (an ) such that deﬁne
∞ ∞ k=1
a2 < ∞. For two sequences (an ), (bn ) ∈ l2 n
1 2
d((an ), (bn )) =
k=1
(an − bn )2
.
Show that this is a metric on l2 .
1.5
Sequences in Metric Spaces
Deﬁnition 6 Let {pn } a sequence of points in a metric space X. {pn } converges to p, if for every ǫ > 0 there d(pn , pM ) < ǫ for all n, m ≥ N . X is complete if every Cauchy sequence converges. Proposition 5 A sequence in a metric space has at most one limit.
is a N such that pn ∈ Bǫ (p) for all n ≥ N . {pn } is a Cauchy sequence, if for every ǫ > 0 there is N such that
5
Proof: Assume that pn converges to two limits p and q. For any ǫ > 0 there is N1 such that d(pn , p) < ǫ/2 for all n ≥ N1 and an N2 such that d(pn , q) < epsilon/2 for all n ≥ N2 . For any n ≥ max{N1 , N2 } we have. d(p, q) ≤ d(p, pn ) + d(pn , q) < ǫ. Since this is true for any ǫ > 0, it follows that d(p, q) = 0 and so p = q.
C has a ﬁnite subcover.
Deﬁnition 7 A set A is bounded if sup{d(p, q) : p, q ∈ A} is ﬁnite. A set C is compact if every open cover of
Proposition 6 Let C be a compact subset of a metric space X. Then C is closed and bounded. Proof: We ﬁrst show closed. To do this we show that C c is open. Let p ∈ C c . For every q ∈ C there exists Since C is compact there are points q1 , . . . , qn such that balls Bǫq (q) and Bǫq (p) such that Bǫq (p) ∩ Bǫq (q) = ∅. The collection of balls Bǫq (q) form an open cover of C.
C ⊂ Bǫq1 (q1 ) ∪ · · · ∪ Bǫqn (qn ) Let ǫ = min{ǫq1 , . . . , ǫqn }, then Bǫ (p) ∩ Bǫqj (qj ) = ∅ and therefore Bǫ (p) ∩ C = ∅. Hence, Bǫ (p) ⊂ C c , and p is an iterior point of C c . Since p was arbitrary it follows that C c is open and C is closed. For the boundedness, B1 (q1 ), . . . , B1 (qn ). Let M = max{d(qi , qj ) : i, j = 1, . . . , n}. For any pair of points p, q there exist qi , qj such that p ∈ B( qi ) and q ∈ B1 (qj ). Thus d(p, q) ≤ d(p, qi ) + d(qi , qj ) + d(qj , q) < 1 + M + 1 = M + 2. Thus the set is bounded. It is important to note that the converse of this proposition is not true in general. However, it is true in Rn as we will see later. In metric space compactness can also be characterized by limit points and sequences. We have the following Theorem. Theorem 1 Let X be a metric space and C ⊂ X. The following are equivalent: 1. C is compact. 2. Every inﬁnite subset of C has a limit point (C is limit point compact) 3. Every sequence in C has a convergent subsequence (C is sequentially compact) Proof: We ﬁrst show that (1) implies (2): To do this we prove the contra positive. Let A ⊂ C with no limit
consider the collectio of open balls {B1 (q) : q ∈ C}, This is an open cover, and has therefore a ﬁnite subcover.
point. Since A contains all its limit points it is closed. For every p ∈ A we may ﬁnd an open ball B(p) such that B(p) ∩ A = {p}. The collection of these balls together with Ac form an open cover of C. So there exists 6
a ﬁnite subcover. Since AC does not contain any points in A, each of the balls must be included in the ﬁnite subcover. This implies there were only ﬁnitely many balls to start with, and A is ﬁnite. set is ﬁnite there is a point p such that pn = p for inﬁnitely many values of n. This is a constant and therefore convergent subsequence. If A is inﬁnite, then A has a limit point p. Now each k ∈ N choose pnk ∈ B1/k (p) and nk > nk−1 . This subsequence converges to p. The last step is to prove that (3) implies (1). This is the hard part. We ﬁrst show the following Next we prove that (2) implies (3). Given a sequence {pn } in C, consider the set A = {pn : n ∈ N}. If this
Lemma 1 If C is sequentially compact and O an open cover of C, then there exists a number δ > 0 such that
each subset of C with diameter less than δ is contained in an open set O ∈ O. δ is called the Lebesgue number.
particular for every integer n there is a set of diameter less than 1/n, which is not contained in any open set in
Proof of the Lemma: (by contradiction) Given the open cover O assume that there is no such number δ. In the cover. Denote these sets by Cn . Choose any point pn ∈ Cn . Since C is sequentially compact there exists
a subsequence {pnj } that converges to some point p. p ∈ O for some open set O in the cover. There exists an ǫ > 0 such that Bǫ (p) ⊂ O. For suﬃtiently large nj with 1/nj < ǫ/2 we have Cnj ⊂ Bǫ/2 (pnj ) and pnj ∈ Bǫ/2 (p). But then Cnj ⊂ Bǫ (p) ⊂ O. which is a contradiction. Next we prove: Lemma 2 If C is sequentially compact and ǫ >, there exist points p1 , . . . , pn such that C ⊂ Bǫ (p1 ) ∪ · · · ∪ Bǫ (pn ). Proof of the Lemma: Pick p1 ∈ C, p2 ∈ C \ Bǫ (p1 ), pk ∈ C \ (Bǫ (p1 ) ∪ · · · ∪ Bǫ (pk−1 )) This sequence must end after ﬁnitely many steps, since otherwise we would have an inﬁnite sequence with d(pn+1 , pj ) ≥ ǫ for all j = 1, . . . , n, which has no convergent subsequence. 7
Lemma. (This called the Lebesgue number of the cover). Let ǫ = δ/3. The second lemma implies that there exist p1 , . . . pn such that C ⊂ Bǫ (p1 ) ∪ · · · ∪ Bǫ (pn ). Now observe that the diameter of these balls is 2δ/3 < δ. I.e. for each pj there is an Oj ∈ O such that Bǫ (pj ) ⊂ Oj . But then Hence, C is compact. C ⊂ O1 ∪ · · · ∪ On .
We can now ﬁnally prove that (3) implies (1): Given any open cover O let δ be the number from the ﬁrst
1.6
Assignment 2
1. Prove that any ﬁnite set is compact. 2. Let A be closed and bounded subset in R2 . Prove that the set A1 = {x ∈ R : (x, s) ∈ A} is also closed and bounded. 3. Is the set A in the previous problem compact? 4. Prove that a closed subset of a compact set is compact. 5. Consider the unit circle in R2 . Any point is uniquely determined by the angle θ. In radians consider the sequence θ0 = 0, θn+1 = θn . These all lie on the circle. Show that the set of these points is dense in the circle.
1.7
Continuous Functions:
continuous on A, if it is continuous at every x ∈ A.
x ∈ A if for every ǫ > 0 there exists a δ > 0 such that d′ (f (y), f (x)) < ǫ for all y ∈ Bδ (x) ∩ A. A function f is
Deﬁnition 8 Let (X, d) and (Y, d′ ) be metric spaces and A ⊂ X. A function f : A → Y is continuous at
Observe that this can be easily rephrased in the following ways: 1. A function f : A → Y is continuous at x ∈ A if for every ǫ > 0 there exists a δ > 0 such that f (y) ∈ Bǫ (f (x)) for all y ∈ Bδ (x) ∩ A. Bδ (x) ∩ A ⊂ f −1 (Bǫ (f (x))). 2. A function f : A → Y is continuous at x ∈ A if for every ǫ > 0 there exists a δ > 0 such that One of the most important characterisations of continuous functions is the following: 8
Proposition 7 Let (X, d) and (Y, d′ ) be metric spaces. A function f : X → Y is continuous on X if and only if f −1 (U ) is open for every open set U ⊂ Y .
Proof: Let f be continuous on X and U be an open subset of Y . Let x ∈ f −1 (U ). We need to show that x is Since f is continuous there exists a δ > 0 such that
an interior point of this set. Now f (x) ∈ U , which is open, thus there exists an ǫ > 0 such that Bǫ (f (x)) ⊂ U . Bδ (x) ⊂ f −1 (Bǫ (f (x))) ⊂ f −1 (U ),
that
is open an thus is f −1 (Bǫ (f (x))). Hence, x ∈ f −1 (Bǫ (f (x))) is an interior point and there exists a δ > 0 such Bδ (x) ⊂ f −1 (Bǫ (f (x))) ⊂ f −1 (U ),
Conversely, assume that f −1 (U ) is open for every open subset U of Y . Let x ∈ X and ǫ > 0. The set Bǫ (f (x))
and x is an interior point of the latter set. Since this holds for every x ∈ f −1 (U ) this set is open.
and f is continuous at x. Since x was arbitrary, this hols for every x ∈ X. Remark: The above proposition also holds for functions f : A → Y for A ⊂ X, if we say f −1 (U ) is open
open subset V of X.
relative to A for every open subset u of Y . Recall that a set O is open relative to A if O = A ∩ V for some
1.8
Properties of Continuous Functions:
Continuous functions have many important properties. In general one can say that continuous functions preserve many topological qualities the two most important ones for now are compactness and connectedness. Proposition 8 Let f : X → Y be continuous and A ⊂ X connected. Then f (A) is connected. Proof: We will prove the contrapositive. Assume that f (A) is not connected, then there exist open non-empty sets U and V such that f (A) ⊂ U ∪ V and U ∩ V = ∅. Now f −1 (U ) and f −1 (V ) are open and A ⊂ f −1 (f (A)) ⊂ f −1 (U ∪ V ) = f −1 (U ) ∪ f −1 (V ), and f −1 (U ) ∩ f −1 (V ) = f −1 (U ∩ V ) = ∅. Thus A is not connected.
Proposition 9 Let f : X → Y be continuous and A ⊂ X compact. Then f (A) is compact.
9
Proof: Let {Oα : α ∈ Λ} be an open covering of f (A). Since A ⊂ f −1 (f (A)) ⊂ f −1 ( Oα ) =
α∈Λ
f −1 (Oα )
α∈Λ
the collection {f −1 (Oα ) : α ∈ Λ} is an open cover of A. Hence there exists a ﬁnite subcover f −1 (O1 ), . . . , f −1 (On ). But, since A ⊂ f −1 (O1 ) ∪ . . . ∪ f −1 (On ), we have f (A) ⊂ O1 ∪ . . . ∪ On , and f (A) is compact. Another important property of continuous functions is that they preserve convergence of sequences. Proposition 10 A function f : X → Y is continuous at x ∈ X if and only if for every sequence {xn } which converges to x we have limn→∞ f (xn ) = f (x).
these two statements we get that for ǫ > 0 there exists a N such that f (xn ) ∈ Bǫ (f (x)) for all n ≥ N . Hence Next assume that f is not continuous at x. I.e there is an ǫ0 > 0 such that for any δ > 0 there is a xδ ∈ Bδ (x) f (xn ) → f (x).
f (y) ∈ Bǫ (f (x)) for any y ∈ Bδ (x). For δ there exists an N such that xn ∈ Bδ (x) for any n ≥ N . Combining
Proof: Assume that f is continuous and that xn → x. Let ǫ > 0 then there exists a δ > 0 such that
xn . Since xn ∈ B1/n (x) for any n ∈ N, xn → x. However, f (xn ) ∈ Bǫ0 (f (x) for all n ∈ N and so f (xn ) cannot / converge to f (x). We have thus shown the contrapositive of the statement if f (xn ) converges to f (x) for any sequence xn → x then f is continuous at x. We ﬁnish the discussion of continuous functions by looking at uniform continuity. Deﬁnition 9 f : X → Y is on A ⊂ X if for every ǫ > 0 there exist a δ > 0 such that d′ (f (x), f (y)) < ǫ for all x, y ∈ A with d(x, y) < δ.
such that f (xδ ) ∈ Bǫ0 (f (x)). Since we can choose δ freely, we ﬁnd xδ for δ = 1/n for any n ∈ N and label it /
The important result here is Proposition 11 Let C ⊂ X be compact and f : C → Y be continuous on C. Then f is uniformly continuous
on C.
{Bδx (x)} forms an open cover of C. Since C is compact, this open cover has a Lebesgue number δ > 0. Now
Proof: Let ǫ > 0 For every x ∈ C there is a δx such that f (y) ∈ Bǫ/2 (x) for all y ∈ Bδx (x). The collection
10
let d(x, y) < δ, then x, y lie in an open set u of diameter smaller than δ and hence there exists a set in the open covering which contains U . I.e. there is a x0 ∈ C such that x, y ∈ U ⊂ Bδx0 (x0 ), thus d′ (f (x), f (y)) ≤ d′ (f (x), f (x0 )) + d′ (f (x0 ), f (y)) < ǫ/2 + ǫ/2 = ǫ. I.e. f is uniformly continuous.
1.9
Some Odds and Ends:
Deﬁnition 10 Let X be a space with two metrics d and d′ . The two metrics are equivalent if there exist constants C1 and C2 such that d(x, y) ≤ C1 d′ (x, y) and d′ (x, y) ≤ C2 d(x, y).
Proposition 12 Let X be a space with two metrics d and d′ which are equivalent. Then 1. U ⊂ X is open in the metric d if and only if U is open in d′ . 2. U ⊂ X is closed in the metric d if and only if U is closed in d′ . 3. U ⊂ X is bounded in the metric d if and only if U is bounded in d′ . 4. U ⊂ X is compact in the metric d if and only if U is compact in d′ . Proof: See assignment 3. This shows that we can often choose the most convenient metric to prove some result. Since continuity only depends on open sets (thanks to a previous proposition) a change from one metric to an equivalent one will not change the continuity of a function. The discussion about equivalent metrics leads immediately to the following Deﬁnition 11 Let (X, d) and (Y, d′ ) be two metric spaces. A function φ : X → Y is called a homeomorphism
if it is a continuous bijection with a continuous inverse function. If there is a homeomorphism between X and Y , the two spaces are called homeomorphic. It is very easy to confuse this term with the term homomorphism, which is a purely algebraic concept. Example: The map (x, y) → (−y, x) is an isometry on R2 with the standard Euclidean metric. Any isometry is automatically a homeomorphism. The fact that the three metrics d1 , d2 , and d∞ are equivalent on Rn will be very useful later on, as it allows us to use the metric of our choice to prove certain results.
11
1.10
Assignement 3
1. Prove the last Proposition. 2. Consider the metrics d1 and d2 on Rn from the ﬁrst day. Prove that d1 (p, q) ≥ d2 (p, q), and 1 d2 (p, q) ≥ √ d1 (p, q). n
3. Consider the metrics d∞ and d2 on Rn from the ﬁrst day. Prove that d2 (p, q) ≥ d∞ (p, q), and 1 d∞ (p, q) ≥ √ d2 (p, q). n
4. Consider Rn with your favorite metric and R with the standard metric. For a point (x1 , . . . , xn ) ∈ Rn for k = 1, . . . , n deﬁne πk : Rn → R by πk (x1 , . . . , xn ) = xk .
πk is called the k-th projection. Prove that πk is continuous. 5. Consider R2 and the map (x, y) → (−y, x). Show that this is an isometry.
2
DIFFERENTIAL CALCULUS ON Rn
In this section we explore derivatives of functions of several variables. We will generally assume that Rn is endowed with the Euclidean metric d2 , but use the other metrics if it is more convenient to do so. For x ∈ Rn (x1 , . . . xn ) or its transpose. we denote by |x| = d2 (x, 0) the norm of the vector x. The components of the vector x are usually denoted by
2.1
The Topology of Rn
Before exploring multivariable diﬀerential calculus we will discuss brieﬂy the topology of Rn . As seen in the homework and the examples Rn is a metric space, and we may use any of the three metrics used int the examples. However, Rn has some special properties which will be discussed shortly. To begin we recall the Heine-Borel Theorem on the real line. It states that a subset of R is compact if and only if it is closed and bounded. We will use this to prove the Heine-Borel Theorem for RN : 12
Proposition 13 Heine-Borel Theorem Let C ⊂ Rn then C is compact if and only if it is closed and bounded. Proof: Every comp[act subset of a metric space is closed and bounded, so we only have to prove the other direcand bj = sup{xj : (x1 , . . . , xj , . . . , xn ) ∈ C}. Then C ⊂ [a1 , b1 ] × · · · × [an , bn ] and it suﬃces to show the comtion. Let C be a closed and bounded subset of Rn . For j = 1, . . . , n deﬁne aj = inf{xj : (x1 , . . . , xj , . . . , xn ) ∈ C}
pactness of the latter set. To do this we show the sequential compactness of this set. In order to make the
[a1 , b1 ] × [a − 2, b2 ], then {αk } is a sequence in [a1 , b1 ] and has a converging subsequence {alphakl } which con-
notations easier we assume that n = 2. and then procede with induction. Let {(αk , βk } be a sequence in
verges to a limit α, since [a−1, b1] is compact. Next consider the subsequence {αkl , βkl } of the original sequence. Then {βkl } is a sequence in [a2 , b2 ] with a converging subsequence {βklj } which converges to β ∈ [a2 , b2 ]. Hence the sequence {(αklj , βklj )} converges to (α, β) ∈ [a1 , b1 ] × [a2 , b2 ]. Now having shown this result for n = 2 it is easy to see that the same proof works to show that A × B is [a1 , b1 ] × · · · × [an , bn ] = ([a1 , b1 ] × · · · × [an−1 , bn−1 ]) × [an , bn ], i.e. the product of two compact sets by induction hypothesis. Finally since C is a closed subset of the compact set [a1 , b1 ] × · · · × [an , bn ] it is it self compact. The section about metric spaces covers everything we need to know about continuous functions. However, we must be careful when trying to evaluate a limit as the following example shows. Example: Let 1 f (x, y) = 0 1 |y| > x2 |y| = 0 0 < |y| < x2
compact for any two compact sets A, B. This allows us to proceed by induction since
Then if we approach the origin along any straight line y = ax we get limx→0 f (x, ax) = 1. The same is true when we approach the origin along the y-axis. But If we onsider the curve y =
x2 2 ,
we get limx→0 f (x, x ) = 0 so the 2
2
limit cannot possibly exist. Unlike in one dimension where we can compute one sided limits and compare them, in several dimensions it is not even suﬃcient to compute directional limits ( as the limits along all straight lines would be). The only certain way to state that a certain number is the limit is to revert back to the deﬁnition using ǫ and δ. However, there is some help, as the sum and product rules for limits still work for real valued functions as does the composition.
2.2
The Derivative of a function of several variables:
Recall the deﬁnition of the derivative of a function of one variable, i.e. if A ⊂ R, f : A → R and a an interior point of A then f ′ (a) = lim f (a + h) − f (a) , h 13
h→0
if this limit exists. Unfortunately, this deﬁnition cannot be readily adapted to Rn for n > 1, since it uses a quotient, and vector spaces do not allow for division. So we rewrite this deﬁnition as follows: Let A, f , and a be as above, f is diﬀerentiable at a if there exists a real number f ′ (a) and a function ξ : (−ǫ, ǫ) → R such that limt→0
ξ(t) t
= 0 and |f (a + h) − f (a) − f ′ (a) · h| ≤ |ξ(|h|)| ,
for all |h| < ǫ. To see that this is equivalent assume ﬁrst that f is diﬀerentiable at a in the traditional sense. this new condition. Conversely if f satisﬁes the new condition at a. Then 0≤ and thus
Then deﬁne ξ(t) = f (a + t) − f (a) − f ′ (a) · t This function has clearly the right properties and thus f satisﬁes |ξ(|h|)| f (a + h) − f (a) − f ′ (a) ≤ , h |h|
h→0
lim
f (a + h) − f (a) = f ′ (a), h
by the squeeze theorem. We are now ready to deﬁne the derivative of a function in several variables. Deﬁnition 12 Let A ⊂ Rn be open, f : A → Rm a function and a ∈ A. Then f is diﬀerentiable at a if there is
ξ(t) t
a linear map T (a) : Rn → Rm , and a function ξ : (−ǫ, ǫ) → R such that limt→0 |f (a + h) − f (a) − T (a) · h| ≤ |ξ(|h|),
= 0 and
for all |h| < ǫ. The operator T (a) is called the derivative of f at a, and is denoted by Df (a). If we choose bases for Rn and Rm , Df (a) is of course represented by an m × n matrix. In general we will assume vector spaces. In the case that m = 1 Df (a) is of ocurse represented by a vector. Examples: 1. Let f : Rn → R be a constant map. I. e. f (x) = c for all x ∈ Rn . Then f is diﬀerentiable and Df (x) = 0 for all x ∈ Rn . To see this let ξ(t) = 0, Then |f x + h) − f (x) − 0 · h| = 0 ≤ ξ(|h|) for all h. 2. Let f : Rn → R be a linear map, i. e. f (x) = a·x for some a ∈ Rn and all x ∈ Rn . Then f is diﬀerentiable and Df (x) = a for all x ∈ Rn . To see this let ξ(t) = 0. Then |f (x + h) − f (x) − a · h| = 0 3. Let f : R2 → R be given by f (x, y) = x2 + y 2 . f is diﬀerentiable with Df (x, y) = 2(x, y). Let h = (h, k), then (x + h)2 + (y + k)2 − x2 − y 2 − 2(x, y) · (h, k) = x2 + y 2 + 2xh + 2yk + h2 + k 2 − x2 − y 2 − 2xh − 2yk = h2 + k 2 14 that we work in the canonical bases, but it is important to realize that the derivative is a linear operator on
So we may use ξ(t) = t2 and show that the function is diﬀerentiable. The same argument works for n > 2. The examples presented above were all cases of real valued functions. As we will see it is suﬃcient to understand this case Proposition 14 Let f : Rn → Rm and let f = (f1 , · · · , fm ) in the standard basis of Rm . Then f is diﬀerentiable at x if and only if fj is diﬀerentiable at x for j = 1, . . . , m. Proof: Consider the j-th component of (f (x + h) − f (x) − Df (x) · h)j = fj (x + h) − fj (x) − (Df )j · h. Using the d∞ metric on Rm it is clear that |f (x + h) − f (x) − Df (x) · h|∞ ≥ |fj (x + h) − fj (x) − (Df )j · h| , and |f (x + h) − f (x) − Df (x) · h|∞ ≤ max |fj (x + h) − fj (x) − (Df )j · h| .
j=1,...,m
So we can use the same function ξ for showing the diﬀerentiability of the components and the function as a whole. Moreover, Dfj (x) = (Df (x))j .
2.3
Some Properties of Derivatives
Without proof we can state the linearity properties of the derivative: Proposition 15 Let f, g be diﬀerentiable at some point x ∈ Rn . Then for any a, b ∈ R, af + bg is diﬀerentiable at x and Daf +bg (x) = aDf (x) + bDg (x). The proof of this is left as an exercise. More important is the following result: Proposition 16 Let f : Rn → Rm be diﬀerentiable at x ∈ Rn . Then f is continuous at x Proof: Let h = y − x then |f (y) − f (x)| ≤ |f (x + h) − f (x) − Df (x) · h| + |Df (x) · h| ≤ ξ(|h|) + |Df (x) · h| . The result follows now immediately from the deﬁnition of diﬀerentiability and the fact that for any linear map T on Rn there is a positive constant C such that |T · h| ≤ C|h|.
15
Proposition 17 Let f, g be real valued and diﬀerentiable at some x ∈ Rn then f g is diﬀerentiable and Df g (x) = gDf (x) + f Dg (x). Moreover, if g(x) = 0, then
f g
is diﬀerentiable at x and Df /g = 1 (gDf − f Dg ) . g2
Proof: For the ﬁrst assertion observe that |f (x + h)g(x + h) − f (x)g(x) − (g(x)Df (x) + f (x)Dg (x) · h)| = |g(x + h) (f (x + h) − f (x)) + f (x) (g(x + h) − g(x)) − (g(x)Df (x) · h + f (x)Dg (x) · h)| ≤ |g(x + h) (f (x + h) − f (x) − Df (x) · h)| + |f (x) (g(x + h) − g(x) − Dg (x) · h)) + |f (x) (g(x) − g(x + h))| . The ﬁrst two terms converge to 0 as |h| → 0, because of the diﬀerentiability of f and g, the last term converges to ) because of the continuity of g. The second part of the Theorem is left as an exercise.
be diﬀerentiable at y = g(x) with derivative Df (y). Then f ◦ g is diﬀerentiable at x, and Df ◦g (x) = Df (y) · Dg (x). Proof: Let k = g(x + h) − g(x), Then g(x + h) = y + k. We have f (g(x + h)) − f (g(x)) = = = = Now deﬁne ξ(|h| = Df (y) · ξ2 (|h|) + ξ1 (|k|), f (y + k) − f (y) Df (y) · k + ξ1 (|k|) Df (y) · (g(x + h) − g(x)) + ξ1 (|k|) D (Dg · h + ξ2 (|h|)) + ξ1 (|k|).
Proposition 18 Chain Rule Let g : Rn → Rm be diﬀerentiable at x with derivative Dg (x), and f : Rm → R
which satisﬁes the right properties since |k| → 0 as |h| → 0. The result follows immediately.
16
2.4
Assignment 4
1. Let f : Rn → Rm and g : Rn → Rm be both diﬀerentiable at x. Show that the skalar product f · g is also diﬀerentiable at x. 2. Prove the linearity of the derivative Proposition 15. 3. Prove the quotient rule in Proposition 17. 4. Let f : R4 → R be given by f (x1 , x2 , x3 , x4 ) = x1 x4 − x2 x3 . Prove that f is diﬀerentiable and that Df (x1 , x2 , x3 , x4 ) = (x4 , −x3 , −x2 , x1 ). 5. The space of 2 × 2 matrices over R is a four dimensional vector space in a natural way. Show that the determinant is a diﬀerentiable function from this space to the real numbers. 6. Show that the set of invertible 2 × 2 real matrices is an open subset of the space of 2 × 2 real matrices.
2.5
Directional and Partial Derivatives
Deﬁnition 13 Let u be a unit vector in Rn , and f : Rn → R be a function. The directional derivative in direction u of f at x is deﬁned as ∂u f (x) = lim
t→0
f (x + tu) − f (x) . t
derivative with respect to xj and denoted by
If u is a unitvector in the canonical basis of Rn , i. e. u = ej for some 1 ≤ j ≤ n, then this is called the partial ∂f (x) ∂xj An easy consequence is: Lemma 3 Suppose f : Rn → R is diﬀerentiable at x. Then for any unit vector u ∈ Rn the directional derivative in direction u exists and ∂u f (x) = Df (x) · u Proof: Consider the function g : R → Rn given by g : t → x + tu. The directional derivative at x is the immediately from the chain rule proven above. derivative of the function F = f ◦ g : R → R at 0. since g is diﬀerentiable and Dg (0) = u the result follows
or
fxj (x)
In the interest of computing derivatives it would be good to have the converse of this last Lemma. I. e. we would like to say that f is diﬀerentiable if all partial derivatives exist. Unfortunately that is not the case as the next example shows. 17
Example: Let f be deﬁned as follows: 0 0 |x2 | ≤ 1 |x1 | 3 3|x1 | ≤ |x2 |
1 3 |x1 |
f (x1 , x2 ) =
1
< |x2 | < 3|x1 |
Then f is clearly not diﬀerentiable at (0, 0) since it isn’t even continuous. But ∂f ∂f (0, 0) = (0, 0) = 0. ∂x1 ∂x2
Before we can prove a partial converse we will introduce a mean value theorem. The mean value theorems of single variable calculus do not apply since there is no natural ordering on Rn that is compatible with the metric space topology. We start with the following Deﬁnition 14 A set C ⊂ Rn is called convex if for every x, y ∈ C and every s ∈ [0, 1] we have x(1 − s) + ys ∈ C. Losely speaking this means that for any two points x and y in C, the line segment connecting these two points is also in C. Proposition 19 Let C be a convex subset of Rn and f : C → R be diﬀerentiable in C with continuous derivative
Df (x). Letf x, y ∈ C. Then there exist an s ∈ [0, 1] such that
f (x) − f (y) = Df (xs + y(1 − s)) · (x − y). Proof: Consider the function F (t) = f (xt + yf (1 − t)) on [0, 1] and apply chain rule and mean value theorem for 1 variable.
is diﬀerentiable at x. Moreover, Df (x) = (fx1 (x), . . . , fxn (x)).
Theorem 2 Let f be continuous on a ball Bǫ (x) ⊂ Rn , with continuous partial derivatives on Bǫ (x). Then f
Proof: We will do the proof in two variables, the general case workds exactly the same way, but is messier theorem (used twice) there exist s, t ∈ (0, 1) such that f (x + h) − f (x) to write down. Let h = (h1 , h2 ) and |h| < ǫ. Observe that the ball of raidus ǫ is convex. By the mean value
= f (x1 + h1 , x2 + h2 ) − f (x1 , x2 ) = f (x1 + h1 , x2 + h2 ) − f (x1 , x2 + h2 ) + f (x1 , x2 + h2 ) − f (x1 , x2 ) = fx1 (x1 + sh1 , x2 + h2 )h1 + fx2 (x1 , x2 + th2 )h2 18
Thus f (x + h) − f (x) − fx1 (x1 , x2 )h1 − fx2 (x1 , x2 )h2 = (fx1 (x1 + sh1 , x2 + h2 ) − fx1 (x1 , x2 ))h1 +(fx2 (x1 , x2 + th2 ) − fx2 (x1 , x2 ))h2 = ξ(h1 , h2 ). Next observe that the right hand side of this equation satisﬁes |ξ(h1 , h2 )| ≤ (|fx1 (x1 + sh1 , x2 + h2 ) − fx1 (x1 , x2 )| + |fx2 (x1 , x2 + th2 ) − fx2 (x1 , x2 )|) |h|, and thus |ξ(h1 , h2 )| = 0, |h|→0 |h| lim
by the continuity of the partial derivatives.
This theorem now allows us to easily compute derivatives for diﬀerentiable functions, by computing the partial derivatives. Proposition 20 Suppose that f is diﬀerentiable on Bǫ (x0 ) ⊂ Rn . If f has a local extremum at x0 then Df (x0 ) = 0.
Proof: We assume that f has a local maximum at x0 , and that f (x0 ) ≥ f (x) for all x ∈ Bǫ (x0 ). Let u be any unit vector in Rn and 0 < h < ǫ. Then
0 ≥ f (x0 + uh) − f (x0 ) = Df (x0 + λuh) · uh, for some λ ∈ (0, 1). By changing h to −h we have 0 ≥ f (x0 − uh) − f (x0 ) = Df (x0 + λu(−h)) · u(−h), Letting h → 0 we get from the ﬁrst inequality, and ∂u f (x0 ) ≥ 0 from the second one. Thus Df (x0 ) · u = 0, for all unit vectors u ∈ Rn , and the result follows.
∂u f (x0 ) ≤ 0
19
2.6
Assignment 5:
x2 + x2 + x2 ) 1 2 3
1. Let φ : R → [0, ∞) be a diﬀerentiable function. Deﬁne F : R3 → [0, ∞) by F (x1 , x2 , x3 ) = φ( Show that φ′ ( x2 + x2 + x2 ) = 1 2 3 2. Let f (x1 , x2 ) = ∂F ∂x1
2
+
∂F ∂x2
2
+
∂F ∂x3
2
1 2
x2 x2 1 x4 +x2 1 2
x2 = 0 x2 = 0
Prove that all directional derivatives of f exist at (0, 0) but that f is neither dﬁﬀerentiable nor continuous at this point. 3. Suppose y = F (x1 , x2 ) is diﬀerentiable at (p1 , p2 ) with Fx2 (p1 , p2 ) = 0. Let I be an open interval containing p1 . If f : I → R is diﬀerentiable and F (x1 , f (x1 )) = 0 for all x1 ∈ I then f ′ (p1 ) = 4. Show that −Fx1 (p1 , p2 ) . Fx2 (p1 , p2 )
0
is diﬀerentiable on R2 for all α > 1/2. 5. Prove that
(x x )α log(x2 + x2 ) 1 2 1 2 f (x1 , x2 ) = 0
x4 +x4 1 2 (x2 +x2 )α 1 2
(x1 , x2 ) = (0, 0) (x1 , x2 ) = (0, 0)
f (x1 , x2 ) = is diﬀerentiable for all α < 3/2.
(x1 , x2 ) = (0, 0) (x1 , x2 ) = (0, 0)
0
6. Investigate the case when α = 1/2 in problem 4 and α = 3/2 in problem 5.
2.7
Higher order derivatives:
Let O ⊂ Rn be open and f : O → Rm be a function that is diﬀerentiable for all x ∈ O, then Df is itself a
function deﬁned on O. The values of this function are m × n matrices, which can be thought of as elements of
Rmn . We can thus look at the diﬀerentiability of this function. We observe, that everytime we take a derivative
the dimension of the range increases by a factor n, and things will get confusing quickly. If f is areal valued function on R3 then its fourth derivative will have vaues in R81 ! Mindful of this we restrict our discussion to derivatives up to the second order, and real valued functions. For a real-valued function on Rn the second derivative is an n × n matrix. We will prove that under certain conditions this matrix is symmetric. Of course 20
we also have higher order directional derivatives an partial derivatives. At this time we make the following deﬁnition which really belongs into a linear algebra course. Deﬁnition 15 Let V be a vector space over R an n-linear form is a map. φ : V n → R, which is linear in each of its components. That is for j = 1, . . . , n, and a, b ∈ R φ(v1 , . . . , avj + buj , . . . , vn ) = aφ(v1 , . . . , vj , . . . , vn ) + bφ(v1 , . . . , uj , . . . , vn ) Now it easy to see that for f : Rn → R the ﬁrst derivative is a 1-linear form, the second a bilinear form and in product by a vector so φ(v) = a · v for some ﬁxed vector a and all v ∈ V . Similarly a bilinear form is givesn by a matrix A such that: φ(v, u) = vT · Au. For higher order derivatives it is not so easy to see how it is represented. But in order to make the notation easier we write Df (x) for the k-th derivative of f at x and keep in mid that this is a k-linear form. If f is k times diﬀerentiable then the k-th directional derivative at x in direction u is given by Df (x)(u, . . . , u) = Df (x)(u), i.e all k arguments of the k-linear form are evaluated at u. The crucial result in this section is Theorem 3 Schwartz’s Theorem. Suppose that f : Rn → R is C 1 (i.e. continuously diﬀerentiable and if ∂2f ∂xi ∂xj exists and is continuous on an open set U ⊂ Rn , then ∂ 2f (x) ∂xj ∂xi exists and ∂2f ∂2f (x) = (x) ∂xj ∂xi ∂xi ∂xj
(k) (k) (k)
general the n-th an n-linear form. Also it is lclear that in any given basis, a 1-linear form is given by the inner
for all x ∈ U . 21
Proof: Suppose that |hi |, |hj | < ǫ deﬁne ∆(hi , hj )
∂2 f ∂xi ∂xj (x)
exists at x ∈ U . Let ǫ > 0 such that Bǫ (x) ⊂ U in the d∞ metric. For
= f (x1 , . . . , xi + hi , . . . , xj + hj , . . . , xn ) − f (x1 , . . . , xi + hi , . . . , xj , . . . , xn ) −f (x1 , . . . , xi , . . . , xj + hj , . . . , xn ) + f (x1 , . . . , xi , . . . , xj , . . . , xn ).
Applying the Mean Value Theorem twice yields that there is t, s ∈ (0, 1) such that ∆(hi , hj ) = hj
∂f ∂f (x1 , . . . , xi + hi , . . . , xj + thj , . . . , xn ) − hj (x1 , . . . , xi , . . . , xj + thj , . . . , xn ) ∂xj ∂xj ∂2f ∆(hi , hj ) ∂2f (x1 , . . . , xi + shi , . . . , xj + thj , . . . , xn ). = (x). ∂xi ∂xj hi hj ∂xi ∂xj ∂2f ∆(hi , hj ) = (x). hj →0 hi →0 hi hj ∂xi ∂xj lim lim
= hi hj
Since this is mixed partial is continuous we have that
On the other hand we may use the Mean Value Theorem to get ∆(hi , hj ) = hi ∂f ∂f (x1 , . . . , xi + uhi , . . . , xj + hj , . . . , xn ) − (x1 , . . . , xi + uhi , . . . , xj , . . . , xn ) , ∂xi ∂xi ∂f ∂f (x1 , . . . , xi , . . . , xj + hj , . . . , xn ) − (x1 , . . . , xi , . . . , xj , . . . , xn ) , ∂xi ∂xi ∆(hi , hj ) ∂2f = (x), hj →0 hi →0 hi hj ∂xj ∂xi lim lim
for some u ∈ (0, 1) and
hi →0
lim
∆(hi , hj = hi
∂f ∂xi
by the continuity of
Thus
and the statement is proven.
2.8
Taylor’s Formula:
point a. Then for every x ∈ S there is a t ∈ [0, 1] such that
k−1
Theorem 4 Let f : Rn → R be k-times continuously diﬀerentiable on an open convex set S containing the 1 (k) 1 (j) D (a)(x − a) + Df ((1 − t)a + tx))(x − a). j! f k!
f (x) = f (a) +
j=1
x−a , |x − a| and consider F (s) = f (a + us). This function is k times continuously diﬀerentiable on an open interval (−r, R) u= which contains [0, |x − a|]. Moreover, F (|x − a|) = f (x). The Taylor Formula for functions on the reals implies that
k−1
Proof: Let
F (|x − a|) = F (0) +
j=1
1 (j) 1 F (0)(|x − a|)j + F (k) (ξ)(|x − a|)k , j! k! 22
for some ξ ∈ [0, |x − a|]. From the form of the directional derivatives explained earlier we get
k−1
f (x) = f (a) +
j=1
1 (j) 1 (k) Df (a)(u)(|x − a|)j + Df (a + uξ)(u)(|x − a|)k . j! k!
The result now follows immediately from the multilinearity of the derivatives and the fact that a + ξu = (1 − t)a + tx, for some t ∈ [0, 1].
2.9
Assignment 6:
1. Consider the map [0, ∞) × [0, 2π) → R2 , deﬁned by (x(r, θ), y(r, θ)) = (r cos θ, r sin θ). Show that this map is diﬀerentiable and compute its derivative. Moreover, prove that the map is onto. Is it also one-to-one? 2. Let f (x, y) be a twice diﬀerentiable function on R2 such that fxx + fyy = 0 on R2 . Deﬁne φ(r, θ) = f (r cos θ, r sin θ). (a) Compute ∂φ ∂r (b) Prove that 1 ∂ 2 φ 1 ∂φ ∂ 2 φ + + 2 = 0. r2 ∂θ2 r ∂r ∂r 3. Consider f (x1 , x2 ) = sin(x2 + x2 ). Compute the ﬁrst three terms of the Taylor expansion of this function. 1 2 and ∂φ ∂θ
3
3.1
THE INVERSE AND IMPLICIT FUNCTION THEOREMS
The Inverse Function Theorem
1 f ′ (f −1 (x)) .
and onto f ((a, b)) with a diﬀerentiable inverse function and (f −1 )′ (x) = start by looking at the following example: Example:
If f : (a, b) → R is a diﬀerentiable function such that f ′ (x) = 0 for all x ∈ (a, b), we know that f is one-to-one
We would like to generalize
this to higher dimensions. First we need to have an idea how to generalize the statemant f ′ (x) = 0. Let us
Let f (x1 , x2 ) = x2 + x2 then f ′ (x1 , x2 ) = (2x1 , 2x2 ) = (0, 0) except at the origin. But this function is clearly not 1 2 a one-to-one function. If however, we interpret the derivative as a linear map Rn → Rm then we can interpret the result for R also as the map f ′ (x) : R → R deﬁned by f ′ (x) : h → f ′ (x) · h, is not singular. 23
Deﬁnition 16 Let V and W be vector spaces. A linear map T : V → W is non-singular if ker T = {0}. An immediate consequence is that if T : V → W is non-singular and onto then V and W must have the same
dimension. Moreover, non singular linear maps are one-to-one. This suggests that the actual property of the that the result from calculus can atmost be generalize to functions where the domain and the range have the same dimension. We continue with another example. Example: Let f : (0, ∞) × (−2π, 2π) be given by (r, θ) → (r cos θ, r sin θ). The the derivative is given by the matrix cos θ sin θ . Df (r, θ) −r sin θ r cos θ one-to-one, since, for example, f (1, −π) = (−1, 0) = f (1, π).
derivative which guarantees the injectiveness is the non-singularity of this linear map. A direct consequence is
Observe that det Df (r, θ) = r = 0, and therefore the derivative is not singular, but the function is certainly not This last example shows that we can certainly not have a generalization of the one dimensional case to higher
dimensions which gives us global injectivity (like in one dimension) the best we can hope for is a local result. Theorem 5 Inverse Function Theorem Let V ⊂ Rn be open and f : V → Rn be continuously diﬀerentiable 1. x0 ∈ V0 and f (x0 ) ∈ W) . 2. f is a bijection from V0 onto W0 , and f −1 is a bijection from V0 onto W0 . 3. f −1 is continuously diﬀerentiable on W0 , and 4. for each y = f (x) ∈ W0 we have
on v. Let x0 ∈ V such that Df (x0 ) is non-sinsgular. The there exist open set V0 ⊂ V and W0 ⊂ f (V ) such that
Df −1 (y) − [Df (x)]
−1
,
where the expression on the right denotes the inverse operator. Before proving the theorem let me remark that we really do not want much more than this. The Theorem guarantees that the equation f (x) = y, has a unique solution in neighborhood of y0 . In practice we may want to solve this equation using Newton’s method (see down below). And the theorem guarantees that errors will propagate in a predictable way. Proof of the Theorem: The proof of this Theorem is rather lengthy and we dicide into several steps. In the ﬁrst step we reduce the problem to one which has asimpler derivative. the identity operator on Rn . So we assume without loss of generality that Df (x0 ) = I. 24 STEP 1: Consider the function Φ(x) = (Df (x0 ))−1 ◦f (x). Then Φ is one-to-one if and only if f is and DΦ (x0 ) = I,
STEP 2: It follows that if f (x0 + h) = f (x0 ) we have |f (x0 + h) = f (x0 ) − I · h| |h| = = 1. |h| |h| But since f is diﬀerentiable at x0 with derivative I, we must have
|h|→0
lim
|f (x0 + h) = f (x0 ) − I · h| = 0. |h| f (x) = f (x0 ) (8)
Thus there is a closed rectangle R such that
for all x in the interior of R. Since the determinant is a continuous function and det Df (x0 ) = 1 there exists an open set around x0 such that det Df (x) = 0, (9)
on that open set. Finally, since the partial derivatives of f are continuous at x0 there is an open set such that ∂fi ∂fi 1 (x) − (x0 ) < 2 , ∂xj ∂xj 2n for all i, j = 1, . . . , n. Without loss of generality we assume that (8-10) all hold on the interior of R. STEP 3: (10) also implies that 1 ∂fi (x) < 1 + n2 , ∂xj 2 on R. Now let g(x) = f (x) − x), then (10) implies that ∂gi 1 (x) < 2 ∂xj 2n and by the mean value theorem |g(x1 ) − g(x2 )| ≤ From the triangle inequality we get: |x1 − x2 | − |f (x1 ) − f (x2 )| ≤ ≤ It immediately follows that for x1 , x2 ∈ R we have |x1 − x2 | ≤ 2 |f (x1 ) − f (x2 )| . (12) |f (x1 ) − x1 − (f (x2 ) − x2 )| 1 |x1 − x2 | . 2 1 |x1 − x2 | . 2 (10)
(11)
Observe that this last inequality will directly imply that the inverse function is Lipschitz continuous, if it exists. We will now show that f is one-to-one on a subset of R. First we observe that the boundary of R is compact, and therefore f (∂R) is compact as well. Moreover, by (8), f (x0 ) ∈ f (∂R). Thus there is a number d.0 such that / |f (x0 ) − f (x)| ≥ d, 25
for all x ∈ ∂R. Deﬁne For any y ∈ W and any x ∈ ∂R we have
W =
y : |y − f (x0 )| <
d 2
.
|y − f (x0 )| < |y − f (x)| . STEP 4: For a given y ∈ W deﬁne g(x) = |y − f (x)| .
2
(13)
This is a continuously diﬀerentiable function on R and therefore has a minimum on R. By (13) we have g(x0 ) < g(x) for all x ∈ ∂R, and therefore this minimum msut occur in the interior of R, and at this minimum we have Dg (x) = 0. A simple computation gives
Dg (x) = (y − f (x)) · Df (x),
(14)
and since Df (x) is non-singular this expression can only vanish if y − f (x) = 0, i.e. if there exists an x ∈ R such that y = f (x). (12) immediately implies the uniqueness of such a solution. Deﬁne V = int R ∩ f −1 (W ). Then for every y ∈ W there is a unique x ∈ V such that y = f (x), or the function f has an inverse function on
W As mentioned above (12) implies that f −1 is continuous on W . It remains to be shown that this function is also diﬀerentiable. STEP 5: To show that f −1 is diﬀerentiable observe that for h suﬃciently small f (x1 ) − f (x) = Df (x) · (x − 1 − x) + ξ(x1 − x) where |ξ(h)| = 0. h→0 |h| lim
(15)
If we apply (Df (x))−1 to this equation we get;
(Df (x))−1 · (f (x1 ) − f (x)) = x1 − x + (Df (x))−1 ξ(x1 − x), which we can rewrite as f −1 (y1 ) − f −1 (y) = (Df (x))−1 · (y1 − y) − (Df (x))−1 ξ(f −1 (y1 ) − f −1 (y)),
26
and we are only left to prove: lim (Df (x))−1 ξ(f −1 (y1 ) − f −1 (y)) = 0. |y1 − y| ξ(f −1 (y1 ) − f −1 (y)) = 0, |y1 − y|
y1 →y
To do this it is clear that this follows if lim
y1 →y
since the derivative is a linear operator. Now observe that ξ(f −1 (y1 ) − f −1 (y)) ξ(f −1 (y1 ) − f −1 (y)) f −1 (y1 ) − f −1 (y) = · −1 (y ) − f −1 (y)| |y1 − y| |f |y1 − y| 1 second fraction is bounded by 2, and thus the product converges to zero. (16)
By the continuity of f −1 and (15 the ﬁrst fraction will converge to zero as y1 → y. Finally (13 implies that the
3.2
Assignment 7:
and onto. Moreover, show that f (Rn ) is both open and closed and therefore f (Rn ) = Rn . Conclude that
1. Let f : Rn → Rn be such that |Df (x) · u| = |u| for all x, u ∈ Rn . Prove that f is one-to-one on all of Rn f has a global diﬀerentiable inverse which is also an isometry.
2. In this problem we generalize the global result from above. Let S ⊂ Rn be open and f : § → Rn such that there exists k1 , k2 > 0 such that k≤ |Df (x0 · u| ≤ k2 with a diﬀerentiable inverse function. for all x ∈ S and all unit vectors u ∈ Rn . Prove that f (S) is open and that f is one-to-one and onto f (S)
3. Deﬁne f : R2 → r2 by f (x, y) = (ex cos y, ex siny) show that Df (x) is non-singular for all x = (x, y) ∈ R2 , but f is not one-to-one.
3.3
The Implicit Function Theorem
We continue next to the Implicit Function Theorem. To start we all recall problems from calculus of the following form: Let y 3 − x2 = 2 Find the slope of the tangent line to this curve at the point (−2, 2). We learned to solve them by diﬀerentiating this equation under the assumption that y is a diﬀerentiable function of x to get: 3y 2 y ′ (x) − 2x = 0 27
Function Theorem will give us speciﬁc conditions when we can do this and generalize this to higher dimensions. We need to introduce some new notation to simplify this situation. Let f : Rn × Rm → Rm be diﬀerentable at a point (x, y) ∈ Rn × Rm . We denote by ∂x f (x, y) and ∂y f (x, y)
Next we evaluated x and y at the given point to get 12y ′ (−2) + 4 = 0 and get y ′ (−2) = −1/3. The Implicit
the drivatives of f with respect to the x and y variables. If we write Df (x, y) as a matrix of partial derivatives, these represent the submatrices consisting of the ﬁrst n and the last m columns, respectively. And Df (x, y) = (∂x f (x, y), ∂y f (x, y)). Using this notation we may formulate the ﬁrst version of the theorem. Theorem 6 Implicit Function Theorem. Let f : Rn × Rm → Rm Then there exists an open set O ⊂ Rn and and open set w ⊂ Rm such that (x0 , y0 ) ∈ O × W and for each x ∈ O be a continuously diﬀerentiable at (x0 , y0 ) ∈ Rn × Rm such that f (x0 , y0 ) = 0 and ∂y f (x0 , y0 ) is non-singular.
there exists a unique g(x) ∈ W such that f (x, g(x)) = 0. Moreover, the function g : O → W is diﬀerentiable and Dg (x) = −(∂y f (x, y))−1 · ∂x f (x, y), y = g(x).
Proof: The proof of this theorem is essentially a smart application of the inverse function theorem. To do this we must ﬁrst construct a function to which the inverse function theorem can be applied. Deﬁne F : Rn × Rm → Rn × Rm by F(x, y) = (x, f (x, y)).
Observe that this function is dierentiable with the derivative given by I 0 DF (x, y) = ∂x f ∂y f
It is easy to see that DF (x0 , y0 ) is non-singular and we may therefore apply the inverse function theorem. Thus set U ⊂ RM with 0 ∈ U , such that F has a diﬀerentiable inverse F−1 : O × U → O × W . For (z, w) ∈ O × W this inverse will be of the form F−1 (z, w) = (z, k(z, w)). Finally deﬁne g(z) = k(z, 0). 28
there exists an open set which we may choose to be of the form O × W containing the point (x0 , y0 ) and an open
Observe that for all x ∈ O F(x, g(x)) = F(x, k(x, 0)) = F(F−1 (x, 0)) = (x, 0), and thus f (x, g(x)) = 0. Since F−1 is diﬀerentable, so is g, and by taking the derivative of this last equation using the chain rule we arrive at: 0 = ∂x f + (∂y f ) · Dg , which immediately implies the formula for the derivative. It is clear that the position of the x and y variables does not matter. The theorem can therefore be signiﬁcantly generalized. To do this let F : Rn+m → Rm . If rank Df (x0 , y0 ) = m we can ﬁnd m functions gj : Rn → R such {xi1 , . . . , xim }. that m variables xi1 , . . . , xim can be expressed xij = gj (y), where y the n variables which are not in the set
3.4
Assignment 8:
1. Apply the Implicit function theorem to prove the following Theorem: Theorem 7 Let f : Rn → R and gi : Rn → R be continuously diﬀerentiable functions for i = 1, . . . , k, and U ⊂ Rn be open. Assume that f attains a local maximum or minumum at x∗ on the set D = U ∩ {x ∈ Rn : gi (x) = 0, Let g = (g1 , . . . , gk ) and assume that rank Dg (x∗ ) = k Then there exists a vector Λ = (λ1 , . . . , λk ) ∈ Rk such that
k
i = 1, . . . , k}
Df (x∗ ) +
i=1
λi Dgi (x∗ ) = 0.
Hint: Apply the Implicit function theorem to the function g, replace the variables in f using the functions created by the theorem. The resulting function has a local extremum at an interior point of an open subset of Rn−k . Supply all the details and give a formula for Λ. 2. Let Ω ∈ Rn be an open set and f : Rn → R be a diﬀerentiable function such that f (x1 , . . . , xn ) = 0 and
∂f ∂xJ
(x1 , . . . , xn ) = 0 for all j = 1, . . . , n and all (x1 , . . . , xn ) ∈ Ω. Prove that ∂xn−1 ∂xn ∂x1 ∂x2 ··· = (−1)n , ∂x2 ∂x3 ∂xn ∂x1
for all (x1 , . . . , xn ) ∈ Ω. 29
4
THE BANACH SPACE C(X)
Let X be a metric space, then C(X) denotes the set of all continuous real valued functions on X. The set of all continuous functions with values in Rn is denoted by (C(X))n . We will concetrate our eﬀorts on the space of real valued functions as the vector valued case can be treated by treating the components separately. Proposition 21 C(X) is a real vector space with a norm deﬁned by f
∞
= sup |f (x)|
x∈X
af + bg ∈ C(X). To show that there is a norm, observe that f and if f
∞ ∞
Proof: Clearly if f, g ∈ C(X) and a, b ∈ R then af + bg is real valued continuous function on X and therefore
≥ 0,
= 0 we must have |f (x)| =) for all x ∈ X and thus f = 0. Next, let a ∈ R then af
∞
= sup |af (x)| = |a| sup |f (x)| = |a| f
x∈X x∈X
∞.
Finally, observe that for every x ∈ X |f (x) + g(x)| ≤ |f (x)| + |g(x)| ≤ f Taking the least upper bound on the left hand side we get f +g Remarks: 1. This norm is called the uniform norm. Any norm generates a metric in the usual way. The topology generated by this metric is called the uniform topology on C(X). 2. Continuity is not required for this last proposition. However, we restrict ourselves to continuous functions, as C(X) will inherit some nice properties. 3. The set C(X) is actually more than a vectorspace, it is a commutative ring, for those of us who like algebra. 4. In the case of vector valued functions the absolute value is replaced by the Euclidean norm on Rn . 5. If X is a ﬁnite set with n elements, then C(X) is isomorphic to Rn , but for inﬁnite sets X the space is inﬁnite dimensional. We can see this easily in the case when X is an interval. The polynomials are elements of C(X), but for any ﬁnite set of polynomials one can construct a new polynomial which is 30
∞ ∞
+ g
∞.
≤ f
∞
+ g
∞.
linearly independent. Let p1 , . . . , pn denote a set of n linearly independent polynomials. Let m be the largest degree of these n polynomials. Then xm+1 cannot be written as a linear combination of the original n polynomials. Thus no ﬁnite set of of polynomials can span the vectorspace of polynomials and this is an inﬁnite dimensional subspace of C(X)
This Proposition estabished C(X) as a metric space. We can and will now look at all the topological properites of this space. Most importantly we have: Theorem 8 C(X) is complete. Proof: The proof of this theorem is surprisingly simple. Let {fn } be a Cauchy sequence in C(X). This implies
complete this sequence converges to a unique limit. So for every x ∈ X we can deﬁne f (x) = lim fn (x).
n→∞
that for any x ∈ X the sequence {fn (x)} is a Cauchy sequence of real numbers. Since the real numbers are
Nx such that
Next we need to show that this sequence converges uniformly. Let ǫ > 0, then for each x ∈ X there is a number |f (x) − fn (x)| < for all n ≥ Nx . Furthermore, there exists M such that fn − fm
∞
ǫ 2
<
ǫ 2
for all n, m ≥ M . Now let n ≥ M then for each x ∈ X there is a m ≥ max{M, Nx } such that |f (x) − fn (x)| ≤ |f (x) − fm (x)| + |fm (x) − fn (x)| < ǫ ǫ + . 2 2
Since this inequality holds for all x ∈ X and the right hand side is independent of x we may take the least upper bound on the left and get f − fn such that f − fn <
ǫ 3 ∞
<ǫ
for all n ≥ M . Finally, we will prove that f is continuous. To do this let ǫ > 0 and x0 ∈ X then There is a M
∞
for all x ∈ Bδ (x0 ). Finally, observe that
for all n ≥ M . Pick a n ≥ M then there exists a δ > 0 such that |fn (x) − fn (x0 )| <
ǫ 3
|f (x) − f (x0 )| ≤ |f (x) − fn (x)| + |fn (x) − fn (x0 )| + |f (x0 ) − fn (x0 )| < ǫ for all x ∈ Bδ (x0 ). Remarks:
31
1. In the proof of the Theorem we only used the contiuity of fn to sow that the limit is continuous. WE can thus extend the statement to the following: Let {fn } be a sequence of real valued functions on a metric space X which is a Cauchy sequence in the
uniform metric. Then the sequence converges to a function on X.
2. Complete, normed vector spaces are usually called Banach Spaces, in honor of the Polish mathematician Stefan Banach. Examples: 1. Consider series
∞
xj
j=0
on (−p, p). We know that this sequece converges for every x ∈ (−p, p). We will show that it converges uniformly. To do this let f (x) = limn→∞ fn (x) then
∞ ∞ ∞
for j ∈ (−p, p) where 0 < p < 1 If we let fn (x) =
n j=0
xj then this is a sequence of continuous functions
|f (x) − fn (x)| =
j=n+1
xj ≤
j=n+1
|x|j <
pj =
j=n+1
pn+1 1−p
The right most term is independent of x and converges to 0 as n → ∞. So we take the least upper bound on the left to get uniform convergence. 2. Let fn (x) = xn then {fn } is not uniformly convergent on [0, 1], but it is uniformly convergent on [0, p) for
that |f (x) − fn (x)| < pn .
the limit function is not continuous, as it would be if the convergence was uniform. For x ∈ [0, p) observe
any 0 < p < 1. To see this observe that limn→∞ fn (x) = 0 for all x ∈ [0, 1), but limn→∞ fn (1) = 1. Thus
valued functions on the metric space x we can let gn (x) = fn+1 (x) − fn (x) and g0 (x) = f1 (x) which gives
n−1
It seems like the ﬁrst example can be extended to any power series. Moreover, if {fn } is a sequence of real
fn (x) =
j=0
gj (x),
i.e. we can assume that any given sequence is actually a series. And for series there is an excellent tool to test whether it converges uniformly or not. Theorem 9 Weierstrass M-Test. Let {gn } be a sequence of functions on X. If there exists a sequence of non-negative real numbers {an } suct that the series
∞
an
j=0
32
converges and gj
∞
≤ aj for all j ∈ N then the series
∞
gj (x)
j=0
converges uniformly on X. Proof: The proof essentially follows example 1. Let ǫ > 0, since
∞
aj
j=0
converges the exists a N such that
m
aj < ǫ
j=n+1
for all n, m ≥ N . Now
m j=n+1
m
m
gj ≤
j=n+1
gj ≤
aj < ǫ
j=n+1
for all n, m ≥ N . Thus the sequence of partial sums is a Cauchy sequence and converges.
4.1
Consequences and Examples
Proposition 22 Let O ⊂ X and x0 be a limit point of O (not necessarily in O). Let {fn } be a sequence of continuous functions on O such that
x→x0
lim fn (x) = an .
If the sequence of functions converges uniformly to a function f on O and the sequence an converges to a number a, then
x→x0
lim f (x) = lim an
n→∞
Proof: Observe that |f (x) − a| ≤ f − fn For any ǫ > 0 there exists N such that f − fn
∞ ∞
+ |fn (x) − an | + |an − a| .
<
ǫ 3
and
|an − a| <
ǫ , 3
for any n ≥ N . We can also ﬁnd δ > 0 such that |fn (x) − an | < ǫ 3
for all x ∈ Bδ (x0 ) ∩ O. Combining these yields the desired result. 33
Derivatives and Integrals of Sequences: If we have a given sequence of diﬀerentiable functions in C([a, b]) is the limit function necessarily diﬀerentiable. We will investigate this and related questions next. We start by looking at integration. Example: Let n(1 − nx) fn (x) = 0
1
0<x≤
1 n
1 n
<x≤1
Then this is a sequence of continuous Riemann integrable functions on (0, 1). This sequence converges to f (x) = 0, and fn dx =
0
1 , 2
1 0 n→∞
for all n Thus
n→∞
1
lim
fn dx =
0
1 =0= 2
lim fn dx.
Proposition 23 Let {fn } a sequence of Riemann integrable functions on the interval [a, b], that uniformly coverges to a function f . Then f is Riemann integrable and
b n→∞ b
lim
fn dx =
a a
f dx
Proof: Let ǫ > 0 since the sequence converges uniformly there exists a N > 0 such that fn − f
∞
<
ǫ , 6(b − a)
for all n ≥ N . In particular this inequality holds for n = N . Morevoer, since fN is integrable there exists a partition P such that S(fN ; P ) − S(fN ; P ) < ǫ . 3
For the ith subinterval of a partition let Mi (f ) = sup{f (x) : xi−1 ≤ x ≤ xi }, and mi (f ) = inf{f (x) : xi−1 ≤ x ≤ xi }. It follows from the uniform convergence and a problem in the next assignment that |Mi (f ) − Mi (fN )| < Thus
n
ǫ 3(b − a)
and
|mi (f ) − mi (fN )| <
ǫ . 3(b − a)
S(f ; P ) − S(fN ; P )
≤ <
i=1
|Mi (f ) − Mi (fN )|∆xi
n
ǫ 3(b − a)
∆xi =
i=1
ǫ 3
and similarly |S(f ; P ) − S(fN ; P )| < 34 ǫ . 3
Finally, S(f ; P ) − S(f ; P ) ≤ S(f ; P ) − S(fN ; P ) + S(fN ; P ) − S(fN ; P ) + |S(f ; P ) − S(fN ; P )| < ǫ, and thus f is Riemann integrable. To show the second part of the proposition observe that for ǫ > o there is a N such that fn (x) − ǫ ǫ < f (x) < fn (x) + , (b − a) (b − a)
b b
for all x ∈ [a, b] and all n ≥ N . Integration yields
b a
fn dx − ǫ <
f dx <
a a
fn dx + ǫ
which yields the result. The next example shows that uniform convergence is a suﬃcien condition, but it is not a necessary condition. Example: The sequence of functions {fn (x)} = {xn } converges on the interval [0, 1], but since the limit fuction is not continuous this convegerce is not uniform. The limit function f is identically equal to zero on [0, 1) and f (1) = 1.
1
Thus
0
f dx = 0, and
n→∞ 1
lim
fn dx = lim
0
n→∞
1 = 0. n+1
The next proposition has a somewhat lengthy proof, but it is a very powerful statement. We will show that if fn is a sequence of diﬀerentiable functions and whose derivatives converge uniformly, then fn converges uniformly to a diﬀerentiable function, provided it converges at at least one point in its domain.
′ Proposition 24 Let {fn } be a sequence of diﬀerentiable functions on (a, b) such that the sequence {fn } con-
verges uniformly to a function g, and there is a point x0 ∈ (a, b) such that {fn (x0 )} converges. Then the sequence {fn } converges uniformly to a diﬀerentiable function f and f ′ = g.
Proof: We start with proving that the series converges uniformly. Let ǫ > 0, then there exists a N > 0 such that |fn (x0 ) − fm (x0 )| < for all n, m ≥ N and
′ ′ fn − fm <
ǫ 2
ǫ , 2(b − a)
35
for all n, m ≥ N . Now we have |fn (x) − fm (x)| = |fn (x) − fn (x0 ) + fn (x0 ) − fm (x0 ) + fm (x0 ) − fm (x)| ≤ |fn (x0 ) − fm (x0 )| + |(fn (x) − fm (x)) − (fn (x0 ) − fm (x0 ))| ǫ ′ ′ ≤ + fn − fm |b − a| 2 < ǫ, where we used the Mean Value Theorem for the second term on the right of the second line and for the function f (x) = limn→∞ fn (x). fn − fm . Since this holds for any x ∈ [a, b] the sequence is a uniform Cauchy sequence and converges. Let
Next, ﬁx x ∈ (a, b) and deﬁne hn (t) = fn (t) − fn (x) t−x and h(t) = f (t) − f (x) t−x for all t ∈ [a, b] \ {x}
To continue observe that by using the Mean Value Theorem again we have |hn (t) − hm (t)| = ≤ 1 |(fn (t) − fm (t)) − (fn (x) − fm (x))| |t − x| 1 ′ f ′ − fm | |t − x| |t − x| n
which imeiately make hn into a Cauchy sequence which converges uniformly if t = x. But since fn (t) converges to f (t) for all t = x, hn (t) → h(t). By a previous proposition we have
t→x
lim
n→∞
lim hn (t) = lim
n→∞
t→x
lim Hn (t) ,
and thus
′ f ′ (x) = lim h(t) = lim fn (x). t→x n→∞
4.2
A nowhere diﬀerentiable continuous function
Our next step is to give an example of a function which is continuous on R but nowhere diﬀerentiable. Proposition 25 There exists a continuous f : R → R which is nowhere diﬀerentiable. Proof: We begin by deﬁning φ : R → R as follows. φ(x) = |x|, for |x| ≤ f rac12. and φ(x + n) = φ(x) for all between slope 1 and -1. Moreover, 0 ≤ φ(x) ≤
1 2
integers n. This deﬁnes φ on all of R φ is continuous and its graph consists of line segments which alternate for all x ∈ R. Next let φ(4j x) . 4j
fj (x) =
36
Then fj is continuous and |fj (x)| ≤ 1 4−j , and fj (x + 4j ) = fj (x). Thus the sum 2
∞
fj (x)
j=0
an let hj = ±4−j−1 , where the sign is chosen such that 4j x and 4j (x + hj ) lie in the same interval [ k , k+1 . I.e 2 2 the graph at these two points are the same line segment. Observe that fk (x) = fk (x + hj ) for all k > j. Furthermore, fk (x) − fk (x + hj ) = ±hj
converges uniformly to a continuous function f (x) To show that this function is nowhere diﬀerentiable, ﬁx x ∈ R
for all k ≤ j. This is easy to see for k = j. For k < j , the points 4k x and 4k (x + hj ) also lie in the same interval [ m , m+1 ]. Therefore, we have 2 2
j j
f (x + hj ) − f (x) = hj
k=0
fk (x + hj ) − fk (x) = j
γk
k=0
number of +1’s and an odd number of −1’s. In either case the sum is an even number. By the same reasoning the sum is an odd number if j + 1 is odd. Thus the limit lim f (x + hj ) − f (x) hj
where γ = ±1.Thus if j + 1 is even we have either an even number of +1’s and even number of −1’s, or an odd
j→∞
cannot exist, and neither can the derivative.
4.3
Assignment 9:
1. Find an example of asequence of functions that is not continuous at any point in [0, 1] but converges uniformly to a continuous function. 2. Deﬁne fn (x) = Show that the series 3. Prove that if converse true? 4. Prove that 37
∞ n=1 ∞ n=1
1 n
,
1 2n
fn converges uniformly, but the Weierstrass M-test fails.
0
otherwise
<x≤
1 2n−1
fn converges uniformly on E, then {fn } converges to zero uniformly on E. Is the
(a) If fn converges uniformly to f and if fn is bounded, then f is bounded as well. (b) If fn converges uniformly to f on [a, b] and if fN − f M (f ) = max{f (x); x ∈ [a, b]}. < ǫ, then |M (fN ) − M (f )| < 2ǫ, where
4.4
Dense subsets of C(X):
While continuous functions seem very reasonable it is often necessary to approximate continuous functions with another group of functions which are easier to evaluate. In fact the only functions which we can evaluat exactly are the ones which only involve the four basic operations of arithmetic. All others require approximations. For notation we say that a set A is adense subset of a metric space X, if the closure of A is equal to X. In other a simple result. words if A is a dense subset, every x ∈ X can be approximated by a sequence of elements of A. We start with
Proposition 26 Let L denote the set of piece wise linear functions on the interval [a, b]. Then L is dense in C([a, b]). Proof: Without loss of generality we assume [a, b] = [0, 1]. Let f ∈ C([0, 1]) be a given continuous function
1/N < δ, and let x0 = 0 and xj = j/N for j = 1, . . . , N . Deﬁne l(x) = f (xj )
and ǫ > 0. Then there exists a δ > 0 such that |f (x) − f (y)| < ǫ/2 for all |x − y| < δ. Choose N such that xj − x x − xj−1 + f (xj−1 ) , xj − xj−1 xj − xj−1
j = 1, dots, N we have:
for x ∈ [xj−1 , xj ], j = 1, . . . , N . Then l is continuous and piecewise linear and for every x ∈ [xj−1 , xj ] and every |l(x) − f (x)| = f (xj ) xj − x x − xj−1 xj − x x − xj−1 + f (xj−1 ) − f (x) + f (x) xj − xj−1 xj − xj−1 xj − xj−1 xj − xj−1 x − xj−1 xj − x ≤ (f (xj ) − f (x)) + (f (xj−1 ) − f (x)) < ǫ. xj − xj−1 xj − xj−1
Hence every continuous function can be uniformly approximated by a piecewise linear function, and the piecewise linear functions are dens in C([a, b]). Instead of proving such results for every possible subset we would like to have a general criterium for density. This will be given by the following two theorems, Stone’s Theorem, and the Stone-Weierstrass Theorem. To formulate these theorem we introduce the following notation. Let f and g be two continuous functions on a metric space X, then (f ∨ g)(x) = max{f (x), g(x)}, and (f ∧ g)(x) = min{f (x), g(x)}. It is clear that f ∨ g and f ∧ g are both continuous. properties: Theorem 10 Stone’s Theorem Let X be a compact metric space, and let L ⊂ C(X) have the following
38
1. if f, g ∈ L then af + bg ∈ L. i.e. L is a vector space. 2. if f, g ∈ L, then f ∨ g, f ∧ g ∈ L 3. for any x, y ∈ X, with x = y, there exists f ∈ L with f (x) = f (y), i.e. L separates points 4. every constant function belongs to L. Then L is dense in C(X). Proof: First if x = y and a, b ∈ R there is an f ∈ L such that f (x) = a, and f (y) = b. This is obvious if a = b α = 0, then
and if a = b there is a g ∈ L such that g(x) = α and g(y) = β. And one of the numbers α, β is not zero. Assume a a g + (b − β) α α
satisﬁes the desired property. x ∈ X. Now for every pair x, y ∈ X there exists a function gxy ∈ L such that gxy (x) = f (x) and gxy (y) = f (y). of open balls Bδxy (y) forms an open cover of X, and by compactness there exist points y1 , . . . yN such that X = Bδxy1 (y1 ) ∪ · · · ∪ BδxyN (yN ). Now we deﬁne for every x ∈ X gx = gxy1 ∧ · · · ∧ gxyN ∈ L. Then gx (t) < f (t) + ǫ for all t ∈ x and gx (x) = f (x). Again by continuity there exists numbers δx such that gx (t) > f (t) − ǫ for all t ∈ Bδx (x). Compactness again applies that there are ﬁnitely many points x1 , . . . , xM whose balls cover X. Deﬁne g = gx1 ∨ · · · ∨ gxM . Then f (t) − ǫ < g(t) < f (t) + ǫ for all t ∈ X and the proof is complete. This is a powerful theorem which has the proposition on piecewise linear functions as a direct consequence. Since piecewise linear functions on Rn satisfy all the hypotheses, as do piecewise polynomial functions. However, we often wnat to use functions which are not deﬁned piecewise, like polynomials or trigonometric polynomials etc. example. And these functions do not satisfy all hypotheses, since p ∨ q is not a polynomial if p, q are polynomials for Now let f ∈ C(X) and ǫ > 0 we will constaruct a function g ∈ L such that f (x) − ǫ, g(x) < f (x) + ǫ for all
By continuity of gxy there exists a number δxy such that gxy (t) < f (t) + ǫ for all t ∈ Bδxy (y). The collection
Before continuing to an improved theorem we will explore the condition of maxima further. Observe that |a| = a ∨ (−a) and thus |a − b| = a ∨ b − a ∧ b and a + b = a ∨ b + a ∧ b.
So if f + g ∈ L and |g| ∈ L for every g ∈ L the condition (2) of the theorem holds. We have the following 39
Lemma 4 There exists a sequence of polynomials {pn } which uniformly converges to |x| on [−1, 1]. Proof: Let p0 = 1 and recursively deﬁne pj+1 (x) = If |x| ≤ pj ≤ 1 we have and pj+1 − |x| = 1 2 x + 2pJ − p2 j 2 1 2 p − |x|2 ≥ 0, 2 j
pj − pj+1 =
1 (1 − |x|)2 − (1 − pj )2 ≥ 0. 2
{pj (x)} is a bounded decreasing sequence that converges. Let {p} be the limit of this sequence, then p satisﬁes. p= and therefore p2 = |x|2 . Since p is nongegative we have p = |x|. The uniform convergence follows by a theorem that says if a sequence of continuous functions converges monotonically to a continuous limit on a compact set, the convergence is uniform (see assignment 10). 1 |x|2 + 2p − p2 , 2
Since p0 satisﬁes |x| ≤ p0 ≤ 1 itfollows by induction that |x| ≤ pj+1 ≤ pj ≤ 1, and so for every x ∈ [−1, 1]
following properties:
Theorem 11 Stone-Weierstrass Theorem Let X be a compact metric space, and let S ⊂ C(X) have the 1. if f, g ∈ S then af + bg ∈ S. i.e. S is a vector space. 2. if f, g ∈ S, then f g ∈ S 3. for any x, y ∈ X, with x = y, there exists f ∈ S with f (x) = f (y), i.e. S separates points 4. every constant function belongs to S.
Then S is dense in C(X). Proof: It is clear, that if p is a polynomial then p(f ) ∈ S for every f ∈ S . Now for every compact interval is a compact interval I such that f (X) ⊂ I and a sequence {pn (f )} in S that converges to |f |. Thus |f | is in C(X). But since the closure is closed it must equal C(X) and hence S is dense in C(X).
I ⊂ R there is a sequence of polynomials {pn } which converges uniformly to |x| on I. Given f ∈ S. There the closure of S. The closure of S now satisﬁes all the hypotheses of Stone’s Theorem and is therefore dense in An immediate consequece is this classical result.
40
Corollary 1 Weierstrass Approximation Theorem. Let X be a compact subset of Rn . Then every function f ∈ C(X) can be uniformly approximated by a sequence of polynomials. Of course there are many other consequences. Diﬀerentiable functions are dense, and for every n ∈ N n-times diﬀerentiable functions are dense.
4.5
Compactness in C(X):
Another important question is when does a sequence of functions have a uniformly convergent subsequences. In Rn we have the Bolzano-Weierstrass Theorem which gives a condition that can be easily checked. The situation in C(X) is somewhat more complicated. Deﬁnition 17 Let X be a metric space and F ⊂ C(X). The set F is said to be equi-continuous if for every
ǫ > 0 and every x ∈ X there exists a δ > 0 such that |f (x) − f (y)| < ǫ for all y ∈ Bδ (x) and all f ∈ F. This condition is a condition on a set of functions, not on a single function. We now prove:
Theorem 12 Classical Arzela-Ascoli Theorem. Let X be a metric space which has a countable dense subset Q. And let G ⊂ C(X) have the following properties: 1. There exists a M > 0 such that |g| < M for all g ∈ G 2. G is equicontinuous Then every sequence of functions in G has a (pointwise) convergent subsequence. Proof: Let Q = {p1, p2 , . . .}. Then {gn (p1 )} is a bounded sequence of real numbers, and therefore has a
1 convergent subsequence {gn (p1 )} whose limit we call g(p1 ). We ingore all terms of {gn } which are not part
1 1 2 of {gn }. {gn (p2 } is again abounded sequence of real numbers with a convergent subsequence {gn (p2 )} observe
that
2 gn (p2 ) → g(p2 ),
and
2 gn (p1 ) → g(p1 )
j j+1 in this manner we keep extracting subsequences gn (pj+1 )of gn (pj+1 with the property that
lim g j+1 (pi ) n→∞ n
= g(pi )
for all
i = 1, . . . , j + 1.
We next construct a new sequence by diagonalization. I.e, we set
j hj = g j ,
i.e the jth term of this sequence is given tby the jth term of the jth subsequence. In this way {hn } is a subsequence of all the sequences constructed above, and thus
n→∞
lim hn (pj ) = g(pj ), 41
there is a q ∈ Q such that q ∈ Bδ (x). For this q, there exists a M > 0 such that |hn (q) − hm (q)| < for all ...