{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

TrefethenLecture3

# TrefethenLecture3 - Lecture 3 Norms The essential notions...

This preview shows pages 1–8. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lecture 3 . Norms The essential notions of size and distance in a vector space are captured by norms. These are the yardsticks with which we measure approximations and convergence throughout numerical linear algebra. Vector Norms A norm is a function H ~ H : (Um % 1R that assigns a real—valued length to each vector. In order to conform to a reasonable notion of length, a norm must satisfy the following three conditions. For all vectors :1: and y and for all scalars 04 E (D, (1) HJUH Z 0, and HmH : 0 only if a: = 0, (2) H93 + 9H S H\$H + HyHa (3-1) (3) HOWH = W H33H- ln words, these conditions require that (1) the norm of a nonzero vector is positive, (2) the norm of a vector sum does not exceed the sum of the norms of its partsithe triangle inequality, and (3) scaling a vector scales its norm by the same amount. In the last lecture, we used H - H to denote the Euclidean length function (the square root of the sum of the squares of the entries of a vector). However, the three conditions (3.1) allow for different notions of length, and at times it is useful to have this ﬂexibility. 17 18 PART I FUNDAMENTALS The most important class of vector norms, the p—norms, are deﬁned below. The closed unit ball {as E (Um : Hm“ g 1} corresponding to each norm is illustrated to the right for the case m : 2. ml H | | m Elmily i:1 m 1/2 xlz = (2W2) = i:1 (3.2) 121.2%. |mz~|7 i | | xlp (flat-lg”? (1:p<oo). 1'21 \$EE\$<> The 2—norm is the Euclidean length function; its unit ball is spherical. The 1—norm is used by airlines to deﬁne the maximal allowable size of a suitcase. The Sergel plaza in Stockholm, Sweden has the shape of the unit ball in the 4— norm; the Danish poet Piet Hein popularized this “superellipse” as a pleasing shape for objects such as conference tables. Aside from the p—norms, the most useful norms are the weighted p-norms, where each of the coordinates of a vector space is given its own weight. In general, given any norm || - ||7 a weighted norm can be written as llmllw = HW33H- (3-3) Here W is the diagonal matrix in which the ith diagonal entry is the weight wt. 7E 0. For example, a weighted 2—norrn H ~ ”W on (Um is speciﬁed as follows: HmHW = (\$me 69 (3.4) One can also generalize the idea of weighted norms by allowing W to be an arbitrary nonsingular matrix, not necessarily diagonal (Exercise 3.1). The most important norms in this book are the unweighted 2—nor1n and its induced matrix norm. Matrix Norms Induced by Vector Norms An m X 71 matrix can be viewed as a vector in an mn—dimensional space: each of the mn entries of the matrix is an independent coordinate. Any mn— dimensional norm can therefore be used for measuring the “size” of such a matrix. LECTURE 3 NORMS 19 However, in dealing with a space of matrices, certain special norms are more useful than the vector norms (3.2)—(3.3) already discussed. These are the induced matria: norms, deﬁned in terms of the behavior of a matrix as an operator between its normed domain and range spaces. Given vector norms H - Hm and H - Hm) on the domain and the range of A 6 (13mm, respectively, the induced matrix norm HAHUM) is the smallest number C for which the following inequality holds for all 1: E C": HAﬂCHm) S Cllﬂchy (3-5) In other words, |AH(m,n) is the supremum of the ratios HAilfH(m)/H13H(n) over all vectors a: E Cnithe maximum factor by which A can “stretch” a vector .93. We say that H - HUM) is the matrix norm induced by H - H(m) and H - Hm. Because of condition (3) of (3.1), the action of A is determined by its action on unit vectors. Therefore, the matrix norm can be deﬁned equivalently in terms of the images of the unit vectors under A: ”Axum HAW: sug H H( )= sux; llAmll(m)- (3.6) LEE) 3: (n) 11716021 {E 1(7L)7 This form of the deﬁnition can be convenient for visualizing induced matrix norms, as in the sketches in (3.2) above. Examples Example 3.1. The matrix H; :1 (3.7) maps (132 to (132. It also maps IE2 to 1R2, which is more convenient if we want to draw pictures and also (it can be shown) sufficient for determining matrix p—norms, since the coefﬁcients of A are real. Figure 3.1 depicts the action of A on the unit balls of R2 deﬁned by the 1—, 2—, and oo—norms. From this ﬁgure, one can see a graphical interpretation of these three norms of A. Regardless of the norm, A maps 61 : (1, 0)* to the ﬁrst column of A, namely 81 itself, and 62 : (0, 1)* to the second column of A, namely (2, 2)*. In the 1—norm, the unit vector :1: that is ampliﬁed most by A is (0, 1)* (or its negative), and the ampliﬁcation factor is 4. In the oo—norm, the unit vector 1: that is ampliﬁed most by A is (1,1)* (or its negative), and the ampliﬁcation factor is 3. In the 2—norm, the unit vector that is ampliﬁed most by A is the vector indicated by the dashed line in the ﬁgure (or its negative), and the ampliﬁcation factor is approximately 2.9208. (Note that it must be at least x/8 % 2.8284, since (0,1)* maps to (2, 2)*.) We shall consider how to calculate such 2—norm results in Lecture 5. D 20 PART I FUNDAMENTALS t 2—norm: OO-IlOI‘IIlI Figure 3.1. On the left, the amt balls of 1R2 with respect to H - H1, H - H2, and H . H00. Oh the right, their images under the matria‘ A of (3.7). Dashed lines mar/<3 the vectors that are ampliﬁed most by A in each norm. Example 3.2. The p-Norm of a Diagonal Matrix. Let D be the diag— onal matrix at 7Tb Then, as in the second row of Figure 3.1, the image of the 2—norm unit sphere under D is an m—dimensional ellipse Whose semiaxis lengths are given by the numbers |dl| The unit vectors ampliﬁed most by D are those that are mapped to the longest semiaxis of the ellipse, of length maxi{|di|}. Therefore, we have HDH2 = max1<i<m{|di|}. In the next lecture we shall see that every matrix maps the 2—no_rrh unit sphere to an ellipseiproperly called a hyperelh'pse if m > 27though the axes may be oriented arbitrarily. This result for the 2—norm generalizes to any p: if D is diagonal, then HDHp = maxigz'gm Idil Example 3.3. The 1-Norm of a Matrix. If A is any m X n matrix, then HAH1 is equal to the “maximum column sum” of A. We explain and derive LECTURE 3 NORMS 21 this result as follows. Write A in terms of its columns A: Lil a , (3.8) where each aj is an m—vector. Consider the diamond—shaped 1—norm unit ball in C”, illustrated in (3.2). This is the set {as E C” : ;‘21|J:j| g 1}. Any vector Am in the image of this set satisﬁes TL ’17. HATHi : ”235]“le S Elmjlllajlll S llgjagzllajlli- choosing :13 = ej. wherej maximizes HajHI, we attain this bound7 and thus the matrix norm is Therefore the induced matrix 1—norm satisﬁes HAM1 < maxlSJ-SnHajHl. By HAHl = max HajHl' (3-9) lstn D Example 3.4. The oo-Norm of 3 Matrix. By much the same argument, it can be shown that the oo—norm of an m X n matrix is equal to the “maximum row sum,” IMHO. = max HaZ-‘Hp (3-10) 193m where a: denotes the 2th row of A. D Cauchy—Schwarz and Holder Inequalities Computing matrix p—norms with p 75 17 00 is more difﬁcult, and to approach this problem, we note that inner products can be bounded using p—norms. Let p and q satisfy 1/p—l— 1/q : 17 with 1 g p, q S 00. Then the Holder inequality states that, for any vectors :1: and y, |:I:*y| S llmllpllqu- (3-11) The CauchyiSchwarz inequality is the special case p : q : 2: Ifyl S ||\$||2||y||2- (312) Derivations of these results can be found in linear algebra texts. Both bounds are tight in the sense that for certain choices of :1: and y, the inequalities become equalities. Example 3.5. The 2-Norm of a ROW Vector. Consider a matrix A containing a single row. This matrix can be written as A : a*, where a is a column vector. The Cauchy—Schwarz inequality allows us to obtain the induced matrix 2—norm. For any :13, we have HAmH2 = |a*Jr| g HaH2HmH2. This bound is tight: observe that HACLH2 : “a”; Therefore, we have D HAH2 = 8:13{|lA\$l|2/H03H2} = M2- 22 PART I FUNDAMENTALS Example 3.6. The 2-Norm of an Outer Product. More generally, con- sider the rank—one outer product A : m“, where u is an m—vector and “U is an n—vector. For any n—vector m, we can bound HAzrrH2 as follows: HAéng = lluv*\$ll2 = HUH2IU*\$| S Halbllvllgllivllg- (3-13) Therefore HAH2 < HquﬂvHQ. Again, this inequality is an equality: consider the case x : v. D in an Induced Matrix Norm Bounding ||AB| The induced matrix norm of a matrix product can also be bounded. Let H - ”(0’ || - NW), and H - ”(n) be norms on (El, (Em, and C“, respectively, and let A be an l x m matrix and B an m X n matrix. For any x E C” we have HABQTHM) S HA! (r,m)|lBJ3l|(m) S l|A|l(r,m)”Blhmmﬂlmlhm- Therefore the induced norm of AB must satisfy NAB! (5,71) 3 HAl|(e,m)HBH(m,n)- (3-14) In general this inequality is not an equality. For example, the inequality ||An|| g ||A||n holds for any square matrix in any matrix norm induced by a vector norm, but HAnH : HAM” does not hold in general for n 2 2. General Matrix Norms As noted above7 matrix norms do not have to be induced by vector norms. In general, a matrix norm must merely satisfy the three vector norm conditions (3.1) applied in the mn—dimensional vector space of matrices: (1) HA” 2 0, and HAM : 0 only ifA : 0, (2) ||A+B|| S ||A|| + ”B”: (315) (3) H0614” = |04| HAH- The most important matrix norm which is not induced by a vector norm is the Hilbert—Schmidt or Fmbenius norm, deﬁned by m n 1/2 HAIIF = (22%?) . (316) i:1j:1 Observe that this is the same as the 2—norm of the matrix when viewed as an mn—dimensional vector. The formula for the Frobenius norm can also be LECTURE 3 NORMS 23 written in terms of individual rows or columns. For example, if aj is the jth column of A7 we have n 1/2 HAHF = (ZHajllf) . (317) j:1 This identity7 as well as the analogous result based on rows instead of columns, can be expressed compactly by the equation HAHF = Worm) = may), (3.18) where tr(B) denotes the trace of B, the sum of its diagonal entries. Like an induced matrix norm, the Frobenius norm can be used to bound products of matrices. Let C : AB with entries cm, and let a: denote the 2th row of A and bj the jth column of B. Then cij : afbj, so by the Cauchyi Schwarz inequality we have |cz~j| S Hai|l2 Hbj|l2. Squaring both sides and sum— ming over all 2', j, we obtain ”ABM? : 22W? |/\ M 'M3 A E [0: 3* & 7Tb 19220.15 = ||A||%||BII%- 1:1 || C .3 E N l | H Invariance under Unitary Multiplication One ofthe many special properties ofthe matrix 2—norm is that, like the vector 2—norm, it is invariant under multiplication by unitary matrices. The same property holds for the Frobenius norm. Theorem 3.1. F07" any A E (3an and unitary Q E (Dmxm, we have HQAH2 = HAHg, HQAHF = HAHF' Proof. Since ||ch3||2 : ||cc||2 for every x, by (2.10), the invariance in the 2—norm follows from (3.6). For the Frobenius norm we note that by (3.17), it is enough to show that the jth column of QA has the same 2—norm as the jth column of A7 and this follows from (1.6) and (2.10). 24 PART I FUNDAMENTALS Exercises Prove that if W is an arbitrary nonsingular matrix, the function H - HW deﬁned by (3.3) is a vector norm. Let H - H denote any norm on (Em and also the induced matrix norm on (3me. Show that p(A) S HAH, where p(A) is the spectral radius of A, i.e., the largest absolute value W of an eigenvalue A of A. Vector and matrix p—norms are related by various inequalities, often involving the dimensions m or n. For each of the following, verify the inequality and give an example of a nonzero vector or matrix (for general mm) for which equality is achieved. In this problem x is an m—vector and A is an m X 71 matrix. (a) llwlloo S HIL‘H2 (b) Hw|l2 S WHIBHOO (C) llAHoo S x/ﬁllAHg (61) Mb S WHAHOO Let A be an m x 71 matrix and let B be a submatrix of A, that is, an M X 1/ matrix (,u S m, V S 71) obtained by selecting certain rows and columns of A. (a) Explain how B can be obtained by multiplying A by certain row and column “deletion matrices” as in step (7) of Exercise 1.1. (b) Using this product, show that HBHP S HAHp for any p with 1 g p g 00. Example 3.6 shows that if E is an outer product E : mf“, then HEH2 : HquHvH2 Is the same true for the Frobenius norm, i.e., HEHF : HuHFHvHF ? Prove it or give a counterexample. Let H - H denote any norm on (Elm. The corresponding dual norm H - H’ is deﬁned by the formula HmH’ = supllyH:1|y*x| (a) Prove that H - H’ is a norm. (b) Let :r,y E (Em with HmH : HyH : 1 be given. Show that there exists a rank—one matrix B : yz“ such that Ba: : y and HBH : 17 where HBH is the matrix norm of B induced by the vector norm H - H. You may use the following lemma, without proof: given :1: E (Um, there exists a nonzero z E (Um such that |Z*\$| = HZH’HmH- ...
View Full Document

{[ snackBarMessage ]}