This
preview
has intentionally blurred sections.
Sign up to view the full version.
This
preview
has intentionally blurred sections.
Sign up to view the full version.
This
preview
has intentionally blurred sections.
Sign up to view the full version.
This
preview
has intentionally blurred sections.
Sign up to view the full version.
Unformatted text preview: Lecture 3 . Norms The essential notions of size and distance in a vector space are captured by
norms. These are the yardsticks with which we measure approximations and
convergence throughout numerical linear algebra. Vector Norms A norm is a function H ~ H : (Um % 1R that assigns a real—valued length to
each vector. In order to conform to a reasonable notion of length, a norm
must satisfy the following three conditions. For all vectors :1: and y and for all
scalars 04 E (D, (1) HJUH Z 0, and HmH : 0 only if a: = 0,
(2) H93 + 9H S H$H + HyHa (3-1)
(3) HOWH = W H33H- ln words, these conditions require that (1) the norm of a nonzero vector is
positive, (2) the norm of a vector sum does not exceed the sum of the norms
of its partsithe triangle inequality, and (3) scaling a vector scales its norm
by the same amount. In the last lecture, we used H - H to denote the Euclidean length function
(the square root of the sum of the squares of the entries of a vector). However,
the three conditions (3.1) allow for different notions of length, and at times it
is useful to have this flexibility. 17 18 PART I FUNDAMENTALS
The most important class of vector norms, the p—norms, are defined below. The closed unit ball {as E (Um : Hm“ g 1} corresponding to each norm is
illustrated to the right for the case m : 2. ml H
| | m Elmily i:1
m 1/2 xlz = (2W2) =
i:1 (3.2) 121.2%. |mz~|7 i
| | xlp (flat-lg”? (1:p<oo). 1'21 $EE$<> The 2—norm is the Euclidean length function; its unit ball is spherical. The
1—norm is used by airlines to define the maximal allowable size of a suitcase.
The Sergel plaza in Stockholm, Sweden has the shape of the unit ball in the 4—
norm; the Danish poet Piet Hein popularized this “superellipse” as a pleasing
shape for objects such as conference tables. Aside from the p—norms, the most useful norms are the weighted p-norms,
where each of the coordinates of a vector space is given its own weight. In
general, given any norm || - ||7 a weighted norm can be written as llmllw = HW33H- (3-3) Here W is the diagonal matrix in which the ith diagonal entry is the weight
wt. 7E 0. For example, a weighted 2—norrn H ~ ”W on (Um is specified as follows: HmHW = ($me 69 (3.4) One can also generalize the idea of weighted norms by allowing W to be an
arbitrary nonsingular matrix, not necessarily diagonal (Exercise 3.1). The most important norms in this book are the unweighted 2—nor1n and
its induced matrix norm. Matrix Norms Induced by Vector Norms An m X 71 matrix can be viewed as a vector in an mn—dimensional space:
each of the mn entries of the matrix is an independent coordinate. Any mn—
dimensional norm can therefore be used for measuring the “size” of such a
matrix. LECTURE 3 NORMS 19 However, in dealing with a space of matrices, certain special norms are
more useful than the vector norms (3.2)—(3.3) already discussed. These are
the induced matria: norms, defined in terms of the behavior of a matrix as an
operator between its normed domain and range spaces. Given vector norms H - Hm and H - Hm) on the domain and the range of A 6 (13mm, respectively, the induced matrix norm HAHUM) is the smallest
number C for which the following inequality holds for all 1: E C": HAflCHm) S Cllflchy (3-5) In other words, |AH(m,n) is the supremum of the ratios HAilfH(m)/H13H(n) over
all vectors a: E Cnithe maximum factor by which A can “stretch” a vector
.93. We say that H - HUM) is the matrix norm induced by H - H(m) and H - Hm. Because of condition (3) of (3.1), the action of A is determined by its action
on unit vectors. Therefore, the matrix norm can be defined equivalently in
terms of the images of the unit vectors under A: ”Axum
HAW: sug H H( )= sux; llAmll(m)- (3.6)
LEE) 3: (n) 11716021
{E 1(7L)7 This form of the definition can be convenient for visualizing induced matrix
norms, as in the sketches in (3.2) above. Examples Example 3.1. The matrix H; :1 (3.7) maps (132 to (132. It also maps IE2 to 1R2, which is more convenient if we want
to draw pictures and also (it can be shown) sufficient for determining matrix
p—norms, since the coefficients of A are real. Figure 3.1 depicts the action of A on the unit balls of R2 defined by the
1—, 2—, and oo—norms. From this figure, one can see a graphical interpretation
of these three norms of A. Regardless of the norm, A maps 61 : (1, 0)* to the
first column of A, namely 81 itself, and 62 : (0, 1)* to the second column of A,
namely (2, 2)*. In the 1—norm, the unit vector :1: that is amplified most by A is
(0, 1)* (or its negative), and the amplification factor is 4. In the oo—norm, the
unit vector 1: that is amplified most by A is (1,1)* (or its negative), and the
amplification factor is 3. In the 2—norm, the unit vector that is amplified most
by A is the vector indicated by the dashed line in the figure (or its negative),
and the amplification factor is approximately 2.9208. (Note that it must be
at least x/8 % 2.8284, since (0,1)* maps to (2, 2)*.) We shall consider how to
calculate such 2—norm results in Lecture 5. D 20 PART I FUNDAMENTALS t 2—norm: OO-IlOI‘IIlI Figure 3.1. On the left, the amt balls of 1R2 with respect to H - H1, H - H2, and
H . H00. Oh the right, their images under the matria‘ A of (3.7). Dashed lines
mar/<3 the vectors that are amplified most by A in each norm. Example 3.2. The p-Norm of a Diagonal Matrix. Let D be the diag—
onal matrix at 7Tb Then, as in the second row of Figure 3.1, the image of the 2—norm unit sphere
under D is an m—dimensional ellipse Whose semiaxis lengths are given by the
numbers |dl| The unit vectors amplified most by D are those that are mapped
to the longest semiaxis of the ellipse, of length maxi{|di|}. Therefore, we have
HDH2 = max1<i<m{|di|}. In the next lecture we shall see that every matrix
maps the 2—no_rrh unit sphere to an ellipseiproperly called a hyperelh'pse if
m > 27though the axes may be oriented arbitrarily. This result for the 2—norm generalizes to any p: if D is diagonal, then
HDHp = maxigz'gm Idil Example 3.3. The 1-Norm of a Matrix. If A is any m X n matrix, then
HAH1 is equal to the “maximum column sum” of A. We explain and derive LECTURE 3 NORMS 21 this result as follows. Write A in terms of its columns
A: Lil a , (3.8) where each aj is an m—vector. Consider the diamond—shaped 1—norm unit ball
in C”, illustrated in (3.2). This is the set {as E C” : ;‘21|J:j| g 1}. Any
vector Am in the image of this set satisfies TL ’17.
HATHi : ”235]“le S Elmjlllajlll S llgjagzllajlli- choosing :13 = ej. wherej maximizes HajHI, we attain this bound7 and thus the
matrix norm is Therefore the induced matrix 1—norm satisfies HAM1 < maxlSJ-SnHajHl. By HAHl = max HajHl' (3-9) lstn
D Example 3.4. The oo-Norm of 3 Matrix. By much the same argument,
it can be shown that the oo—norm of an m X n matrix is equal to the “maximum row sum,”
IMHO. = max HaZ-‘Hp (3-10) 193m where a: denotes the 2th row of A. D Cauchy—Schwarz and Holder Inequalities Computing matrix p—norms with p 75 17 00 is more difficult, and to approach
this problem, we note that inner products can be bounded using p—norms. Let
p and q satisfy 1/p—l— 1/q : 17 with 1 g p, q S 00. Then the Holder inequality
states that, for any vectors :1: and y, |:I:*y| S llmllpllqu- (3-11) The CauchyiSchwarz inequality is the special case p : q : 2: Ifyl S ||$||2||y||2- (312) Derivations of these results can be found in linear algebra texts. Both bounds
are tight in the sense that for certain choices of :1: and y, the inequalities
become equalities. Example 3.5. The 2-Norm of a ROW Vector. Consider a matrix A
containing a single row. This matrix can be written as A : a*, where a
is a column vector. The Cauchy—Schwarz inequality allows us to obtain the
induced matrix 2—norm. For any :13, we have HAmH2 = |a*Jr| g HaH2HmH2. This
bound is tight: observe that HACLH2 : “a”; Therefore, we have D
HAH2 = 8:13{|lA$l|2/H03H2} = M2- 22 PART I FUNDAMENTALS Example 3.6. The 2-Norm of an Outer Product. More generally, con-
sider the rank—one outer product A : m“, where u is an m—vector and “U is an
n—vector. For any n—vector m, we can bound HAzrrH2 as follows: HAéng = lluv*$ll2 = HUH2IU*$| S Halbllvllgllivllg- (3-13) Therefore HAH2 < HquflvHQ. Again, this inequality is an equality: consider the case x : v. D in an Induced Matrix Norm Bounding ||AB| The induced matrix norm of a matrix product can also be bounded. Let H - ”(0’ || - NW), and H - ”(n) be norms on (El, (Em, and C“, respectively, and let A be
an l x m matrix and B an m X n matrix. For any x E C” we have HABQTHM) S HA! (r,m)|lBJ3l|(m) S l|A|l(r,m)”Blhmmfllmlhm- Therefore the induced norm of AB must satisfy NAB! (5,71) 3 HAl|(e,m)HBH(m,n)- (3-14) In general this inequality is not an equality. For example, the inequality
||An|| g ||A||n holds for any square matrix in any matrix norm induced by a
vector norm, but HAnH : HAM” does not hold in general for n 2 2. General Matrix Norms As noted above7 matrix norms do not have to be induced by vector norms. In
general, a matrix norm must merely satisfy the three vector norm conditions
(3.1) applied in the mn—dimensional vector space of matrices: (1) HA” 2 0, and HAM : 0 only ifA : 0,
(2) ||A+B|| S ||A|| + ”B”: (315)
(3) H0614” = |04| HAH- The most important matrix norm which is not induced by a vector norm
is the Hilbert—Schmidt or Fmbenius norm, defined by m n 1/2
HAIIF = (22%?) . (316)
i:1j:1 Observe that this is the same as the 2—norm of the matrix when viewed as
an mn—dimensional vector. The formula for the Frobenius norm can also be LECTURE 3 NORMS 23 written in terms of individual rows or columns. For example, if aj is the jth
column of A7 we have n 1/2
HAHF = (ZHajllf) . (317)
j:1 This identity7 as well as the analogous result based on rows instead of columns,
can be expressed compactly by the equation HAHF = Worm) = may), (3.18) where tr(B) denotes the trace of B, the sum of its diagonal entries. Like an induced matrix norm, the Frobenius norm can be used to bound
products of matrices. Let C : AB with entries cm, and let a: denote the 2th
row of A and bj the jth column of B. Then cij : afbj, so by the Cauchyi
Schwarz inequality we have |cz~j| S Hai|l2 Hbj|l2. Squaring both sides and sum—
ming over all 2', j, we obtain ”ABM? : 22W? |/\
M
'M3
A
E
[0:
3*
& 7Tb 19220.15 = ||A||%||BII%- 1:1 ||
C
.3
E N
l |
H Invariance under Unitary Multiplication One ofthe many special properties ofthe matrix 2—norm is that, like the vector
2—norm, it is invariant under multiplication by unitary matrices. The same
property holds for the Frobenius norm. Theorem 3.1. F07" any A E (3an and unitary Q E (Dmxm, we have HQAH2 = HAHg, HQAHF = HAHF' Proof. Since ||ch3||2 : ||cc||2 for every x, by (2.10), the invariance in the 2—norm
follows from (3.6). For the Frobenius norm we note that by (3.17), it is enough
to show that the jth column of QA has the same 2—norm as the jth column
of A7 and this follows from (1.6) and (2.10). 24 PART I FUNDAMENTALS Exercises Prove that if W is an arbitrary nonsingular matrix, the function H - HW defined
by (3.3) is a vector norm. Let H - H denote any norm on (Em and also the induced matrix norm on (3me.
Show that p(A) S HAH, where p(A) is the spectral radius of A, i.e., the largest
absolute value W of an eigenvalue A of A. Vector and matrix p—norms are related by various inequalities, often involving
the dimensions m or n. For each of the following, verify the inequality and
give an example of a nonzero vector or matrix (for general mm) for which
equality is achieved. In this problem x is an m—vector and A is an m X 71
matrix. (a) llwlloo S HIL‘H2 (b) Hw|l2 S WHIBHOO
(C) llAHoo S x/fillAHg
(61) Mb S WHAHOO Let A be an m x 71 matrix and let B be a submatrix of A, that is, an M X 1/
matrix (,u S m, V S 71) obtained by selecting certain rows and columns of A. (a) Explain how B can be obtained by multiplying A by certain row and
column “deletion matrices” as in step (7) of Exercise 1.1. (b) Using this product, show that HBHP S HAHp for any p with 1 g p g 00. Example 3.6 shows that if E is an outer product E : mf“, then HEH2 :
HquHvH2 Is the same true for the Frobenius norm, i.e., HEHF : HuHFHvHF ?
Prove it or give a counterexample. Let H - H denote any norm on (Elm. The corresponding dual norm H - H’ is defined
by the formula HmH’ = supllyH:1|y*x| (a) Prove that H - H’ is a norm. (b) Let :r,y E (Em with HmH : HyH : 1 be given. Show that there exists a
rank—one matrix B : yz“ such that Ba: : y and HBH : 17 where HBH is the
matrix norm of B induced by the vector norm H - H. You may use the following lemma, without proof: given :1: E (Um, there exists a nonzero z E (Um such that
|Z*$| = HZH’HmH- ...
View
Full Document
- Spring '10
- Johnson
-
Click to edit the document details