This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: SVD Singular value decomposition Recall the URV decomposition, A E Rm” (or 67"”) A = URV* U [076” V*, with rank(A) = rank(Crxr) = 7'. This is an orthonormal decomposition, UEmem, U*U I (unitary) V 6 CW", V*V I (unitary) C is invertible and it is also diagonalizable (via unitary transformations), A U2V*==U 0'“ 0 V*. Roy Smith, ECE 210a, 12: 1 SVD A = UEV* 
C O
O
3 We will derive the SVD by construction. “ONT”: 2 max “0 = max xTCTCm (consider C to be realvalued)
$T$21 xT$21 Set this up as an optimization problem and use Lagrange multipliers to solve it. Deﬁne h(m,7) = xTCTCa: — 7(xTa: — 1) With 'y > 07 and solve, max Mic/Y)
ma 8h
The necessary conditions are: (raw) = 0 and 81: 87 Roy Smith, ECE 210a, 12: 2 Singular value decomposition The necessary conditions are: m = 0 and m = 0.
6:1: 87
W
mTzv = 1
6h(x,'y) _ 8:571 T T T 8:5 8:371 T63: _
—8w —0 => amiCC$+ccCCawi— Was—Pym am—O,
(MT
11 h = T
(reca t at am 6,)
=> 263101101) — 276?]! = 0, 2': 1,...,r. Stacking these r conditions gives, CTCrL' — 7x = 0, or equivalently, (CTC—VIM = 0. This means that *y is an eigenvalue of CTC’, and a: is the associated eigenvector. So the maximum is achieved for an eigenvalue of OTC. S VD Roy Smith, ECE 210a, 12: 3 S VD Singular value decomposition For these values ofac7 = mTCTCa: = vxTa: =7,
and so, ma C2: 2 ma hat, = , mil“ “2 WX ( v) 7 Where 'y is the maximum eigenvalue of OTC. Therefore “CH2 2 (This is the induced 2norm) Deﬁne y = 0x — llell2 _ W Cm SO y2 = 1' We can use x and y to diagonalize C. To begin scale {B so that = 1. Roy Smith, ECE 210a, 12: 4 S VD Singular value decomposition We can diagonalize C via Householder transformations as follows. Deﬁne the elementary reﬂectors, Rm = [ x  X ] 6 72”", Rat unitary and, Ry [ y  Y ] 6 RT)”, Ry unitary
Here X e RTX(T_1) with :c J. X (and analogously for Y). We can choose X (and Y) such that Rfo = I (BERy = I T
Now R3012, = [y ]C[:rX] = YTCzL' YTCX yTC'a: yTCX
YT ' But yTCx = yTyW = W,
:L'TC’TCX 'yxTX T
y C'X = = = 0, mJ_X ,
W W ( )
YTC'x = YTyW = 0, (le).
Roy Smith, ECE 210a, 12: 5
SVD
Singular value decomposition
0
So R303, = W
0 YTCX
T 0'1 0
Deﬁne 01 = W to get, Ry CR9, 2
0 02 So now A URVT = U [02)” 3] VT :URyORgfOCTXTORxORQOVT
01 01 0 001 01 0'1 0 0
U1 0 C2 0 VlT
0 0 0
with U1 — U [Ry 0] unitary, V1 2 V [Rx 0] unitary, and 02 ER(T_1)X(’"—l). Roy Smith, ECE 210a, 12: 6 Singular value decomposition 0'1 0 0
So we now have, A = U1 0 02 0 KT.
0 0 0 For the next step, consider C'22 and repeat the procedure to give, 01000 _ 00200 T A—U200030V2'
0000 VTT = UZV* in general. E E Rmxn and the diagonal elements are the singular values of A. S VD Roy Smith, ECE 21021, 12: 7 S VD Singular value decomposition rank(A) = 7" : 01,...,ar > 07 and ar+1,... = 0.
U: u1 Um} and V: 111 0n} are the left and right singular vectors of A. The singular values can be calculated as ai = (eigi (ATA))1/2 v?
U'T
A 2 U1 u! ur+1~   um, r
,UT
r+l
W .
basis basis for 72 (A) for N (AT) vn } basis for R (AT) } basis for N (A) Roy Smith, ECE 21021, 12: 8 S VD Singular value decomposition We can now look at IIAII2 (induced 2—norm) more closely. A = A = U 2V .
 “2 $19331” mn2 33:1” mllz
Say 1) = V113, then 0T1} = xTVTVx = xTx = 1 (V is unitary).
This gives an equivalent problem, A2 = max1 UZv2.
’U ’1}: The 2norm is invariant with respect to the unitary matrix, 11, so, 2
’01 T
2 2 .
A2 = max “21)”2 = max . = max 02122.
'uTv=l vT'u=l ' T 11.2=1 '7 ‘7
Urn 1:1 1 j=1
2
Roy Smith, ECE 210a, 12: 9
SVD
Singular value decomposition
2
0'1
. U1
2 2 . .
“Aug = max “sz2 = max 2; 0 
UT’U=1 vTv=1 T IU'n
2
r
= maxafvf + agvg + 0312,? subject to E v? = 1.
vi
1:1
1
0
Th1s 1s max1mlzed by choosmg v = ,
0 Then, A2 = 01 (the maximum singular value of A) This is also obvious from our derivation of the singular value decomposition. Roy Smith, ECE 210a, 12: 10 S VD Singular value decomposition 7‘
We can also writeAas: A = ZuluTm, With = Z 1
i=1 Recall the earlier comparison between eigenvalues and the induced 2—norm, The singular values scale each of the semiaxes of the image of the unit ball. Note the orthogonality of the singular vectors with respect to one another. Roy Smith, ECE 210a, 12: 11 S VD Rank and approximation problems The rank of a matrix can be loosely though of as the "complexity" of the operator it
represents. Low rank matrices can be stored efficiently, and algorithms involving low rank
matrices are often simpler and faster than full rank matrix algorithms. Is there a lower rank matrix “Close” to A ? Approximation problem: Find B7 with rank(B) = k < 7‘, such that we solve, ' A — B .
ranggﬂ  “2 Roy Smith, ECE 210a, 12: 12 S VD Lower rank approximation problem To solve, min “A — B2.
rank(B):k 0.1
. 0 0 *
0k l/la
partition A as, A = [ Ula Ulb U2 ] f2
0 v;
0
The approximation problem is solved with,
01 *
' 0 0 W—a 0'1
B = [ Um Ulb U2 ] 0k 17, = U1a . V1:
0 0 7 0k
2
0 0 0
Uk+1
Then A — B = Ulb ‘ I V17J (This is an SVD decomposition).
0r
This gives7 min i B2 : 0k“. rank(B)=k Roy Smith, ECE 210a, 12: 13 Contextual search example Example: contextual search The SVD can be used for data mining or in search engines.
Dictionary of terms: T1, . . . ,Tm. Scan each document (or page), Dj , and determine the frequency of terms, dj : (f1j7f2j7"'7fmj)7 Where fij is the number of occurrences of term i in document j. f 11 f In Deﬁne A E Rmxn = . A is typical large (n >> m) and sparse.
fml fmn To search the database, deﬁne a query, qi = [0 . . . 1 0 . . . 1 . . T . The 1’s correspond to the terms in our search. Key Smith, ECE 210a, 12: 14 Contextual search example Example: contextual search We use the inner product to deﬁne the “closeness” of a match to a particular
document. quj _ qTAej cos0 = — — —.
J q2dj2 <12Aej2 If cos 0j > t then the document is signiﬁcant (t: tolerance). The matrix, A, is “noisy” because of differences in the documents (length, author, style). If we normalize q and the columns of A then, qTA = [cos01 cos02 ...cos0n]. Model reduction A is very large; can we use a low rank approximation? Roy Smith, ECE 210a, 12: 15 Contextual search example Example: contextual search Calculate an SVD of A and truncate at k < min{m, Ul
Ak = U1 . W
0k:
TA 6
So cos 0] N q k J When can we effectively do this?
qulzllAkeJH2 qTUlsj and 0050‘ is —.
J q28j2 Repeated searches on the database are now much faster. Roy Smith, ECE 210a, 12: 16 ...
View
Full
Document
 Spring '08
 Chandrasekara

Click to edit the document details