View the step-by-step solution to:

Question

Please explain how you would approach this question. Thanks a lot.Consider the following 5 ˆ 2 data

matrix. Think of the rows as observations and columns as feature values for each observation. X0 " ¨ ˚˚˚˚˚˚˚˚˚˝ 0 0 0 2 2 0 2 2 x y ˛ ‹ ‹ ‹ ‹ ‹ ‹ ‹ ‹ ‹‚ P R 5ˆ2 , px, yq P r0, 2s 2 Write x i 0 for the ith column of X0 and µi for its average, i " 1, 2. x, y are scalars that can take any values between 0 and 2. In Q1, fix px, yq " p1, 1q. (a) (2 pt) Write out the centered version of X0, calling it X. In other words, x 1 " x 1 0 ´µ1 is the first column of X. (b) (2 pt) Write out XtX. Hint: Remember the ij-th entry of this matrix is the inner product of columns x i and x j . This should make it easy to calculate. For example, the 11 entry of this matrix (top left corner) is px 1 q t px 1 q. (c) (2 pt) First, show the column vector v1 " p1, 0q t is an eigenvector for XtX and write its corresponding eigenvalue, which you should call λ1. Remember, v is an eigenvector for a matrix A with eigenvalue λ if Av " λv. Second, find a second eigenvector for XtX and its corresponding eigenvalue λ2. (d) (4 pt) Say the SVD of X is X " UDVt . Remember from class that XtX " VD2Vt where D2 is the diagonal matrix in which each entry of D is squared. First, use (c) to justify the fact that V " ¨ ˝ 1 0 0 ´1 ˛ ‚ Second, find an explicit expression for D. You should be specific in your explanations. 1 (e) (5 pt) Compute the PC-k variance for k " 1, 2 principle component directions, using your answers to (c), (d). Look up the formula in lecture slides if you have forgotten. (f) (7 pt) Go to the website http://setosa.io/ev/principal-component-analysis/ and look at the 2-dimensional PCA example. Move the points in the left-hand display to match the coordinates of X0 with px, yq " p1, 1q. When done correctly, the left display should have a square with dots at the four corners and one at the middle. The L shape in the display should have its red line horizontal and its green line vertical. Explanation of right-hand display: The x-axis coordinates show the dot products of the first principle component vector with each of the the centered observation vectors (rows of X), and the y-axis shows the dot products with the second principle component. Note the L in the left-hand display is rotated in the right-hand display so that the red base of the L is horizontal, corresponding to the first PC direction, and the green left-hand side of the L is vertical, corresponding to the second PC direction. Therefore, looking at how spread out the points are along the x-axis compared to the y-axis shows the variation along the PC1 and PC2 directions respectively. The right-hand display should be a square with corners at p1, 1q,p1, ´1q etc. and center zero. Use your answers to (c) through (d) to select all correct statements about the right-hand display on the 2d example obtained from your representation of X0 on the plot. The right-hand display... ( ) looks exactly like a plot of X. ( ) looks exactly like a plot of X0. ( ) shows the vertical direction (pointing up) of X0 has more variation than the horizontal direction. ( ) shows the horizontal (pointing right) direction of X0 has more variation than the vertical direction. ( ) shows the horizontal and vertical directions of X0 have the same variation. ( ) shows that the projections of the observation vectors in X onto the PC1 and PC2 directions are projections onto the original axes. ( ) shows that all of the total variance of the data is explained by the first principle component. ( ) shows that half of the total variance of the data is explained by the first principle component. ( ) shows that about a quarter of the total variance of the data is explained by the second principle component. ( ) shows that the matrix of principle component directions times X gives a rotation of X, including possibly the identity where nothing moves.   1. Consider the following 5 x 2 data matrix. Think of the rows as observations and columns as feature values for each observation. 6R5”, (as) E 2 MONO H
E Write self] for the ith column of X0 and ,a, for its average, i = 1, 2. :c, y are scalars that can take any values between 0 and 2. In Q1, ﬁx (3:,y) = (1,1). (a) {2 pt) Write out the centered version of X0, calling it X. In other words, 3:1 = 3% — ,al is the ﬁrst column of X. (b) (2 pt) Write out X‘X. Hint: Remember the ij-th entry of this matrix is the inner product of
Columns 32* and 323'. This should make it easy to calculate. For example, the 11 entry of this matrix (top left corner)I is (m1)t(ml). (c) {2 pt) First, show the column vector cl = (1, 0)* is an eigenvector for X‘X and write its corre-
sponding eigenvalue, which you should call A1. Remember, 1: is an eigenvector for a matrix A with eigenvalue A if An = Av. Second, ﬁnd a second eigenvector for JFK and its correSponding eigenvalue A2. ((1) {4 pt) Say the SVD of X is X = UDVt. Remember from class that X‘X = VDQV‘ where D2 is the diagonal matrix in which each entry of D is squared. First, use (c) to justify the fact that 10
V: Second, ﬁnd an explicit expression for D. You should be speciﬁc in your explanations. (e) {5 pt) Compute the PC—k variance for k = 1, 2 principle component directions, using your answers to (c), (d). Look up the formula in lecture slides if you have forgotten. (f) {7 pt] Go to the website http:/fsetcsa. io/evfprincipal—compcnent—analysis! and look at the 2-dimensional PCA example. Move the points in the left-hand display to match the
coordinates of X0 with (1:, y] = (1,1). When done correctly, the left display should have a
square with dots at the four corners and one at the middle. The L shape in the display should have its red line horizontal and its green line vertical. Explanation of right-hand display: The :r-axis coordinates show the dot products of the ﬁrst
principle component vector with each of the the centered observation vectors {rows of X), and the
y—axis shows the dot products with the second principle component. Note the L in the left-hand
display is rotated in the right-hand display so that the red base of the L is horizontal, corresponding
to the ﬁrst PC direction, and the green left-hand side of the L is vertical, corresponding to the
second PC direction. Therefore, looking at how spread out the points are along the x-axis compared to the y-axis shows the variation along the PCI and P02 directions respectively.
The right-hand display should be a square with corners at [1, 1), [1, —1) etc. and center zero. Use your answers to (c) through (d) to select all correct statements about the right-hand display on the 2d example obtained from your representation of X0 on the plot. The right-hand display... [ ) looks exactly like a plot of X.
looks exact] like a lot of .
3’ P [ ) shows the vertical direction {pointing up] of X0 has more variation than the horizontal di- rection. [ ) shows the horizontal {pointing right) direction of X0 has more variation than the vertical direction.
[ ) shows the horizontal and vertical directions of X0 have the same variation. [ ) shows that the projections of the observation vectors in X onto the PCI and PC? directions are projections onto the original axes.
[ ) shows that all of the total variance of the data is explained by the ﬁrst principle component.
[ ) shows that half of the total variance of the data is explained by the ﬁrst principle component. [ ) shows that about a quarter of the total variance of the data is explained by the second principle component. [ ) shows that the matrix of principle component directions times X gives a rotation of X, including possibly the identity where nothing moves.

Why Join Course Hero?

Course Hero has all the homework and study help you need to succeed! We’ve got course-specific notes, study guides, and practice tests along with expert tutors.

-

Educational Resources
• -

Study Documents

Find the best study resources around, tagged to your specific courses. Share your own to gain free Course Hero access.

Browse Documents