This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x ∈ R n as “approximately” lying in some kdimension subspace, where k n . Specif ically, we imagined that each point x ( i ) was created by first generating some z ( i ) lying in the kdimension affine space { Λ z + μ ; z ∈ R k } , and then adding Ψcovariance noise. Factor analysis is based on a probabilistic model, and parameter estimation used the iterative EM algorithm. In this set of notes, we will develop a method, Principal Components Analysis (PCA), that also tries to identify the subspace in which the data approximately lies. However, PCA will do so more directly, and will require only an eigenvector calculation (easily done with the eig function in Matlab), and does not need to resort to EM. Suppose we are given dataset { x ( i ) ; i = 1 , . . . , m } of attributes of m dif ferent types of automobiles, such as their maximum speed, turn radius, and so on. Lets x ( i ) ∈ R n for each i ( n m ). But unknown to us, two different attributes—some x i and x j —respectively give a car’s maximum speed mea sured in miles per hour, and the maximum speed measured in kilometers per hour. These two attributes are therefore almost linearly dependent, up to only small differences introduced by rounding off to the nearest mph or kph. Thus, the data really lies approximately on an n 1 dimensional subspace. How can we automatically detect, and perhaps remove, this redundancy? For a less contrived example, consider a dataset resulting from a survey of pilots for radiocontrolled helicopters, where x ( i ) 1 is a measure of the piloting skill of pilot i , and x ( i ) 2 captures how much he/she enjoys flying. Because RC helicopters are very difficult to fly, only the most committed students, ones that truly enjoy flying, become good pilots. So, the two attributes x 1 and x 2 are strongly correlated. Indeed, we might posit that that the 1 2 data actually likes along some diagonal axis (the u 1 direction) capturing the intrinsic piloting “karma” of a person, with only a small amount of noise lying off this axis. (See figure.) How can we automatically compute this u 1 direction?direction?...
View
Full
Document
This document was uploaded on 01/25/2012.
 Fall '09

Click to edit the document details