Lecture7a.slides.pdf - CS4487 Machine Learning Lecture 7...

This preview shows 1 out of 5 pages.

CS4487 - Machine Learning Lecture 7 - Linear Dimensionality Reduction Dr. Antoni B. Chan Dept. of Computer Science, City University of Hong Kong Outline 1. Linear Dimensionality Reduction for Vectors A. Principal Component Analysis (PCA) B. Random Projections C. Fisher's Linear Discriminant (FLD) 2. Linear Dimensionality Reduction for Text A. Latent Semantic Analysis (LSA) B. Non-negative Matrix Factorization (NMF) C. Latent Dirichlet Allocation (LDA) Dimensionality Reduction Goal: Transform high-dimensional vectors into low-dimensional vectors. Dimensions in the low-dim data represent co-occuring features in high-dim data. Dimensions in the low-dim data may have semantic meaning. For example: document analysis high-dim: bag-of-word vectors of documents low-dim: each dimension represents similarity to a topic.
Image of page 1

Subscribe to view the full document.

Example: image analysis approximate an image as a weighted combination of several basis images represent the image as the weights. Reasons for Dimensionality Reduction
Image of page 2
Preprocessing - make the dataset easier to use Reduce computational cost of running machine learning algorithms Remove noise Make the results easier to understand (visualization) Linear Dimensionality Reduction Project the original data onto a lower-dimensional hyperplane (e.g., line, plane). I.e, Move and rotate the coordinate axis of the data Represent the data with coordinates in the new component space. Equivalently, approximate the data point as a linear combination of basis vectors (components) in the original space. original data point approximation: is a basis vector and the corresponding weight. the data point is then represented its corresponding weights Several methods for linear dimensionality reduction. Di ff erences: goal (reconstruction vs classification) unsupervised vs. supervised constraints on the basis vectors and the weights. reconstruction error criteria Principal Component Analysis (PCA) Unsupervised method Goal: preserve the variance of the data as much as possible choose basis vectors along the maximum variance (longest extent) of the data. the basis vectors are called principal components (PC). x x ∈ ℝ d = x ̂ p j =1 w j v j v j d ∈ ℝ w j x w = [ , , ] w 1 w P p
Image of page 3

Subscribe to view the full document.

In [4]: vfig Goal: Equivalently, minimize the reconstruction error over all the data points . reconstruction: constraint: principal components are orthogonal (perpendicular) to each other. PCA algorithm 1) subtract the mean of the data 2) the first PC is the direction that explains the most variance of the data.
Image of page 4
Image of page 5
You've reached the end of this preview.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern