lect10

lect10 - Lecture outline Dimensionality reduction SVD/PCA...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Lecture outline Dimensionality reduction SVD/PCA CUR decompositions Nearest-neighbor search in low dimensions kd-trees
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Datasets in the form of matrices We are given n objects and d features describing the objects. (Each object has d numeric values describing it.) Dataset An n-by-d matrix A , A ij shows the “ importance” of feature j for object i . Every row of A represents an object. Goal 1.Understand the structure of the data, e.g., the underlying process generating the data. 2.Reduce the number of features representing the data
Background image of page 2
Market basket matrices n customers d products (e.g., milk, bread, wine, etc.) A ij = quantity of j -th product purchased by the i -th customer Find a subset of the products that characterize customer behavior
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Social-network matrices n users d groups (e.g., BU group, opera, etc.) A ij = partiticipation of the i -th user in the j -th group Find a subset of the groups that accurately clusters social-network users
Background image of page 4
Document matrices n documents d terms (e.g., theorem, proof, etc.) A ij = frequency of the j -th term in the i -th document Find a subset of the terms that accurately clusters the documents
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Recommendation systems n customers d products A ij = frequency of the j -th product is bought by the i -th customer Find a subset of the products that accurately describe the behavior or the customers
Background image of page 6
The Singular Value Decomposition (SVD) feature 1 feature 2 Object x Object d (d,x) Data matrices have n rows (one for each object) and d columns (one for each feature). Rows: vectors in a Euclidean space, Two objects are “ close ” if the angle between their corresponding vectors is small.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4.0 4.5 5.0 5.5 6.0 2 3 4 5 SVD: Example Input: 2-d dimensional points Output: 1st (right) singular vector 1st (right) singular vector: direction of maximal variance, 2nd (right) singular vector 2nd (right) singular vector: direction of maximal variance, after removing the projection of the data along the first singular vector.
Background image of page 8
Singular values σ 1 : measures how much of the data variance is explained by the first singular vector.
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 10/05/2010.

Page1 / 44

lect10 - Lecture outline Dimensionality reduction SVD/PCA...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online