{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

lect10 - Lecture outline Dimensionality reduction SVD/PCA...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Lecture outline Dimensionality reduction SVD/PCA CUR decompositions Nearest-neighbor search in low dimensions kd-trees
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Datasets in the form of matrices We are given n objects and d features describing the objects. (Each object has d numeric values describing it.) Dataset An n-by-d matrix A , A ij shows the “ importance” of feature j for object i . Every row of A represents an object. Goal 1.Understand the structure of the data, e.g., the underlying process generating the data. 2.Reduce the number of features representing the data
Background image of page 2
Market basket matrices n customers d products (e.g., milk, bread, wine, etc.) A ij = quantity of j -th product purchased by the i -th customer Find a subset of the products that characterize customer behavior
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Social-network matrices n users d groups (e.g., BU group, opera, etc.) A ij = partiticipation of the i -th user in the j -th group Find a subset of the groups that accurately clusters social-network users
Background image of page 4
Document matrices n documents d terms (e.g., theorem, proof, etc.) A ij = frequency of the j -th term in the i -th document Find a subset of the terms that accurately clusters the documents
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Recommendation systems n customers d products A ij = frequency of the j -th product is bought by the i -th customer Find a subset of the products that accurately describe the behavior or the customers
Background image of page 6
The Singular Value Decomposition (SVD) feature 1 feature 2 Object x Object d (d,x) Data matrices have n rows (one for each object) and d columns (one for each feature). Rows: vectors in a Euclidean space, Two objects are “ close ” if the angle between their corresponding vectors is small.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4.0 4.5 5.0 5.5 6.0 2 3 4 5 SVD: Example Input: 2-d dimensional points Output: 1st (right) singular vector 1st (right) singular vector: direction of maximal variance, 2nd (right) singular vector 2nd (right) singular vector: direction of maximal variance, after removing the projection of the data along the first singular vector.
Background image of page 8
Singular values σ 1 : measures how much of the data variance is explained by the first singular vector. σ 2 : measures how much of the data variance is explained by the second singular vector. σ 1 4.0 4.5 5.0 5.5 6.0 2 3 4 5 1st (right) singular vector 2nd (right) singular vector
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
SVD decomposition U (V)
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}