{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

lecture9

# lecture9 - CSE 6740 Lecture 9 How Can I Reduce/Relate the...

This preview shows pages 1–11. Sign up to view the full content.

CSE 6740 Lecture 9 How Can I Reduce/Relate the Data Points? (Association and Clustering) Alexander Gray [email protected] Georgia Institute of Technology CSE 6740 Lecture 9 – p. 1/3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Today 1. Clustering 2. Associations Central tasks for “data mining”. CSE 6740 Lecture 9 – p. 2/3
Clustering Methods “Show me the sub-groups in the data.” CSE 6740 Lecture 9 – p. 3/3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Clustering Why show sub-groups in the data? Sometimes: Computational reasons ( e.g. use cluster centers instead of the dataset) Statistical reasons ( e.g. identify/remove outliers) Mainly: Visualization/understanding reasons CSE 6740 Lecture 9 – p. 4/3
Procedural Methods When we can speak of a true underlying function (as we do in most density estimation, classification, and regression methods), we can discuss error, error bounds, generalization (minimizing error on future data), what happens to the error as we get more data, etc. In other words we can leverage all the powerful tools of statistics we have discussed. CSE 6740 Lecture 9 – p. 5/3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Procedural Methods I will call a method which has not been formally related to some function of the underlying density a procedural method. This turns out to be common in clustering and density estimation methods. Though this makes it hard/impossible to say much about these methods analytically, they are nonetheless often still useful in practice. CSE 6740 Lecture 9 – p. 6/3
Mixture of Gaussians Treat clustering as a density estimation problem, where each Gaussian is a cluster. CSE 6740 Lecture 9 – p. 7/3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Mixture of Gaussians Again: Task: density estimation Model class: set of all possible mixtures of Gaussians with K components Loss: Likelihood Optimizer: EM algorithm Generalization mechanism: Cross-validation Evaluation algorithm: N -body CSE 6740 Lecture 9 – p. 8/3
Sum-of-Squares Minimization Here’s a simpler method, which cannot be described as relating to some function of the underlying distribution of the data. We’ll seek a partitioning of the points into K disjoint subsets C k each containing N k points, such that the following sum-of-squares objective function is minimized: K summationdisplay k =1 summationdisplay i C k || x i μ k || 2 (1) where μ k = 1 N k i C k x i is the mean of the points in set C k . C ( x i ) = C k will denote that the class of x i is C k . CSE 6740 Lecture 9 – p. 9/3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
K -means The K -means method is as follows: First initialize the means μ k somehow, for example by choosing K different points randomly. Then: 1. Assign each point according to C ( x i ) = argmin k || x i μ k || . 2. Recompute each μ k according to the new assignments. Stop when no assignments change. This process can be shown to never increase the sum-of-squares. However, it does not necessarily obtain the global optimum. In practice, this is done, say, 10 times and the result with the lowest sum-of-squares is used.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 35

lecture9 - CSE 6740 Lecture 9 How Can I Reduce/Relate the...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online