lecture9 - CSE 6740 Lecture 9 How Can I Reduce/Relate the...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CSE 6740 Lecture 9 How Can I Reduce/Relate the Data Points? (Association and Clustering) Alexander Gray agray@cc.gatech.edu Georgia Institute of Technology CSE 6740 Lecture 9 p. 1/3 5 Today 1. Clustering 2. Associations Central tasks for data mining. CSE 6740 Lecture 9 p. 2/3 5 Clustering Methods Show me the sub-groups in the data. CSE 6740 Lecture 9 p. 3/3 5 Clustering Why show sub-groups in the data? Sometimes: Computational reasons ( e.g. use cluster centers instead of the dataset) Statistical reasons ( e.g. identify/remove outliers) Mainly: Visualization/understanding reasons CSE 6740 Lecture 9 p. 4/3 5 Procedural Methods When we can speak of a true underlying function (as we do in most density estimation, classification, and regression methods), we can discuss error, error bounds, generalization (minimizing error on future data), what happens to the error as we get more data, etc. In other words we can leverage all the powerful tools of statistics we have discussed. CSE 6740 Lecture 9 p. 5/3 5 Procedural Methods I will call a method which has not been formally related to some function of the underlying density a procedural method. This turns out to be common in clustering and density estimation methods. Though this makes it hard/impossible to say much about these methods analytically, they are nonetheless often still useful in practice. CSE 6740 Lecture 9 p. 6/3 5 Mixture of Gaussians Treat clustering as a density estimation problem, where each Gaussian is a cluster. CSE 6740 Lecture 9 p. 7/3 5 Mixture of Gaussians Again: Task: density estimation Model class: set of all possible mixtures of Gaussians with K components Loss: Likelihood Optimizer: EM algorithm Generalization mechanism: Cross-validation Evaluation algorithm: N-body CSE 6740 Lecture 9 p. 8/3 5 Sum-of-Squares Minimization Heres a simpler method, which cannot be described as relating to some function of the underlying distribution of the data. Well seek a partitioning of the points into K disjoint subsets C k each containing N k points, such that the following sum-of-squares objective function is minimized: K summationdisplay k =1 summationdisplay i C k || x i k || 2 (1) where k = 1 N k i C k x i is the mean of the points in set C k . C ( x i ) = C k will denote that the class of x i is C k . CSE 6740 Lecture 9 p. 9/3 5 K-means The K-means method is as follows: First initialize the means k somehow, for example by choosing K different points randomly. Then: 1. Assign each point according to C ( x i ) = arg min k || x i k || . 2. Recompute each k according to the new assignments. Stop when no assignments change....
View Full Document

Page1 / 35

lecture9 - CSE 6740 Lecture 9 How Can I Reduce/Relate the...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online