K Mean clustering2011 - 2/22/2011 Clustering An...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
2/22/2011 1 An Introduction to K-Means, EM, MoG,… Clustering • Organizing data into classes such that there is • high intra-class similarity • low inter-class similarity • Finding the class labels and the number of classes directly from the data (in contrast to classification). • More informally, finding natural groupings among objects. What is Clustering? Also called unsupervised learning , sometimes called classification* by statisticians and sorting by psychologists and segmentation by people in marketing * Classification is used when you known the subclasses and you need to give a new comer a subclass label; clustering is used when you have no prior information on subclasses (and you “divide” the data)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2/22/2011 2 What is a natural grouping among these objects? School Employees Simpson's Family Males Females Clustering is subjective What is a natural grouping among these objects?
Background image of page 2
2/22/2011 3 What is Similarity? The quality or state of being similar; likeness; resemblance; as, a similarity of features. Similarity is hard to define, but… We know it when we see it The real meaning of similarity is a philosophical question. We will take a more pragmatic approach. Webster's Dictionary Defining Distance Measures Definition : Let O 1 and O 2 be two objects from the universe of possible objects. The distance (dissimilarity) between O 1 and O 2 is a real number denoted by D ( O 1 , O 2 ) 0.23 3 342.7 Peter Piotr
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/16/2012 for the course MAD 4103 taught by Professor Li during the Spring '11 term at University of Central Florida.

Page1 / 10

K Mean clustering2011 - 2/22/2011 Clustering An...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online