dm5part2 - University of Florida CISE department Gator...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: University of Florida CISE department Gator Engineering Clustering Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Partitional Clustering Original Points A Partitional Clustering University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Hierarchical Clustering Traditional Hierarchical Clustering Traditional Dendrogram University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Characteristics of Clustering Algorithms • Type of clustering the algorithm produces: – Partitional versus hierarchical – Overlapping versus non-overlapping – Fuzzy versus non-fuzzy – Complete versus partial University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Characteristics of Clustering Algorithms • Type of clusters the algorithm seeks: – Well-separated, center-based, density-based or contiguity-based – Are the clusters found in the entire space or in a subspace – Are the clusters relatively similar to one another, or are they of differing sizes, shapes and densities University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Characteristics of Clustering Algorithms • Type of data the algorithm can handle: – Some clustering algorithms need a data matrix • The K-means algorithm assumes that it is meaningful to take the mean (average) of a set of data objects. • This makes sense for data that has continuous attributes and for document data, but not for record data that has categorical attributes. – Some clustering algorithms start from a proximity matrix • Typically assume symmetry – Does the data have noise and outliers? – Is the data high dimensional? University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Characteristics of Clustering Algorithms • How the algorithm operates: – Minimizing or maximizing a global objective function. • Enumerate all possible ways of dividing the points into clusters and evaluate the ‘goodness’ of each potential set of clusters by using the given objective function. (NP Hard) • Can have global or local objectives. – Hierarchical clustering algorithms typically have local objectives – Partitional algorithms typically have global objectives – A variation of the global objective function approach is to fit the data to a parameterized model....
View Full Document

This note was uploaded on 11/13/2011 for the course CIS 4930 taught by Professor Staff during the Spring '08 term at University of Florida.

Page1 / 34

dm5part2 - University of Florida CISE department Gator...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online