dm5part1 - University of Florida CISE department Clustering...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
University of Florida CISE department Gator Engineering Clustering Part 1 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 What is Cluster Analysis? • Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups. – Based on information found in the data that describes the objects and their relationships. – Also known as unsupervised classification. • Many applications – Understanding: group related documents for browsing or to find genes and proteins that have similar functionality. – Summarization: Reduce the size of large data sets.
Background image of page 2
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 What is not Cluster Analysis? • Supervised classification. – Have class label information. • Simple segmentation. – Dividing students into different registration groups alphabetically, by last name. • Results of a query. – Groupings are a result of an external specification. • Graph partitioning – Some mutual relevance and synergy, but areas are not identical.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Notion of a Cluster is Ambiguous Initial points. Four Clusters Two Clusters Six Clusters
Background image of page 4
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Types of Clusterings • A clustering is a set of clusters. • One important distinction is between hierarchical and partitional sets of clusters. • Partitional Clustering – A division data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset. • Hierarchical clustering – A set of nested clusters organized as a hierarchical tree.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Partitional Clustering Original Points A Partitional Clustering
Background image of page 6
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Hierarchical Clustering Traditional Hierarchical Clustering Non-traditional Hierarchical Clustering Non-traditional Dendrogram Traditional Dendrogram
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Other Distinctions Between Sets of Clusters • Exclusive versus non-exclusive – In non-exclusive clusterings, points may belong to multiple clusters. – Can represent multiple classes or ‘border’ points
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 29

dm5part1 - University of Florida CISE department Clustering...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online