Lecture12 - Data Mining: Principles and Algorithms Jianyong...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
December 23, 2009 Data Mining: Principles and Algorithms 1 Data Mining: Pr inciples and Algorithms Jianyong Wang Database Lab, Institute of Software Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
December 23, 2009 Data Mining: Principles and Algorithms 2 Chapter 5. Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods Density-Based Methods Grid-Based Methods Model-Based Methods Clustering High-Dimensional Data Constraint-Based Clustering Outlier Analysis Summary
Background image of page 2
December 23, 2009 Data Mining: Principles and Algorithms 3 What is Cluster Analysis? Cluster: a collection of data objects - Similar to one another within the same cluster - Dissimilar to the objects in other clusters Cluster analysis - Finding similarities between data according to the characteristics found in the data and grouping similar data objects into clusters Unsupervised learning : no predefined classes Typical applications - As a stand-alone tool to get insight into data distribution - As a preprocessing step for other algorithms
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
December 23, 2009 Data Mining: Principles and Algorithms 4 Clustering: Rich Applications and Multidisciplinary Efforts WWW - Document clustering - Cluster Weblog data to discover groups of similar access patterns Spatial Data Analysis - Detect spatial clusters for other spatial mining tasks - Create thematic maps in GIS by clustering feature spaces Pattern Recognition Image Processing
Background image of page 4
December 23, 2009 Data Mining: Principles and Algorithms 5 Examples of Clustering Applications Marketing: Help marketers discover distinct groups in their customer bases, and then use this knowledge to develop targeted marketing programs Biology: Gene function annotation City-planning: Identifying groups of houses according to their house type, value, and geographical location Earth-quake studies: Observed earth quake epicenters should be clustered along continent faults Land use: Identification of areas of similar land use in an earth observation database Insurance: Identifying groups of motor insurance policy holders with a high average claim cost
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
December 23, 2009 Data Mining: Principles and Algorithms 6 Quality: What Is Good Clustering? A good clustering method will produce high quality clusters with - High intra-class similarity - Low inter-class similarity The quality of a clustering method is also measured by its ability to discover some or all of the hidden patterns The quality of a clustering result depends on both the similarity measure used by the method and its implementation
Background image of page 6
December 23, 2009 Data Mining: Principles and Algorithms 7 Measure the Quality of Clustering Dissimilarity/Similarity metric : Similarity is expressed in terms of a distance function, typically metric: d ( i, j ) The definitions of
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 06/02/2010 for the course COMPUTER DM2009F taught by Professor Wangwei during the Fall '09 term at Tsinghua University.

Page1 / 41

Lecture12 - Data Mining: Principles and Algorithms Jianyong...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online