This preview shows pages 1–10. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CSE 572 Data Mining Lecture Notes for Chapter 8 Basic Cluster Analysis Introduction to Data Mining by Tan, Steinbach, Kumar Huan Liu, Spring 2010 Tan,Steinbach, Kumar Introduction to Data Mining 2 What is Cluster Analysis? Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups Intercluster distances are maximized Intracluster distances are minimized Tan,Steinbach, Kumar Introduction to Data Mining 3 Applications of Cluster Analysis Understanding Group related documents for browsing, group genes and proteins that have similar functionality, or group stocks with similar price fluctuations Summarization Reduce the size of large data sets Discovered Clusters Industry Group 1 AppliedMatlDOWN,BayNetworkDown,3COMDOWN, CabletronSysDOWN,CISCODOWN,HPDOWN, DSCCommDOWN,INTELDOWN,LSILogicDOWN, MicronTechDOWN,TexasInstDown,TellabsIncDown, NatlSemiconductDOWN,OraclDOWN,SGIDOWN, SunDOWN Technology1DOWN 2 AppleCompDOWN,AutodeskDOWN,DECDOWN, ADVMicroDeviceDOWN,AndrewCorpDOWN, ComputerAssocDOWN,CircuitCityDOWN, CompaqDOWN, EMCCorpDOWN, GenInstDOWN, MotorolaDOWN,MicrosoftDOWN,ScientificAtlDOWN Technology2DOWN 3 FannieMaeDOWN,FedHomeLoanDOWN, MBNACorpDOWN,MorganStanleyDOWN FinancialDOWN 4 BakerHughesUP,DresserIndsUP,HalliburtonHLDUP, LouisianaLandUP,PhillipsPetroUP,UnocalUP, SchlumbergerUP OilUP Clustering precipitation in Australia Tan,Steinbach, Kumar Introduction to Data Mining 4 What is not Cluster Analysis? Supervised classification Have class label information Simple segmentation Dividing students into different registration groups alphabetically, by last name Results of a query Groupings are a result of an external specification Tan,Steinbach, Kumar Introduction to Data Mining 5 Notion of a Cluster can be Ambiguous How many clusters? Four Clusters Two Clusters Six Clusters Tan,Steinbach, Kumar Introduction to Data Mining 6 Types of Clusterings A clustering is a set of clusters Important distinction between hierarchical and partitional sets of clusters Partitional Clustering A division of data objects into nonoverlapping subsets (clusters) such that each data object is in exactly one subset Hierarchical clustering A set of nested clusters organized as a hierarchical tree Tan,Steinbach, Kumar Introduction to Data Mining 7 Partitional Clustering Original Points A Partitional Clustering Tan,Steinbach, Kumar Introduction to Data Mining 8 Hierarchical Clustering p4 p1 p3 p2 p4 p1 p3 p2 p4 p1 p2 p3 p4 p1 p2 p3 Traditional Hierarchical Clustering Nontraditional Hierarchical Clustering Nontraditional Dendrogram Traditional Dendrogram Tan,Steinbach, Kumar Introduction to Data Mining 9 Other Distinctions Between Sets of Clusters Exclusive versus nonexclusive...
View Full
Document
 Spring '02
 dawsonengler
 Data Mining

Click to edit the document details