chap8_basic_cluster_analysis

chap8_basic_cluster_analysis - Data Mining Cluster...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Edited for STATS202, Stanford University, Winter 2010 © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 2 What is Cluster Analysis? z Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups Inter-cluster distances are maximized Intra-cluster distances are minimized
Background image of page 2
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 3 Applications of Cluster Analysis z Understanding Group related documents for browsing, group genes and proteins that have similar functionality, or group stocks with similar price fluctuations z Summarization Reduce the size of large data sets Discovered Clusters Industry Group 1 Applied-Matl-DOWN,Bay-Network-Down,3-COM-DOWN, Cabletron-Sys-DOWN,CISCO-DOWN,HP-DOWN, DSC-Comm-DOWN,INTEL-DOWN,LSI-Logic-DOWN, Micron-Tech-DOWN,Texas-Inst-Down,Tellabs-Inc-Down, Natl-Semiconduct-DOWN,Oracl-DOWN,SGI-DOWN, Sun-DOWN Technology1-DOWN 2 Apple-Comp-DOWN,Autodesk-DOWN,DEC-DOWN, ADV-Micro-Device-DOWN,Andrew-Corp-DOWN, Computer-Assoc-DOWN,Circuit-City-DOWN, Compaq-DOWN, EMC-Corp-DOWN, Gen-Inst-DOWN, Motorola-DOWN,Microsoft-DOWN,Scientific-Atl-DOWN Technology2-DOWN 3 Fannie-Mae-DOWN,Fed-Home-Loan-DOWN, MBNA-Corp-DOWN,Morgan-Stanley-DOWN Financial-DOWN 4 Baker-Hughes-UP,Dresser-Inds-UP,Halliburton-HLD-UP, Louisiana-Land-UP,Phillips-Petro-UP,Unocal-UP, Schlumberger-UP Oil-UP Clustering precipitation in Australia
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 4 What is not Cluster Analysis? z Supervised classification Have class label information z Simple segmentation Dividing students into different registration groups alphabetically, by last name z Results of a query Groupings are a result of an external specification z Graph partitioning Some mutual relevance and synergy, but areas are not identical
Background image of page 4
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 5 Notion of a Cluster can be Ambiguous How many clusters? Four Clusters Two Clusters Six Clusters
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 6 Types of Clusterings z A clustering is a set of clusters z Important distinction between hierarchical and partitional sets of clusters z Partitional Clustering A division data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset z Hierarchical clustering A set of nested clusters organized as a hierarchical tree
Background image of page 6
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 7 Partitional Clustering Original Points A Partitional Clustering
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 8 Hierarchical Clustering p4 p1 p3 p2 p4 p1 p3 p2 p4 p1 p2 p3 p4 p1 p2 p3 Traditional Hierarchical Clustering Non-traditional Hierarchical Clustering Non-traditional Dendrogram Traditional Dendrogram
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 07/29/2011 for the course STAT 202 at Stanford.

Page1 / 108

chap8_basic_cluster_analysis - Data Mining Cluster...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online