lect7

lect7 - ClusteringIV Outline...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
Clustering IV
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Outline Impossibility theorem for clustering Density-based clustering and subspace  clustering Bi-clustering or co-clustering
Background image of page 2
General form of impossibility  results Define a set of simple  axioms  (properties)  that a computational task should satisfy Prove that  there does not exist an  algorithm  that can simultaneously satisfy  all the axioms    impossibility
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Computational task: clustering clustering function  operates on a set  X   of  n  points.   X = {1,2,…,n} Distance function  d: X  x  X   R  with  d(i,j)≥0 d(i,j)=d(j,i) , and  d(i,j)=0  only if  i=j Clustering function  f f(X,d) =  Γ , where  Γ   is a  partition  of  X
Background image of page 4
Axiom 1: Scale invariance For  a>0 , distance function  ad  has values  (ad) (i,j)=ad(i,j) For any  d  and for any  a>0  we have  f(d) = f(ad) The clustering function should not be sensitive to  the changes in the units of distance  measurement – should not have a built-in  “length scale”
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Axiom 2: Richness The  range  of  f  is equal to  the set of  partitions  of  X For any  X  and any partition  Γ  of  X , there is  a distance function on  X  such that  f(X,d) =  Γ .
Background image of page 6
Axiom 3: Consistency Let  Γ  be a partition of  X d, d’  two distance functions on  X d’  is a  Γ -transformation of  d , if For all  i,jє X  in the  same cluster  of  Γ , we  have  d’(i,j)≤d(i,j) For all  i,jє X  in  different clusters  of  Γ , we  have  d’(i,j)≥d(i,j) Consistency:  if  f(X,d)=  Γ   and  d’  is a  Γ - transformation of  d , then  f(X,d’)=  Γ .
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Axiom 3: Consistency Intuition:  Shrinking distances between  points inside a cluster and expanding  distances between points in different  clusters does not change the result
Background image of page 8
Examples Single-link agglomerative clustering Repeatedly merge clusters whose closest points  are at minimum distance  Continue until a stopping criterion is met k -cluster stopping criterion: continue until there are k  clusters distance- r  stopping criterion: continue until all  distances between clusters are larger than r scale- a  stopping criterion: let d* be the maximum  pairwise distance; continue until all distances are  larger than ad*
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Examples (cont.) Single-link agglomerative clustering with  k - cluster stopping criterion does not satisfy  richness axiom Single-link agglomerative clustering with  distance- stopping criterion does not satisfy  scale-invariance property Single-link agglomerative clustering with scale- a   stopping criterion does not satisfy consistency  property
Background image of page 10
Centroid-based clustering and  consistency k
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 10/05/2010.

Page1 / 48

lect7 - ClusteringIV Outline...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online