Global+Optimization+Algorithms+Theory+and+Application_Part28

Global+Optimization+Algorithms+Theory+and+Application_Part28...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 29.2 Clustering Algorithms 541 Algorithm 29.1 : C new ←− kMeansModify k ( C ) Input : [implicit] k : the number of clusters wanted, k ≤ | A | Input : [implicit] dist: the distance measure between clusters to be used Input : [implicit] dist 2 : the distance measure between elements to be used Input : C : the list of clusters c to be modified Data : m : the index of the cluster C [ m ] with the lowest error Data : n : the index of the cluster C [ n ] nearest to C [ m ] Data : s : index of the cluster C [ s ] with the highest error Output : C new : the modified list of closters begin 1 m ←− m : error c ( C [ m ] ) = min { C [ i ] ∀ i ∈ [0 ,k − 1] } 2 n ←− n : dist( C [ m ] ,C [ n ] ) = min { dist( C [ m ] ,C [ i ] ) ∀ i ∈ [0 ,k − 1] \ { m }} 3 s ←− s : error c ( C [ s ] ) = max { error c ( C [ i ] ) ∀ i ∈ [0 ,k − 1] \ { m,n }} 4 C [ m ] ←− C [ m ] ∪ C [ n ] 5 a ←− a ∈ C [ s ] : dist 2 ( a, centroid( C [ s ] )) ≥ dist 2 ( b, centroid( C [ s ] )) ∀ b ∈ C [ s ] 6 C [ n ] ←− { a } 7 C [ s ] ←− C [ s ] \ { a } 8 return B 9 end 10 29.2.3 n th Nearest Neighbor Clustering The n th nearest neighbor clustering algorithm is defined in the context of this book only. It creates at most k clusters where the first k − 1 clusters contain exactly one element. The remaining elements are all together included in the last cluster. The elements of the single- element clusters are those which have the longest distance to their n th-nearest neighbor. This clustering algorithm is suitable for reducing a large set to a smaller one which contains still the most interesting elements (those in the single-element clusters). It has relatively low complexity and thus runs fast, but on the other hand has the setback that dense aggregations of ≥ n elements will be put into the “rest elements”-cluster. For n , normally a value of n = √ k is used. n th nearest neighbor clustering uses the k th nearest neighbor distance function dist ρ nn,k introduced in Definition 28.63 on page 506 with its parameter k set to n . Do not mix this parameter up with the parameter k of this clustering method – although they have the same name, they are not the same. I know, I know, this is not pretty. Notice that Algorithm 29.3 should only be applied if all the elements a ∈ A are unique, i.e., there exists no two equal elements in A ) which is, per definition, true for all sets. In a real implementation, a preprocessing step should remove are duplicates from A before clustering is performed. Especially our home-made nearest neighbor clustering variant is unsuitable to process lists containing the same elements multiple times. Since all equal elements have the same distance to their n th neighbor, it is likely that the result of the clustering is very unsatisfying since one element may occur multiple times whereas a variety of different other elements is ignored. Therefore, the aforementioned preprocessing should be applied, whichelements is ignored....
View Full Document

Page1 / 20

Global+Optimization+Algorithms+Theory+and+Application_Part28...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online