Kaufman rousseeuw 1990 gives characteristic values of

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: square error for a cluster P is given by: MSE( P) = and µ P = 1 | P| ∑ dist(d, µP )2 , d∈ P (10) ∑d∈ P td is the centroid of the clusters P and dist is a distance measure. One clustering measure that is independent from the number of clusters is the silhouette coefficient SC(P) (cf. Kaufman & Rousseeuw (1990)). The main idea of the coefficient is to find out the location of a document in the space with respect to the cluster of the document and the next similar cluster. For a good clustering the considered document is nearby the own cluster whereas for a bad clustering the document is closer to the next cluster. With the help of the silhouette coefficient one is able to judge the quality of a cluster or the entire clustering (details can be found in Kaufman & Rousseeuw (1990)). Kaufman & Rousseeuw (1990) gives characteristic values of the silhouette coefficient for the evaluation of the cluster quality. A value for SC(P) between 0.7 and 1.0 signals excellent separation between the found clusters, i.e. the objects within a cluster...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online