Unformatted text preview: in the literature for inverse purity is microaveraged precision.
The reader may note that, in the evaluation of clustering results, microaveraged
precision is identical to microaveraged recall (cf. e.g. Sebastiani (2002)). The Fmeasure works similar as inverse purity, but it depreciates overly large clusters,
as it includes the individual precision of these clusters into the evaluation.
While (inverse) purity and F-measure only consider ‘best’ matches between
‘queries’ and manually deﬁned categories, the entropy indicates how large the Band 20 – 2005 39 Hotho, Nürnberger, and Paaß
information content uncertainty of a clustering result with respect to the given
E (P , L ) =
E( P) = − ∑ prob( P) · E( P), where (15) prob( L| P) log(prob( L| P)) (16) P ∈P ∑ L ∈L where prob( L| P) = Precision( P, L) and prob( P) =
range [0, log(|L |)], with 0 indicating optimality. | P|
|D| The entropy has the 3.2.2 Partitional Clustering Manning & Schütze (2001); Steinbach et al.
View Full Document