Cluster Analysis Notes (1)
The purpose of cluster analysis is to:
place observations into groups, or clusters, suggested by
the data, where groups are not defined a priori, and observations (objects) in a given cluster
tend to be similar to each other in some sense, and objects in different clusters tend to be
: “Any generalization about cluster analysis must be vague because
a vast number of clustering methods have been developed in several different fields, with
different definitions of clusters and similarity among objects. The variety of clustering
techniques is reflected by the variety of terms used for cluster analysis:
unsupervised pattern recognition,
Do clusters of observations naturally exist in the data?
shape? overlap? number?
If so, how do we identify the clusters?
what method for creating clusters should be used?
what measure of similarity (proximity, dissimilarity) should be used to identify
If so, how do we decide how many clusters there are?
What are the assumptions or requirements of the clustering methods?