Unformatted text preview: Retrieval Hierarchical Clustering Dendrogram: Hierarchical Clustering   Build a tree ­based hierarchical taxonomy (dendrogram) from a set of documents.   Clustering obtained by cutting the dendrogram at a desired level: each connected component forms a cluster. animal vertebrate invertebrate fish reptile amphib. mammal worm insect crustacean   One approach: recursive applica*on of a par**onal clustering algorithm. 32 Introduc)on to Informa)on Retrieval Sec. 17.1 Hierarchical Agglomera*ve Clustering (HAC)   Starts with each doc in a separate cluster   then repeatedly joins the closest pair of clusters, un*l there is only one cluster.   The history of merging forms a binary tree or hierarchy. Introduc)on to Informa)on Retrieval Sec. 17.2 Closest pair of clusters   Many variants to defining closest pair of clusters   Single ­link   Similarity of the most cosine ­similar (single ­link)   C...
