spectral clustering Luxburg07_tutorial_2007

spectral clustering Luxburg07_tutorial_2007 - A Tutorial on...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tubingen, Germany ulrike.luxburg@tuebingen.mpg.de This article appears in Statistics and Computing, 17 (4), 2007. The original publication is available at www.springer.com . Abstract In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed. Keywords: spectral clustering; graph Laplacian 1 Introduction Clustering is one of the most widely used techniques for exploratory data analysis, with applications ranging from statistics, computer science, biology to social sciences or psychology. In virtually every scientific field dealing with empirical data, people attempt to get a first impression on their data by trying to identify groups of similar behavior in their data. In this article we would like to introduce the reader to the family of spectral clustering algorithms. Compared to the traditional algorithms such as k-means or single linkage, spectral clustering has many fundamental advantages. Results ob- tained by spectral clustering often outperform the traditional approaches, spectral clustering is very simple to implement and can be solved efficiently by standard linear algebra methods. This tutorial is set up as a self-contained introduction to spectral clustering. We derive spectral clustering from scratch and present different points of view to why spectral clustering works. Apart from basic linear algebra, no particular mathematical background is required by the reader. However, we do not attempt to give a concise review of the whole literature on spectral clustering, which is impossible due to the overwhelming amount of literature on this subject. The first two sections are devoted to a step-by-step introduction to the mathematical objects used by spectral clustering: similarity graphs in Section 2, and graph Laplacians in Section 3. The spectral clustering algorithms themselves will be presented in Section 4. The next three sections are then devoted to explaining why those algorithms work. Each section corresponds to one explanation: Section 5 describes a graph partitioning approach, Section 6 a random walk perspective, and Section 7 a perturbation theory approach. In Section 8 we will study some practical issues related to spectral clustering, and discussapproach....
View Full Document

Page1 / 32

spectral clustering Luxburg07_tutorial_2007 - A Tutorial on...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online