{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

icml_final

# icml_final - Fitting a Graph to Vector Data Samuel I Daitch...

This preview shows pages 1–3. Sign up to view the full content.

Fitting a Graph to Vector Data Samuel I. Daitch [email protected] Yale University, New Haven, CT USA Jonathan A. Kelner [email protected] Massachusetts Institute of Technology, Cambridge, MA USA Daniel A. Spielman [email protected] Yale University, New Haven, CT USA Keywords : Learning on Graphs, Transductive Classification, Transductive Regression, Clustering Abstract We introduce a measure of how well a com- binatorial graph fits a collection of vectors. The optimal graphs under this measure may be computed by solving convex quadratic programs and have many interesting proper- ties. For vectors in d dimensional space, the graphs always have average degree at most 2( d +1), and for vectors in 2 dimensions they are always planar. We compute these graphs for many standard data sets and show that they can be used to obtain good solutions to classification, regression and clustering prob- lems. 1. Introduction Given a collection of vectors x 1 , . . . , x n IR d , we ask the question, “What is the right graph to fit to this set of vectors?” In recent years, a number of researchers have gained insight by fitting graphs to their data and then using these graphs to solve clustering, classification, or re- gression problems on their data, e.g. (Ng et al., 2001; Zhu et al., 2003; Belkin & Niyogi, 2003; Joachims, 2003; Zhou & Sch¨ olkopf, 2004a; Coifman et al., 2005). They have employed simply defined graphs that are Appearing in Proceedings of the 26 th International Confer- ence on Machine Learning , Montreal, Canada, 2009. Copy- right 2009 by the author(s)/owner(s). easy to compute, associating a vertex of the graph with each data vector, and then connecting vertices whose vectors are sufficiently close, sometimes with weights depending on the distance. Not surprisingly, different results are obtained by the use of different graphs (Maier et al., 2008), and researchers have stud- ied how to combine different graphs in a way that tends to give heavier weight to the better graphs (Argyriou et al., 2006). In this paper, we study what can be gained by choosing the graphs with more care. For a set of vectors x 1 , . . . , x n , we construct a weighted, undirected graph on n vertices, where w i,j = w j,i 0 denotes the weight of edge ( i, j ), and d i = j w i,j denotes the weighted degree of vertex i . When there is no edge ( i, j ), we have w i,j = 0. We do not allow self-loops, so w i,i = 0 for all i . We measure how well the graph with weights w fits the vectors by how small it makes the following function, which is a weighted sum of the squared distance from each vertex to the weighted average of its neighbors: f ( w ) = X i d i x i - X j w i,j x j 2 . If we let X be the n -by- d matrix with i th row x i , and let L be the graph Laplacian matrix, defined as L i,j = ( - w i,j if i 6 = j d i if i = j , then f may be rewritten as f ( w ) = k LX k 2 F , where k M k F is the Frobenius norm ( i,j M 2 i,j ) 1 / 2 .

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Fitting a Graph to Vector Data Figure 1. The hard graph for a random set of vectors in two dimensions.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}