joachims_hopcroft_05a

Course: VIVO 24452, Fall 2008
School: Cornell
Rating:
 
 
 
 
 

Document Preview

Bounds Error for Correlation Clustering Thorsten Joachims tj@cs.cornell.edu Cornell University, Dept. of Computer Science, 4153 Upson Hall, Ithaca, NY 14853 USA John Hopcroft jeh@cs.cornell.edu Cornell University, Dept. of Computer Science, 5144 Upson Hall, Ithaca, NY 14853 USA Abstract This paper presents a learning theoretical analysis of correlation clustering (Bansal et al., 2002). In particular, we give...

Register Now

Unformatted Document Excerpt

Coursehero >> New York >> Cornell >> VIVO 24452

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Bounds Error for Correlation Clustering Thorsten Joachims tj@cs.cornell.edu Cornell University, Dept. of Computer Science, 4153 Upson Hall, Ithaca, NY 14853 USA John Hopcroft jeh@cs.cornell.edu Cornell University, Dept. of Computer Science, 5144 Upson Hall, Ithaca, NY 14853 USA Abstract This paper presents a learning theoretical analysis of correlation clustering (Bansal et al., 2002). In particular, we give bounds on the error with which correlation clustering recovers the correct partition in a planted partition model (Condon & Karp, 2001; McSherry, 2001). Using these bounds, we analyze how the accuracy of correlation clustering scales with the number of clusters and the sparsity of the graph. We also propose a statistical test that analyzes the signicance of the clustering found by correlation clustering. is that its simplicity allows a concise analysis, while providing a starting point for exploring more complex models. While a substantial amount of theoretical work on clustering algorithms exists, much of this work is concerned primarily with computational aspects (e.g. (Dasgupta, 1999; McSherry, 2001)). Probably the most general learning theoretic model of clustering to date is Empirical Risk Approximation (Buhmann, 1998; Buhmann & Held, 1999), which applies to clustering algorithms that optimize an objective function. Buhmann (1998) uses uniform convergence arguments to bound the dierence between the objective value a clustering achieves on the training data and its objective over the data distribution. Ben-David follows this approach to derive nite sample bounds for k-median clustering (Ben-David, 2004). The k-means or vector quantization problem is probably the best studied clustering problem. Among other results, statistical consistency was proven by Pollard (Pollard, 1981), and lower (Bartlett et al., 1998) and upper bounds (Linder et al., 1994) on the quantization error are know. Our work is substantially dierent since it considers nonmetric clustering problems where the data comes in the form of graphs. Graph-based clustering problems are ubiquitous in WWW search and social network analysis (e.g. (Kleinberg, 1999)). Instead of limiting our analysis to investigating statistical consistency, like the work of von Luxburg et al. (2004) for spectral clustering, we rather use a more restrictive model in which we can prove nite sample bounds for correlation clustering. Our analysis makes three contribution. First, we dene a model in which we derive nite-sample error bounds for correlation clustering. Second, we study the asymptotic behavior of correlation clustering with respect to the density of the graph and the scaling of cluster sizes. And nally, we propose a statistical test for evaluating the signicance of a clustering. 1. Introduction While we have gained a detailed learning theoretical understanding of supervised learning over the last decades, our understanding of unsupervised clustering is still rather limited. For example, how much data is necessary so that a clustering algorithm outputs a reliable clustering? How does the amount of data depend on the distribution of the data? Is the particular clustering produced by some algorithm signicant? This paper addresses these questions for a particular graph-based clustering algorithm, namely correlation clustering (Bansal et al., 2002). Correlation clustering is a particularly attractive clustering method, since its solution can be approximated eciently (see e.g. (Demaine & Immorlica, 2003; Swamy, 2004)) and it automatically selects the number of clusters. While Bansal et al. (2004) briey discuss the behavior of their algorithm under noise in the data, no learning theoretic analysis exists yet. To conduct the analysis, we propose a simple probabilistic model over graphs that extends the planted partition model (Condon & Karp, 2001; McSherry, 2001). An advantage of this model Appearing in Proceedings of the 22 nd International Conference on Machine Learning, Bonn, Germany, 2005. Copyright 2005 by the author(s)/owner(s). Error Bounds for Correlation Clustering 1 +1 +1 0 +1 1 +1 2 +1 +1 +1 1 1 1 3 0 +1 +1 +1 0 1 4 +1 1 +1 +1 0 1 5 1 1 0 0 +1 +1 6 +1 1 1 1 +1 +1 cluster 11 21 31 41 50 60 2 5 3 4 Figure 1. Example of correlation clustering on graph with 6 vertices. The graph and its weight matrix W are depicted on the left. Solid edges indicate a weight of +1, dotted edges a weight of 1. The correlation clustering is depicted on the right. 2. Correlation Clustering The correlation clustering of an n vertex weighted graph with edge weights Wij is the partition of the vertices that minimizes the sum of positive weights that are cut minus the negative weights that are not cut. An example of an (undirected) graph with six vertices is given in Figure 1. In this example, the matrix of edge weights W contains only three possible values, namely 1, 0, and +1. The correlation clustering of W is depicted on the right-hand side of Figure 1. The clustering contains one cluster containing vertices 1,2,3, and 4 and another cluster containing vertices 5 and 6. This clustering cuts 2 (directed) edges with weight +1, while it fails to cut 2 (directed) edges with weight 1. This gives this clustering a score of 1 2 (1) 2 = 4, which optimizes the objective function of correlation clustering. More formally, the correlation clustering S of a graph with edge weights W is given by the solution Y of the following integer program (Demaine & Immorlica, 2003). The number k of clusters is not xed by the user, but determined as part of the clustering process. The edge weights Wij enter the optimization problem as follows. W + is equal to adjacency matrix W , except that all negative edge weights are replaced by 0. Similarly, W is equal to W , except that all positive edge weights are replaced by 0. The optimization is over the n n matrix Y with elements Yij {0, 1}. A value of 1 for Yij indicates that objects xi and xj are in the same cluster. A value of 0 indicates that they are in dierent clusters. n n + (1 Yij )Wij Yij Wij (1) i=1 j=1 of the optimization problem directly encode the three conditions in the denition of an equivalence relation, namely reexivity, symmetry, and transitivity. This means that any feasible Y and therefore also the solution Y directly corresponds to an equivalence relation and it is straightforward to derive a clustering from the solution Y . We denote the clustering that corresponds to an indicator matrix Y with S(Y ). Vice versa, we denote with Y (S) the cluster indicator matrix induced by clustering S on X. Finally, we dene the cost of a matrix Y n n + (1 Yij )Wij Yij Wij i=1 j=1 costW (Y ) = as the value of the objective function for that clustering. For simplicity of notation, we assume that diagonal entries of W are always non-negative, i.e. Wii 0. Note that the formulation of the optimization problem can be simplied. In particular, the reexivity constraints and the associated variables Yii can be dropped. Similarly, one can eliminate the symmetry constraints by unifying their variables. While the solution of the optimization problem is known to be NPcomplete (Bansal et al., 2002), there are eective approximation algorithms for this problem (e.g. (Bansal et al., 2002; Demaine & Immorlica, 2003; Swamy, 2004)). 3. Generalized Planted Partition Model In this section we dene a probabilistic data model similar to the one in (Condon & Karp, 2001; McSherry, 2001). For data generated according to this model, we will derive results that describe how accurately correlation clustering recovers the correct cluster structure. In our model we assume that there is an arbitrary true partition S = {S1 , ..., Sk } of the vertices X (i.e. S1 ... Sk = X and Si Sj = ). Unlike in the model of Condon and Karp (2001), the number of clusters k and the size of each cluster are arbitrary and min Y subject to i : Yii = 1 i, j : Yij = Yji i, j, k : Yij + Yjk Yik + 1 i, j : Yij {0, 1} (2) (3) (4) (5) We call Y a cluster indicator matrix. The constraints 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 2 6 W 1 2 3 4 5 6 ^ Y1 2 3 4 5 6 0 0 0 0 1 1 (6) Error Bounds for Correlation Clustering 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 Y* 1 11 21 31 41 50 60 70 80 2 3 4 5 6 7 8 M 1 2 3 4 5 6 7 8 W1 2 3 4 5 6 7 8 1 1.0 0.2 0.7 0.30.80.50.80.8 2 0.4 1.0 0.5 0.40.50.80.50.8 3 0.5 0.7 1.0 0.20.50.80.30.8 4 0.2 0.5 0.4 1.00.30.50.80.5 5 0.50.30.80.8 1.0 0.4 0.7 0.5 6 0.30.80.50.3 0.2 1.0 0.7 0.4 7 0.50.30.30.3 0.7 0.2 1.0 0.7 8 0.30.80.50.8 0.4 0.7 0.5 1.0 1 +1 1 0 1 1 1 1 1 2 0 +1 +1 +1 1 0 +1 1 3 +1 +1 +1 0 1 +1 1 1 1 0 1 +1 +1 1 +1 +1 5 1 1 1 0 +1 +1 0 +1 6 1 1 0 1 0 +1 +1 1 7 0 +1 1 +1 +1 0 +1 +1 8 1 1 1 0 0 1 +1 +1 d(Y*,Y) Figure 2. Illustration of planted partition model and the inference process. unknown to the clustering algorithm. To this partition S corresponds a probability distribution PS (W ) over edge weights. We assume that PS (W ) is the process that generates the data we want to cluster. The goal of the clustering algorithm is to recover the true partition S underlying the data generating process PS (W ) from a single realization of edge weights W . The class of distributions PS (W ) we consider is dened as follows. Denition 1 (Gen. Planted Partition Model) In a graph with n edges, the edge weights are generated by a distribution n n if Y (S )ij = 0 (dierent cluster): E(Wij ) < 0 In the example in Figure 2, + is 0.2 and is 0.3. Note that the model requires that the mean weight of between cluster edges be less than zero, and that the mean weight of within cluster edges be greater than zero. In addition, it requires that all weights are bounded2 , i.e. i, j : a Wij b. This class of distributions PS (W ) can be used to model a variety of clustering applications. Here are three examples: Pair-Wise Classication This example is the application Bansal et al. (2002) use to motivate correlation clustering. Edge weights Wij are derived from classications of pairs as e.g. in noun-phrase coreference resolution (Ng & Cardie, 2002). For each pair of objects, a classication rule makes an independent judgment of whether two vertices should be in the same cluster or not. The edge weight is derived from the condence in the judgment. Citation Network Analysis Edge weights represent citations in a network of bibliographic references. Each citation edge receives a weight of 1, non-present citations receive a negative weight w . We will discuss the value of w in Section 4.2. The matrix of means M reects the probabilities with which documents cite each other dependent on whether they are in the same cluster or not. Co-Clustering The co-clustering model (Dhillon, 2001), originally proposed for text, simultanously clusters the rows and the columns of a term/document matrix. This leads to a bipartite graph model with terms and documents being two sets of vertices. An edge of weight Wij = 1 is present in the graph, if term i occurs in document j, it is equal to some negative value w if 2 Alternatively, we could assume that all variances are bounded. PS (W |M, a, b) = i=1 j=1 PS (Wij |Mij , a, b) (7) so that each element Wij of W is a bounded independent random variable in the interval [a, b] with mean Mij . Each PS (Wij |Mij , a, b) is constrained by the true partitioning S as follows. If Y (S )ij = 1 (vertices i and j are in the same cluster), the mean Mij of Wij must fulll the constraint that Mij + > 0. If Y (S )ij = 0 (vertices i and j are in dierent clusters), the mean Mij of Wij must fulll Mij < 0. One can think of PS (W |M, a, b) (or PS (W ) for short) as generating edge weights that are a noisy representation of the underlying true partition S . Figure 2 gives an example of a true partition S , how its structure is reected in the matrix of means M , and how a particular matrix of edge weights W is drawn from PS (W ). The matrix of means M controls the amount of noise1 and we summarize M using the two parameters + and . + is a lower bound on the mean edge weight between any two vertices that are in the same cluster of S , while is an upper bound on the mean edge weight between any two vertices in dierent clusters. if Y (S )ij = 1 (same cluster): E(Wij ) + > 0 A natural and straightforward extension is to allow a small random subset of edges to violate their constraints on the mean. 1 P (W) 4 S* cluster4 Y 1 2 0 1 1 0 0 0 0 0 3 0 1 1 0 0 0 0 0 4 0 0 0 1 1 1 1 1 5 0 0 0 1 1 1 1 1 6 0 0 0 1 1 1 1 1 7 0 0 0 1 1 1 1 1 8 0 0 0 1 1 1 1 1 11 20 30 0 50 60 70 80 Error Bounds for Correlation Clustering the word does not occur. Weights between terms and between documents are zero. We will discuss more detailed results for query clustering in search engines and citation network analysis in Sections 4.2 and 5 to further illustrate the model. a weight matrix W drawn from PS (W ) has a cost lower than the true partition for any partition S with d(S, S ) , i.e. the probability that S : d(S, S ) costW (Y (S)) costW (Y (S )). This bounds the probability of drawing a W so that correlation cluster ing returns a partition S which has an error d(S, S ) greater than . There are two components contributing to d(S, S ). Let d+ (S, S ) = + be the number of vertex pairs that are clustered together in S but not in S. Similarly, let d (S, S ) = be the number of vertex pairs that are clustered together in S but not in S . This means that d(S, S ) = + + = . Based on the magnitude of these two types of errors, we can bound the probability of the model generating a W for which the incorrect partition S has a lower cost than the true partition S . Lemma 1 Given two partitions S and S with d+ (S, S ) = + and d (S, S ) = , it holds for W drawn from the generalized planted partition model PS (W ) with < 0 < + and a Wij b that P(costW(Y(S)) costW(Y(S ))|S,S ,+,) e for any + + [0, n(n 1)]. Proof We can compute the dierence of costs costW (Y (S))costW (Y (S )) of Y (S) and Y (S ) with respect to W as n n + (1Yij)Wij YijWij i=1 j=1 i=1 j=1 n n (1 ij )Wij Yij Wij Y + 2 ( )2 ++ ( + )(ba)2 + 4. Analysing the Error of Correlation Clustering In this section, we analyze how well correlation clustering can reconstruct the true partition S based on a weight matrix W drawn from a probability distribution PS (W ) that conforms with our generalized planted partition model. Figure 2 illustrates the statistical model in which we evaluate correlation clustering. In this model, correlation clustering is applied to the weight matrix W generated from PS (W ). The resulting cluster indicator matrix Y and the partition S it induces are then compared against the true clus tering S . We measure the error of S with respect to S using the following pair-wise loss function. d(S, S ) = ||Y (S) Y (S )||2 F (8) Here ||.||F denotes the Frobenius norm. Intuitively, d(., .) measures the distance between two clusterings as the number of elements that are dierent in the corresponding cluster indicator matrices. In the example in Figure 2, this dierence is depicted as the shaded region in the right-most panel and has value d(S, S ) = 10 + 8 = 18. In the following, we will rst derive upper bounds on the error d(S, S ) of correlation clustering with respect to the number of vertices n and the values of + and . After deriving the general results, we will apply them to the example settings mentioned above. Finally, we will discuss the asymptotic behavior of correlation clustering in our model. 4.1. Error Bound for Finite Samples In our Planted Partition Model there is a true partition S of the given set of vertices X. Associated with the partition S is a probability distribution PS (W ) of edge weights so that the mean of each within cluster edge exceeds + > 0, and so that the mean of each between cluster edge is less than < 0. Our argument is structured as follows. First, given two partitions S and S with distance d(S, S ), we bound the probability that a weight matrix W drawn from PS (W ) has a lower cost for partition S than for the true partition S , i.e. costW (Y (S)) costW (Y (S )). In a second step, we will bound the probability that + = (1Yij )Wij Yij Wij (1 ij )Wij Yij Wij Y + {(i,j):Yij =Yij} {(i,j):Yij =Yij} = [Yij Wij ] Yij Wij {(i,j):Yij =Yij } {(i,j):Yij =Yij } More precisely, if the distance between two clusterings S and S is d+ (S, S ) = + and d (S, S ) = , then there are exactly + + elements of Y (S) and Y (S) on which costW (Y (S)) and costW (Y (S )) dier. Denote the corresponding sets of edges as D+ and D . This implies that if costW (S) costW (S ), then the following sum must be negative. Wij (i,j)D+ (i,j)D Wij 0 (9) Since the edge weights in W are drawn independently, we can use Hoedings inequality to bound the probability that this sum is negative. Hoedings inequality bounds the deviation of a sum of independent and Error Bounds for Correlation Clustering bounded variables random Xk [ai , bi ] from its mean. P Xk E( Xk ) c e 2c2 (bi ai )2 In our case, we set Xk = (i,j)D+ Wij Wij , E( Xk ) = (i,j)D (i,j)D+ Mij Xk ), and (bi ai )2 = (i,j)D Mij , c = E( (+ + )(b a)2 . We can now apply Hoedings inequality to bound the probability P ( Xk 0) = P ( Xk E( Xk ) E( Xk )). 2 2 (i,j)D+ Mij (i,j)D (+ + )(ba)2 Mij Theorem 1 Given the true partition S of n points, the probablity that correlation clustering returns a par tition S with Err(S, S ) in the planted partion model with = min{+ , } and a Wij b is bounded by n ln(n)2 P (Err(S, S ) ) e n(n1) 2 (ba)2 (10) Proof We bound the probability that any partition with error d(S, S ) = n(n 1) has a cost that is better or equal to that of the true partition S . The bound follows directly from the union bound and Lemmas 1 and 2. P (S : d(S, S ) costW (Y (S)) costW (Y (S ))) n! e 2 (+ + )2 (+ + )(ba)2 P Xk 0 e Since the Mij are bounded by + and in the planted partition model, it holds that 2 (i,j)D+ e n ln(n)2 2 (ba)2 Mij (i,j)D Mij (+ + )2 , which completes the proof of the lemma. The lemma bounds for a particular clustering S the probability of drawing a W for which S has a misleadingly good cost. To bound the probability for all clusterings, we need an upper bound on the number of possible clusterings. The exact number of clusterings is know as the Bell number, but the following bound suces for our purposes. Lemma 2 The number C # (n) of possible clusterings of n points is at most n factorial. Proof By induction over n. For n = 1 there is exactly one clustering. Given C # (n 1), for each clustering of n 1 objects the n-th object can either start its own cluster, or join one of at most n 1 existing clusters in the clustering. So, there are at most n ways to extend each of the existing clusterings of n objects. This implies C # (n) nC # (n 1). We can now bound the probability that correlation clustering returns a partition with large error. We state the theorem in terms of the error rate Err(S, S ), which is the fraction of misclassied edges Err(S, S ) = d(S, S ) n(n 1) This proves that with high probability no partition S with distance d(S, S ) has a cost that is better than the cost of the true partition S . Since correla tion clustering returns the partition S with the lowest has a distance d(S, S ) less than with high cost, S probablity. Related bounds were derived by Condon and Karp (2001), as well as McSherry (2001). However, Condon and Karp (2001) consider a more restricted setting where all clusters have equal size, and this size is known to the clustering algorithm a priori. Furthermore, both bounds are dierent from our work, since they do not quantify the error between S and an im perfect S. The following example illustrates the bound. Assume a planted partition model with = + = = 0.5 and bounds a = 1 and b = 1. Lets assume we have n = 3000 objects X = (x1 , ..., xn ) and a true partition S = {S1 , S2 , S3 } with three clusters of size 1000 each. Applying the bound tells us that with 95% condence, the error rate Err(S, S ) of the partition S returned by correlation clustering is at most 2.2%. Furthermore, for true clusters of size k, moving e objects out of the correct cluster leads to a pairwise loss of at least e k e(e + 1)/2. This minimum pairwise loss is achieved by splitting one of the clusters into two subclusters of size e and k e. Therefore, with 95% condence at most 215 of the 3000 objects are not clustered correctly. 4.2. Application: Query Clustering We will now illustrate how the planted partition model can be substantiated with particular parameters according to application settings. We use query clustering in search engines as an example. In query Note that the following bound is with respect to the randomness in drawing the cost matrix W . However, note that the bound also holds for cases where the optimization problem in Eqs. (1)-(5) does not have a unique solution. Error Bounds for Correlation Clustering clustering, the goal is to group queries together that are semantically related (e.g. the queries imdb and movie reviews). To measure the relation between two queries, we make use of the fact that users reformulate queries during their search. We consider a xed set of n queries as nodes (e.g. n popular queries). Using the query log over some time interval, the adjacency matrix W is constructed by assigning Wij = 1 if some user issued query xi directly followed by query xj , and Wij = w < 0 otherwise. We will discuss the choice of w below. This representation exploits that two consecutive queries by the same user are likely to be related. Before applying correlation clustering to W , we dene what we mean by a cluster of related queries. We dene that queries within the same cluster co-occur in the query log during the time interval with probabiltiy at least p+ , while between cluster co-occurrences have probability p < p+ . The independence assumption approximately holds in this setting (especially, if one considers only one query pair per user), so that one can apply our results for the planted partition model as follows. Corollary 1 Based on the true partition S of n nodes, the edges of a directed graph are independently drawn so that within-cluster edges have probability at least p+ , and between-cluster edges have probabiltity less than p < p+ . From this graph construct W by assigning Wij = 1 to each element corresponding to an p edge, and Wij = w = p++ +p otherwise. The prob+p 2 ability that the error rate Err(S, S ) of the correlation of W is greater than is bounded by clustering S P (Err(S, S ) ) en ln(n) 2 1 Wij = 1 if paper xi cites paper xj , and Wij = w < 0 otherwise. Clustering in citation networks is dierent from query clustering in at least two respects3 . First, while it is easy to control the sparsity of the graph by considering shorter or longer query logs in query clustering, the sparsity of the citation graph cannot be manipulated. Second, with a growing number of nodes, the number of clusters grows as well. We discuss both issues in the following. 5.1. How does the Bound Scale with Increasing Sparsity of the Graph? If the lower bound = min{+ , } on the dierence of means for between and within cluster edges is a constant independent of n, then in Theorem 1 the probability that the error is greater than any constant fraction goes to zero since the second term in the exponent of (10) is order n2 . However, being a constant independent of n leads to very dense data. In citation network analysis, for example, graphs are usually very sparse with only a constant number of nonzero entries per row. Such a level of sparsity implies that is of 1 size n and that the second term in the exponent is constant. In this case, the rst term dominates giving a meaningless bound of en ln(n) . Thus, if we wish to have small probability of more than a constant fracln(n) tion error, we need to grow faster than for n the second term in the exponent of e to dominate. 5.2. How does the Bound Scale with Cluster Size? For an increasing number of nodes n, assume that each true partition Sn contains a xed number k of clus ters S (n) = {S1 (n), . . . , Sk (n)} that each grow pro|S (n)| portionally with n. Let fi = in be the constant fraction of nodes in cluster Si (n) and, without loss of generality, let cluster k be the smallest cluster. With increasing n, does correlation clustering eventually recover each of the clusters? Suppose that we want to guarantee with high probability that all but a fraction k of f2 nodes are clustered correctly. If n nodes are misclassied by some partition S, the value of the pairwise loss is at least d(S, S (n)) 2n2 (fk ). Since d(S, S (n)) is quadratic in n, the bound from Theorem 1 shows that the probability of misclassifying a constant fraction of nodes goes to zero. If the clusters do not grow proportionally with n but 3 Furthermore, the independence assumption is likely to be less valid than in query clustering. n(n1)(p+ p )2 (11) We omit the proof for brevity, since it is a direct consequence of Theorem 1. Note that the particular choice of w maximizes = min{+ , }. It is straightforward to derive other (and potentially tighter) versions of the bound by replacing Hoedings inequality, but omit their discussion for brevity. 5. Asymptotic Behavior How does the bound scale if the number of nodes in the graph grows? Growing graphs are natural, for example, in citation network analysis. Clustering in citation networks is used to reveal groups of related publications. Similar to query clustering, one could use correlation clustering to nd clusters of papers that reference each other with high frequency. Let W be the adjacency matrix of the citation graph in which Error Bounds for Correlation Clustering slower, the pairwise loss d(S, S (n)) does not grow quadratically in n. This happens, for example, when clusters grow at dierent rates or when the number of clusters grows with n. To ensure convergence of the bound from Theorem 1, we need d(S, S (n)) to grow faster than n ln(n). This is ensured if the fraction of nodes in each cluster grows faster than ln(n) . n Lemma 3 Given a graph with n nodes and a particular clustering S of the graph for which we denote ||Y (S)|| as . Let w , , and p so that w < 0, 0 (n2 )p ( n)(1 p)w , and 0 p 1. If we randomly generate weights on the edges of the graph so that edges have weight 1 with probability p and weight w otherwise, the probability that clustering S has costW (Y (S)) is P (costW(Y(S)) |S, ) e 2 ((n2 )p(n)(1p)w )2 n(n1)(1w )2 6. Is a Clustering Signicant? In typical applications of correlation clustering we are given a set of data W and we apply correlation clustering to detect potential cluster structure. So far, this paper addressed the question of whether the cor relation clustering S reveals the true underlying struc ture S . We will now turn to the related question of whether the data reveals any signicant cluster structure. Answering this question is important, since it provides a practitioner with a measure of condence (or lack thereof) in S. In the following we use correlation clustering to derive a signicance test that lets us reject the null hypothesis that the data was produced by a random process without any underlying structure. As the null hypothesis, we use a planted partition model where all edge-weight distributions P (Wij ) have the same mean, i.e. Mij = Mkl . This null hypothesis captures that there is no structure in our data. For simplicity of presentation, we consider only the setting of citation network analysis, so that all Wij take only two values indicating whether a particular edge is present or not. Let p be the probability that any given edge is present. For correlation clustering, the resulting graph is transformed into a weighted complete graph with weight matrix W by weighting present edges with 1, and inserting an edge with weight w < 0 whenever there is no edge present. In this model we can pick a cost threshold and bound the probability that the distribution from the null hypothesis generates a set of data W for which the partition S returned by correlation clustering has less than the threshold . If we observe costW (Y (S)) that costW (Y (S)) is less than for our given set of data W , we can use this bound on the probability to reject the null hypothesis with the corresponding condence. For technical reasons that will be discussed in the proof of Lemma 3, must be less than (n2 )p ( n)(1 p)w . The following derivation of the signicance test proceeds by rst bounding the probability for a single S in Lemma 3, and then by extending the result to hold uniformly for all S in Theorem 2. . Proof Since we have n(n 1) random variables (i.e. the o diagonal entries of the cost matrix) that are bounded within [w , 1], we can apply Hoedings inequality and get P (costW (Y (S)) |S, ) e 2 (E(costW (Y (S)))2 n(n1)(1w )2 (12) for [0, (n2 )p(n)(1p)w ]. Note that has to be less than E(costW (Y (S)) for Hoedings inequality to apply, thus the restriction to the interval. It remains to determine the expected cost E(costW (Y (S)). For a partition matrix Y (S) with ( n) o-diagonal entries equal to 1 and (n2 ) entries equal to 0, the expectation is E(costW(Y(S)) = (n2)p(n)(1p)w . Substituting this into (12) yields the result. Theorem 2 Let w , , and p so that w < 0, 0 n(n 1)min{p, (1 p)w }, and 0 p 1. For a complete graph with n nodes where edges have weight 1 with probability p and weight w otherwise, the probability that the clustering S returned is by correlation clustering has costW (Y (S)) P (costW(Y(S)) ) e n ln(n)2 n( ) min p,( )w } n1 { p1 (1w )2 2 n( ) n1 Proof We prove a uniform bound in the sense that P (costW (Y (S)) ) P (S : costW (Y (S)) ) To apply the union bound, we need a bound on P (costW (Y (S)) |S, ) that holds independent of . Relaxing the bound from Lemma 3, it holds for every clustering S independent of = ||Y (S)|| that P (costW (Y (S)) |S) = e e 2 (min0n(n1) {(n2 )p(n)(1p)w })2 n(n1)(1w )2 n(n1)(min{p,(1p)w } (1w )2 )2 n(n1) 2 Applying the union bound w.r.t. the upper bound on the number of clusterings from Lemma 2 yields the result. Error Bounds for Correlation Clustering Note that p is a parameter that needs to be xed independent of the data. However, for practical purposes one can consider estimating p as the fraction of positive edges in W . Given p, a reasonable choice for w p is w = 1p , since it maximizes the numerator in the exponent. For this choice of w we can apply the bound from Theorem 2 in a hypothesis test as follows. We decide on a condence level and solve n(n1) p Ben-David, S. (2004). A framework for statistical clustering with a constant time approximation algorithms for k-median clustering. Conference on Learning Theory (COLT). Buhmann, J. (1998). Empirical risk approximation: An induction principle for unsupervised learning (Technical Report IAI-TR-98-3). Universitaet Bonn. Buhmann, J., & Held, M. (1999). Model selection in clustering by uniform convergence bounds. Neural Information Processing Systems (NIPS) (pp. 216 222). Condon, & Karp (2001). Algorithms for graph partitioning on the planted partition model. Random Structures & Algorithms, 18, 116140. Dasgupta, S. (1999). Learning mixtures of Gaussians. IEEE Symposium on Foundations of Computer Science (FOCS) (pp. 634644). Demaine, & Immorlica (2003). Correlation clustering with partial information. International Workshop on Approximation Algorithms for Combinatorial Optimization (APPROX). Dhillon, I. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. ACM SIGKDD Conference. Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46, 604632. Linder, T., Lugosi, T., & Zeger, K. (1994). Rates of convergence in the source coding theorem, in empirical quantizer design, and in universal lossy source coding. IEEE Transactions on Information Theory, 40, 17281740. McSherry, F. (2001). Spectral partitioning of random graphs. IEEE Symposium on Foundations of Computer Science (FOCS). Ng, V., & Cardie, C. (2002). Improving machine learning approaches to coreference resolution. Annual Meeting of the Assoc. for Comp. Linguistics (ACL). Pollard, D. (1981). Strong consistency of k-means clustering. The Annals of Statistics, 9, 135140. Swamy, C. (2004). Correlation clustering: Maximizing agreements via semidenite programming. Symposium on Discrete Algorithms (SODA). von Luxburg, U., Bousquet, O., & Belkin, M. (2004). On the convergence of spectral clustering on random samples: The normalized case. Conference on Learning Theory (COLT) (pp. 457471). e n ln(n)2 n(n1) (1w )2 2 (13) for the signicance threshold as follows. n(n1)p n 1+ p 1p n ln(n) ln() (14) 2 If costW (Y (S)) is less or equal to , we can reject the null hypothesis with condence . 7. Conclusions and Future Work We presented a simple probabilistic graph model in which we analyze correlation clustering. The model allows us to derive nite sample bounds on the error with which correlation clustering recovers the graph structure. The results give insight into the behavior of correlation clustering with respect to the number of nodes, the density of the edges, and the number of clusters. Furthermore, we derive a test which can be applied to validate the signicance of a given clustering. While the planted partition model is an interesting starting point for analyzing clustering algorithms, there is need for generalizing the model and removing its assumptions. Clearly, the biggest assumption in the model is that edge weights are independently distributed. It is an interesting question whether this assumption can be relaxed without making the bounds too lose for any practical relevance. This work was funded in part under NSF awards IIS0412894, IIS-0312910, and the KD-D grant. References Bansal, N., Blum, A., & Chawla, S. (2002). Correlation clustering. IEEE Symposium on Foundations of Computer Science (FOCS). Bansal, N., Blum, A., & Chawla, S. (2004). Correlation clustering. Machine Learning, 56. Bartlett, P. L., Linder, T., & Lugosi, G. (1998). The minimax distortion redundancy in empirical quantizer design. IEEE Transactions on Information Theory, 44, 18021813.
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Cornell - VIVO - 24452
Evaluating the Robustness of Learning from Implicit FeedbackFilip Radlinski Department of Computer Science, Cornell University, Ithaca, NY 14853 USA Thorsten Joachims Department of Computer Science, Cornell University, Ithaca, NY 14853 USAfilip@c
Cornell - VIVO - 24452
Unstructuring User Preferences: Ecient Non-Parametric Utility RevelationCarmel Domshlak Fac. of Industrial Engineering &amp; Management Technion - Israel Institute of Technology Haifa, Israel 32000Thorsten Joachims Computer Science Dept. Cornell Univ
Cornell - VIVO - 24452
Learning to Align Sequences: A Maximum-Margin ApproachThorsten Joachims Tamara Galor Ron Elber Department of Computer Science Cornell University Ithaca, NY 14853 {tj,galor,ron}@cs.cornell.edu June 24, 2005Abstract We propose a discriminative method
Cornell - VIVO - 24452
Eye-Tracking Analysis of User Behavior in WWW SearchLaura A. GrankaCornell University Human-Computer Interaction GroupThorsten JoachimsCornell University Department of Computer ScienceGeri GayCornell University Human-Computer Interaction Grou
Cornell - VIVO - 24452
KDD-Cup 2004: Results and AnalysisRich CaruanaCornell University Dept. of Computer Science Ithaca, NY, USThorsten JoachimsCornell University Dept. of Computer Science Ithaca, NY, USLars BackstromCornell University Dept. of Computer Science It
Cornell - VIVO - 24452
Learning a Distance Metric from Relative ComparisonsMatthew Schultz and Thorsten Joachims Department of Computer Science Cornell University Ithaca, NY 14853 schultz,tj @cs.cornell.eduAbstractThis paper presents a method for learning a distance m
Cornell - VIVO - 24452
Transductive Learning via Spectral Graph PartitioningThorsten Joachims tj@cs.cornell.edu Cornell University, Department of Computer Science, Upson Hall 4153, Ithaca, NY 14853 USAAbstractWe present a new method for transductive learning, which ca
Cornell - VIVO - 24452
Evaluating Retrieval Performance using Clickthrough DataThorsten Joachims Cornell University Department of Computer Science Ithaca, NY 14853 USA tj@cs.cornell.eduAbstract This paper proposes a new method for evaluating the quality of retrieval func
Cornell - VIVO - 24452
A Statistical Learning Model of Text Classication for Support Vector MachinesThorsten JoachimsGMD Forschungszentrum IT, AIS.KD Schloss Birlinghoven, 53754 Sankt Augustin, GermanyThorsten.Joachims@gmd.de ABSTRACT
Cornell - VIVO - 24452
Cornell - VIVO - 24452
Estimating the Generalization Performance of an SVM E cientlyThorsten JoachimsInformatik LS VIII, Universitat Dortmund, Baroper Str. 301, 44221 Dortmund, Germanyjoachims@ls8.informatik.uni-dortmund.deThis paper proposes and analyzes an e cient a
Cornell - VIVO - 24452
11Making Large-Scale SVM Learning PracticalThorsten Joachims Universitat Dortmund, Informatik, AI-Unit Thorsten Joachims@cs.uni-dortmund.de http: www-ai.cs.uni-dortmund.de PERSONAL joachims.html To be published in: 'Advances in Kernel Methods - S
Cornell - VIVO - 24452
Combining statistical learning with a knowledge based approach | A case study in intensive care monitoringKatharina Morik and Peter Brockhausen and Thorsten Joachimsfmorik,brockhausen,joachimsg@ls8.cs.uni-dortmund.deUniversitat Dortmund, LS VIII 4
Cornell - VIVO - 24452
Text Categorization with Support Vector Machines: Learning with Many Relevant FeaturesThorsten JoachimsUniversitat Dortmund Informatik LS8, Baroper Str. 301 44221 Dortmund, GermanyAbstract. This paper explores the use of Support Vector Machines
Cornell - VIVO - 24452
UNIVERSITAT DORTMUNDFachbereich Informatik Lehrstuhl VIII Kunstliche IntelligenzMaking Large-Scale SVM Learning PracticalLS 8 Report 24Thorsten JoachimsDortmund, 15. June, 1998Universitat Dortmund Fachbereich InformatikUniversity of Dortmu
Cornell - VIVO - 24452
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text CategorizationThorsten JoachimsUniversitat Dortmund, Fachbereich Informatik, Lehrstuhl 8 Baroper Str. 301 44221 Dortmund, Germany thorsten@ls8.informatik.uni-dortmund.deAbstrac
Cornell - VIVO - 24452
UNIVERSITAT DORTMUNDFachbereich Informatik Lehrstuhl VIII Kunstliche IntelligenzText Categorization with Support Vector Machines: Learning with Many Relevant FeaturesLS 8 Report 23Thorsten JoachimsDortmund, 27. November, 1997 Revised: 19. Apri
Cornell - VIVO - 24452
DiplomarbeitEinsatz eines intelligenten, lernenden Agenten fr das World Wide WebThorsten JoachimsDiplomarbeit am Fachbereich Informatik der Universitt Dortmund4. Dezember 1996Betreuer: Prof. Dr. Katharina Morik Prof. Dr. Norbert FuhrZusamm
Cornell - VIVO - 24452
WebWatcher: Machine Learning and HypertextThorsten Joachims, Tom Mitchell, Dayne Freitag, and Robert ArmstrongSchool of Computer Science Carnegie Mellon University May 29, 1995This paper describes the rst implementation of WebWatcher, a Learning
Cornell - VIVO - 24707
July 24, 200723:27WSPC - Proceedings Trim Size: 9.75in x 6.5inpaper1A conservative parametric approach to motif signicance analysisUri Keich, Patrick Ng Department of Computer Science, Cornell University, Ithaca, NY, USA We suggest a novel
Cornell - VIVO - 24707
BIOINFORMATICSVol. 22 no. 14 2006, pages e393e401 doi:10.1093/bioinformatics/btl245Apples to apples: improving the performance of motif nders and their signicance analysis in the Twilight ZonePatrick Ng1, Niranjan Nagarajan1, Neil Jones2 and Uri
Cornell - VIVO - 24707
Rening motif nders with E-value calculationsNiranjan Nagarajan, Patrick Ng, Uri Keich Department of Computer Science, Cornell University, Ithaca, NY, USAAbstract Motif nders are an important tool for searching for regulatory elements in DNA. Popula
Cornell - VIVO - 24707
A Fast and Numerically Robust Method for Exact Multinomial Goodness-of-Fit TestUri KEICH and Niranjan NAGARAJANEvaluating the signicance of goodness-of-ts tests for multinomial data in general, and estimating the p value of the log-likelihood ratio
Cornell - VIVO - 24707
BIOINFORMATICSVol. 21 Suppl. 1 2005, pages i311i318 doi:10.1093/bioinformatics/bti1044Computing the P -value of the information content from an alignment of multiple sequencesNiranjan Nagarajan1 , Neil Jones2 and Uri Keich1,Science Department,
Cornell - VIVO - 24707
A Faster Reliable Algorithm to Estimate the p-Value of the Multinomial llr StatisticUri Keich and Niranjan NagarajanDepartment of Computer Science, Cornell University, Ithaca, NY-14850, USA {keich,niranjan}@cs.cornell.eduAbstract. The subject of
Cornell - VIVO - 24707
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 12, Number 4, 2005 Mary Ann Liebert, Inc. Pp. 416430sFFT: A Faster Accurate Computation of the p-Value of the Entropy ScoreURI KEICHABSTRACT We present sFFT, an algorithm for efciently computing the p-val
Cornell - VIVO - 24707
54IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS,VOL. 4,NO. 1, JANUARY-MARCH 2007Correcting Base-Assignment Errors in Repeat Regions of Shotgun AssemblyDegui Zhi, Uri Keich, Pavel Pevzner, Steffen Heber, and Haixu TangAbs
Cornell - VIVO - 24707
Discrete Applied Mathematics 138 (2004) 253 263www.elsevier.com/locate/damOn spaced seeds for similarity searchUri Keicha; , Ming Lib , Bin Mac , John Trompda Computer b BioinformaticsScience &amp; Engineering Department, University of Californi
Cornell - VIVO - 24707
Designing Seeds for Similarity Search in Genomic DNAJeremy Buhler (jbuhler@cse.wustl.edu) Uri Keich (keich@cs.ucsd.edu) Yanni Sun (yanni@cse.wustl.edu) Submitted to RECOMB 2003Department of Computer Science and Engineering, Campus Box 1045, Washin
Cornell - VIVO - 24707
H 2 2 ) ! ' 2 ) ' 5 ! 8 5 8 5 ! v5 !5 # 5 2 ) ' {55y I )5 2 ' H @C(G(C&amp;U06&amp;%Cf&amp;C6Gi6@4&amp;(6g9&amp;SP&amp;C%(&amp;%8 5 8 5 ! v5 {5y ! 5 ) 2 # 2 8 5 ! v5 8 # X t 3 ) 2 H ! &amp;D&amp;C6Gi6@o9gz@&amp;G&amp;xw9CeCU&amp;C6Gi6@BC4e%u6US p5
Cornell - VIVO - 24707
Finding motifs in the twilight zoneUri KeichDepartment of Computer Science and Engineering University of California San Diego La Jolla, CA 92093, USAPavel A. PevznerDepartment of Computer Science and Engineering University of California San Dieg
Cornell - VIVO - 24707
STATIONARY TANGENT: THE DISCRETE AND NON-SMOOTH CASEU. KEICH Abstract. In [5] we dene a stationary tangent process, or a locally optimal stationary approximation, to a real non-stationary smooth Gaussian process. This paper extends the idea by const
Cornell - VIVO - 24707
OPTIMAL DECOMPOSITIONS FOR THE K-FUNCTIONAL FOR A COUPLE OF BANACH LATTICES.MICHAEL CWIKEL AND URI KEICHAbstract. Let f = gt + ht be the optimal decomposition for calculating the exact value of the K-functional K(t, f ; X) of an element f with res
Cornell - VIVO - 24707
A POSSIBLE DEFINITION OF A STATIONARY TANGENTU. KEICH Abstract. This paper oers a way to construct a locally optimal stationary approximation for a non-stationary Gaussian process. In cases where this construction leads to a unique stationary approx
Cornell - VIVO - 24707
Kreins Strings, the Symmetric Moment Problem, and Extending a Real Positive Denite FunctionURI KEICHCalifornia Institute of TechnologyAbstract The symmetric moment problem is to nd a possibly unique, positive symmetric measure that will produce a
Cornell - VIVO - 24707
ON Lp BOUNDS FOR KAKEYA MAXIMAL FUNCTIONS AND THE MINKOWSKI DIMENSION IN R2U. KEICH Abstract. We prove that the bound on the Lp norms of the Kakeya type maximal functions studied by Cordoba [2], and by Bourgain [1] are sharp for p &gt; 2. The proof is
Cornell - VIVO - 24707
ABSOLUTE CONTINUITY BETWEEN THE WIENER AND STATIONARY GAUSSIAN MEASURESU. KEICH Abstract. It is known that the entropy distance between two Gaussian measures is nite if, and only if, they are absolutely continuous with respect to one another. Shepp
Cornell - VIVO - 24707
THE ENTROPY DISTANCE BETWEEN THE WIENER AND STATIONARY GAUSSIAN MEASURESU. KEICH Abstract. Investigating the entropy distance between the Wiener measure,Wt0 , , and stationary Gaussian measures, Qt0 , on the space of continuous functions C[t0 , t0
Cornell - MATH - 6
The ProblemWaiting for k mutationsRick Durrett Deena Schmidt (IMA) Jason Schweinsberg (UCSD)Given a population of size N, how long does it take until k the rst time we have an individual with a prespecied sequence of k mutations? Initially all i
Cornell - MATH - 1
RecentWaldLecturesinProbability Recent Wald Lectures in Probability WaldLecture1: PhilosophyandAnecdotesRick Durrett,CornellUPDFsoftalks(6slidesperpage)andpapers: www.math.cornell.edu/~durrett/(2005)S.Varadhan (2005) S Varadhan (1999)CharlesNewma
Cornell - MATH - 2
The ProblemWald Lecture 2 My Work in Genetics with Jason SchweinsbregRick DurrettGiven a population of size N, how long does it take until k the rst time we have an individual with a prespecied sequence of k mutations? We use the Moran model. In
Cornell - MATH - 3
The planWald Lecture 3 Coexistence in Stochastic Spatial ModelsRick DurrettIn this talk I will review 20 years of work on Q. When is there coexistence in stochastic spatial models? The answer, announced in Durrett and Levin (1994), is that this
Cornell - MATH - 6
Problem 1Consider the contact process on a random graph with a power law degree distribution. Power law random graph. Following Newman, Strogatz, and Watts (2000, 2001) Let d1 , d2 . . . be i.i.d. with P(di = k) Ck with &gt; 3 so that var (di ) &lt; .
Cornell - VIVO - 22810
Government 603: American Politics Field Seminar Spring 2004 Wednesday 4:30-6:30 Professor Elizabeth Sanders 314 White Hall 255-2305 mes14@cornell.eduThis course introduces a wide selection of important and methodologically/theoretically diverse wor
Cornell - VIVO - 22810
Government 683US Foreign Policy in PerspectiveInstructors: Elizabeth Sanders Telephone: 255-2305 Email: mes14@cornell.edu Office: 314 White Hall Office Hours: MF 12:30-2 Matthew Evangelista 255-8672 mae10@cornell.edu 320 White Hall Wed, 10 am noo
Cornell - VIVO - 19598
HIPPOCAMPUS 16:000000 (2006)Hippocampal Place Cells, Context, and Episodic MemoryDavid M. Smith* and Sheri J.Y. MizumoriABSTRACT: Although most observers agree that the hippocampus has a critical role in learning and memory, there remains conside
Cornell - VIVO - 19598
3154 The Journal of Neuroscience, March 22, 2006 26(12):3154 3163Behavioral/Systems/CognitiveLearning-Related Development of Context-Specific Neuronal Responses to Places and Events: The Hippocampal Role in Context ProcessingDavid M. Smith and
Cornell - VIVO - 19598
Behavioral Neuroscience 2004, Vol. 118, No. 6, 12251239Copyright 2004 by the American Psychological Association 0735-7044/04/$12.00 DOI: 10.1037/0735-7044.118.6.1225Fornix Lesions Impair Context-Related Cingulothalamic Neuronal Patterns and Concu
Cornell - VIVO - 19598
Firing properties of dopamine neurons in freely moving dopamine-deficient mice: Effects of dopamine receptor activation and anesthesiaSiobhan Robinson*, David M. Smith, Sheri J. Y. Mizumori, and Richard D. Palmiter**Neurobiology and Behavior Progra
Cornell - VIVO - 19598
The Journal of Neuroscience, September 15, 2002, 22(18):82128221Limbic Thalamic Lesions, Appetitively Motivated Discrimination Learning, and Training-Induced Neuronal Activity in RabbitsDavid M. Smith,1,2 John H. Freeman Jr,4 Daniel Nicholson,4 an
Cornell - VIVO - 19598
The Journal of Neuroscience, May 1, 2001, 21(9):32713281Medial Geniculate, Amygdalar and Cingulate Cortical TrainingInduced Neuronal Activity during Discriminative Avoidance Learning in Rabbits with Auditory Cortical LesionsAdam D. Duvel,1 David M
Cornell - GOVT - 316
Note that readings and lectures change each time the course is taught so past questions may sound unfamiliar; I put the exam on the web site just as a preview of format. Exam length varies depending on whether the class is MWF or T/TH Govt 316 Midter
Cornell - GOVT - 316
WHAT YOU NEED TO KNOW BEFORE WRITING YOUR GOVT316 PAPERBy Matt Di Carlo, teaching assistantThe Thesis Statement Your paper, as well as the assessment and grading of your paper, largely revolves around one key feature: your thesis statement. Your t
Cornell - MATH - 3
Submitted to the Annals of Applied ProbabilityA WAITING TIME PROBLEM ARISING FROM THE STUDY OF MULTI-STAGE CARCINOGENESIS By Rick Durrett , Deena Schmidt, and Jason Schweinsberg Cornell University and University of California, San DiegoWe consider
Cornell - ORIE - 1
BIAS OPTIMALITY IN A QUEUE WITH ADMISSION CONTROLyzFaculty of Commerce and Business Administration University of British Columbia, 2053 Main Mall, Vancouver, BC Canada V6T 1Z2 School of Industrial and Systems Engineering Georgia Institute of Technol
Cornell - MATH - 0802
Submitted to the Annals of Applied ProbabilityWAITING FOR REGULATORY SEQUENCES TO APPEAR By Richard Durrett and Deena SchmidtCornell UniversityOne possible explanation for the substantial organismal dierences between humans and chimpanzees is th
Cornell - MATH - 0
Contact processes on random graphs with power law degree distributions have critical value 0.Shirshendu Chatterjee and Rick Durrett Departments of ORIE and Mathematics, Cornell University, Ithaca, New York 14853 May 23, 2008Abstract If we consider
Cornell - VIVO - 24628
May 15, 2008 / Vol. 33, No. 10 / OPTICS LETTERS1041Energy limits imposed by two-photon absorption for pulse amplication in high-power semiconductor optical ampliersFaisal R. Ahmad, Yen Wei Tseng, Mikhail A. Kats, and Farhan Rana*School of Elect
Cornell - VIVO - 24628
1308IEEE PHOTONICS TECHNOLOGY LETTERS, VOL. 20, NO. 15, AUGUST 1, 2008Fundamental and Subharmonic Hybrid Mode-Locking of a High-Power (220 mW) Monolithic Semiconductor LaserFaisal R. Ahmad, Student Member, IEEE, and Farhan Rana, Member, IEEEAbs
Cornell - VIVO - 24628
190IEEE PHOTONICS TECHNOLOGY LETTERS, VOL. 20, NO. 3, FEBRUARY 1, 2008Passively Mode-Locked High-Power (210 mW) Semiconductor Lasers at 1.55-m WavelengthFaisal R. Ahmad and Farhan Rana, Member, IEEEAbstractWe report on the generation of stable
Cornell - VIVO - 24628
IEEE JOURNAL OF QUANTUM ELECTRONICS, VOL. 43, NO. 11, NOVEMBER 20071109Relaxation Oscillations and Pulse Stability in Harmonically Mode-Locked Semiconductor LasersFarhan Rana and Paul GeorgeAbstractIn this paper, we discuss pulse dynamics in ha