11 Pages

hMetis

Course: ECE 1387, Fall 2009
School: Toledo
Rating:
 
 
 
 
 

Word Count: 9129

Document Preview

TRANSACTIONS IEEE ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 69 Multilevel Hypergraph Partitioning: Applications in VLSI Domain George Karypis, Rajat Aggarwal, Vipin Kumar, Senior Member, IEEE, and Shashi Shekhar, Senior Member, IEEE Abstract In this paper, we present a new hypergraphpartitioning algorithm that is based on the multilevel paradigm. In the multilevel paradigm, a...

Register Now

Unformatted Document Excerpt

Coursehero >> Ohio >> Toledo >> ECE 1387

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
TRANSACTIONS IEEE ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 69 Multilevel Hypergraph Partitioning: Applications in VLSI Domain George Karypis, Rajat Aggarwal, Vipin Kumar, Senior Member, IEEE, and Shashi Shekhar, Senior Member, IEEE Abstract In this paper, we present a new hypergraphpartitioning algorithm that is based on the multilevel paradigm. In the multilevel paradigm, a sequence of successively coarser hypergraphs is constructed. A bisection of the smallest hypergraph is computed and it is used to obtain a bisection of the original hypergraph by successively projecting and rening the bisection to the next level ner hypergraph. We have developed new hypergraph coarsening strategies within the multilevel framework. We evaluate their performance both in terms of the size of the hyperedge cut on the bisection, as well as on the run time for a number of very large scale integration circuits. Our experiments show that our multilevel hypergraph-partitioning algorithm produces high-quality partitioning in a relatively small amount of time. The quality of the partitionings produced by our scheme are on the average 6%23% better than those produced by other state-of-the-art schemes. Furthermore, our partitioning algorithm is signicantly faster, often requiring 410 times less time than that required by the other schemes. Our multilevel hypergraph-partitioning algorithm scales very well for large hypergraphs. Hypergraphs with over 100 000 vertices can be bisected in a few minutes on todays workstations. Also, on the large hypergraphs, our scheme outperforms other schemes (in hyperedge cut) quite consistently with larger margins (9%30%). Index Terms Circuit partitioning, hypergraph partitioning, multilevel algorithms. During the course of VLSI circuit design and synthesis, it is important to be able to divide the system specication into clusters so that the inter-cluster connections are minimized. This step has many applications including design packaging, HDL-based synthesis, design optimization, rapid prototyping, simulation, and testing. In particular, many rapid prototyping systems use partitioning to map a complex circuit onto hundreds of interconnected eld-programmable gate arrays (FPGAs). Such partitioning instances are challenging because the timing, area, and input/output (I/O) resource utilization must satisfy hard device-specic constraints. For example, if the number of signal nets leaving any one of the clusters is greater than the number of signal p-i-ns available in the FPGA, then this cluster cannot be implemented using a single FPGA. In this case, the circuit needs to be further partitioned, and thus implemented using multiple FPGAs. Hypergraphs can be used to naturally represent a VLSI circuit. The vertices of the hypergraph can be used to represent the cells of the circuit, and the hyperedges can be used to represent the nets connecting these cells. A high quality hypergraph-partitioning algorithm greatly affects the feasibility, quality, and cost of the resulting system. A. Related Work The problem of computing an optimal bisection of a hypergraph is at least NP-hard [5]. However, because of the importance of the problem in many application areas, many heuristic algorithms have been developed. The survey by Alpert and Khang [1] provides a detailed description and comparison of such various schemes. In a widely used class of iterative renement partitioning algorithms, an initial bisection is computed (often obtained randomly) and then the partition is rened by repeatedly moving vertices between the two parts to reduce the hyperedge cut. These algorithms often use the SchweikertKernighan heuristic [6] (an extension of the KernighanLin (KL) heuristic [7] for hypergraphs), or the faster FiducciaMattheyses (FM) [8] renement heuristic, to iteratively improve the quality of the partition. In all of these methods (sometimes also called KLFM schemes), a vertex is moved (or a vertex pair is swapped) if it produces the greatest reduction in the edge cuts, which is also called the gain for moving the vertex. The partition produced by these methods is often poor, especially for larger hypergraphs. Hence, these algorithms have been extended in a number of ways [9][12]. Krishnamurthy [9] tried to introduce intelligence in the tiebreaking process from among the many possible moves with ) algorithm, the same high gain. He used a Look Ahead ( which looks ahead up to -level of gains before making I. INTRODUCTION YPERGRAPH partitioning is an important problem with extensive application to many areas, including very large scale integration (VLSI) design [1], efcient storage of large databases on disks [2], and data mining [3]. The problem roughly is to partition the vertices of a hypergraph into equal parts, such that the number of hyperedges connecting vertices in different parts is minimized. A hypergraph is a generalization of a graph, where the set of edges is replaced by a set of hyperedges. A hyperedge extends the notion of an edge by allowing more than two vertices to be connected by is dened a hyperedge. Formally, a hypergraph , where each as a set of vertices and a set of hyperedges hyperedge is a subset of the vertex set [4], and the size of a hyperedge is the cardinality of this subset. Manuscript received April 29, 1997; revised March 23, 1998. This work was supported under IBM Partnership Award NSF CCR-9423082, by the Army Research Ofce under Contract DA/DAAH04-95-1-0538, and by the Army High Performance Computing Research Center, the Department of the Army, Army Research Laboratory Cooperative Agreement DAAH04-95-20003/Contract DAAH04-95-C-0008. G. Karypis, V. Kumar, and S. Shekhar are with the Department of Computer Science and Engineering, Minneapolis, University of Minnesota, Minneapolis, MN 55455-0159 USA. R. Aggarwal is with the Lattice Semiconductor Corporation, Milpitas, CA 95131 USA. Publisher Item Identier S 1063-8210(99)00695-2. H 10638210/99$10.00 1999 IEEE 70 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 a move. PROP [11], introduced by Dutt and Deng, used a probabilistic gain computation model for deciding which vertices need to move across the partition line. These schemes tend to enhance the performance of the basic KLFM family of renement algorithms, at the expense of increased run time. Dutt and Deng [12] proposed two new methods, namely, CLIP and CDIP, for computing the gains of hyperedges that contain more than one node on either side of the partition boundary. and CLIP in conjunction with CDIP in conjunction with PROP are two schemes that have shown the best results in their experiments. Another class of hypergraph-partitioning algorithms [13][16] performs partitioning in two phases. In the rst phase, the hypergraph is coarsened to form a small hypergraph, and then the FM algorithm is used to bisect the small hypergraph. In the second phase, these algorithms use the bisection of this contracted hypergraph to obtain a bisection of the original hypergraph. Since FM renement is done only on the small coarse hypergraph, this step is usually fast, but the overall performance of such a scheme depends upon the quality of the coarsening method. In many schemes, the projected partition is further improved using the FM renement scheme [15]. Recently, a new class of partitioning algorithms was developed [17][20] based upon the multilevel paradigm. In these algorithms, a sequence of successively smaller (coarser) graphs is constructed. A bisection of the smallest graph is computed. This bisection is now successively projected to the next-level ner graph and, at each level, an iterative renement algorithm such as KLFM is used to further improve the bisection. The various phases of multilevel bisection are illustrated in Fig. 1. Iterative renement schemes such as KLFM become quite powerful in this multilevel context for the following reason. First, the movement of a single node across a partition boundary in a coarse graph can lead to the movement of a large number of related nodes in the original graph. Second, the rened partitioning projected to the next level serves as an excellent initial partitioning for the KL or FM renement algorithms. This paradigm was independently studied by Bui and Jones [17] in the context of computing ll-reducing matrix reordering, by Hendrickson and Leland [18] in the context of niteelement mesh-partitioning, and by Hauck and Borriello (called Optimized KLFM) [20], and by Cong and Smith [19] for hypergraph partitioning. Karypis and Kumar extensively studied this paradigm in [21] and [22] for the partitioning of graphs. They presented new graph coarsening schemes for which even a good bisection of the coarsest graph is a pretty good bisection of the original graph. This makes the overall multilevel paradigm even more robust. Furthermore, it allows the use of simplied variants of KLFM renement schemes during the uncoarsening phase, which signicantly speeds up the renement process without compromising overall quality. METIS [21], a multilevel graph partitioning algorithm based upon this work, routinely nds substantially better bisections and is often two orders of magnitude faster than the hitherto state-of-the-art spectral-based bisection techniques [23], [24] for graphs. The improved coarsening schemes of METIS work only for graphs and are not directly applicable to hypergraphs. If the Fig. 1. The various phases of the multilevel graph bisection. During the coarsening phase, the size of the graph is successively decreased; during the initial partitioning phase, a bisection of the smaller graph is computed, and during the uncoarsening and renement phase, the bisection is successively rened as it is projected to the larger graphs. During the uncoarsening and renement phase, the dashed lines indicate projected partitionings and dark solid lines indicate partitionings that were produced after renement. 0 is the given graph, which is the nest graph. i+1 is the next level coarser graph of i , and vice versa, i is the next level ner graph of i+1 . 4 is the coarsest graph. G G G G G G hypergraph is rst converted into a graph (by replacing each hyperedge by a set of regular edges), then METIS [21] can be used to compute a partitioning of this graph. This technique was investigated by Alpert and Khang [25] in their algorithm called GMetis. They converted hypergraphs to graphs by simply replacing each hyperedge with a clique, and then they dropped many edges from each clique randomly. They used METIS to compute a partitioning of each such random graph and then selected the best of these partitionings. Their results show that reasonably good partitionings can be obtained in a reasonable amount of time for a variety of benchmark problems. In particular, the performance of their resulting scheme is comparable to other state-of-the art schemes such as PARABOLI [26], PROP [11], and the multilevel hypergraph partitioner from Hauck and Borriello [20]. The conversion of a hypergraph into a graph by replacing each hyperedge with a clique does not result in an equivalent representation since high-quality partitionings of the resulting graph do not necessarily lead to high-quality partitionings of the hypergraph. The standard hyperedge-to-edge conversion to each edge in [27] assigns a uniform weight of is the of the hyperedge, i.e., the the clique, where number of vertices in the hyperedge. However, the fundamental problem associated with replacing a hyperedge by its clique is that there exists no scheme to assign weight to the edges of the clique that can correctly capture the cost of cutting this hyperedge [28]. This hinders the partitioning renement algorithm since vertices are moved between partitions depending on how much this reduces the number of edges they cut in the converted graph, whereas the real objective is to minimize the number of hyperedges cut in the original hypergraph. Furthermore, the hyperedge-to-clique conversion destroys the natural sparsity of the hypergraph, signicantly increasing the KARYPIS et al.: MULTILEVEL HYPERGRAPH PARTITIONING: APPLICATIONS IN VLSI DOMAIN 71 run time of the partitioning algorithm. Alpert and Khang [25] solved this problem by dropping many edges of the clique randomly, but this makes the graph representation even less accurate. A better approach is to develop coarsening and renement schemes that operate directly on the hypergraph. Note that the multilevel scheme by Hauck and Borriello [20] operates directly on hypergraphs and, thus, is able to perform accurate renement during the uncoarsening phase. However, all coarsening schemes studied in [20] are edge-oriented; i.e., they only merge pairs of nodes to construct coarser graphs. Hence, despite a powerful renement scheme (FM with the ) during the uncoarsening phase, their use of look-ahead performance is only as good as that of GMetis [25]. B. Our Contributions In this paper, we present a multilevel hypergraphpartitioning algorithm hMETIS that operates directly on the hypergraphs. A key contribution of our work is the development of new hypergraph coarsening schemes that allow the multilevel paradigm to provide high-quality partitions quite consistently. The use of these powerful coarsening schemes also allows the renement process to be simplied considerably (even beyond plain FM renement), making the multilevel scheme quite fast. We investigate various algorithms for the coarsening and uncoarsening phases which operate on the hypergraphs without converting them into graphs. We have also developed new multiphase renement schemes ( - and -cycles) based on the multilevel paradigm. These schemes take an initial partition as input and try to improve them using the multilevel scheme. These multiphase schemes further reduce the run times, as well as improve the solution quality. We evaluate their performance both in terms of the size of the hyperedge cut on the bisection, as well as on run time on a number of VLSI circuits. Our experiments show that our multilevel hypergraph-partitioning algorithm produces high-quality partitioning in a relatively small amount of time. The quality of the partitionings produced by our scheme are on the average 6%23% better than those produced by other state-of-the-art schemes [11], [12], [25], [26], [29]. The difference in quality over other schemes becomes even greater for larger hypergraphs. Furthermore, our partitioning algorithm is signicantly faster, often requiring 410 times less time than that required by the other schemes. For many circuits in the well-known ACM/SIGDA benchmark set [30], our scheme is able to nd better partitionings than those reported in the literature for any other hypergraph-partitioning algorithm. The remainder of this paper is organized as follows. Section II describes the different algorithms used in the three phases of our multilevel hypergraph-partitioning algorithm. Section III describes a new partitioning renement algorithm based on the multilevel paradigm. Section IV compares the results produced by our algorithm to those produced by earlier hypergraph-partitioning algorithms. II. MULTILEVEL HYPERGRAPH BISECTION We now present the framework of hMETIS, in which the coarsening and renement scheme work directly with hyper- edges without using the clique representation to transform them into edges. We have developed new algorithms for both the phases, which, in conjunction, are capable of delivering very good quality solutions. A. Coarsening Phase During the coarsening phase, a sequence of successively smaller hypergraphs are constructed. As in the case of multilevel graph bisection, the purpose of coarsening is to create a small hypergraph, such that a good bisection of the small hypergraph is not signicantly worse than the bisection directly obtained for the original hypergraph. In addition to that, hypergraph coarsening also helps in successively reducing the sizes of the hyperedges. That is, after several levels of coarsening, large hyperedges are contracted to hyperedges that connect just a few vertices. This is particularly helpful, since renement heuristics based on the KLFM family of algorithms [6][8] are very effective in rening small hyperedges, but are quite ineffective in rening hyperedges with a large number of vertices belonging to different partitions. Groups of vertices that are merged together to form single vertices in the next-level coarse hypergraph can be selected in different ways. One possibility is to select pairs of vertices with common hyperedges and to merge them together, as illustrated in Fig. 2(a). A second possibility is to merge together all the vertices that belong to a hyperedge, as illustrated in Fig. 2(b). Finally, a third possibility is to merge together a subset of the vertices belonging to a hyperedge, as illustrated in Fig. 2(c). These three different schemes for grouping vertices together for contraction are described below. 1) Edge Coarsening (EC): The heavy-edge matching scheme used in the multilevel-graph bisection algorithm can also be used to obtain successively coarser hypergraphs by merging the pairs of vertices connected by many hyperedges. In this EC scheme, a heavy-edge maximal1 matching of the vertices of the hypergraph is computed as follows. The vertices are visited in a random order. For each vertex , all unmatched vertices that belong to hyperedges incident to are considered, and the one that is connected via the edge with the largest weight is matched with . The weight of an edge connecting two vertices and is computed as the sum of the edge weights of all the hyperedges that contain and . Each hyperedge of size is assigned an edge-weight of , and as hyperedges collapse on each other during coarsening, their edge weights are added up accordingly. This EC scheme is similar in nature to the schemes that treat the hypergraph as a graph by replacing each hyperedge with its clique representation [27]. However, this hypergraph-tograph conversion is done implicitly during matching without forming the actual graph. 2) Hyperedge Coarsening (HEC): Even though the EC scheme is able to produce successively coarser hypergraphs, it decreases the hyperedge weight of the coarser graph only for those pairs of matched vertices that are connected via a hyperedge of size two. As a result, the total hyperedge 1 One can also compute a maximum weight matching [31]; however, that would have signicantly increased the amount of time required by this phase. 72 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 Fig. 2. Various ways of matching the vertices in the hypergraph and the coarsening they induce. (a) In edge-coarsening, connected pairs of vertices are matched together. (b) In hyperedge-coarsening, all the vertices belonging to a hyperedge are matched together. (c) In MHEC, we match together all the vertices in a hyperedge, as well as all the groups of vertices belonging to a hyperedge. weight of successively coarser graphs does not decrease very fast. In order to ensure that for every group of vertices that are contracted together, there is a decrease in the hyperedge weight in the coarser graph, each such group of vertices must be connected by a hyperedge. This is the motivation behind the HEC scheme. In this scheme, an independent set of hyperedges is selected and the vertices that belong to individual hyperedges are contracted together. This is implemented as follows. The hyperedges are initially sorted in a nonincreasing hyperedge-weight order and the hyperedges of the same weight are sorted in a nondecreasing hyperedge size order. Then, the hyperedges are visited in that order, and for each hyperedge that connects vertices that have not yet been matched, the vertices are matched together. Thus, this scheme gives preference to the hyperedges that have large weight and those that are of small size. After all of the hyperedges have been visited, the groups of vertices that have been matched are contracted together to form the next level coarser graph. The vertices that are not part of any contracted hyperedges are simply copied to the next level coarser graph. 3) Modied Hyperedge Coarsening (MHEC): The HEC algorithm is able to signicantly reduce the amount of hyperedge weight that is left exposed in successively coarser graphs. However, during each coarsening phase, a majority of the hyperedges do not get contracted because vertices that belong to them have been contracted via other hyperedges. This leads to two problems. First, the size of many hyperedges does not decrease sufciently, making FM-based renement difcult. Second, the weight of the vertices (i.e., the number of vertices that have been collapsed together) in successively coarser graphs becomes signicantly different, which distorts the shape of the contracted hypergraph. To correct this problem, we implemented a MHEC scheme as follows. After the hyperedges to be contracted have been selected using the HEC scheme, the list of hyperedges is traversed again, and for each hyperedge that has not yet been contracted, the vertices that do not belong to any other contracted hyperedge are contracted together. B. Initial Partitioning Phase During the initial partitioning phase, a bisection of the coarsest hypergraph is computed, such that it has a small cut, and satises a user-specied balance constraint. The balance constraint puts an upper bound on the difference between the relative size of the two partitions. Since this hypergraph has a very small number of vertices (usually less than 200), the time to nd a partitioning using any of the heuristic algorithms tends to be small. Note that it is not useful to nd an optimal partition of this coarsest graph, as the initial partition will be substantially modied during the renement phase. We used the following two algorithms for computing the initial partitioning. The rst algorithm simply creates a random bisection such that each part has roughly equal vertex weight. The second algorithm starts from a randomly selected vertex and grows a region around it in a breadth-rst fashion [22] until half of the vertices are in this region. The vertices belonging to the grown region are then assigned to the rst part, and the rest of the vertices are assigned to the second part. After a partitioning is constructed using either of these algorithms, the partitioning is rened using the FM renement algorithm. Since both algorithms are randomized, different runs give solutions of different quality. For this reason, we perform a small number of initial partitionings. At this point, we can select the best initial partitioning and project it to the original hypergraph, as described in Section II-C. However, the partitioning of the coarsest hypergraph that has the smallest cut may not necessarily be the one that will lead to the smallest cut in the original hypergraph. It is possible that another partitioning of the coarsest hypergraph (with a higher cut) will lead to a bet- KARYPIS et al.: MULTILEVEL HYPERGRAPH PARTITIONING: APPLICATIONS IN VLSI DOMAIN 73 ter partitioning of the original hypergraph after the renement is performed during the uncoarsening phase. For this reason, instead of selecting a single initial partitioning (i.e., the one with the smallest cut), we propagate all initial partitionings. Note that propagation of initial partitionings increases the time during the renement phase by a factor of . Thus, by increasing the value of , we can potentially improve the quality of the nal partitioning at the expense of higher run time. One way to dampen the increase in run time due to large values of is to drop unpromising partitionings as the hypergraph is uncoarsened. For example, one possibility is to propagate only those partitionings whose cuts are within % of the best partitionings at the current level. If the value of is sufciently large, then all partitionings will be maintained and propagated in the entire renement phase. On the other hand, if the value of is sufciently small then, on average, only one partitioning will be maintained, as all other partitionings will be eliminated at the coarsest level. For moderate values of , many partitionings may be available at the coarsest graph, but the number of such available partitionings will decrease as the graph is uncoarsened. This is useful for two reasons. First, it is more important to have many alternate partitionings at the coarser levels, as the size of the cut of a partitioning at a coarse level is a less accurate reection of the size of the cut of the original nest level hypergraph. Second, renement is more expensive at the ne levels, as these levels contain far more nodes than the coarse levels. Hence, by choosing an appropriate value of , we can benet from the availability of many alternate partitionings at the coarser levels and avoid paying the high cost of renement at the ner levels by keeping fewer candidates on average. In our experiments, as reported in this paper, we nd ten initial partitionings at the coarsest graph, and we drop all partitionings whose cut is 10% worse than the best cut at that level. This allows us to both lter out the really bad partitionings (and thus reduce the amount of time spent in renement) and at the same time keep more than just one promising partitioning (so as to improve the overall partitioning quality). In our experiments, we have seen that by keeping ten partitionings, we can reduce the cut on the average by 3%4%, whereas the partitioning time increases only by a factor of two. Computing and propagating more partitionings does not further reduce the cut signicantly. In our experiments, keeping 20 partitionings further reduces the cut by a factor less than 0.5%, on the average. Increasing the value of parameter (from 10% to a higher value such as 20%) did not signicantly improve the quality of the partitionings, although it did increase the run time. C. Uncoarsening and Renement Phase During the uncoarsening phase, a partitioning of the coarser hypergraph is successively projected to the next-level ner hypergraph, and a partitioning renement algorithm is used to reduce the cut set (and thus to improve the quality of the partitioning) without violating the user specied balance constraints. Since the next-level ner hypergraph has more degrees of freedom, such renement algorithms tend to improve the solution quality. We have implemented two different partitioning renement algorithms. The rst is the FM algorithm [8], which repeatedly moves vertices between partitions in order to improve the cut. The second algorithm, called hyperedge renement (HER), moves groups of vertices between partitions so that an entire hyperedge is removed from the cut. These algorithms are further described in the remainder of this section. 1) FM: The partitioning renement algorithm by Fiduccia and Mattheyses [8] is iterative in nature. It starts with an initial partitioning of the hypergraph. In each iteration, it tries to nd subsets of vertices in each partition, such that moving them to other partitions improves the quality of the partitioning (i.e., the number of hyperedges being cut decreases) and this does not violate the balance constraint. If such subsets exist, then the movement is performed and this becomes the partitioning for the next iteration. The algorithm continues by repeating the entire process. If it cannot nd such a subset, then the algorithm terminates since the partitioning is at a local minima and no further improvement can be made by this algorithm. In particular, for each vertex , the FM algorithm computes the gain, which is the reduction in the hyperedge cut achieved by moving to the other partition. Initially all vertices are unlocked, i.e., they are free to move to the other partition. The algorithm iteratively selects an unlocked vertex with the largest gain (subject to balance constraints) and moves it to the other partition. When a vertex is moved, it is locked, and the gain of the vertices adjacent to are updated. After each vertex movement, the algorithm also records the size of the cut achieved at this point. Note that the algorithm does not allow locked vertices to be moved since this may result in thrashing (i.e., repeated movement of the same vertex). A single pass of the FM algorithm ends when there are no more unlocked vertices (i.e., all the vertices have been moved). Then, the recorded cut sizes are checked, and the point where the minimum cut was achieved is selected, and all vertices that were moved after that point are moved back to their original partition. Now, this becomes the initial partitioning for the next pass of the algorithm. With the use of appropriate data structures, the complexity [8]. of each pass of the FM algorithm is For renement in the context of multilevel schemes, the initial obtained partitioning from the next level coarser graph is actually a very good partition. For this reason, we can make a number of optimizations to the original FM algorithm. The rst optimization limits the maximum number of passes performed by the FM algorithm to only two. This is because the greatest reduction in the cut is obtained during the rst or second pass and any subsequent passes only marginally improve the quality. Our experience has shown that this optimization signicantly improves the run time of FM without affecting the overall quality of the produced partitionings. The second optimization aborts each pass of the FM algorithm before actually moving all the vertices. The motivation behind this is that only a small fraction of the vertices being moved actually lead to a reduction in the cut and, after some point, the cut tends to increase as we move more vertices. When FM is applied to a random initial partitioning, it is quite likely that after a long sequence of bad moves, the algorithm will climb 74 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 Fig. 3. Effect of restricted coarsening. (a) Example hypergraph with a given partitioning with the required balance of 40/60. (b) Possible condensed version of (a). (c) Another condensed version of a hypergraph. out of a local minima and reach to a better cut. However, in the context of a multilevel scheme, a long sequence of cutincreasing moves rarely leads to a better local minima. For this reason, we stop each pass of the FM algorithm as soon as we have performed vertex moves that did not improve the cut. We choose to be equal to 1% of the number of vertices in the graph we are rening. This modication to FM, called early-exit FM (FM-EE), does not signicantly affect the quality of the nal partitioning, but it dramatically improves the run time (see Section IV). 2) HER: One of the drawbacks of FM (and other similar vertex-based renement schemes) is that it is often unable to rene hyperedges that have many nodes on both sides of the partitioning boundary. However, a renement scheme that moves all the vertices that belong to a hyperedge can potentially solve this problem. Our HER works as follows. It randomly visits all the hyperedges and, for each one that straddles the bisection, it determines if it can move a subset of the vertices incident on it, so that this hyperedge will become completely interior to a partition. In particular, consider a hyperedge , which straddles the partitioning boundary, and and be the vertices of that belong to partition 0 let and partition 1, respectively. Our algorithm computes the gain , which is the reduction in the cut achieved by moving the vertices in to partition 1, and the gain , which is the reduction in the cut achieved by moving the vertices in to partition 0. Now, depending on these gains and subject to or . balance constraints, it may move one of the two sets In particular, if is positive and , it moves , and if is positive and , it moves . III. MULTIPHASE REFINEMENT WITH RESTRICTED COARSENING Although the multilevel paradigm is quite robust, randomization is inherent in all three phases of the algorithm. In particular, the random choice of vertices to be matched in the coarsening phase can disallow certain hyperedge cuts, reducing renement in the uncoarsening phase. For example, consider the example hypergraph in Fig. 3(a) and its two possible condensed versions [Fig. 3(b) and (c)] with the same partitioning. The version in Fig. 3(b) is obtained by selecting hyperedges and to be compressed in the HEC phase and then selecting , , and to be compressed in the pairs of nodes modied HEC phase. Similarly, the version shown in Fig. 3(c) is obtained by selecting hyperedge to be compressed in and the HEC phase and then selecting pairs of nodes to be compressed in the MHEC phase. In the version can be moved from partition of Fig. 3(b), vertex to to reduce the hyperedge cuts by 1, but in Fig. 3(c), no vertex can be moved to reduce the hyperedge cuts. What this example shows is that, in a multilevel setting, a given initial partitioning of a hypergraph can be potentially rened in many different ways depending upon how the coarsening is performed. Hence, a partitioning produced by a multilevel partitioning algorithm can be potentially further rened if the two partitions are again coarsened in a manner different from the previous coarsening phase (which is easily done given the random nature of all of the coarsening schemes described here). The power of iterative renement at different coarsening levels can also be used to develop a partitioning renement algorithm based on the multilevel paradigm. The idea behind this multiphase renement algorithm is quite simple. It consists of two phases, namely a coarsening and an uncoarsening phase. The uncoarsening phase of the multiphase renement algorithm is identical to the uncoarsening phase of the multilevel hypergraph-partitioning algorithm described in Section II-C. The coarsening phase, however, is somewhat different, as it preserves the initial partitioning that is input to the algorithm. We will refer to this as the restricted and a partitioning coarsening scheme. Given a hypergraph , during the coarsening phase, a sequence of successively coarser hypergraphs and their partitionings is constructed. Let for , be the sequence of hypergraphs and its partitioning and partitionings. Given a hypergraph , restricted coarsening will collapse vertices together that belong to only one of the two partitions. That is, if and are the two partitions, we only collapse together vertices that or partition . The partitioning either belong to partition of the next level coarser hypergraph is computed . For example, if by simply inheriting the partition from a set of vertices , , from partition are collapsed of , then vertex belong together to form vertex KARYPIS et al.: MULTILEVEL HYPERGRAPH PARTITIONING: APPLICATIONS IN VLSI DOMAIN 75 to partition as well. By constructing and in this way, we ensure that the number of hyperedges cut by the partitioning is identical to the number of hyperedges cut in . The set of vertices to be collapsed together in by this restricted coarsening scheme can be selected by using any of the coarsening schemes described in Section II-A, namely, edge coarsening, hyperedge coarsening, or modied hyperedge coarsening. Due to the randomization in the coarsening phase, successive runs of the multiphase renement algorithm can lead to additional reductions in the hyperedge cut. Thus, the multiphase renement algorithm can be performed iteratively. Note that during the renement phase, we only propagate a single partitioning; thus, multiphase renement is quite fast. In the context of our multilevel hypergraph-partitioning algorithm, this new multiphase renement can be used in a number of ways. In the remainder of this section, we describe three such approaches. 1) -Cycle: In this scheme, we take the best solution obtained from the multilevel partitioning algorithm ( ) and we improve it using multiphase renement repeatedly. We stop the multiphase renement when the solution quality cannot be improved further. The number of multiphase renement steps performed is problem dependent and, in general, it increases as the size of the hypergraph increases. This is due to the larger solution space of the large hypergraphs. 2) -Cycle: Our experience with the multilevel partitioning algorithm has shown that rening multiple solutions is expensive, especially during the nal uncoarsening levels when the size of the contracted hypergraphs is large. One way to reduce the high cost of rening multiple solutions during the nal uncoarsening levels is to select the best partitioning at some point in the uncoarsening phase and further rene only this best partitioning using multiphase renement. This is the be idea behind the -cycle renement. In particular, let (original the coarse hypergraph at the midpoint between (coarsest hypergraph). Let be the hypergraph) and . We then use ( , ) as best partitioning at is relatively the input to multiphase renement. Since , multiphase renement converges small, as compared to in a small number of iterations. By using -cycles, we can signicantly reduce the amount of time spent in the renement phase, especially for large hypergraphs. However, the overall quality can potentially decrease because we may have not . picked up the best overall partitioning at -Cycle: We can combine both -cycles and -cycles 3) in the algorithm to obtain high-quality partitioning in a small amount of time. In this scheme, we use -cycles to partition the hypergraph followed by the -cycles to further improve the partition quality. -cycles used in this way are particularly effective in signicantly improving the hyperedge cut. TABLE I CHARACTERISTICS OF THE VARIOUS HYPERGRAPHS USED TO EVALUATE THE MULTILEVEL HYPERGRAPH PARTITIONING ALGORITHMS We performed all of our experiments on an SGI Challenge that has MIPS R10000 processors running at 200 MHz, and all of the reported run times are in seconds. All of the reported partitioning results were obtained by forcing a 4555 balance condition. As discussed in Sections II-A, II-B, and II-C, there are many alternatives for each of the three different phases of a multilevel algorithm. Due to space limitations, we are not able to provide a comprehensive comparison of the various parameters. However, this comparison can be found in the full version of this paper, which is available on the World Wide Web at: http//www.cs.umn.edu/karypis/publications. In the remainder of this section, we present comparisons of our scheme with other partitioning schemes available in the literature. A. Comparison with Other Partitioning Algorithms To compare the performance of the bisections produced by our multilevel hypergraph bisection and multiphase renement algorithms, both in terms of bisection quality and run time, we created Table II. Table II shows the sizes of the hyperedge cuts produced by our algorithms (hMETIS)and those reported by various previously developed hypergraph bisection algorithms. In particular, Table II contains results and for the following algorithms: PROP [11], [12], Optimized KLFM (scheme by Hauck and Borriello [20]), GMetis [25], PARABOLI [26], and GFM [32]. Note that for certain circuits, there are missing results for some of the algorithms. This is because no results were reported for these circuits. The column labeled Best shows the minimum cut obtained for each circuit by any of the earlier algorithms. Essentially, this column represents the quality that would have been obtained if all of the algorithms had been run and the best partition was selected. The last four columns of Table II shows the partitionings produced by our multilevel hypergraph bisection and rene- IV. EXPERIMENTAL RESULTS We experimentally evaluated the quality of the bisections produced by our multilevel hypergraph-partitioning algorithm on a large number of hypergraphs that are part of the widely used ACM/SIGDA circuit partitioning benchmark suite [30]. The characteristics of these hypergraphs are shown in Table I. 76 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 TABLE II PERFORMANCE OF OUR MULTILEVEL HYPERGRAPH BISECTION ALGORITHM (hMETIS) AGAINST VARIOUS PREVIOUSLY DEVELOPED ALGORITHMS ment algorithms. In particular, the column labeled hMETISEE corresponds to the best partitioning produced from 20 runs of our multilevel algorithm that uses FM-EE during renement. Of these 20 runs, ten runs used HEC and ten runs used MHEC. The column labeled hMETIS-FM corresponds to the best partitioning produced from 20 runs when FM is used during renement and coarsening is performed similarly to hMETIS-EE . In both of these schemes, we used random initial partitionings during the initial partitioning phase. corresponds to the The column labeled hMETIS-EE best partitioning produced from ten runs of our multilevel par-cycle renement scheme. titioning algorithm that uses the These results were obtained using the MHEC and EE-FM for renement. Finally, the column labeled hMETIS-FM corresponds to the best partition produced from 20 runs that -cycles renement scheme. Out of these runs, use the ten used HEC and ten used MHEC for coarsening; and the renement was done using FM. To make the comparison with previous algorithms easier, we computed the total number of hyperedges cut by each algorithm, as well as the percentage improvement in the cut achieved by our algorithms over previous algorithms. This cut improvement was computed as the average improvement on a circuit-by-circuit level. Looking at these results, we see that all four of our algorithms produce partitionings whose quality is better than that produced by any of the previous algorithms. In particular, hMETIS-EE is 4.1% better than , 5.3% better than , 6.2% better than PROP, 7.8% better than GFM, 9.9% better than Optimized KLFM, 10.0% better than GMetis, and 21.4% better than PARABOLI. If all of these algorithms are considered together, hMETIS-EE is still better by 0.3%. Comparing hMETIS-EE with hMETIS-FM , we see that hMETISFM is about 1.1% better than hMETIS-EE , and about 1.4% better than all of the previous schemes combined. In particular, hMETIS-FM was able to improve the best-known bisections for eight out of the 23 test circuits. Looking at the quality of the partitionings produced by the two schemes that use the multilevel hypergraph renement ( -cycles), we see that these schemes are able to is produce very good results. In particular, hMETIS-FM KARYPIS et al.: MULTILEVEL HYPERGRAPH PARTITIONING: APPLICATIONS IN VLSI DOMAIN 77 Fig. 4. The relative performance of hMETIS-FM20vV compared to rest of the schemes on the large benchmarks (with 10 K or more nodes). about 2.0% better than hMETIS-EE and 0.9% better that seems to be the overall best hMETIS-FM . hMETIS-FM scheme, producing partitionings whose quality is better than any of the previous schemes and 2.3% better that the Best. The last sub-table of Table II shows the total amount of time required by the various partitioning algorithms. These run times are in seconds on the respective architectures. Because of the difference in central processing unit (CPU) speed at the various machines, it is hard to make direct comparisons. However, we tested our code on Sparc5 and we found that it requires about four times more time than when it is running on R10000. Taking into consideration a scaling factor of four, we see that both hMETIS-EE and hMETIS-FM require , , less time than either PROP, is PARABOLI, or GFM. In particular, hMETIS-EE about four times faster than PROP, nine times faster than and , and much faster than PARABOLI, GFM and Optimized KLFM. Compared to GMetis, we see that hMETIS-EE requires roughly the same time, whereas hMETIS-FM is about twice as slow. Note that GMetis runs METIS 100 times on each graph, but each of these runs is substantially faster than hMETIS, partly because METIS is a highly optimized code for graphs, and partly because coarsening and renement on hypergraphs is more complex than the renement schemes used in METIS for graphs. However, both hMETIS-EE and hMETIS-FM produce bisections that cut substantially fewer hyperedges than GMetis. Looking at the amount of time required by hMETISand hMETIS-FM , we see that, by using mulEE tiphase renement, we were in general able to further reduce the amount of time required by our partitioning algorithms. In requires only 409 s to partition particular, hMETIS-EE requires 1513 s. all 23 circuits, whereas hMETIS-FM V. CONCLUSIONS AND FUTURE WORK As the experiments in Section IV show, the multilevel paradigm is very successful in producing high-quality hypergraph partitionings in a relatively small amount of time. The multilevel paradigm is successful for the following reasons. The coarsening phase is able to generate a sequence of hypergraphs that are good approximations of the original hypergraph. The initial partitioning algorithm is then able to nd a good partitioning by essentially exploiting global information of the original hypergraph. Finally, the iterative renement at each uncoarsening level is able to signicantly improve the partitioning quality because it moves successively smaller subsets of vertices between the two partitions. Thus, in the multilevel paradigm, a good coarsening scheme results in a coarse graph that provides a global view that permits computations of a good initial partitioning, and the iterative renement performed during the uncoarsening phase provides a local view to further improve the quality of the partitioning. The multilevel hypergraph-partitioning algorithm presented here is quite fast and robust. Even a single run of the algorithm is able to nd reasonably good bisections. With a small number of runs (e.g., 20), our algorithm is able to nd better bisections than those found by all previously known algorithms for many of the well-known benchmarks. Our algorithm scales quite well for large hypergraphs. Due to the multilevel paradigm, the number of runs required to obtain high-quality bisections does not increase as the size of the hypergraph increases. High-quality bisections of hypergraphs with over 100 000 vertices are obtained in a few minutes on todays workstations. Also, since the coarsening phase runs in time proportional to the size of the hypergraph, the run time of the scheme increases linearly with hypergraph size. Furthermore, the scheme appears to be more powerful relative to the other schemes for larger hypergraphs (refer to Fig. 4). Restricting our comparisons to 78 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 only the larger hypergraphs (with 10 K or more nodes) in performs the benchmark set, we nd that hMETIS-FM 29.5%, 15.8%, 11.3%, 20.4%, 14.6%, 16.3%, and 8.4% better , , PARABOLI, than PROP, GFM, GMetis, and Optimized KLFM, respectively. Note that the hypergraph-based multilevel scheme, as presented in this paper, signicantly outperforms the graph-based multilevel scheme GMetis [25] that used METIS [21] to compute bisections of graph approximations of a hypergraph. The reasons for this performance difference are as follows. First, hypergraphbased coarsening causes a much greater reduction of the exposed hyperedge weight of the coarsest level hypergraph and, thus, provides much better initial partitions than those obtained with edge-based coarsening. Second, the renement in the hypergraph-based multilevel scheme directly minimizes the size of the hyperedge cut rather than the edge cut of the inaccurate graph approximation of the hypergraph. The power of hMETIS over GMetis is much more visible on the largest benchmark golem3, on which even the best of 100 different runs produced a cut that is 50% worse than ten runs . hMETIS also signicantly outperforms of hMETIS-FM Optimized KLFM [20] by Hauck and Borriello even though [9]). they used powerful renement schemes (FM with This is primarily due to the more powerful HEC schemes used in hMETIS. It may be possible to improve the quality of the bisection produced by this algorithm in many ways. Further research may identify better coarsening schemes that are suitable for a wider class of hypergraphs. New powerful variants of the FM renement schemes have been developed recently by Dutt et al. [11], [12]. It will be instructive to include such a renement scheme during the uncoarsening phase to see if it makes the multilevel scheme more robust. However, it is unclear if the added cost of these more powerful renement schemes will result in a cost-effective improvement in the size of the bisection because additional trials of the multilevel scheme could potentially improve the bisection. ACKNOWLEDGMENT Access to computing facilities was provided by AHPCRC and the Minnesota Supercomputer Institute. The algorithms described in this paper are part of the hMETIS hypergraphpartitioning package available via the World Wide Web at URL: http://www.cs.umn.edu/metis. REFERENCES [1] C. J. Alpert and A. B. Kahng, Recent directions in netlist partitioning, Integr. VLSI J., vol. 19, no. 12, pp. 181, 1995. [2] S. Shekhar and D. R. Liu, Partitioning similarity graphs: A framework for declustering problems, Inf. Syst. J., vol. 21, no. 4, pp. 475496, 1996. [3] B. Mobasher, N. Jain, E. H. Han, and J. Srivastava, Web mining: Pattern discovery from world wide web transactions, Dept. Comput. Sci., Univ. Minnesota, Minneapolis, MN, Tech. Rep. TR-96-050, 1996. [4] C. Berge, Graphs and Hypergraphs. Amsterdam, The Netherlands: Elsevier, 1976. [5] M. R. Garey and D. S. Johnson, Computers and Instractability: A Guide to the Theory of NP-Completeness. San Francisco, CA: Freeman, 1979. [6] D. G. Schweikert and B. W. Kernighan, A proper model for the partitioning of electrical circuits, in Proc. ACM/IEEE Design Automation Conf., 1972, pp. 5762. [7] B. W. Kernighan and S. Lin, An efcient heuristic procedure for partitioning graphs, Bell Syst. Tech. J., vol. 49, no. 2, pp. 291307, 1970. [8] C. M. Fiduccia and R. M. Mattheyses, A linear time heuristic for improving network partitions, in Proc. 19th IEEE Design Automation Conf., 1982, pp. 175181. [9] B. Krishnamurthy, An improved min-cut algorithm for partitioning VLSI networks, IEEE Trans. Comput., vol. C-33, pp. 438446, May 1984. [10] Y. Saab, A fast and robust network bisection algorithm, IEEE Trans. Comput., vol. 44, pp. 903913, July 1995. [11] S. Dutt and W. Deng, A probability-based approach to VLSI circuit partitioning, in Proc. ACM/IEEE Design Automation Conf., 1996. , VLSI circuit partitioning by cluster-removal using iterative [12] improvement techniques, in Proc. Physical Design Workshop, 1996. [13] T. Bui et al., Improving the performance of the KernighanLin and simulated annealing graph bisection algorithm, in Proc. ACM/IEEE Design Automation Conf., 1989, pp. 775778. [14] L. Hagen and A. Kahng, A new approach to effective circuit clustering, in Proc. IEEE Int. Conf. Computer-Aided Design, 1992, pp. 422427. [15] H. Shin and C. Kim, A simple yet effective technique for partitioning, IEEE Trans. VLSI Syst., vol. 1, pp. 380386, Sept. 1993. [16] C. J. Alpert, L. W. Hagen, and A. B. Kahng, A general framework for vertex orderings, with applications to netlist clustering, IEEE Trans. VLSI Syst., vol. 4, pp. 240246, June 1996. [17] T. Bui and C. Jones, A heuristic for reducing ll in sparse matrix factorization, in 6th SIAM Conf. Parallel Processing Sci. Computing, 1993, pp. 445452. [18] B. Hendrickson and R. Leland, A multilevel algorithm for partitioning graphs, Sandia Nat. Labs., Tech. Rep. SAND93-1301, 1993. [19] J. Cong and M. L. Smith, A parallel bottom-up clustering algorithm with applications to circuit partitioning in VLSI design, in Proc. ACM/IEEE Design Automation Conf., 1993, pp. 755760. [20] S. Hauck and G. Borriello, An evaluation of bipartitioning technique, in Proc. Chapel Hill Conf. Advanced Res. VLSI, 1995. [21] G. Karypis and V. Kumar, METIS 3.0: Unstructured graph partitioning and sparse matrix ordering system, Dept. Computer Sci., Univ. Minnesota, Tech. Rep. 97-061, 1997. [22] G. Karypis and V. Kumar, A fast and highly quality multilevel scheme for partitioning irregular graphs, SIAM J. Sci. Comput., to be published. [23] A. Pothen, H. D. Simon, and K.-P. Liou, Partitioning sparse matrices with eigenvectors of graphs, SIAM J. Matrix Analysis Applicat., vol. 11, no. 3, pp. 430452, 1990. [24] S. T. Barnard and H. D. Simon, A fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems, in Proc. 6th SIAM Conf. Parallel Processing Sci. Computing, 1993, pp. 711718. [25] C. Alpert and A. Kahng, A hybrid multilevel/genetic approach for circuit partitioning, in Proc. 5th ACM/SIGDA Physical Design Workshop, 1996, pp. 100105. [26] B. M. Riess, K. Doll, and F. M. Johannes, Partitioning very large circuits using analytical placement techniques, in Proc. ACM/IEEE Design Automation Conf., 1994, pp. 646651. [27] T. Lengauer, Combinatorial Optimization: Networks and Matroids. New York: Holt, Rinehart and Winston, 1976. [28] E. Ihler, D. Wagner, and F. Wagner, Modeling hypergraphs by graphs with the same mincut properties, Info. Process. Lett., vol. 45, no. 4, pp. 171175, Mar. 1993. [29] J. Li, J. Lillis, and C. K. Cheng, Linear decomposition algorithm for VLSI design applications, in Proc. IEEE Int. Conf. Computer-Aided Design, 1995, pp. 223228. [30] F. Brglez, ACM/SIGDA design automation benchmarks: Catalyst or anathema?, IEEE Design & Test, vol. 10, no. 3, pp. 8791, 1993. [31] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization. Englewood Cliffs, NJ: Prentice-Hall, 1982. [32] J. Li, J. Lillis, and C. Cheng, Linear decomposition algorithm for VLSI design applications, in Proc. IEEE Int. Conf. Computer-Aided Design, 1995, pp. 223228. KARYPIS et al.: MULTILEVEL HYPERGRAPH PARTITIONING: APPLICATIONS IN VLSI DOMAIN 79 George Karypis received the Ph.D. degree in computer science from the University of Minnesota, Minneapolis. He is currently an Assistant Professor in the Department of Computer Science and Engineering, University of Minnesota. He has co-authored several journal articles and conference papers on these topics and Introduction to Parallel Computing (Reading, MA: Addison-Wesley, 1994). His current research interests spans the areas of parallel algorithm design, applications of parallel processing in scientic computing and optimization, sparse matrix computations, and data mining. His research has resulted in the development of software libraries for serial and parallel unstructured graph partitioning (METIS and ParMETIS), and for parallel Cholesky factorization (PSPASES). Rajat Aggarwal received the B.Tech. degree in electrical engineering from the Indian Institute of Technology, New Delhi, India, in 1995, and the M.Sc. degree in computer science from the University of Minnesota, Minneapolis, MN, in 1997. He is currently with the Lattice Semiconductor Corporation, Milpitas, CA, where he is involved in the development of logic optimization, mapping, and placement algorithms for the CPLDs and FPGAs. Shashi Shekhar (S86M89SM96) received the B.Tech. degree in computer science from the Indian Institute of Technology, Kanpur, India, in 1985, and the M.S. degree in business administration and the Ph.D. degree in computer science from the University of California at Berkeley, Berkeley, CA, in 1989. He is currently an Associate Professor in the Department of Computer Science and Engineering, and an active member of the Army High Performance Computing Research Center, as well as the Center for Transportation Studies, University of Minnesota, Minneapolis, MN. His research interests include databases, geographic information systems (GISs), and intelligent transportation systems. He has published over 100 research papers in refereed journals, conferences, workshops, and edited books. He was program co-chair of the 1996 ACM International Workshop on Advances in GIS. Dr. Shekhar is a senior member of the IEEE Computer Society, and a member of the ACM and AAAI. He is an editorial board member of the IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, and of the IEEE Computer Society Computer Science and Engineering Practice Board. Vipin Kumar (S78M82SM91) received the Ph.D. degree in computer science from the University of Maryland at College Park. He is currently a Professor in the Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN. His current research interests include parallel computing, parallel algorithms for scientic computing problems, and data mining. His research has resulted in the development of highly efcient parallel algorithms and software for sparse matrix factorization (PSPASES), graph partitioning, (METIS and ParMETIS), and dense hierarchical solvers. His research in performance analysis resulted in the development of the isoefciency metric for analyzing the scalability of parallel algorithms. He has authored over 100 research papers and coauthored Introduction to Parallel Computing (Reading, MA: Addison-Wesley, 1994). He has presented over 50 invited talks at various conferences, workshops, national labs, and has served as chair/co-chair for many conferences/workshops in the area of parallel computing and high-performance data mining. He serves on the editorial boards of Parallel Computing and the Journal of Parallel and Distributed Computing. Dr. Kumar is a member of the Society of Industrial and Applied Mathematics (SIAM) and the Association for Computing Machinery (ACM). He serves on the editorial board of the IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. He has also served on the editorial board of IEEE TRANSACTIONS OF DATA AND KNOWLEDGE ENGINEERING (19931997).
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Toledo - ECE - 1387
Are Floorplan Representations Important In Digital Design?Hayward H. Chan , Saurabh N. Adya and Igor L. Markov The University of Michigan, Department of EECS, 1301 Beal Ave., Ann Arbor, MI 48109-2122 Synplicity Inc., 600 W. California Ave, Sunnyvale
Toledo - ECE - 1387
Branch and Bound Algorithms Principles and Examples.Jens Clausen March 12, 1999Contents1 Introduction. 2 B&B - terminology and general description. 2.1 Bounding function. . . . . . . . . . . . . . . . 2.2 Strategy for selecting next subproblem. .
Allan Hancock College - OSSA - 1987404
OCCUPATIONAL SUPERANNUATION STANDARDS ACT 1987 NO. 97, 1987 OCCUPATIONAL SUPERANNUATION STANDARDS ACT 1987 NO. 97, 1987 - TABLE OF PROVISIONS1. Short title 2. Commencement 3. Interpretation 4. Application of Act in relation to periods befo
Toledo - ECE - 1387
Engineering Details of a Stable Force-Directed Placer*Kristofer Vorwerk Andrew Kennings Anthony VannelliDept. of E W E , University of Waterloo Dept. of E&CE, University of Waterloo Dept. of E&CE, University of Waterloo Waterloo, Ontario, Canada Wa
Toledo - ECE - 1387
722IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 24, NO. 5, MAY 2005FastPlace: Efcient Analytical Placement Using Cell Shifting, Iterative Local Renement, and a Hybrid Net ModelNatarajan Viswanathan, Member,
Toledo - ECE - 1387
University of Toronto, Faculty of Applied Science and Engineering Department of Electrical and Computer Engineering ECE 1387 - CAD for Digital Circuit Synthesis and Layout Exercise #1 - Simulated Annealing-based Placement and Timing-Driven Negotiated
Toledo - ECE - 1387
A Crosstalk-Aware Timing-Driven Router for FPGAsSteven J. E. WiltonDepartment of Electrical and Computer Engineering University of British Columbia Vancouver, B.C., Canadastevew@ece.ubc.ca ABSTRACTAs integrated circuits are migrated to more adva
Toledo - ECE - 1387
3.3Improved Global Routing through Congestion EstimationRaia T. Hadsell and Patrick H. Madden SUNY Binghamton CSD Box 6000 Binghamton NY 13902 raia@math.binghamton.edu pmadden@cs.binghamton.edu http:/vlsicad.cs.binghamton.eduABSTRACTIn this pap
Toledo - ECE - 1387
University of Toronto, Faculty of Applied Science and Engineering Department of Electrical and Computer Engineering ECE 1387 - CAD for Digital Circuit Synthesis and Layout Exercise #3 - Floorplanning via the Sequence Pair Approach Fall 2007 Assignmen
Toledo - ECE - 241
Appendix DTutorial 3 Physical Implementation in a Programmable Logic DeviceIn this tutorial we focus on the physical implementation of a design project in a target device. We show how to manually choose which pins on a device package are used for
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 7 PracticalImage Processsing and File Writing 1 Introduction In this practical we'll be reading in a file of an oblique satellite coverage, turning it into a Image, doing some processing on it, and writ
Toledo - ECE - 1776
Stronger Password Authentication Using Browser ExtensionsPresented at 14th USENIX Security Symposium, July 31 August 5, 2005. Baltimore, MDBlake Ross, Collin Jackson, Nick Miyake, Dan Boneh, John Mitchell (Stanford University)By John Leggio Tues
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 7 NotesInput, Output and Network Communications The aims of this unit are to: Provide an overview of reading to and writing data from files Explain network communication in greater detail.On comple
Toledo - ECE - 1776
.9 $0.:73,9,390;03941425:907%019 /9072#05479#0300,773073 47/07 94 570;039 /,9, ,3/ /0399 9019 3 90 0;039 41 .425:907 9019 , .75947,5. 10 88902 -0 25020390/43,89,3/,7/!7:333903: 4507,9388902&543-449:5 90/0.759435,8857,80 -0 4-9,30/ 1742
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 2 NotesObjects and the Object Orientated ApproachThe aims of this unit are to: define objects and their methods introduce the object orientated approach explain how Java implements this approach
Toledo - ECE - 1776
A Crawler-based study of Spyware on the webPresented at NDSS 2006Alex Moshchuk, Tanya Bragin, Steve Gribble, Hank Levy (University of Washington)By Shvetank Jain Tuesday, October 17, 2006 ECE 17761Spyware today Most Internet PCs have, or hav
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java ProgrammingThe Art of Programming 2: UML 1.1 Introduction Professional programming in industry is usually carried out within much larger projects using teams of people working together. Therefore, we need standard ways
Toledo - ECE - 1776
ECE 1776: Progress ReportPatagonix: Dynamically Neutralizing Malware with a HypervisorH. Andrs Lagar-Cavilla - andreslc@cs.toronto.edu Lionel Litty - llitty@cs.toronto.eduIntroductionIn most operating systems widely in use today, it is possible
East Los Angeles College - GEOG - 5565
Geog5561M Introduction to Java Programming Unit 2 NotesObjects and the Object Orientated ApproachThe aims of this unit are to: define objects and their methods introduce the object orientated approach explain how Java implements this approach
Toledo - ECE - 1776
Vladan DjericOctober 17, 2006Midterm Update: Correlating Multi-Session Attacks with Replay1 IntroductionNon-deterministic application replay has the potential to be a powerful tool in intrusion analysis and recovery. Unlike deterministic repla
East Los Angeles College - GEOG - 5565
Geog5561M Introduction to Java Programming Unit 1 NotesGetting StartedThe aims of this unit are to: provide an overview of the module provide some background to Java explain why we would want to use it introduce you to the core Java language
Toledo - ECE - 1776
ECE1776:ProjectMidtermUpdateKiranGollu(994392787) WormDetectionandEradicationinBluetoothEnvironments Ourinitialgoalforthefinalprojectistodevelopamodelforanalyzingandmodeling humanencounters.Ournextgoalsaretoincorporatethisanalysisintowormpropagation
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 1 PracticalGetting Started This practical will introduce you to a range of relevant online resources, get you to download and install Suns Java Development Kit and compile and run your first Java progra
Toledo - ECE - 1776
Isolated Program Execution:An Application Transparent Approach for Executing Untrusted ProgramsAuthors: Z Liang, V Venkatakrishnan, R Sekar Computer Security Applications Conference, 2003 Presenters: Renee Warriner, Bernice Chan, Fareha Shafique E
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 2 PracticalUsing the BlueJ IDE In this practical you will download and install the Integrated Development Environment (IDE) called BlueJ. You will then work through selected sections of the BlueJ tutori
Toledo - ECE - 1776
Detecting Past and Present Intrusions through Vulnerability-Specific PredicatesA. Joshi, S. King, G. Dunlap, P. Chen SOSP 05Motivation How do I know if my systems were affected by a 0-day exploit before a patch was released? I need time to test
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 5 PracticalGraphical User Interfaces and Event-based Programming 1 Introduction In this practical we are going to use what weve learnt about building Graphical User Interfaces and Events to build oursel
Toledo - ECE - 1776
Vulnerabilities We define a vulnerability as: A program flaw (or bug) that when exercised has a security implication Notice that two things need to be true. There has to be a flaw or a bug, that an attacker can exploit to weaken the security of a
East Los Angeles College - GEOG - 5565
Geog5561M Introduction to Java Programming Unit 3 PracticalBuilding a GIS In this practical we are going to build our first GIS using BlueJ. We will build it in parts so that we can test it at each stage. Testing a program whenever you complete a sm
Toledo - ECE - 1776
ECE1776 Project Update: Detecting Buer Overows by Model-CheckingKelvin Ku1IntroductionThe objective of this project is to enable the software model-checker, YASM [1], to eciently detect potential buer overows in C programs. YASM currently prov
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 1 NotesGetting Started The aims of this unit are to: provide an overview of the module provide some background to Java explain why we would want to use it introduce you to the core Java language g
Toledo - ECE - 1776
Smart Cookies: The Restricted Access CookieAndrew Miklas, Shvetank Jain October 18, 2006The cross-site scripting attack is widely prevalent and a number of real world attacks have been reported. One way to curb scripting attack would be to require
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 6 NotesApplets and Images The aims of this unit are to: introduce Applets provide an overview of how to build web pages explain how to create and display imagesOn completion of this unit you shoul
Toledo - ECE - 1776
Progress Report: Secure Isolated Copy-on-Write File SystemFareha Shafique October 17, 2006IntroductionThe proposed project involved developing a copy-on-write user-level file system using FUSE. A copy-on-write (CoW) file system allows reads to e
East Los Angeles College - GEOG - 5565
Geog5565 Introduction to Java Programming Unit 4 NotesArrays and Packages The aims of this unit are to: introduce single and multi-dimensional arrays introduce packages and how to import them show you how to build a basic raster GIS show you ho
Toledo - ECE - 1776
Secure Execution Via Program ShepherdingAuthors: Vladimir Kiriansky, Derek Bruening, Saman Amarasinghe Laboratory for Computer Science Massachusetts Institute of Technology Cambridge, MA 02139 Presenters: Fareha Shafique, Renee Warriner, Bernice Cha
East Los Angeles College - GEOG - 5565
Geog5561M Introduction to Java Programming Unit 4 NotesArrays and Packages The aims of this unit are to: introduce single and multi-dimensional arrays introduce packages and how to import them show you how to build a basic raster GIS show you h
Toledo - ECE - 1776
Outline Malicious Code: Trojans Viruses Worms Intrusion Detection Techniques Network based Intrusion Detection (Snort, Bro) Host based intrusion DetectionLecture 7: Intrusion Detection and Virus ProtectionDavid Lie ECE177612Does This
Toledo - ECE - 1776
Outline Networking 101 Network Hacking Protocol Vulnerabilities TCP weaknesses BGP/EGP, Spoofing Snooping DDOS, Smurf Network Security Border Security: Firewalls, Proxies Encryption: VPN, IPSEC, SSH/SSLLecture 8: Network SecurityDavid Lie
Toledo - ECE - 1776
ModelCheckingOneMillionLinesofCCodeHaoChen,UCBerkeley DrewDean,SRIInternational DavidWagner,UCBerkeleyNetworkandDistributedSystemSecuritySymposium,February2004.Presenter:AndrewMiklasSomefiguresandtablesfromthepaperandtheauthors'slidesTheIdea
Toledo - ECE - 1776
Definition of Security Security is a very nebulous term - what does it mean to be secure? Varying definitions, but in the end often one has to rely on intuition From real life, people have an intuitive idea of what is secure, and these can for the
Toledo - ECE - 1776
Web BrowserrequestBrowserreplyLecture 5: Browser SecurityECE1776 David LieWeb site NetworkOS Hardware Web Browser is extremely important! The prime method for users to access remote hosts on the internet A great deal of attacks and vul
East Los Angeles College - GEOG - 5565
Geog5565IntroductiontoJavaProgrammingUnit5NotesTheGraphicalUserInterface(GUI)andEventbasedProgrammingTheaimsofthisunitareto: look at how Java allows us to build up applications based on Windowsstyle interfaces. lookathowwecangetpro
Toledo - ECE - 1776
ECE 1776 Midterm Progress Report Group Members: Rita Chiu 980290250 Jacky Mok 990872301 Vicky Tsang 981000580Recap of Problem We will try to verify the hypothesis that code vulnerabilities occur on less commonly executed paths by instrumenting
East Los Angeles College - GEOG - 5565
Geog5561MIntroductiontoJavaProgrammingUnit2NotesObjectsandtheObjectOrientatedApproachTheaimsofthisunitareto: defineobjectsandtheirmethods introducetheobjectorientatedapproach explainhowJavaimplementsthisapproach introducetheIntegratedDevelopm
Toledo - ECE - 1776
Outline Access Control Matrix, ACL, Capabilities Multi-level security (MLS) OS Mechanisms Multics Ring structureLecture 9: Access Control and Operating System SecurityECE1776 David Lie Unix File system, Setuid SE Linux Role-based Dom
East Los Angeles College - GEOG - 5565
Geog5565IntroductiontoJavaProgrammingUnit1PracticalGettingStartedThispracticalwillintroduceyoutoarangeofrelevantonlineresources,getyouto downloadandinstallSunsJavaDevelopmentKitandcompileandrunyourfirst Javaprogram.1OnlineResourcesThemostimpor
Toledo - ECE - 1776
A Safety-Oriented Platform for Web ApplicationsIEEE Symposium on Security and Privacy, 2006Richard S. Cox, Jacob Gorm Hansen, Steve D. Gribble, Henry M. Levy University of WashingtonBy Kiran Kumar Gollu Tuesday, October 17, 2006 ECE 17761Agen
East Los Angeles College - GEOG - 5565
Geog5565IntroductiontoJavaProgrammingUnit3PracticalBuildingaGISInthispracticalwearegoingtobuildourfirstGISusingBlueJ.Wewillbuilditinparts sothatwecantestitateachstage.Testingaprogramwheneveryoucompleteasmall partofitisstandardgoodpracticeinsoftwar
Toledo - ECE - 1776
Midterm Report for ECE1776 Projectby Robert Ma 990342054 robertma@gmail.com Dmitry Denisenko 980432520 dmitry.denisenko@utoronto.caOctober 17, 2006User Interface In this project, a Mozilla web browser plug-in is developed to detect phishing webs
East Los Angeles College - GEOG - 5565
Geog5565IntroductiontoJavaProgrammingUnit7PracticalImageProcesssingandFileWriting 1IntroductionInthispracticalwe'llbereadinginafileofanobliquesatellitecoverage,turningitintoa Image,doingsomeprocessingonit,andwritingitouttoafile. Forthispractical,m
Toledo - ECE - 1776
Ivan Hernandez, John LeggioTuesday, October 17, 2006ECE 1776 Project Midterm ReportOverview of ProjectOur anti-phishing extension has been challenging up to this point. To help define our problem, we have analyzed many variations of the phishi
East Los Angeles College - GEOG - 5565
Geog5561MIntroductiontoJavaProgrammingUnit2PracticalUsingtheBlueJIDEInthispracticalyouwilldownloadandinstalltheIntegratedDevelopmentEnvironment(IDE) calledBlueJ.YouwillthenworkthroughselectedsectionsoftheBlueJtutorialandusethe IDEtoopenandruntheJa
Toledo - ECE - 1776
A Client-Side Browser-Integrated Solution for Detecting and Preventing Cross Site Scripting (XSS) AttacksBernice Chan and Gordon ChiuIntroductionOur proposed work aims to create a client-side in-browser solution for mitigating cross-site scriptin
Toledo - ECE - 1776
Phishing Attack Detection by Using a Reputable Search EngineRobert Ma Electrical and Computer Engineering Department University of Toronto robertma@eecg.toronto.eduABSTRACT Phishing attack is one of the most critical issues on the Internet today a
Toledo - ECE - 1776
Project Proposal: Secure Isolated Copy-on-Write File SystemFareha Shafique September 26, 2006IntroductionWhen several users are working with the same files, it is helpful to isolate the changes made by each user since some users maybe malicious.
East Los Angeles College - GEOG - 5075
Geog5075/M Spatial Analysis with GIS Unit 5 NotesMagic and Guesswork: InterpolationThe aims of this unit are to introduce you to the following techniques: Global estimates Local estimates o Splines o Moving average statistics o Area weighted stat
Toledo - ECE - 1776
Using Programmer-Written Compiler Extensions to Catch Security HolesKen Ashcraft and Dawson EnglerPresented by Kelvin KuSome material adapted from Englers PASTE 2002 slidesContextSystems must obey ad hoc rulesUser data must be bounds-checked b
East Los Angeles College - GEOG - 5061
Geog5061M GIS and Geocomputation Unit 8 NotesMicrosimulationThe aims of this unit are to: investigate the data generation process of microsimulation consider the background and reasons for evolution of microsimulation techniques review case stu
Toledo - ECE - 1776
Patagonix: Dynamically Neutralizing Malware with a HypervisorH. Andrs Lagar-Cavilla - andreslc@cs.toronto.edu Lionel Litty - llitty@cs.toronto.edu Introduction In most operating systems widely in use today, it is possible for malicious software (mal
UMBC - COSC - 122
Week 2 Lecture ObjectivesTopics: History of Computers (from Week 1) File System and ManagementObjectives:History of ComputersSee Week 1 Objectives (WebCT). Explain how data is stored in a binary manner on media. Understand file names and fil