This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Biclustering Algorithms: A Survey Amos Tanay £ Roded Sharan Ý Ron Shamir £ May 2004 Abstract Analysis of large scale geonomics data, notably gene expression, has initially focused on clustering methods. Recently, biclustering techniques were proposed for revealing submatrices showing unique patterns. We review some of the algorithmic approaches to biclustering and discuss their properties. 1 Introduction Gene expression profiling has been established over the last decade as a standard technique for obtaining a molecular fingerprint of tissues or cells in different biological conditions [18, 7]. Based on the availability of whole genome sequences, the technology of DNA chips (or microarrays) allows the measurement of mRNA levels simultaneously for thousands of genes. The set (or vector) of measured gene expression levels under one condition (or sample) are called the profile of that condition. Gene expression profiles are powerful sources of information and have revolutionized the way we study and understand function in biological systems . Given a set of gene expression profiles, organized together as a gene expression matrix with rows corresponding to genes and columns corresponding to conditions, a common analysis goal is to group conditions and genes into subsets that convey biological significance. In its most common form, this task translates to the computational problem known as clustering . Formally, given a set of elements with a vector of attributes for each element, clustering aims to partition the elements into (possibly hierarchically ordered) disjoint sets, called clusters, so that within each set the attribute vectors are similar, while vectors of disjoint clusters are dissimilar. For example, when analyzing a gene expression matrix we may apply clustering to the genes (as elements) given the matrix rows (as attributes) or cluster the conditions (as elements) given the matrix columns (as attributes). For reviews on clustering see an earlier chapter in this book. Analysis via clustering makes several a-priori assumptions that may not be perfectly adequate in all circumstances. First, clustering can be applied to either genes or samples, implicitly directing the analysis to a particular £ School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel. f amos,rshamir g @post.tau.ac.il. Ý International Computer Science Institute, 1947 Center St., Berkeley CA 94704, USA. [email protected] 1 genes conditions conditions conditions condition clusters biclusters gene clusters Figure 1: Clustering and biclustering of a gene expression matrix. Clusters correspond to disjoint strips in the matrix. A gene cluster must contain all columns, and a condition cluster must contain all rows. Biclusters correspond to arbitrary subsets of rows and columns, shown here as rectangles....
View Full Document
This note was uploaded on 02/10/2012 for the course CSE 5615 taught by Professor Mitra during the Fall '11 term at FIT.
- Fall '11