This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: A TreeDecomposition Approach to Protein Structure Prediction Jinbo Xu j3xu@theory.csail.mit.edu Department of Mathematics and CSAIL, MIT Cambridge, MA 02139 Feng Jiao fjiao@cs.uwaterloo.ca School of Computer Science University of Waterloo Waterloo, Canada N2L 3G1 Bonnie Berger bab@csail.mit.edu Department of Mathematics and CSAIL, MIT Cambridge, MA 02139 Abstract This paper proposes a tree decomposition of protein structures, which can be used to efficiently solve two key subproblems of protein structure prediction: protein thread ing for backbone prediction and protein sidechain predic tion. To develop a unified treedecomposition based ap proach to these two subproblems, we model them as a geometric neighborhood graph labeling problem. Theo retically, we can have a lowdegree polynomial time al gorithm to decompose a geometric neighborhood graph G = ( V, E ) into components with size O (  V  2 3 log  V  ) . The computational complexity of the treedecomposition based graph labeling algorithms is O (  V  tw +1 ) where is the average number of possible labels for each vertex and tw (= O (  V  2 3 log  V  )) the tree width of G . Empiri cally, tw is very small and the treedecomposition method can solve these two problems very efficiently. This pa per also compares the computational efficiency of the tree decomposition approach with the linear programming ap proach to these two problems and identifies the condition under which the treedecomposition approach is more effi cient than the linear programming approach. Experimen tal result indicates that the treedecomposition approach is more efficient most of the time. 1 Introduction The structure of a protein plays an instrumental role in determining its functions. Protein structures are impor tant for the understanding of life process and drug discov ery. However, existing experimental methods such as Xray crystallography and NMR techniques cannot generate pro tein structures in a high throughput way. In order to pro duce protein structures in a large scale, NIH has launched a protein structure initiative. This initiative aims to experi mentally determine a few thousands of unique protein struc tures within 10 years so that most of new proteins can have a similar structure in the Protein Data Bank (PDB). There fore, the structures of these new proteins can be predicted using templatebased methods such as homology modeling and protein threading. Computational approaches to protein structure prediction are becoming useful and successful, as demonstrated in recent CASP competitions [1, 2, 3]. In deed, protein structure prediction tools have been routinely used by structural biologists and pharmaceutical companies to analyze the structural features and functional characteris tics of a protein....
View Full
Document
 Spring '08
 UNGOR

Click to edit the document details