This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: A Tree-Decomposition Approach to Protein Structure Prediction Jinbo Xu [email protected] Department of Mathematics and CSAIL, MIT Cambridge, MA 02139 Feng Jiao [email protected] School of Computer Science University of Waterloo Waterloo, Canada N2L 3G1 Bonnie Berger [email protected] Department of Mathematics and CSAIL, MIT Cambridge, MA 02139 Abstract This paper proposes a tree decomposition of protein structures, which can be used to efficiently solve two key subproblems of protein structure prediction: protein thread- ing for backbone prediction and protein side-chain predic- tion. To develop a unified tree-decomposition based ap- proach to these two subproblems, we model them as a geometric neighborhood graph labeling problem. Theo- retically, we can have a low-degree polynomial time al- gorithm to decompose a geometric neighborhood graph G = ( V, E ) into components with size O ( | V | 2 3 log | V | ) . The computational complexity of the tree-decomposition based graph labeling algorithms is O ( | V | ∆ tw +1 ) where ∆ is the average number of possible labels for each vertex and tw (= O ( | V | 2 3 log | V | )) the tree width of G . Empiri- cally, tw is very small and the tree-decomposition method can solve these two problems very efficiently. This pa- per also compares the computational efficiency of the tree- decomposition approach with the linear programming ap- proach to these two problems and identifies the condition under which the tree-decomposition approach is more effi- cient than the linear programming approach. Experimen- tal result indicates that the tree-decomposition approach is more efficient most of the time. 1 Introduction The structure of a protein plays an instrumental role in determining its functions. Protein structures are impor- tant for the understanding of life process and drug discov- ery. However, existing experimental methods such as X-ray crystallography and NMR techniques cannot generate pro- tein structures in a high throughput way. In order to pro- duce protein structures in a large scale, NIH has launched a protein structure initiative. This initiative aims to experi- mentally determine a few thousands of unique protein struc- tures within 10 years so that most of new proteins can have a similar structure in the Protein Data Bank (PDB). There- fore, the structures of these new proteins can be predicted using template-based methods such as homology modeling and protein threading. Computational approaches to protein structure prediction are becoming useful and successful, as demonstrated in recent CASP competitions [1, 2, 3]. In- deed, protein structure prediction tools have been routinely used by structural biologists and pharmaceutical companies to analyze the structural features and functional characteris- tics of a protein....
View Full Document
- Spring '08
- linear programming approach, IEEE Computational Systems Bioinformatics Conference