This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: BIOINFORMATICS ORIGINAL PAPER Vol. 25 no. 1 2009, pages 14–21 doi:10.1093/bioinformatics/btn569 Sequence analysis Discovery of phosphorylation motif mixtures in phosphoproteomics data Anna Ritz 1 , ∗ , Gregory Shakhnarovich 2 , Arthur R. Salomon 3 and Benjamin J. Raphael 1 , 4 , ∗ 1 Department of Computer Science, Brown University, 2 Toyota Technological Institute at Chicago, Chicago, IL, 3 Department of Chemistry and Molecular Biology, Cell Biology, and Biochemistry and 4 Center for Computational Molecular Biology, Brown University, Providence, RI, USA Received on August 1, 2008; revised on October 24, 2008; accepted on October 28, 2008 Advance Access publication November 7, 2008 Associate Editor: Burkhard Rost ABSTRACT Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs , present in the amino acid sequence. Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments. Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif- finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases. Availability: Software is available at http:/cs.brown.edu/people/ braphael/software.html. Contact: firstname.lastname@example.org; email@example.com Supplementary information: Supplementary data are available at Bioinformatics online. 1 INTRODUCTION Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Members of signaling pathways include kinases that phosphorylate proteins at tyrosine, serine or threonine residues and phosphatases that desphosphorylate proteins. Both kinases and phosphatases recognize their substrates in part through patterns, or motifs , present near the phosphorylation site in the amino acid sequence of the substrate. A number of such ∗ To whom correspondence should be addressed....
View Full Document
This note was uploaded on 04/06/2010 for the course COMPUTER S COSC1520 taught by Professor Paul during the Spring '09 term at York University.
- Spring '09
- Machine Learning