tyrosulfation - BMC Bioinformatics Research article BioMed...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Bio Med Central Page 1 of 10 (page number not for citation purposes) BMC Bioinformatics Open Access Research article Predicting sulfotyrosine sites using the random forest algorithm with significantly improved prediction accuracy ZhengRongYang Address: School of Biosciences, University of Exeter, UK Exeter EX4 5DE, UK Email: Zheng Rong Yang - z.r.yang@ex.ac.uk Abstract Background: Tyrosine sulfation is one of the most important posttranslational modifications. Due to its relevance to various disease developments, tyrosine sulfation has become the target for drug design. In order to facilitate efficient drug design, accurate prediction of sulfotyrosine sites is desirable. A predictor published seven years ago has been very successful with claimed prediction accuracy of 98%. However, it has a particularly low sensitivity when predicting sulfotyrosine sites in some newly sequenced proteins. Results: A new approach has been developed for predicting sulfotyrosine sites using the random forest algorithm after a careful evaluation of seven machine learning algorithms. Peptides are formed by consecutive residues symmetrically flanking tyrosine sites. They are then encoded using an amino acid hydrophobicity scale. This new approach has increased the sensitivity by 22%, the specificity by 3%, and the total prediction accuracy by 10% compared with the previous predictor using the same blind data. Meanwhile, both negative and positive predictive powers have been increased by 9%. In addition, the random forest model has an excellent feature for ranking the residues flanking tyrosine sites, hence providing more information for further investigating the tyrosine sulfation mechanism. A web tool has been implemented at http://ecsb.ex.ac.uk/ sulfotyrosine for public use. Conclusion: The random forest algorithm is able to deliver a better model compared with the Hidden Markov Model, the support vector machine, artificial neural networks, and others for predicting sulfotyrosine sites. The success shows that the random forest algorithm together with an amino acid hydrophobicity scale encoding can be a good candidate for peptide classification. Background Tyrosine sulfation is a posttranslational modification (PTM), which introduces a sulfate group to a tyrosine res- idue in a protein [1-3]. During the modification process, sulfation is catalysed by tyrosylprotein sulfotransferase [4]. A targeted tyrosine for sulfation is normally required to be exposed on a protein surface [5]. Previous studies have indicated that Sulfation is an important anticipator for extracellular protein-protein interactions [6,7]. Studies have shown that sulfation is related to various diseases when a malfunction of a cellular activity occurs. For instance, sulfotyrosine can alter the affinity in some chem- okine receptors leading to a downstream signalling cas- cade which affects the cells involved in acute and chronic events of cellular immunity [8]. Disease-related altera- tions at the non-reducing termini of chondroitin and der-
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/06/2010 for the course COMPUTER S COSC1520 taught by Professor Paul during the Spring '09 term at York University.

Page1 / 10

tyrosulfation - BMC Bioinformatics Research article BioMed...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online