very good - Preprint Alignment of RNA Base Pairing...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Preprint Alignment of RNA Base Pairing Probability Matrices Ivo L. Hofacker, Stephan H.F. Bernhart and Peter F. Stadler Institut f¨ur Theoretische Chemie und Molekulare Strukturbiologie, Universit¨at Wien, W¨ahringerstraße 17, Vienna, A-1090, Austria and Bioinformatik, Institut f¨ur Informatik, Universit¨at Leipzig, Kreuzstrasse 7b, Leipzig, D-04103, Germany ABSTRACT Motivation: Many classes of functional RNA molecules are characterized by highly conserved secondary struc- tures but little detectable sequence similarity. Reliable multiple alignments can therefore be constructed only when the shared structural features are taken into ac- count. Since multiple alignments are used as input for many subsequent methods of data analysis, structure based alignments are an indispensable necessity in RNA bioinformatics. Results: We present here a method to compute pairwise and progressive multiple alignments from the direct com- parison of basepairing probability matrices. Instead of at- tempting to solve the folding and the alignment problem simultaneously as in the classical Sankoff algorithm we use McCaskill’s approach to compute base pairing proba- bility matrices which effectively incorporate the information on the energetics of each sequences. A novel, simplified variant of Sankoff’s algorithms can then be employed to extract the maximum weight common secondary structure and an associated alignment. Availability: The programs pmcomp and pmmulti de- scribed in this contribution are implemented in Perl , and are available on request from the authors. A web server is available at Contact: Ivo L. Hofacker, Tel: ++43 1 4277 52738, Fax: ++43 1 4277 52793, [email protected] INTRODUCTION Many functional classes of RNA molecules, including tRNA, rRNA, RNAse P RNA, SRP RNA, exhibit a highly conserved secondary structure but little sequence homology. Reliable alignments thus have to take structural information into account. Sankoff’s algorithm (Sankoff, 1985) that simultaneously allows the solution of the structure prediction and align- ment problem is computationally very expensive, O ( n 6 ) in CPU time and O ( n 4 ) in memory for a pair of sequences of length n . A further complication is that it requires the implementation of the full loop-based RNA energy model (Mathews et al. , 1999). Currently available software pack- ages such as foldalign (Gorodkin et al. , 1997) and dynalign (Mathews & Turner, 2002) therefore imple- ment only restricted versions. In this contribution we describe a different approach. Instead of attempting to solve the alignment and the structure prediction problem simultaneously we start from base pairing probability matrices predicted by means of McCaskill’s algorithm (McCaskill, 1990) (implemented in the RNAfold program of Vienna RNA Package (Hofacker et al. , 1994; Hofacker, 2003). The problem then becomes the alignment of the base pairing probability matrices. This appears to be an even harder threading
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 6

very good - Preprint Alignment of RNA Base Pairing...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online