17-cs481-rna_structure.pdf - RNA STRUCTURE RNA Basics 23...

This preview shows page 1 - 20 out of 55 pages.

RNA STRUCTURE
RNA Basics RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U wobble pairing Bases can only pair with one other base. 3 Hydrogen Bonds more stable
RNA Basics transfer RNA (tRNA) messenger RNA (mRNA) ribosomal RNA (rRNA) small interfering RNA (siRNA) micro RNA (miRNA) small nucleolar RNA (snoRNA) ….. others
RNA folding Prediction of secondary structure of an RNA given its sequence General problem is NP- hard due to “difficult” substructures, like pseudoknots Most existing algorithms require too much memory (≥O(n 2 )), and run time (≥O(n 3 )) thus limited to smaller RNA sequences
RNA Structural Levels Primary AAUCG...CUUCUUCCA Primary Secondary Tertiary
RNA families Rfam : General non-coding RNA database (most of the data is taken from specific databases) Includes many families of non coding RNAs and functional Motifs, as well as their alignement and their secondary structures
RNA Secondary Structure Hairpin loop Junction (Multiloop) Bulge Loop Single-Stranded Interior Loop Stem Pseudoknot
Example: 5S rRNA E. coli 5S 120 bases T. thermophilus 5S 120 bases
Example: E. coli 16S rRNA 1542 bases
Example: E. coli 23S rRNA 2904 bases
Example: HIV 9173 bases Watts et al., Nature, 2009
Binary Tree Representation of RNA Secondary Structure Representation of RNA structure using Binary tree Nodes represent Base pair if two bases are shown Loop if base and “gap” (dash) are shown Pseudoknots still not represented Tree does not permit varying sequences Mismatches Insertions & Deletions Images Eddy et al.
Circular Representation Images David Mount
Examples of known interactions of RNA secondary structural elements Pseudoknot Kissing hairpins Hairpin-bulge contact These patterns are excluded from the prediction schemes as their computation is too intensive.
Predicting RNA secondary structure Base pair maximization Minimum free energy (most common) Fold, Mfold (Zuker & Stiegler) RNAfold (Hofacker) Multiple sequence alignment Use known structure of RNA with similar sequence Covariance Stochastic Context-Free Grammars
Sequence Alignment as a method to determine structure Bases pair in order to form backbones and determine the secondary structure Aligning bases based on their ability to pair with each other gives an algorithmic approach to determining the optimal structure
Simplifying Assumptions RNA folds into one minimum free-energy structure. There are no knots (base pairs never cross). The energy of a particular base pair in a double stranded regions is sequence independent Neighbors do not influence the energy. Was solved by dynamic programming, Zuker and Stiegler 1981
Base Pair Maximization U C C A G G A C Zuker (1981) Nucleic Acids Research 9(1) 133-149
Base Pair Maximization Dynamic Programming Algorithm Maximizing Base Pairing S(i,j) is the folding of the subsequence of the RNA strand from index i to index j which results in the highest number of base pairs

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture