This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: c S.Will Structure Prediction Structure Probabilities Part 1 RNA Structure and RNA Structure Prediction c S.Will Structure Prediction Structure Probabilities Definitions Definition (RNA Structure) Let S { A , C , G , U } * be an RNA sequence of length n =  S  . An RNA structure of S is a set of base pairs P { ( i , j )  1 i < j n , S i and S j complementary } such that the degree of P is at most one, i.e. for all ( i , j ) , ( i , j ) P : ( i = i j = j ) and i 6 = j . c S.Will Structure Prediction Structure Probabilities Definitions II Definition (Crossing) Two base pairs ( i , j ) and ( i , j ) are crossing iff i < i < j < j or i < i < j < j . An RNA structure P (of an arbitary RNA sequence S ) is crossing iff P contains (at least) two crossing base pairs. Otherwise, P is called noncrossing (nc) or nested . c S.Will Structure Prediction Structure Probabilities Remarks Synonyms: ( i , j ) P is a base pair, bond, arc Usually, assume minimal allowed size of base pair (aka loop length) m . Then: additional constraint j i > m in def of RNA structure. Crossing base pairs form pseudoknots crossing structures contain pseudoknots. The terms pseudoknotfree and noncrossing are synonymous for RNA structures. As defined RNA structure describes the secondary structure of an RNA. We wont define and deal with tertiary structure through the whole course. c S.Will Structure Prediction Structure Probabilities Prediction of RNA (Secondary) Structure Definition (Problem of RNA noncrossing Secondary Structure Prediction by Base Pair Maximization) IN: RNA sequence S OUT: a noncrossing RNA structure P of S that maximizes  P  (= number of base pairs in P ). Remarks: We defined two variants of the problem. One with the addiditional requirement that structures are noncrossing and one without. Without this restriction the problem is NPhard with the restriction there will be an efficient algorithm for solving the problem. Maximizing base pairs will help to understand the more realistic case of minimizing energy. RNA structure prediction is often called RNA folding . c S.Will Structure Prediction Structure Probabilities Nussinov Algorithm Matrix definition Let S be and RNA sequence of length n . The Nussinov Algorithm solves the problem of RNA noncrossing secondary structure prediction by base pair maximization with input S . Definition (Nussinov Matrix) The Nussinov matrix N = ( N ij ) 1 i n i 1 j n of S is defined by N ij := max { P   P is noncrossing RNA ijsubstructure of S } where we use: Definition (RNA Substructure) An RNA structure P of S is called ijsubstructure of S iff P { i ,..., j } 2 . c S.Will Structure Prediction Structure Probabilities Nussinov Algorithm Recursive computation of N i , j Init: (for 1 i n ) N ii = 0 and N ii 1 = 0 Recursion: (for 1 i < j n ) N ij = max N ij 1 max i k <...
View
Full
Document
This note was uploaded on 04/06/2010 for the course COMPUTER S COMP5647 taught by Professor Dr.ping during the Spring '10 term at York University.
 Spring '10
 Dr.Ping

Click to edit the document details