nucleicacidstructure

nucleicacidstructure - Principles of nucleic acid structure...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Principles of nucleic acid structure Biophysical Chemistry 1, Fall 2010 Reading assignment: Chap. 3 Web assignment: http://w3dna.rutgers.edu Nucleic of nucleotide i is joined to the 5´-oxygen of nucleo ´-oxygen acids: phosphates, sugars, bases (i ) (I+1) The sugar-phosphate backbone ζ O3’ O β = P—O5’—C5’—C4’ O5’ β nucleotide unit α = O3’i-1—P—O5’—C5’ O P α 5’ γ 4’ δ ζ O4’ 3’ O3’ ε Base Chain direction χ 1’ 2’ C3’ Residue i-1 O3’ OP1 Residue i P O5’ C5’ γ = O5’—C5’—C4’-C3’ δ = C5’—C4’—C3’—C2’ ε = C4’—C3’—O3’—Pi+1 ζ = C3’—O3’—Pi+1-O3’i+1 P OP2 Basics of Nucleic Acid Structure Sugar puckering Basics of Nucleic Acid Structure 2 E 3’ 2’ C5’ 2’ C5’ C5’ B T3 3’ 3’ O4’ C5’ O4’ 2’ 3 E O4’ 2’ 3’ O4’ 2 B C5’ B 63 B 3 T2 2’ B C5’ 3’ O4’ B E3 E2 2’ 3’ C5’ S conformations B B C5’ N conformations FIGURE 3.5 Various sugar ring puckering conformations. Those on the left are denoted S (for south); those on the right, N (for north). The C3′-endo conformation is seen at the top right, and the C2′-endo conformation at the top left. The notation of E and T conformations is also given. Superscript numbers preceding E or T refer to carbon atoms on the same side of the reference plane (horizontal line) as C5′. Subscripts following E or T denote atoms on the opposite side of Sugar of Structural Biology puckering A Textbook N C2’-exo 324˚ C1’-endo 1 E 288˚ 1 2E 2T 0˚ C3’-endo 36˚ 3 3 2T E3 4T 1 0T O4’-exo 0E 4 0T 252˚ 72˚ C4’-endo E 0 1T E 216˚ 4E 0 4T 0 4 C4’-endo C4’-exo 4 3T 2 2 3E 3T E C3’-exo 180˚ S 2 1T 108˚ 1E C1’-exo 144˚ C2’-endo tion. Conformational angles of P are divided into two categories, north (N) and so purines and pyrimidines NH2 N 7 5 8 4 9 NH 6 3 1 O N 7 N 8 2 9 NH N Adenine 3 1 NH 2 N NH2 O 4 6 4 6 Guanine O 5 5 3 1 Uracil 4 NH 5 6 2 NH NH2 O 3 1 NH 5 2 NH Thymine 6 O 4 1 3 N 2 NH O Cytosine 3.8 The most common bases found in nucleic acids: the top row is purin row pyrimidines. The atom-numbering scheme of purines and pyrimidines is giv The glycosidic torsion parameter Basics of Nucleic Acid Structure 65 Watson-Crick base pairing A Textbook of Structural Biology major groove major groove H O N H H O N N HN N N N R H NH N N N N N H O N R R H O R minor groove minor groove Guanine N Cytosine Adenine Thymine FIGURE 3.13 The Watson–Crick base pairs. The sugar moieties are represented by R. Notice that the GC base pair on the left interacts via three hydrogen bonds, whereas the AT base pair on the right has only two. This makes the GC base pair and thus GC-rich DNA more stable than the AT base pair and AT-rich DNA. phosphodiester linkages between adjacent nucleotides. In the sugar-phosphate part the phosphate groups connect to the 3′ carbon of one deoxyribose moiety and the 5′ carbon of the next moiety, thereby linking successive deoxyriboses together. The two ends of a chain differ; the end where the 5′ carbon is not con- A and B-form helices The A.B transition: first known change of DNA double-helical state. +H2O . - +salt and/or +alcohol 11 bp/turn A-DNA 10 bp/turn B-DNA A-DNA base pairs inclined with respect to helical axis and untwisted cf. B DNA. A-DNA minor groove wider and more shallow, major groove narrower and more deep cf. B!DNA Base pairs displaced from A-DNA helical axis. 10 bases per turn. One full turn measures 3.4 nm in the axial direction. T A and B-form helices Base stacking The overlap of successive base pairs depends on duplex form. Concentrate on the basepair structures Basics of Nucleic Acid Structure x y y z x y x y 3' x y I z x y Rise z z z x y Opening Stagger Roll z x x y Slide z y z x y Propeller Stretch Tilt z x x y Shift z x z x y Buckle z y x y Shear y z z z Twist 5' x II Coordinate frame 73 Getting DNA to bend Combined B.A and B.C deformations tighten the bending of DNA: ••••• ••••• ••••• ••••• A C Global bend: 360°/75 bp left-handed superhelix Combined B.A and B.C deformations tighten the bending of DNA: The nucleosome core particle Textbook of Structural Biology FIGURE 3.21 Left: A dimer of histone proteins H3 (blue) and H4 (light blue). Right: Nucleosome structure. The octameric complex of histone proteins forms the center and the DNA is wound around. The color scheme of the histone subunits in the core particle is the same as in Fig. 3.20 (PDB: 1KX3). DNA binding to the histone core proteins Comparison to elastic rod models Bending rigidity: A = M (2πνn )2 ; 4 LPn Twisting rigidity: C = Stretching rigidity: Y = a = A/(kB T ) I ( 2Lν n ) 2 n2 ML(2νn )2 n2 Lord Rayleigh, The Theory of Sound, 1894 Bending rigidity for linear duplex DNA n =1 n =2 n =3 d(GACT) 60 base pairs n 1 2 3 ! 19 A = 2.44"10 GB frequencies 0.114 0.116 0.294 0.295 0.522 0.532 erg.cm $ 2.26"10 Analytical frequencies 0.100 0.275 0.539 ! 19 erg.cm % a =594 Å $ 550 Å % Stretching rigidity for linear duplex DNA n =1 n =2 n =3 d(GACT) 60 base pairs n 1 2 3 GB frequencies 0.664 1.289 1.807 Y = 1502 pN Analytic al frequencies 0.619 1.237 1.846 $ 1000!1500 pN % Salt dependence of bending and stretching are not the same Exp: Baumann, Smith, Bloomfield, Bustamante, PNAS 94, 6185 (1997) Theory: Bomble & Case, Biopolymers 89, 722 (2008) 750 700 500 d( G A C T ) d( G C ) d( A T ) dG . dC dA . dT d( C T G ) d( C G G ) r1 r2 r3 r4 r5 r6 r7 r8 r9 r1 0 r1 1 r1 2 r1 3 r1 4 r1 5 r1 6 r1 7 r1 8 r1 9 r2 0 r2 1 r2 2 r2 3 r2 4 r2 5 r2 6 r2 7 r2 8 r2 9 r3 0 P e rsi ste nce le ngth (Å ) Sequence dependence of bending rigidity 800 dA.dT dG.dC 650 600 550 450 d(CGG) S e que nce (6 0 bp) Now consider circular DNA νn = f (Ω, R , ∆Tw , n, ρ ) R is the circle radius ∆Tw is the excess twist Ω = C /A ρ is the mass density Matsumoto, Tobias, Olson, JCTC 1, 117 (2005) In-plane and out-of-plane modes for circular DNA In plane bending motions n =3 n =2 n 2 3 4 Relaxed minicircle with 94 base pairs GB frequencies Analytic al frequenc ies 0.165 0.172 0.121 0.394 0.452 0.366 0.672 0.724 0.697 Ov ertwisted minic ircle with 94 bas e pairs GB frequenc ies Analytic al frequencies 0.166 0.178 0.116 0.443 0.471 0.377 0.704 0.778 0.734 Out of plane bending motions n =3 n =2 n 2 3 4 Relaxed minicircle with 94 base pairs GB frequencies Analytical frequencies 0.217 0.231 0.211 0.525 0.545 0.496 0.793 0.851 0.864 Overtwisted minicircle with 94 base pairs GB frequencies Analytical frequencies 0.211 0.248 0.235 0.520 0.542 0.551 0.802 0.901 0.958 Moving on to RNA 80 A Textbook of Structural Biology P 5.9Å P O 3’ C5’ O P Base O 2’ 7.0Å O C5’ 3’ P C3’-endo 2’ O Base O C2’-endo FIGURE 3.23 The two types of sugar pucker most commonly found in nucleic acids. The C3′-endo pucker is prevalent in RNA and A-form DNA, whereas the C2′-endo pucker is characteristic of B-form DNA. It is seen that the C3′-endo pucker produces a significantly shorter phosphate-phosphate distance in the backbone, resulting in a more compact helical conformation. 10 nucleotides per turn, RNA prefers the A-form with 11–12 nucleotides per turn. In DNA, the base pairs are centered over the helix axis. In an RNA double helix, the base pairs slide ∼ 5 Å away from the helix center. All these factors contribute to the tighter packing of the RNA double helix. The surface of an RNA helix is also quite different from the DNA double helix. The major groove of RNA is very narrow and deep, accentuated by the fact that RNA does not have the thymine methyl group, which resides in the major groove. In contrast, the minor groove is wide and shallow. For this reason, the major and tions, manifested in the degeneracy of the genetic code (Chap. 8). Figure 3.27 shows RNA has more base-pairing possibilities alternative base pairthe GU wobble base pair, which is one of the most common ing patterns, and the GU reverse wobble, where the uracil group is simply flipped N O H O N H N N R O NH R N N NH R N N N N N N H O N H H R H H N H GC Watson-Crick GC reverse FIGURE 3.25 Left: Canonical Watson–Crick GC base pair (cis). Right: GC reverse Watson–Crick base pair (trans). H N N H H N N O H H O N R H O N N N R N N O AU Hoogsteen FIGURE 3.26 N HN N HN R R O AU reverse Hoogsteen N N R N R N N HN N O AU reverse base pair Left: AU Hoogsteen base pair. Center: AU reverse Hoogsteen base pair. RNA has more base-pairingGC base pair (cis). Right: GC reverse Watson–Crick FIGURE 3.25 Left: Canonical Watson–Crick possibilities base pair (trans). H N N H H N N O H H O N R H O N N N R N HN N N N O AU Hoogsteen HN R R O AU reverse Hoogsteen N N R N R N N HN N O AU reverse base pair FIGURE 3.26 Left: AU Hoogsteen base pair. Center: AU reverse Hoogsteen base pair. Right: AU reverse Watson–Crick base pair. The blue dashed line shows the line of symmetry used to define the cis/trans conformation of the base pair. The AU Hoogsteen base pair is thus cis-H/WC, and the AU reverse Hoogsteen is trans H/WC. RNA has more base-pairing possibilities extbook of Structural Biology O O R N O N HN O N HN N N R NH O N R R N NH2 GU wobble FIGURE 3.27 NH O N NH2 GU reverse wobble Left: GU wobble. Right: GU reverse wobble. round the axis of the amine hydrogen bond. The GU wobble base pairing results n the loss of a hydrogen bond from the guanine, but the vacant amino group often orms hydrogen bonds to other bases nearby, perhaps in concert with the neighoring imino group. The GU wobble base pairings can be viewed as a canonical atson–Crick pattern, with a shift of the pyrimidine partner. If lysidine is absent, the tRNA will instead be recognized and mischarged by the RNA uses chemically-modified basesposttranslational modification CAT-recognizing methionyl tRNA ligase. So a single is responsible for both the codon and amino acid specificity of this tRNA. S O NH2 O O H HN H H O NH N HN O H O R R 4-thiouracil (S4U) dihydrouracil (D) 3-methylcytosine (m3C) N N R H2N N N N NH NH R 5-methylcytosine (m5C) N Inosine (I) H COOC 3 NH2 NH O N N O NH N R + N + NH3 NH R O pseudouridine (Ψ) FIGURE 3.32 for ribose. N6-methyladenine (m6A) COOCH3 N N N R HN NH2 O lysidine (L) O N N N N R wyosine (Y) Examples of modified bases in RNA. Modifications are marked in red. R stands Tetraloops are a particularly RNA motifs: tetraloops common motif in RNA structures. An especially well-known case of this hairpin loop is the GNRAa loop motif, which closes the FIGURE 3.36 Three-dimensional structures of various tetraloop folds. Left : GNRA loop from 5S rRNA (PDB: 1JJ2). The first G in the loop stabilizes the loop by hydrogen bonding to the fourth member. Middle: ANYA loop from MS2-RNA complex (PDB: 1DZS). Bases one and two form a stacking interaction, while bases of three and four of the loop are looped out and poised to interact with other species. Right: UNCG tetraloop from 16S rRNA (PDB: 1BYJ). The first U and the last G in the tetraloop interact via hydrogen bonds, while bases of one and two in the loop form a stacking interaction. The third base in the loop is available for interaction with other species. R stands for puRine; N stands for aNy; Y stands for pYrimidine. a the ribosomal RNAs after the normal Watson–Crick base pair. Stems, bulges, loops, internal loops 3’ G C 100 A GC C U C UGGG A G Loop D C C A 120 C C G C C G 110 C AA C CC G GU U C G GAGGUCAU AGGGGCCUU G U 80 70 90 Helix 4 Loop E Helix 5 5’ U U A G Helix 1 G C G G 10 CC A G 30 C UG 20 A 40 G CC C CU GC GGUGGG CG U A C C C A U C G CG C CA CCC G GGCACA AG CC A UA A GA A A 60 50 Loop A Helix 2 Loop B Helix 3 Loop C efficiency. These untranslated regions Even mRNA has structure (UTRs) are present before the start codon (5′ UTRs) and after the stop codon (3′ UTRs). They contain areas of well-defined AAAAAAAAAAAAAAAAAAAAAAAA m7G cap 5’UTR 3’ UTR poly(A) tail FIGURE 3.50 Schematic representation of eukaryotic mRNA showing the 5′ cap, the coding region (red), and the 5′ and 3′ UTRs. More on RNA secondary structure Secondary Structure: small subunit ribosomal RNA AA U A U U A G AUAU G U AU A A GA U U A A CA U AAA AU U A A A U C A A AU GA U UUUUUAAU A A CGA CC UCC AAUUAUAA CU U GU G U A U A A AAAAAUUA U A UA A AA G G U G UGCUUG AUAAUAUU GA AA U A A U U A G C A A A U U U A U U A A U A U A G U G A UA A U A U AC A U A U G C CGA A U U A A U U U U A A U C G U A A A U A AU A U A C U U A A U A G C AA AA U G UC C G A U A G C U AU U A U A A U A AU G A A U CA U A U A U A A UAAUU A G U G G C G U U GC U A U U A A UAUUAA U G U A G UG A U UU C G A A G C U A A U U U G C C G A A AC A U UU U A UUAUAAUU U C UAU A U A A A A UG A A G C UAUAUUAA U UA AU U G UU U A GU A A U G U A A A U A GU U A U A U UU A G A A A A A U UA G A U A U U G AA U A A A A U A UA U C A U A U U A A U U G CU U U C U G U C U UAC U A A U A A U C G A G A A A U U UA G A G U U A U U U GA A UA CA A A A A U C A AU A GU C G A A A A A C U A A G UU CUUA UGA A A UAA U U UU U G U A AC A A C C U AAU A A A U C A U G C A A G A U G A A G U U C U A UA G C G U A A G U A A U UU U G C U C G C A UU A C A A U U A AA A U G U C A U A U AAA U G C AA A U A G A U U A C G U A A A G A G UA CG U U A G U A UA G C A C U A UA UA C A U G A U GA UGA UA UA UA A A A A UA UUUUA UUU U U A A U A UA G A U CUCA UAAU A UA UUUA GUUCCGGGGCCCGGCCA CG U A G A U AA U A UU A AU AAG U A GA A CCGGA CCCGA A A GGA GA A A UA U A G U A G U U UAA A A AA UAAAUAUUUAUAAUAAUAAUAAUAA C U G CA A A A CAG U A A U A G A C G U U A C A G A CUUAAG UA A UA UA UA UA UA UA A A UUGA UUA A U U U C G C G GA AAAUAAAAUCCAUAAAUAAUUAAAA U U A U U G U U GU C G C A C U G A UA A UGA UA UUA A UUA CCA UA UA UA U G A UCCGCUUUGUCA G U A A A U A A A AA A UUUA UA UGGA UA UA UA UA UUA A UA A U C U U UA G AU U G G A GA U A U A U G CG UAAUAUUAAUUUAUUAUUAUUAAUA A AA A U G GA G C U AUAUAUUUUAA C A A A A A C U G AC U A A C GA U U A AA U A G U U A U A C U A U U C U A AA U A U G U A U A A UAAG G A G A CU U U U G U G C U A A U C A AA A UUA CCGUA GG G A A U A A UU GU G G A U A U U A GA UUG A UCC A G UU U A A A A U G C GGUGGCGUCC A AA G A C A C A A A G G U U G A A GG U A G CUA GU AU UA G C C C AAC G U A AU C G U U A G U A G C UAA U A G G U G A C GAG A A U U G U A U CA A G 5’ G U U A A A U UAA A A C G A A A C U A ACAG A A U A U G C C G U U A A U A C UAU AU U A U U A AGU GUC A CG UUUA UUA UU GA A U CAA U A U A U A A U A G U C UC A G A UGG UGU A A A UA A UA G A A UU A U A AU AAG UU GA U A G G U A AA U A C A GU A CG U A U A U AA AU C A G C G U U U CAA U A A GC C GA UA U C U G A A U U U U U A A A A C C U U A C UA A UUU A A GA C G AU U U U G A U A U U 3’ A UU A U C U U U A UA C A A A G C G G A U U G C G U U A A U C G UA A C U U AG A U A G A A A A UAA G A U A U C U UU U A U G U G U A U A A G CAA U G U U A U AAU U U U A A U A U A U A U C G U U U A U U C A A UA A GU A A A A AUAUAUAUUU A U U A A A UA UA UA UA GG U U A UUCA U A A A A U A A A A U A U A A U A AUUA A U A U AAUAA A G U A GA A AUA AUG II III I Saccharomyces cerevisiae: Mitochondrion Domain: Bacteria Eucarya Phyla: Fungi, Ascomycota April 1994 (V00704) A U A A U A U A U U U A A U U A U U A U A U A A A A A Fortunately, a lot of interesting RNA is shorter than this, or can be thought of in terms of domains predicting secondary structure from sequence the inverse folding problem: finding sequences that are compatible with structure making the 2D ⇒ 3D transition force field simulations of nucleic acids Nearest-neighbor energy function Thermodynamics of stems and loops Serra & Turner, Meth. Enzymol. 259, 242 (1995) Lu, Turner & Mathews, NAR 34, 4912 (2006) AA Hairpin: U G U A C-G G-C G-C ΔG loop = ΔG loop (6) + ΔG ( ΔG stem = ΔG ( G CG ) + ΔG ( ) = + 2. 9 kcal/mol A GA GG GC ) + ΔG ( ) = − 6. 3 kcal/mol CC CG ΔG hairpin = ΔG loop + ΔG stem Similar calculations for ΔH , ΔS , and T melt pairing in both strands, and a multibranch loop (helical junction) is a loop f A slightly more complex structure: Santa Cruz RNA Center at http://rn the XRNA program, available from the -2.1 -2.5 -3.4 ‘ 5G C G AG ‘ 3G C G C GA -1.7 -2.4 -0.2 AGG UCC GC C G AA -3.3 +5.4 ∆G˚ = -1.7 - 3.4 - 2.4 - 0.2 - 2.1 - 3.3 - 2.5 + 5.4 = -10.2 kcal/mol Fig. 2 Sample nearest neighbor calculation. The free energy increments of each motif are indicated and the total stability is the sum of each a a c e t a t T h 3 F s m From sequence to secondary structure: 1 A very nice folding program (Windows only) is RNAstructure: http://rna.urmc.rochester.edu/rnastructure.html 1 2 3 sequence editing, secondary structure prediction dynalign, for multiple sequence alignment par tition function analysis, gives much improved confidence levels for the predictions 2 Prediction of secondary structure as a web service: http://www.bioinfo.rpi.edu/~zukerm/ (classical approach) or at http://sfold.wadsworth.org/ (statistical sampling of the Boltzmann distribution) 3 Reviews: D.H. Mathews, Theor. Chem. Acc. 116, 160 (2006); J. Mol. Biol. 359, 526 (2006) RNAmotif: scanning genomes forlooking for structure motifs A simple example: secondary tetraloops desc h5 (minlen=4, maxlen=6) ss (seq ="ˆuucg$" ) returns: AEURR16S AGIRRDX AGIRRDX AHNRRDX ANNRRO AOSRR18S APBRR16SD APSRRE AQF16SRRN AQF16SRRN ..... 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0 1035 12 tccc 0 180 14 ggagc 0 181 12 gagc 0 181 12 gagc 0 39 12 cccc 0 175 12 gccc 1 1321 12 tacc 0 227 12 cacc 1 39 12 agcg 1 40 14 cagcg ttcg ttcg ttcg ttcg ttcg ttcg ttcg ttcg ttcg ttcg ggga gctcc gctc gctc gggg gggc ggta ggtg cgct cgctg h3 Motif for an artifically generated DNA enzyme Looking for a analogue of an artificial DNA enzyme parms wc += gu; descr h5( tag=’h1’, len=5, mispair=1 ) ss( len=1, seq="r" ) h5( tag=’h2’, minlen=5, maxlen=8, seq="^y", mispair=1 ) ss( minlen=4, maxlen=200 ) h3( tag=’h2’, seq="r$" ) ss( tag=’ggct’, len=15, mismatch=1, seq="^ggctagcnacaacrr h3( tag=’h1’ ) Results: Arabidopsis thaliana chr II sect 146/255: AC004747 1.000 1 15537 171 ttgtt a tccgc ttt...(135)...tgggtgga ggctaTccacaacaa ggtgg How can we generate descriptors? search with strict 2o struct descriptor choose seq with low T.E. and look for conserved nt include conserved nt in descriptor, search with looser 2o struct constraints This general procedure can often be used to find conserved nucleotides in familes of structures, and to search for specific types of RNA in genomic databases. 6 Anticodon stem Example: the tRNA motif A Textbook of Structural Biology A C 75 C A C G C 70 U U A A 65 conserved partly conserved G C G G A U U Acceptor stem U A GA D C U C m2 G 15 D-loop D G GA 20 A Cm Y U Gm A A 35 (a) Anticodon 60 CU GACAC mA G TΨC-loop 10 G U m5C 40 Ψ A A 30 G m5 G U G U G 50 TΨ C CU 55 m7 G A G 45 Variable loop G G U m5C 40 Ψ A GAGC 2 25 m2 G C C Anticodon stem A 30 G A Cm Y U Gm A A 35 (a) Anticodon FIGURE 3.52 (a) Secondary structure of tRNAPhe from yeast. (b) Schematic represe three-dimensional folding of the tRNA molecule, using the same color scheme as Looking for tRNA’s Generating an initial tRNA descriptor for E. coli 1. Descriptor with no sequence requirement. GU’s are allowed, and no mispairs are allowed. Result: 26 tRNAs not found, 5 false positives, all with higher Turner energies than true tRNAs. lgth=7 N N lgth=5 lgth=4-22 lgth=8-11 N N N N lgth=3-4 N lgth=5 N NN N N NN 2. Analyze sequence conservation for 50 hits with lowest Turner energies. Found 11 nucleotides that are 100% conserved. lgth=7 U N lgth=7 5’ N lgth=5 NN N 3’ 3. Include conserved nucleotides in descriptor, but allow a mispair in each helix. Result: 2 tRNAs not found, no false positives. lgth=4-22 lgth=8-11 N lgth=3-4 N G C U lgth=5 lgth=7 5’ N CC A 3’ U UC N N NA Generating an optimized tRNA descriptor for E. Coli Optimized tRNA descriptor for E. coli lgth=7 N helix single strand N lgth=5 lgth=4−22 lgth=8−11 N lgth=3−4 N NN N lgth=5 NN U C N N N lgth=7 5’ N CC A 3’ (GU basepairs not allowed; one mispair per helix; no sequence mismatches) Performance for K12 and O157:H7: no false positives, one missing tRNA for K-12, one previously unidentified tRNA located A more general bacterial tRNA descriptor !"#$%& U helix single strand N One mispair per helix is allowed a.c. arm )!"#$%* V loop !"#$%(!-- !"#$%+!,, D arm N !"#$%'!( N GU !"#$%* U C !"#$%& *. !"#$%( '. C N NA T arm a.a. stem U GU pairs generally not allowed N One sequence mismatch is allowed Results for bacteria using this descriptor !"#$%&'( !!!!!!!! !"#$%&'(!)* !"#$%&'+),-./0"'1234&%&5 672&89:';9$%&<25 /"'&=8%29=>;9 ?@<"'A=92B$=&;9 )*)$+ /.)012 !!!!!! ..56 ..98 ..5< ..=> ..86 ..>6 ,$+'.%-#3 !!!!! ..7 ..: ..7 ..: ..: ..7 ..,$+'...4*'3 ..!!!!! ....78 ....;: ....75 .....= .....< .....6 (We show below how to eliminate false positives.) False positives can be further distinguished from true tRNAs using Turner energies. Analysis using nearest-neighbor energies Turner Energies test Bacillus subtilis Aquifex aeolicus Haemophilus influenzae Mycoplasma pneumoniae true tRNA false pos. true tRNA false pos. Moving to eukaryotes What about eukaryotes? organism S. pombe S. cerevisiae arabidopsis drosophila C. elegans human • • • • tRNAs 153 273 620 284 584 496 tRNAs with introns 39 59 83 15 34 28 S. cerevisiae: tRNAscan-SE found 275 tRNAs, 59 with introns we modified our descriptor to allow an 8-60 base insert after the sixth postion of the anticodon loop rnamotif then found all but 3 of the tRNA’s identified by tRNAscan-SE rnamotif also found 1301 false positives, but all had high Turner energies “Solving” the inverse folding problem (yeast): Turner Energy tRNA folding energies for S. cerevisiae 5555 5655 4755 8 9 true tRNA 44655556457765855945 44555656557458865955 45554566657558459965 475575675775875975 485585685785885985 95 95 95 95 95 95 false pos ...
View Full Document

This note was uploaded on 01/15/2012 for the course CHE 543 taught by Professor Staff during the Fall '10 term at Syracuse.

Ask a homework question - tutors are online