MIT6_047f08_lec08_slide08

Codon a codon a the markov the model model q0 q0 the

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: p Codon A Codon A the Markov the model: model: q0 q0 the input sequence: AGCTAGCAGTATGTCATGGCATGTTCGGAGGTAGTACGTAGAGGTAGCTAGTATAGGTCGATAGTACGCGA What is the best state labeling? Courtesy of William Majoros. Used with permission. http://geneprediction.org/book/classroom.html Finding The Most Likely Path • A sensible choice is to choose π that maximizes P[π|x] • This is equivalent to finding path π* that maximizes total joint probability P[ x, π ]: P(x,π) = a0π * Πi eπ (xi) × aπ π 1 i i i+1 start emission transition How do we select π∗ efficiently? A (Very) Simple HMM Donor T Donor T Intron Intron Acceptor A Acceptor A Donor G Donor G Start Start Codon G Codon G Start Start Codon T Codon T Start Start Codon A Codon A Intergenic Intergenic Exon Exon Acceptor G Acceptor G Stop Stop Codon G Codon G Stop Stop Codon T Codon T Stop Stop Codon A Codon A the Markov model: q0 q0 the input sequence: the most probable path: the gene prediction: AGCTAGCAGTATGTCATGGCATGTTCGGAGGTAGTACGTAGAGGTAGCTAGTATAGGTCGATAGTACGCGA exon 1 exon 1 exon 2 exon 2 exon 3 exon 3 Courtesy of William Majoros. Used with permission. http://geneprediction.org/book/classroom.html A Real HMM Gene Predictor Title page of journal article removed due to copyright restrictions. The article is the following: Krogh, Anders, I. Saira Mian, and David Haussler. "A Hidden Markov Model That Finds Genes in E.coli DNA." Nucleic Acids Research 22, no. 22 (1994): 4768-4778. HMM Limitations The HMM framework imposes constraints on state paths… Donor T Donor T Intron Intron Acceptor A Acceptor A Donor G Donor G Start Start Codon G Codon G Start Start Codon T Codon T Start Start Codon A Codon A Intergenic Intergenic Exon Exon Acceptor G Acceptor G Stop Stop Codon G Codon G Stop Stop Codon T Codon T Stop Stop Codon A Codon A q0 q0 Human Exon Lengths Courtesy of Christopher Burge. Used with permission. Burge, MIT PhD Thesis 10% 0% 20% 30% 40% 50% 10% 20% 30% 40% 50% 0% N. crassa C. neoformans Fungal Intron Lengths Nucleotides 10% 20% 30% 40% 50% 10% 0% 0% 31-40 71-80 111-120 151-160 191-200 231-240 271-280 311-320 351-360 391-400 431-440 471-480 511-520 551-560 591-600 631-640 671-680 711-720 751-760 791-800 20% 30% 40% 50% S....
View Full Document

Ask a homework question - tutors are online