Lect9 Genome Sequencing and assembly

An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
CSE182-L9 Gene Finding (DNA signals) Genome Sequencing and assembly
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
An HMM for Gene structure
Background image of page 2
Gene Finding via HMMs Gene finding can be interpreted as a d.p. approach that threads  genomic sequence through the states of a ‘gene’ HMM.   E init , E fin , E mid I, I (intergenic) E init I E fin E mid I G Note: all links are not shown here i
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Generalized HMMs, and other refinements A probabilistic model for each of the states (ex: Exon, Splice site) needs  to be described In standard HMMs, there is an exponential distribution on the  duration of time spent in a state. This is violated by many states of the gene structure HMM.  Solution is  to model these using generalized HMMs.
Background image of page 4
Length distributions of Introns & Exons
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Generalized HMM for gene finding Each state also emits a ‘duration’ for which it will cycle  in the same state. The time is generated according to a  random process that depends on the state. 
Background image of page 6
Forward algorithm for gene finding j i q k F k ( i ) = P q k ( X j , i ) f q k ( j - i +1) a lk l Q F l ( j ) Emission Prob.:  Probability that you emitted X i ..X j  in  state q (given by the 5th order markov model) Forward Prob:  Probability that you emitted i symbols and ended  up in state q k Duration Prob.:  Probability that you stayed in state q for j-i+1 steps
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
De novo Gene prediction: Summary Various signals distinguish coding regions from non- coding HMMs are a reasonable model for Gene structures, and  provide a uniform method for combining various signals. Further improvement may come from improved signal  detection
Background image of page 8
DNA Signals Coding versus non-coding Splice Signals Translation start ATG 5’ UTR intron exon 3’ UTR Acceptor Donor splice site Transcription start Translation start
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
DNA signal example: The donor site marks the junction where an exon ends,  and an intron begins.
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 39

Lect9 Genome Sequencing and assembly - CSE182-L9 Gene...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online