{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Lect9 Genome Sequencing and assembly

# An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)

This preview shows pages 1–11. Sign up to view the full content.

CSE182-L9 Gene Finding (DNA signals) Genome Sequencing and assembly

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
An HMM for Gene structure
Gene Finding via HMMs Gene finding can be interpreted as a d.p. approach that threads  genomic sequence through the states of a ‘gene’ HMM.   E init , E fin , E mid I, I (intergenic) E init I E fin E mid I G Note: all links are not shown here i

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Generalized HMMs, and other refinements A probabilistic model for each of the states (ex: Exon, Splice site) needs  to be described In standard HMMs, there is an exponential distribution on the  duration of time spent in a state. This is violated by many states of the gene structure HMM.  Solution is  to model these using generalized HMMs.
Length distributions of Introns & Exons

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Generalized HMM for gene finding Each state also emits a ‘duration’ for which it will cycle  in the same state. The time is generated according to a  random process that depends on the state.
Forward algorithm for gene finding j i q k F k ( i ) = P q k ( X j , i ) f q k ( j - i +1) a lk l Q F l ( j ) Emission Prob.:  Probability that you emitted X i ..X j  in  state q (given by the 5th order markov model) Forward Prob:  Probability that you emitted i symbols and ended  up in state q k Duration Prob.:  Probability that you stayed in state q for j-i+1 steps

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
De novo Gene prediction: Summary Various signals distinguish coding regions from non- coding HMMs are a reasonable model for Gene structures, and  provide a uniform method for combining various signals. Further improvement may come from improved signal  detection
DNA Signals Coding versus non-coding Splice Signals Translation start ATG 5’ UTR intron exon 3’ UTR Acceptor Donor splice site Transcription start Translation start

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
DNA signal example: The donor site marks the junction where an exon ends,  and an intron begins.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 39

Lect9 Genome Sequencing and assembly - CSE182-L9 Gene...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online