Unformatted text preview: Prob of Max over all prev states and state subsequence given state k durations Max ending in state j at step i-d Prob that state k has duration d Transition from j->k Courtesy of Elsevier, Inc. Used with permission. Genscan - Burge and Karlin, (1997) • • Explicit State Duration GHMM 5th order markov models for coding and non-coding sequences Each CDS frame has own model WAM models for start/stop codons and acceptor sites MDD model for donor sites Separate parameters for regions of different GC content Model +/- strand concurrently • • • • • Courtesy of Elsevier, Inc. Used with permission. Types of Exons Three types of exons are defined, for convenience: • initial exons extend from a start codon to the first donor site; • internal exons extend from one acceptor site to the next donor site; • final exons extend from the last acceptor site to the stop codon; • single exons (which occur only in intronless genes) extend from the start codon to the stop codon: Courtesy of William Majoros. Used with permission. Intron and Exon Phase Phase 0 Intron Codon Phase 0 Exon 3123 Exon Intron 1231 Exon Phase 1 Intron 2312 Exon Intron Phase 1 Exon 3123 Exon Phase 2 Intron 1231 Exon Intron Phase 2 Exon 2312 Exon Two State Types in Genscan • D-type represented by diamonds • C-type represented by circles D states are always followed by C and vice versa C states always preceded by same Dstate Courtesy of Elsevier, Inc. Used with permission. Two State Types in Genscan • D-Type – geometric length distribution – Intergenic regions – UTRs – Introns Sequence models are “factorable”: Pk ( Xa, c ) = Pk ( Xa, b) Pk ( Xb + 1, c) f k (d ) = P (state duration d|state k)=p k f k (d − 1) Increasing duration by one changes probability by constant factor Two State Types in Genscan • C-Type – general length distributions and sequence generating modes – Exons (initial, internal, terminal) – Promotors – Poly-A Signals Genscan...
