Unformatted text preview: ly, the sequences
of exons, splice sites and introns must have NATURE BIOTECHNOLOGY VOLUME 22 NUMBER 10 OCTOBER 2004 different statistical properties. Let’s imagine
some simple differences: say that exons
have a uniform base composition on average
(25% each base), introns are A/T rich (say,
40% each for A/T, 10% each for C/G), and
the 5′SS consensus nucleotide is almost
always a G (say, 95% G and 5% A).
Starting from this information, we can
draw an HMM (Fig. 1). The HMM invokes
three states, one for each of the three labels
we might assign to a nucleotide: E (exon),
5 (5′SS) and I (intron). Each state has its
own emission probabilities (shown above the
states), which model the base composition
of exons, introns and the consensus G at the
5′SS. Each state also has transition probabilities (arrows), the probabilities of moving
from this state to a new state. The transition 1315 PRIMER © 2004 Nature Publishing Group http://www.nature.com/naturebiotechnology probabilities describe t...
View Full Document
This note was uploaded on 04/06/2010 for the course COMPUTER S COSC1520 taught by Professor Paul during the Spring '09 term at York University.
- Spring '09