whatIsHMM_seanEddy_nbt04

# E mail eddygeneticswustledu different statistical

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ly, the sequences of exons, splice sites and introns must have NATURE BIOTECHNOLOGY VOLUME 22 NUMBER 10 OCTOBER 2004 different statistical properties. Let’s imagine some simple differences: say that exons have a uniform base composition on average (25% each base), introns are A/T rich (say, 40% each for A/T, 10% each for C/G), and the 5′SS consensus nucleotide is almost always a G (say, 95% G and 5% A). Starting from this information, we can draw an HMM (Fig. 1). The HMM invokes three states, one for each of the three labels we might assign to a nucleotide: E (exon), 5 (5′SS) and I (intron). Each state has its own emission probabilities (shown above the states), which model the base composition of exons, introns and the consensus G at the 5′SS. Each state also has transition probabilities (arrows), the probabilities of moving from this state to a new state. The transition 1315 PRIMER © 2004 Nature Publishing Group http://www.nature.com/naturebiotechnology probabilities describe t...
View Full Document

## This note was uploaded on 04/06/2010 for the course COMPUTER S COSC1520 taught by Professor Paul during the Spring '09 term at York University.

Ask a homework question - tutors are online