{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

# hmm - Hidden Markov Models(HMMs(Lecture for CS397-CXZ...

This preview shows pages 1–9. Sign up to view the full content.

1 Hidden Markov Models (HMMs) (Lecture for CS397-CXZ Algorithms in Bioinformatics) Feb. 20, 2004 ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Motivation: the CpG island problem Methylation in human genome “CG” -> “TG” happens in most place except “start  regions” of genes  CpG islands = 100-1,000 bases before a gene starts Questions Q1: Given a short stretch of genomic sequence,  how would we decide if it comes from a CpG island  or not?  Q2: Given a long sequence, how would we find the  CpG islands in it?
3 Answer to Q1: Bayes Classifier ( | ) ( ) ( | ) ( ) ( | ) ( | ) ( ) ( ) CpG CpG Other Other CpG Other P X H P H P X H P H P H X P H X P X P X = = Hypothesis space: H={H CpG , H Other } Evidence: X=“ATCGTTC” Likelihood of evidence (Generative Model) Prior probability ( | ) ( | ) ( ) ( | ) ( | ) ( ) CpG CpG CpG Other Other Other P H X P X H P H P H X P X H P H = We need two generative models for sequences: p(X| H CpG ), p(X|H Other )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 A Simple Model for Sequences:p(X) 1 2 1 1 1 1 1 1 ( ) ( ... ) ( | ... ) : ( ) ( ) : ( ) ( | ) n n i i i n i i n i i i p X p X X X p X X X Unigram p X p X Bigram p X p X X - = = - = = = = = Probability rule Assume independence Capture some dependence P(x|H CpG ) P(A|H CpG )=0.25 P(T|H CpG )=0.25 P(C|H CpG )=0.25 P(G|H CpG )=0.25 P(x|H Other ) P(A|H Other )=0.25 P(T|H Other )=0.40 P(C|H Other )=0.10 P(G|H Other )=0.25 X=ATTG Vs. X=ATCG
5 Answer to Q2: Hidden Markov Model CpG Island X=ATTGATGCAAAAGGGGGATCGGGCGATATAAAATTTG Other Other How can we identify a CpG island in a long sequence? Idea 1: Test each window of a fixed number of nucleitides Idea2: Classify the whole sequence Class label S1: OOOO………….……O Class label S2: OOOO…………. OCC Class label Si: OOOO…OCC..CO…O Class label SN: CCCC……………….CC S*=argmax S P(S|X) = argmax S P(S,X) S*=OOOO…OCC..CO…O CpG

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 HMM is just one way of modeling p(X,S)…
7 A simple HMM Parameters Initial state prob: p(B)= 0.5; p(I)=0.5 State transition prob: p(B B)=0.8 p(B I)=0.2 p(I B)=0.5 p(I I)=0.5 Output prob: P(a|B) = 0.25, p(c|B)=0.10 P(c|I) = 0.25 … P(B)=0.5 P(I)=0.5 P(x|B) B I 0.8 0.2 0.5 0.5 P(x|I) P(x|H CpG )=p(x|I) P(a|I)=0.25 P(t|I)=0.25 P(c|I)=0.25 P(g|I)=0.25 P(x|H Other )=p(x|B) P(a|B)=0.25 P(t|B)=0.40 P(c|B)=0.10 P(g|B)=0.25

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
8 ( , , , , ) HMM S V B A = Π ( ) : " " i k k i b v prob of generating v at s A General Definition of HMM 1 1 { ,..., } 1 N N i i π π π = Π = = : i i prob of starting at state s π 1 { ,..., } M V v v = 1 { ,..., } N S s s = N states M symbols Initial state probability: 1 { } 1 , 1 N ij ij j A a i j N a = = =
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 29

hmm - Hidden Markov Models(HMMs(Lecture for CS397-CXZ...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online