HMM-Lec9-092104

# HMM-Lec9-092104 - Lecture 9 Hidden Markov Models BioE 480...

This preview shows pages 1–11. Sign up to view the full content.

Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Delete states (circle): silent or null state. Do not match any residues, they are there so it is possible to jump over one or more columns: For modeling when just a few of the sequences have a “-” at a position. Example:
Pseudo-counts Dangerous to estimate a probability distribution from just a few observed amino acids. If there are two sequences, with Leu at a position: P for Leu =1, but P = 0 for all other residues at this position But we know that often Val substitutes Leu. The probability of the whole sequence are easily become 0 if a single Leu is substituted by a Val. Or , the log-odds is minus infinity. How to avoid “over-fitting” (strong conclusions drawn from very little evidence)? Use pseudocounts: Pretend to have more counts than those from the data. A. Add 1 to all the counts: Leu: 3/23, other a.a.: 1/23

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Adding 1 to all counts is as assuming a priori all a.a. are equally likely. Another approach: use background composition as pseudocounts.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Searching a database with HMM Know how to calculate the probability of a sequence in the alignment: multiplying all the probabilities (or adding the log-odds scores) in the model along the path followed by that sequence. For sequences not in the alignment, we do not know the path. Find a path through the model where the new sequence fits well: we can then score it as before. Need to “align” the sequence to the model: Assigning states to each residue in the sequence. A given sequence can have many alignments.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Eg. A protein has a.a. as: A1, A2, A3, … HMM states as: M1, M2, M3, … for match states, I1, I2, I3, … for insertion states, An alignment: A1 matches M1, A2 and A3 match I1, A4 matches M2, A5 matches M6 (after passing through three delete states). For each alignment, we can calculate the probability of the sequence,
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 21

HMM-Lec9-092104 - Lecture 9 Hidden Markov Models BioE 480...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online