Lect7Protein synthesis using HMMs Gene finding

An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
CSE182-L7 Protein Sequence Analysis using HMMs,  Gene Finding
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Domain analysis via profiles Given a database of profiles of known domains/families,  we can query our sequence against each of them, and  choose the high scoring ones to functionally characterize  our sequences. What if the sequence matches some other sequences  weakly (using BLAST), but does not match any known  profile?
Background image of page 2
Psi-BLAST idea  Iterate: Find homologs using Blast on query Discard very similar homologs Align, make a profile, search with profile. Why is this more sensitive? Seq Db --In the next iteration,  the red sequence will  be thrown out. --It matches the query  in non-essential residues
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Psi-BLAST speed Two time consuming steps. 1. Multiple alignment of homologs 2. Searching with Profiles. 1. Does the keyword search idea work? Multiple alignment: Use ungapped multiple alignments  only  Pigeonhole principle again:  If profile of length m must score >= T Then, a sub-profile of length l must score >= lT|/ m Generate all l-mers that score at least lT|/M Search using an automaton 
Background image of page 4
Representation 3: HMMs Building good profiles relies upon good alignments. Difficult if there are gaps in the alignment. Psi-BLAST/BLOCKS etc. work with gapless  alignments.  An HMM representation of Profiles helps put the  alignment construction/membership query in a  uniform framework. Also allows for position specific gap scoring. V
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
QUIZ! Question:   your ‘friend’ likes to gamble.  He tosses a coin: HEADS, he gives you a dollar. TAILS,  you give him a dollar. Usually, he uses a fair coin, but ‘once in a while’, he uses  a loaded coin.  Can you say what fraction of the times he loads the coin?
Background image of page 6
The generative model Think of each column in the  alignment as generating a  distribution. For each column, build a node that  outputs a residue with the  appropriate distribution  0.71 0.14 Pr[F]=0.71 Pr[Y]=0.14
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
A simple Profile HMM Connect nodes for each column into a chain. Thie chain generates random  sequences. What is the probability of generating FKVVGQVILD? In this representation Prob [New  sequence S belongs to a family]= Prob[HMM generates sequence S] What is the difference with Profiles?
Background image of page 8
Profile HMMs can handle gaps The match states are the same as on the previous page.  Insertion and deletion states help introduce gaps.
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 97

Lect7Protein synthesis using HMMs Gene finding - CSE182-L7...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online