791_my_lecture2

791_my_lecture2 - 7.91 Lecture #2 Michael Yaffe More...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
7.91 – Lecture #2 More Pairwise Sequence Comparisons ARDFSHGLLENKLLGCDSMRWE .::. .:::. .:::: :::. GRDYKMALLEQWILGCD-MRWD - and – Multiple Sequence Alignment ARDFSHGLLENKLLGCDSMRWE .::. .:::. .:::: :::. GRDYKMALLEQWILGCD-MRWD .::. ::.: .. :. .::: SRDW--ALIEDCMV-CNFFRWD Reading: This lecture: Mount pp. 8-9, 65-89, 96-115, 140-155, 161-170 Michael Yaffe
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Outline • Recursion and dynamic programming Applied dynamic programming: global alignments: Needleman-Wunsch Applied dynamic programming: local alignments – Smith-Waterman Substitution matrices: PAM, BLOSUM, Gonnet Gaps - linear and affine Alignment statistics What you need to know to optimize an alignment
Background image of page 2
Outline (cont) Multiple sequence alignments: MSA, Clustal • Block analysis • Position-Specific Scoring Matrices (PSSM)
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Examples O(n k ) is “polynomial time” as long as K<3 …..tractable Consider our un-gapped dot matrix Global alignment: 1 n 1 12345678…. 12345678…. 12345678…. 12345678…. 12345678…. m 12345678…. ….essentially an O(mn) problem
Background image of page 4
O.K. Examples O(n) better than O(n log(n)), better than O(n 2 ), better than O(n 3 ) Terrible Examples O(k n ) = exponential time….horrible!!!! NP problems- no known polynomial time Solutions = non-deterministic polynomial Problems.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Recursion and Dynamic Programming Aligning two protein sequences without gaps – roughly an O(mn) problem. With gaps – becomes computationally astronomical, and cannot be done by direct comparison methods. (= 2 2L / (2 π L); L=sequence length) Alternative is to compare all possible pairs of characters (matches and mismatches, and also take gaps into account as well, while keeping the number of comparisons manageable. The approach is called dynamic programming. Mathematically proven to produce optimal alignment Need a substitution or similarity matrix and some way to account for gaps. Example of how to score an alignment: Write down two sequences: sequence#1 V D S C Y sequence#2 V E S L C Y Score from sub. Matrix 4 2 4 -11 9 7 Score = Σ (AA pair scores) – gap penalty = 15
Background image of page 6
C 9 S T 5 P A 0 1 0 G 6 N 6 D 6 E 2 5 Q 0 2 5 H 0 0 8 R 0 1 0 5 K 1 5 M 0 I 4 L 2 4 V 3 1 4 F 0 Y 2 7 W CSTPAGNDEQHRKMILVFYW BLOSUM 62 Scoring Matrix -1 4 -1 1 -3 -1 -1 7 - 1 4 -3 0 -2 -2 0 -3 1 0 -2 -2 0 - 3 0 - 1- 1- 2- 1 1 - 4 0 - 1- 1- 1- 2 0 - 3 0 - 1
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Scoring system should: favor matching identical or related amino acids Penalize for poor matches and for gaps. To get a good scoring system need to know: how often a particular amino acid
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 67

791_my_lecture2 - 7.91 Lecture #2 Michael Yaffe More...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online