791_my_lecture2

# 791_my_lecture2 - 7.91 Lecture#2 Michael Yaffe More...

This preview shows pages 1–9. Sign up to view the full content.

7.91 – Lecture #2 More Pairwise Sequence Comparisons ARDFSHGLLENKLLGCDSMRWE .::. .:::. .:::: :::. GRDYKMALLEQWILGCD-MRWD - and – Multiple Sequence Alignment ARDFSHGLLENKLLGCDSMRWE .::. .:::. .:::: :::. GRDYKMALLEQWILGCD-MRWD .::. ::.: .. :. .::: SRDW--ALIEDCMV-CNFFRWD Reading: This lecture: Mount pp. 8-9, 65-89, 96-115, 140-155, 161-170 Michael Yaffe

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Outline • Recursion and dynamic programming Applied dynamic programming: global alignments: Needleman-Wunsch Applied dynamic programming: local alignments – Smith-Waterman Substitution matrices: PAM, BLOSUM, Gonnet Gaps - linear and affine Alignment statistics What you need to know to optimize an alignment
Outline (cont) Multiple sequence alignments: MSA, Clustal • Block analysis • Position-Specific Scoring Matrices (PSSM)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Examples O(n k ) is “polynomial time” as long as K<3 …..tractable Consider our un-gapped dot matrix Global alignment: 1 n 1 12345678…. 12345678…. 12345678…. 12345678…. 12345678…. m 12345678…. ….essentially an O(mn) problem
O.K. Examples O(n) better than O(n log(n)), better than O(n 2 ), better than O(n 3 ) Terrible Examples O(k n ) = exponential time….horrible!!!! NP problems- no known polynomial time Solutions = non-deterministic polynomial Problems.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Recursion and Dynamic Programming Aligning two protein sequences without gaps – roughly an O(mn) problem. With gaps – becomes computationally astronomical, and cannot be done by direct comparison methods. (= 2 2L / (2 π L); L=sequence length) Alternative is to compare all possible pairs of characters (matches and mismatches, and also take gaps into account as well, while keeping the number of comparisons manageable. The approach is called dynamic programming. Mathematically proven to produce optimal alignment Need a substitution or similarity matrix and some way to account for gaps. Example of how to score an alignment: Write down two sequences: sequence#1 V D S C Y sequence#2 V E S L C Y Score from sub. Matrix 4 2 4 -11 9 7 Score = Σ (AA pair scores) – gap penalty = 15
C 9 S T 5 P A 0 1 0 G 6 N 6 D 6 E 2 5 Q 0 2 5 H 0 0 8 R 0 1 0 5 K 1 5 M 0 I 4 L 2 4 V 3 1 4 F 0 Y 2 7 W CSTPAGNDEQHRKMILVFYW BLOSUM 62 Scoring Matrix -1 4 -1 1 -3 -1 -1 7 - 1 4 -3 0 -2 -2 0 -3 1 0 -2 -2 0 - 3 0 - 1- 1- 2- 1 1 - 4 0 - 1- 1- 1- 2 0 - 3 0 - 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Scoring system should: favor matching identical or related amino acids Penalize for poor matches and for gaps. To get a good scoring system need to know: how often a particular amino acid
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 11/11/2011 for the course BIO 20.410j taught by Professor Rogerd.kamm during the Spring '03 term at MIT.

### Page1 / 67

791_my_lecture2 - 7.91 Lecture#2 Michael Yaffe More...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online