129_Lecture5_2014

Unformatted text preview: 1 - columns: Sequence 2 Assign a score S(i,j) to each entry in the table: - select a window size WS WS WS i j - Compare window around i with window around j -&gt; Score(i,j) Display table of scores S - show a dot at position (i,j) if Score(i,j) &gt; Threshold The Scoring Scheme Scores are usually stored in a “weight” matrix also called “subs6tu6on” matrix or “matching” matrix. Deﬁning the “proper” matrix is s&lt;ll an ac&lt;ve area of research: 1. Iden3ty matrix 2. Chemical property matrix In this matrix amino acids or nucleotides are intuitively classified on the basis of their chemical properties 3. Subs3tu3on- based matrix Dayhoﬀ matrix PAM matrices Blosum matrices Subs&lt;tu&lt;on Matrices Dayhoﬀ matrix was created in 1978 based on few closely related (&gt; 85% iden&lt;ty) sequences available this &lt;me (1500 aligned amino- acids). PAM- family of matrices is a simple update of the original Dayhoﬀ matrix. Gonnet matrices were created by exhaus&lt;ve alignment of all Database sequences in 1992. BLOSUM matrix is based on local similari&lt;es (blocks) of proteins rather than overall alignments. 5 1/28/14 Most common Scoring Matrices BLOSUM matrices (Henikoff and Henikoff, 1992) •  Start from “reliable” alignments of sequences with at least XX % iden&lt;ty •  Compute muta&lt;on probabili&lt;es •  Convert into Scores: - &gt; BLOSUMXX matrix PAM matrices (Dayhoff, 1974) •  Point Accepted Muta&lt;on •  Start with PAM score = 1: alignments of sequences with 1 muta&lt;on - &gt; PAM1 matrix •  Generate successive PAM matrices: PAMXX = (PAM1)XX Example of a Scoring matrix: Blosum62 C S T P A C 9 -1 -1 -3 S -1 4 1 T -1 1 4 P -3 -1 A 0 G G N D E Q H R K M I L V F Y W 0 -3 -3 -3 -4 -3 -3 -3 -3 -1 -1 -1 -1 -2 -2 -2 -1 1 0 1 0 0 0 -1 -1 0 -1 -2 -2 -2 -2 -2 -3 1 -1 1 0 1 0 0 0 -1 0 -1 -2 -2 -2 -2 -2 -3 1 7 -1 -2 -1 -1 -1 -1 -2 -2 -1 -2 -3 -3 -2 -4 -3 -4 1 -1 -1 4 0 -1 -2 -1 -1 -2 -1 -1 -1 -1 -1 -2 -2 -2 -3 -3 0 1 -2 0 6 -2 -1 -2 -2 -2 -2 -2 -3 -4 -4 0 -3 -3 -2 N -3 1 0 -2 -2 0 6 1 0 0 -1 0 0 -2 -3 -3 -3 -3 -2 -4 D -3 0 1 -1 -2 -1 1 6 2 0 -1 -2 -1 -3 -3 -4 -3 -3 -3 -4 E -4 0 0 -1 -1 -2 0 2 5 2 0 0 1 -2 -3 -3 -3 -3 -2 -3 Q -3 0 0 -1 -1 -2 0 0 2 5 0 1 1 0 -3 -2 -2 -3 -1 -2 H -3 -1 0 -2 -2 -2 1 1 0 0 8 0 -1 -2 -3 -3 -2 -1 2 -2 R -3 -1 -1 -2 -1 -2 0 -2 0 1 0 5 2 -1 -3 -2 -3 -3 -2 -3 K -3 0 0 -1 -1 -2 0 -1 1 1 -1 2 5 -1 -3 -2 -3 -3 -2 -3 M -1 -1 -1 -2 -1 -3 -2 -3 -2 0 -2 -1 -1 5 1 2 -2 0 -1 -1 I -1 -2 -2 -3 -1 -4 -3 -3 -3 -3 -3 -3 -3 1 4 2 1 0 -1 -3 L -1 -2 -2 -3 -1 -4 -3 -4 -3 -2 -3 -2 -2 2 2 4 3 0 -1 -2 V -1 -2 -2 -2 0 -3 -3 -3 -2 -2 -3 -3 -2 1 3 1 4 -1...
