# Anchor a g c g t t a g g t c c t a g t

Unformatted text preview: no gap un&lt;l alignment score falls below 50% T T C A A C T A A G G T C C T C A 16 1/28/14 Original BLAST G A T A A G T A A G G T C C A G T An example: k = 4, T = 4 1) The matching word AGGT ini&lt;ates an alignment 2) Extension of the alignment to the lek and right with no gap un&lt;l alignment score falls below 50% 3) Output: AAGTAAGGTCC AACTAAGGTCC T T C A A C T A A G G T C C T C A Gapped BLAST A C G A A G T A A G G T C C A G T An example: k = 4, T = 4 1) The matching word GGTC ini&lt;ates an alignment A G C G T T A G G T C C T A G T C Gapped BLAST A C G A A G T A A G G T C C A G T An example: k = 4, T = 4 1) The matching word GGTC ini&lt;ates an alignment 2) Extend alignment in a band around anchor A G C G T T A G G T C C T A G T C 17 1/28/14 Gapped BLAST A C G A A G T A A G G T C C A G T An example: k = 4, T = 4 1) The matching word GGTC ini&lt;ates an alignment 2) Extend alignment in a band around anchor 3) Output: GTAAGGTCCAGT GTTAGGTC-AGT A G C G T T A G G T C C T A G T C BLAST Portal BLAST: Input 18 1/28/14 BLAST Parameters BLAST Results Statistics of Protein Sequence Alignment •  Statistics of global alignment: Unfortunately, not much is known! Statistics based on Monte Carlo simulations (shuffle one sequence and recompute alignment to get a distribution of scores) •  Statistics of local alignment Well understood for ungapped alignment. Same theory probably apply to gapped-alignment 19 1/28/14 Statistics of Protein Sequence Alignment What is a local alignment ? “Pair of equal length segments, one from each sequence, whose scores can not be improved by extension or trimming. These are called high-scoring pairs, or HSP” http://www.people.virginia.edu/~wrp/cshl98/Altschul/Altschul-1.html The E-value for a sequence alignment HSP scores follow an extreme value distribution, characterized by two parameters, K and λ. The expected number of HSP with score at least S is given by: -10 -8 -6 -4 -2 0 2 S 4 6 8 10 E = Kmn exp(− λS ) m, n : sequence lengths E : E-value The Bit Score of a sequence alignment Raw scores have little meaning without knowledge of the scoring scheme used for the alignment, or equivalently of the parameters K and λ. Scores can be normalized according to: S' = λS − ln (K ) ln (2 ) S’ is the bit score of the alignment. The E-value can be expressed as:...
