Lecture6_Alignment

# Lecture6_Alignment - An Introduction to Bioinformatics...

An Introduction to Bioinformatics Algorithms Sequence Alignment

An Introduction to Bioinformatics Algorithms From LCS to Alignment: Change up the Scoring The Longest Common Subsequence (LCS) problem —the simplest form of sequence alignment – allows only insertions and deletions (no mismatches). In the LCS Problem, we scored 1 for matches and 0 for indels Consider penalizing indels and mismatches with negative scores Simplest scoring schema : +1 : match premium - ! : mismatch penalty - " : indel penalty
An Introduction to Bioinformatics Algorithms Simple Scoring When mismatches are penalized by ! , indels are penalized by " , and matches are rewarded with +1 , the resulting score is: #matches – ! ( #mismatches) – " ( #indels)

An Introduction to Bioinformatics Algorithms The Global Alignment Problem Find the best alignment between two strings under a given scoring schema Input : Strings v and w and a scoring schema Output : Alignment of maximum score #\$ = - % = 1 if match = - ! if mismatch s i-1,j-1 +1 if v i = w j s i,j = max s i-1,j-1 - ! if v i " w j s i-1,j - " s i,j-1 - " { m : mismatch penalty " : indel penalty
An Introduction to Bioinformatics Algorithms Scoring Matrices To generalize scoring, consider a (4+1) x(4+1) scoring matrix & . In the case of an amino acid sequence alignment, the scoring matrix would be a (20+1)x(20+1) size. The addition of 1 is to include the score for comparison of a gap character “-”. This will simplify the algorithm as follows: s i-1,j-1 + & (v i , w j ) s i,j = max s i-1,j + & (v i , -) s i,j-1 + & (-, w j ) {

An Introduction to Bioinformatics Algorithms Measuring Similarity Measuring the extent of similarity between two sequences Based on percent sequence identity Based on conservation
An Introduction to Bioinformatics Algorithms Percent Sequence Identity The extent to which two nucleotide or amino acid sequences are invariant A C C T G A G A G A C G T G G C A G 70% identical mismatch indel

An Introduction to Bioinformatics Algorithms Making a Scoring Matrix Scoring matrices are created based on biological evidence.
• Fall '10
• A
• Bioinformatics, DNA, Sequence alignment, global alignment, Bioinformatics Algorithms

