Lecture6_Alignment

# Lecture6_Alignment - An Introduction to Bioinformatics...

• Notes
• 30

This preview shows pages 1–9. Sign up to view the full content.

An Introduction to Bioinformatics Algorithms Sequence Alignment

This preview has intentionally blurred sections. Sign up to view the full version.

An Introduction to Bioinformatics Algorithms From LCS to Alignment: Change up the Scoring The Longest Common Subsequence (LCS) problem —the simplest form of sequence alignment – allows only insertions and deletions (no mismatches). In the LCS Problem, we scored 1 for matches and 0 for indels Consider penalizing indels and mismatches with negative scores Simplest scoring schema : +1 : match premium - ! : mismatch penalty - " : indel penalty
An Introduction to Bioinformatics Algorithms Simple Scoring When mismatches are penalized by ! , indels are penalized by " , and matches are rewarded with +1 , the resulting score is: #matches – ! ( #mismatches) – " ( #indels)

This preview has intentionally blurred sections. Sign up to view the full version.

An Introduction to Bioinformatics Algorithms The Global Alignment Problem Find the best alignment between two strings under a given scoring schema Input : Strings v and w and a scoring schema Output : Alignment of maximum score #\$ = - % = 1 if match = - ! if mismatch s i-1,j-1 +1 if v i = w j s i,j = max s i-1,j-1 - ! if v i " w j s i-1,j - " s i,j-1 - " { m : mismatch penalty " : indel penalty
An Introduction to Bioinformatics Algorithms Scoring Matrices To generalize scoring, consider a (4+1) x(4+1) scoring matrix & . In the case of an amino acid sequence alignment, the scoring matrix would be a (20+1)x(20+1) size. The addition of 1 is to include the score for comparison of a gap character “-”. This will simplify the algorithm as follows: s i-1,j-1 + & (v i , w j ) s i,j = max s i-1,j + & (v i , -) s i,j-1 + & (-, w j ) {

This preview has intentionally blurred sections. Sign up to view the full version.

An Introduction to Bioinformatics Algorithms Measuring Similarity Measuring the extent of similarity between two sequences Based on percent sequence identity Based on conservation
An Introduction to Bioinformatics Algorithms Percent Sequence Identity The extent to which two nucleotide or amino acid sequences are invariant A C C T G A G A G A C G T G G C A G 70% identical mismatch indel

This preview has intentionally blurred sections. Sign up to view the full version.

An Introduction to Bioinformatics Algorithms Making a Scoring Matrix Scoring matrices are created based on biological evidence.
This is the end of the preview. Sign up to access the rest of the document.
• Fall '10
• A
• Bioinformatics, DNA, Sequence alignment, global alignment, Bioinformatics Algorithms

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern