Lecture6_Alignment - An Introduction to Bioinformatics...

Info icon This preview shows pages 1–9. Sign up to view the full content.

An Introduction to Bioinformatics Algorithms Sequence Alignment
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

An Introduction to Bioinformatics Algorithms From LCS to Alignment: Change up the Scoring The Longest Common Subsequence (LCS) problem —the simplest form of sequence alignment – allows only insertions and deletions (no mismatches). In the LCS Problem, we scored 1 for matches and 0 for indels Consider penalizing indels and mismatches with negative scores Simplest scoring schema : +1 : match premium - ! : mismatch penalty - " : indel penalty
Image of page 2
An Introduction to Bioinformatics Algorithms Simple Scoring When mismatches are penalized by ! , indels are penalized by " , and matches are rewarded with +1 , the resulting score is: #matches – ! ( #mismatches) – " ( #indels)
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

An Introduction to Bioinformatics Algorithms The Global Alignment Problem Find the best alignment between two strings under a given scoring schema Input : Strings v and w and a scoring schema Output : Alignment of maximum score #$ = - % = 1 if match = - ! if mismatch s i-1,j-1 +1 if v i = w j s i,j = max s i-1,j-1 - ! if v i " w j s i-1,j - " s i,j-1 - " { m : mismatch penalty " : indel penalty
Image of page 4
An Introduction to Bioinformatics Algorithms Scoring Matrices To generalize scoring, consider a (4+1) x(4+1) scoring matrix & . In the case of an amino acid sequence alignment, the scoring matrix would be a (20+1)x(20+1) size. The addition of 1 is to include the score for comparison of a gap character “-”. This will simplify the algorithm as follows: s i-1,j-1 + & (v i , w j ) s i,j = max s i-1,j + & (v i , -) s i,j-1 + & (-, w j ) {
Image of page 5

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

An Introduction to Bioinformatics Algorithms Measuring Similarity Measuring the extent of similarity between two sequences Based on percent sequence identity Based on conservation
Image of page 6
An Introduction to Bioinformatics Algorithms Percent Sequence Identity The extent to which two nucleotide or amino acid sequences are invariant A C C T G A G A G A C G T G G C A G 70% identical mismatch indel
Image of page 7

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

An Introduction to Bioinformatics Algorithms Making a Scoring Matrix Scoring matrices are created based on biological evidence.
Image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.
  • Fall '10
  • A
  • Bioinformatics, DNA, Sequence alignment, global alignment, Bioinformatics Algorithms

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern