Ch06_Alignment

# Ch06_Alignment - An Introduction to Bioinformatics...

This preview shows pages 1–11. Sign up to view the full content.

www.bioalgorithms.info An Introduction to Bioinformatics Algorithms Sequence Alignment

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Outline Global Alignment Scoring Matrices Local Alignment Alignment with Affine Gap Penalties
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Outline - CHANGES Scoring Matrices - ADD an extra slidewith an example of 5x5 matrix. Local Alignment – ADD extra slide showing a naïve approach to local alignment

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info From LCS to Alignment: Change up the Scoring The Longest Common Subsequence (LCS) problem —the simplest form of sequence alignment – allows only insertions and deletions (no mismatches). In the LCS Problem, we scored 1 for matches and 0 for indels Consider penalizing indels and mismatches with negative scores Simplest scoring schema : +1 : match premium : mismatch penalty : indel penalty
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Simple Scoring When mismatches are penalized by –μ , indels are penalized by –σ , and matches are rewarded with +1 , the resulting score is: #matches – μ ( #mismatches) – σ ( #indels)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info The Global Alignment Problem Find the best alignment between two strings under a given scoring schema Input : Strings v and w and a scoring schema Output : Alignment of maximum score ↑→ = - б = 1 if match = - µ if mismatch s i-1,j-1 +1 if v i = w j s i,j = max s i-1,j-1 if v i ≠ w j s i-1,j - σ s i,j-1 - σ μ : mismatch penalty σ : indel penalty
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Scoring Matrices To generalize scoring, consider a (4+1) x(4+1) scoring matrix δ. In the case of an amino acid sequence alignment, the scoring matrix would be a (20+1)x(20+1) size. The addition of 1 is to include the score for comparison of a gap character “-”. This will simplify the algorithm as follows: s i-1,j-1 + δ (v i , w j ) s i,j = max s i-1,j + δ (v i , -) s i,j-1 + δ (-, w j )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Measuring Similarity Measuring the extent of similarity between two sequences Based on percent sequence identity Based on conservation
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Percent Sequence Identity The extent to which two nucleotide or amino acid sequences are invariant A C  C  T G  A  G  –  A G  A C  G  T G  –  G  C  A G 70% identical mismatch indel

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Making a Scoring Matrix Scoring matrices are created based on biological evidence.
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/10/2012 for the course CSE 5615 taught by Professor Mitra during the Fall '11 term at FIT.

### Page1 / 45

Ch06_Alignment - An Introduction to Bioinformatics...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online