7 Optimal Alignment

1 Introduction to Bioinformatics/ Elements of Bioinformatics Pairwise sequence alignment – Optimal alignment

2 01234567 0 1 2 3 4 5 6 7 A T C C G C G a ATGCGTC b sequence a = A T C C G C G sequence b = A T G C G T C Converting a dotplot to alignments
3 01234567 0 1 2 3 4 5 6 7 A T C C G C G a ATGCGTC b Converting a dotplot to alignments Dots can be joined to form diagonals which suggest regions of similarities. Offsets in the diagonals suggest insertions/deletion.

4 01234567 0 1 2 3 4 5 6 7 sequence a = A T C C G C G - - | | | | | sequence b = A T - - G C G T C Score 1+1+0+0+1+1+1+0+0 = 5 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 A T C C G C G a ATGCGTC b Converting a dotplot to alignments
5 sequence a = A T C C G C G -- | | | | | sequence b = A T G C G T C Score 1+1+0+0+1+1+1+0+0 = 5 Diagonal line = align residue in sequence a against residue in sequence b 01234567 0 1 2 3 4 5 6 7 A T C C G C G a ATGCGTC b Converting a dotplot to alignments

6 sequence a = A T C C G C G - - | | | | | sequence b = A T -- G C G T C Score 1+1+0+0+1+1+1+0+0 = 5 Vertical line = adding gap to sequence b; residue in sequence a align to gap in sequence b 01234567 0 1 2 3 4 5 6 7 A T C C G C G a ATGCGTC b Converting a dotplot to alignments
7 sequence a = A T C C G C G -- | | | | | sequence b = A T - - G C G T C Score 1+1+0+0+1+1+1+0+0 = 5 Horizontal line = adding gap to sequence a; residue in sequence b align to gap in sequence a 01234567 0 1 2 3 4 5 6 7 A T C C G C G a ATGCGTC b Converting a dotplot to alignments

8 sequence a = A T C C G - C G | | | | | sequence b = A T G C G T C – Score 1+1+0+1+1+0+1+0 = 5 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 01234567 0 1 2 3 4 5 6 7 A T C C G C G a ATGCGTC b More than one possible alignments
9 01234567 0 1 2 3 4 5 6 7 A T C C G C G a ATGCGTC b 48639 possible alignments! How to find the best alignment?

10 Optimal alignment • An optimal alignment is the alignment whose score is at a minimum or a maximum when compared to other alignments using the same scoring parameters. • Optimal alignment is usually obtained by dynamic programming.
11 13 possible alignments Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 sequence a --AT sequence b TA-- 012 0 1 2 A T TA a b

12 13 possible alignments sequence a -AT sequence b TA- Score = 1 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 012 0 1 2 A T TA a b
13 sequence a -A-T sequence b T-A- Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 13 possible alignments 012 0 1 2 A T TA a b

14 sequence a -AT sequence b T-A Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 13 possible alignments 012 0 1 2 A T TA a b
15 sequence a -AT- sequence b T--A Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 13 possible alignments 012 0 1 2 A T TA a b

16 sequence a A-T sequence b TA- Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 13 possible alignments 012 0 1 2 A T TA a b
17 sequence a AT sequence b TA Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 13 possible alignments 012 0 1 2 A T TA a b

18 sequence a AT- sequence b T-A Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 13 possible alignments 012 0 1 2 A T TA a b
19 sequence a A--T sequence b -TA- Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 13 possible alignments 012 0 1 2 A T TA a b

20 sequence a A-T sequence b -TA Score = 0 Scoring scheme: Match: +1 Mismatch: 0 Gap: 0 13 possible alignments 012 0 1 2 A T TA a b
