{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Lec02-seqalign

# Lec02-seqalign - CMSC 423 Sequence Alignment Slides By Carl...

This preview shows pages 1–7. Sign up to view the full content.

CMSC 423: Sequence Alignment Slides By: Carl Kingsford Department of Computer Science University of Maryland, College Park Based on Section 6.6 of Algorithm Design by Kleinberg & Tardos.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
The Problem Given: Two strings a = a 1 a 2 a 3 a 4 . . . a m b = b 1 b 2 b 3 b 4 . . . b n a i , b i L for some alphabet L like { A , C , G , T } . Compute how “similar” the two strings are. What do we mean by similarity between two strings?
Alignment Examples prin-ciple |||| |||XX prinncipal (1 gap, 2 mm) misspell ||| |||| mis-pell (1 gap) aa-bb-ccaabb |X || | | | ababbbc-a-b- (5 gaps, 1 mm) prin-cip-le |||| ||| | prinncipal- (3 gaps, 0 mm) prehistoric |||||||| ---historic (3 gaps, 0 mm) al-go-rithm- || XX ||X | alKhwariz-mi (4 gaps, 3 mm)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Motivation Alignment is used extensively in molecular biology, where a and b are the DNA sequences of two genes (see NCBI BLAST) Spell checkers Inexact search of web pages
NCBI BLAST

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
NCBI BLAST Alignment >gb|AC115706.7| Mus musculus chromosome 8, clone RP23-382B3, complete sequence Query 1650 gtgtgtgtgggtgcacatttgtgtgtgtgtgcgcctgtgtgtgtgggtgcctgtgtgtgt 1709 |||||||||| | || | ||||||||| | |||||||| ||| || ||||| Sbjct 56838 GTGTGTGTGGAAGTGAGTTCATCTGTGTGTGCACATGTGTGTGCA--TGCATGCATGTGT 56895 Query 1710 gtg-gggcacatttgtgtgtgtgtgtgtgcctgtgtgtgggtgcacatttgtgtgtgtgc 1768 || ||||| || ||| ||||||| |||||||| ||| ||| ||||| || | Sbjct 56896 GTCCGGGCA------TGCATGTCTGTGTGCATGTGTGTGTGTGTGCAT--GTGTGAGTAC 56947 Query 1769 ctgtgtgtgtgtgcctgtgtgtgggggtgcacatttgtgtgtgtgtgtgcctgtgtgtgg 1828 |||||||||| ||| ||| |||| | ||| ||| ||||| |||||| ||||| | Sbjct 56948 CTGTGTGTGTATGCTTGTATGTGTGTGTGTGCATGTGTGTAGGTGTGTATATGTGTAAGT 57007 Query 1829 gggtgcacatttgtgtgtgtgtgtgcctgtgtgtgtgggtgcacatttgtgtgtgtgtgt 1888 ||| ||||||| |||||| |||| | ||| |||| |||||||||| || Sbjct 57008 T------CATCTGTGTGTATGTGTG--TGTGAGAGTGCATGCA----TGTGTGTGTGAGT 57055 Query 1889 gcctgtgtgt--gtgggtgcacatttgtgtgtgtgtgcctgtg--tgtgt--gggtgcac 1942 | | ||||| ||| ||| || || | | | ||||| ||||| | ||| | Sbjct 57056 TCATCTGTGTCAGTGTATGCTTATGGGTATAACT-TAACTGTGCATGTGTAAGTGTGTTC 57114 Query 1943
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}