lecture_11

# lecture_11 - Sequence similarity DNA From a computer...

This preview shows pages 1–6. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Sequence similarity DNA: From a computer scientist’s viewpoint, DNA is a sequence of characters chosen from the alphabet {A, C, G, T}. Human genome contains ~3 billion characters. An A4 paper contains 5 to 10 K characters, you need ~.5 million sheets of paper. Given two DNA, biologists want to know how similar they are. From the computational point of view, we first find the best way to align (pair-up) two sequences, then we can see how close they are. Alignment and similarity s An alignment : pairing up two strings character by character possibly with space inserted. s Example: ACCAATCC and AGCCATGC A C C A AT C C A _ CCA A T C C A G C C AT G C A G CCA _ T G C b 1st alignment: 5 positions matched; 3 mismatched 2 nd alignment: 6 positions matched; 1 mismatched b Which is the better alignment? What is the best alignment? Similarity function s A similarity (scoring) function δ specifies how much each match/mismatch/space contributes to the overall similarity. s E.g., match: 2; mismatch: -1; character-space: -1. _ A C G T _-1-1-1-1 ,G) = s Given an alignment, define its quality = sum of similarity score of each position. s A C C A AT C C A _ CCA A T C C A G C C AT G C A G CCA _ T G C score : 10 – 3 = 7 score : 12 – 3 = 9. A-1 2-1-1-1 C-1-1 2-1-1 G-1-1-1 2-1 T-1-1-1-1 2 δ (C,G) = -1 Similarity function s A more complicated similarity function. _ A C G T _-.5-.5 A-.5 2 0.5-1-1 C-.5 .5 4-1-1 G-1-1 3-1 T-1-1-1 2 The alignment problem s Similarity score: match: 2; mismatch: -1, char-space: -1 The similarity score...
View Full Document

{[ snackBarMessage ]}

### Page1 / 20

lecture_11 - Sequence similarity DNA From a computer...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online