129_Lecture5_2014

The gap by one n size of the gap global versus local

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: A GGC AATGC AGGC Sta<s<cal Significance of alignment: Shuffling Score: 355 Shuffling a sequence: THISISTHECORRECTSEQUENCE TSTCRTQNHIHOESUCISERCEEE 13 1/28/14 Gap penalty Most common model: WN = G0 + N * G1 WN : gap penalty for a gap of size N G0 : cost of opening a gap G1 : cost of extending the gap by one N : size of the gap Global versus Local Alignment Global alignment finds the arrangement that maximizes total score Best known algorithm: Needleman and Wunsch. Local alignment identifies highest scoring subsequences, sometimes at the expense of the overall score. Best known algorithm: Smith and Waterman. Local alignment algorithm is just a variation of the global alignment algorithm! Modifica<ons for local alignment 1)  The scoring matrix has nega<ve values for mismatches 2)  The minimum score for any (i,j) in the alignment matrix is 0. 3)  The best score is found anywhere in the filled alignment matrix These 3 modifications cause the algorithm to search for matching sub-sequences which are not penalized by other regions (modif. 2), with minimal poor matches (modif 1), which can occur anywhere (modif 3). 14 1/28/14 Global versus Local Alignment Match: +1; Mismatch: - 2; Gap: - 1 A C C N S A 1 -3 -3 -3 -3 Global: C -3 2 1 -2 -2 C -3 1 3 -1 -1 T -3 -2 -1 1 0 ACCTGS ACC-NS G -3 -2 -1 0 -1 S -3 -2 -1 0 1 A C C N S ACCTGS 100000 021000 013000 000100 000001 Local: ACCTGS ACCN-S ACC ACC Sequence Analysis 1.  Why do we compare sequences? 2.  Sequence comparison: from qualitative to quantitative methods 3.  Deterministic methods: Dynamic programming 4.  Heuristics: BLAST 1.  Concept 2.  Ungapped BLAST 3.  Gapped BLAST 5.  Multiple Sequence Alignment BLAST (Basic Local Alignment Search Tool) Main ideas: 1. Construct a list of all words in the query sequence 2. Scan database for sequences that contain one or more of the query words 1. Ini<ate a local alignment for each word match between query and database Database Query sequence 15 1/28/14 Original BLAST 1.  Define dic<onary All words of length k (typically k=11) 2.  Scan database sequences for matches with alignment score ≥ T (typically T = k) 3. Generate alignment ungapped extensions un<l score below sta<s<cal threshold 4. Output all local alignments with scores above the sta<s<cal threshold … Database sequence query Original BLAST G A T A A G T A A G G T C C A G T An example: k = 4, T = 4 1)  The matching word AGGT ini<ates an alignment T T C A A C T A A G G T C C T C A Original BLAST G A T A A G T A A G G T C C A G T An example: k = 4, T = 4 1)  The matching word AGGT ini<ates an alignment 2)  Extension of the alignment to the lek and right with...
View Full Document

Ask a homework question - tutors are online