ECS124_Lecture5

Lecture 5: LCS, BLAST and DB Search ECS 124: Theory and practice of bioinformatics Instructor: IliasTagkopoulos [email protected] Office: Kemper 3063 and GBSF 5313 4/20/2010 UC Davis 1

Announcements circle5 Make-up class circle5 Midterm Labs 4/20/2010 UC Davis 2 circle5
LAST TIME: Global vs. Local alignment Global alignment Local alignment 4/20/2010 3 Initialization: Iteration: Starting Position: Ending Position: F(0,j) = -j*d , F(i,0)=-i*d F(0,j) = F(i,0)= 0 F(i,j) = max{F(i-1,j) – d, F(i,j-1) – d, F(i-1,j-1) + Sim(S1 i ,S2 j )} Top left Bottom right F(i,j) = max{0, F(i-1,j) – d, F(i,j-1) – d, F(i-1,j-1) + Sim(S1 i ,S2 j )} anywhere anywhere

Dot plots circle5 Comparing two sequences (Maizel & Lenk, PNAS 1981) circle5 Alpha chain vs. beta chain of human hemoglobin (window 31, matches +5, mismatches -4) circle5 (1)deletion, (2)insertion, (3) mutation 4/20/2010 UC Davis 4
Some definitions circle5 Definitions: Assume string with length n, then circle5 Prefix of a string: the first k characters of the string, where k≤n circle5 Suffix of a string: the last k characters of the string where k≤n 4/20/2010 UC Davis 5 circle5 Substring : a prefix of a suffix, or equivalently a suffix of a prefix of the string circle5 Subsequence : any combination of the characters in the string, without changing the order

Free end-gaps circle5 Should gaps in the ends be penalized the same as any other gaps? circle5 Imagine Shotgun sequencing 4/20/2010 UC Davis 6 circle5 Maybe aligning prefix/suffix makes more sense…
Free end-gaps circle5 Suffix/Prefix match circle5 Objective function: circle5 MAX(#matches - #mismatches - #spaces NOT at the 4/20/2010 UC Davis 7

