LecturesPart06

# E find word pairs of fixed length w with a score of

pairs that score above a threshold, i.e., find word pairs of fixed length w with a score of at least T s Key concept: Seems similar to FASTA, but Key we are searching for words which score above T rather than that match exactly match BLAST method for proteins 1. Compile a list of words which give a score 1. above T when paired with the query sequence. sequence. x Example using PAM-120 for query sequence Example ACDE (w=4, T=17): ACDE =4, ACDE ACDE = +3 +9 +5 +5 = 22 3 try all possibilities: AAAA = +3 -3 0 0 = 0 no good AAAA no AAAC = +3 -3 0 -7 = -7 no good AAAC no 3 ...too slow, try directed change Generating word list ACDE ACDE = +3 +9 +5 +5 = 22 3 change 1st pos. to all acceptable substitutions gCDE = 1 9 5 5 = 20 ok (=pCDE,sCDE, (=pCDE,sCDE, tCDE) tCDE) nCDE = 0 9 5 5 = 19 ok (=dCDE,eCDE, (=dCDE,eCDE, nCDE,vCDE) nCDE,vCDE) iCDE = -1 9 5 5 = 18 ok (=qCDE) (=qCDE) kCDE = -2 9 5 5 = 17 ok (=mCDE) (=mCDE) 3 change 2nd pos.: can't - all alternatives negative and change the other three positions only add up to 13 3 change 3rd pos. in combination with first position gCnE = 1 9 2 5 = 17 ok ok 3 continue - use recursion Generating word list s For "best" values of w and T there are For typically about 50 words in the list for every residue in the query sequ...
