{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

FASTA_BLAST - An Introduction to Bioinformatics Biological...

Info icon This preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
An Introduction to Bioinformatics Biological Database Searching :FASTA, BLAST
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
An Introduction to Bioinformatics Dynamic programming algorithms give “correct” solutions but is very slow unless the sequences are quite short. Current common protein sequence data bases contain more than 100 M residues, For a “query” sequence of 1000 residues, we need to evaluate 10 11 matrix cells. Even if you compute 10 M cells /second, it will take 10 4 secs =~ 3 hour just for one query. Goal: search small fraction of the possible high scoring alignments. The vast literature on exact and approximate match algorithms can be used. But with scoring matrices, distant matches are hard to find. Need heuristic algorithms: FASTA and BLAST are two such classes of algorithms, BLAST is more popular but we still use the FASTA data format. Database Searching
Image of page 2
An Introduction to Bioinformatics Database searching Core: pair-wise alignment algorithm Speed (fast sequence comparison) Relevance of the search results (statistical tests) Recovering all information of interest The results depend of the search parameters like gap penalty, scoring matrix. Sometimes searches with more than one matrix should be preformed
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon