8 Database Search

8 Database Search - 1 Introduction to Bioinformatics...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Introduction to Bioinformatics Elements of Bioinformatics Sequence Database Searches
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 References Mount D.W. (2004) Bioinformatics: Sequence and Genome Analysis. 2 nd ed. Cold Spring Harbor Lab. Press, N.Y. Chapter 6. Baxevanis, A.D., and Ouellette, B.F.F. (2005) Bioinformatics - A practical guide to the analysis of genes and proteins (3rd ed). John Wiley and Sons, NY. Chapter 11. • BLAST course: http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html • BLAST tutorial: http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/information3.html
Background image of page 2
3 Why search sequence databases? • To find potentially related sequences in the database Science (1983) 221: 275-277 Nature (1983) 304: 35-39
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Sequence database searches • Search a query sequence against a database for similar sequences – BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi) – FASTA (http://fasta.bioch.virginia.edu/) – Smith-Waterman local alignment search (e.g. SSEARCH in the FASTA package) • Search a protein query sequence against a pattern/profile database to identify protein family – InterProScan
Background image of page 4
5 Sequence database searches • Search a set of similar protein sequences against a protein database to look for additional matches – Generate a multiple alignment of the sequences – Search the alignment against a protein database to identify further related sequences – e.g. PSI-BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi/)
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Database search tools • Sequence database searches are essentially local pairwise sequence alignments. • Two common search programs: – BLAST – FASTA Both find exact matching words quickly to confine the search for interesting local alignments to a small fraction of the entire search space.
Background image of page 6
7 Smith-Waterman local alignment BLAST FASTA BLAST and FASTA confine the search to a small fraction of the entire search space to increase speed.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8 BLAST (Basic Local Alignment Search Tool) • Seeding • Extension • Evaluation
Background image of page 8
9 1. Generate a list of overlapping words of size W (usually 3 for proteins, 11 for nucleotides) from the beginning to the end of the query sequence. BLAST: Seeding ...AGCACAAAAAACCCCAACCAA. .. AGCACAAAAAA GCACAAAAAAC CACAAAAAACC ACAAAAAACCC CAAAAAACCCC AAAAAACCCCA AAAAACCCCAA AAAACCCCAAC ...VAEITQIKFDASWLH. .. VAE AEI EIT ITQ TQI QIK IKF KFD ... ... ... ... A lookup table of words from the query sequence A lookup table of words from the query sequence
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10 ...VAEITQIKFDASWLH. .. VAE AEI EIT ITQ TQI QIK IKF KFD ... In protein-protein searches, each word generated from the query sequence is expanded to include a list of neighborhood words which score T when aligned to the query word using a chosen scoring matrix (e.g. Blosum62). Neighborhood words for QIK when T is set to 11. Lower values of T increase the number of neighborhood words and reduce the chance of missing a hit but will increase the search time.
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 65

8 Database Search - 1 Introduction to Bioinformatics...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online