LecturesPart06

# Database x could scan sequence a word at a time but

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: quence and a database? database? x Could scan sequence a word at a time, but this Could is order L (size of database) is Word searching - hashing s Solution: Use a precomputed table that lists Solution: where in the database each possible word occurs occurs x Generation of the table is of order L (size of database) but use of the table is of order N (size use of query sequence) of s The computer science term for this approach The is hashing hashing Hashing s Hashing x Hashing Table of size 10 x Hashing function H(x) = x mod 10 x Applet: http://www.engin.umd.umich.edu/CIS/course.des/cis x Insertion & Search s (Demonstration A10) FASTA s Heavily used for searching databases until Heavily advent of BLAST (see below) advent s Inputs x k (word or k-tuple) size (word x similarity matrix s Compares query sequence pairwise with Compares each sequence in the database each FASTA method s The initial step in the algorithm is to identify all exact matches of length k (k– tuples) or greater between the two sequences. FASTA method 1. Find diagonals (paired pieces from each 1. sequence without gaps) that have the highest density of common wor...
View Full Document

## This note was uploaded on 01/13/2012 for the course BIO 101 taught by Professor Staff during the Fall '10 term at DePaul.

Ask a homework question - tutors are online