HMM-Lecture4

HMM-Lecture4 - Sequence Alignment Cont'd Linear-space...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
Sequence Alignment Cont’d
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Linear-space alignment Iterate this procedure to the left and right! N-k * M/2 M/2 k *
Background image of page 2
The Four-Russian Algorithm Main structure of the algorithm: Divide N × N DP matrix into K × K log 2 N-blocks that overlap by 1 column & 1 row For i = 1……K For j = 1……K Compute D i,j as a function of A i,j , B i,j , C i,j , x[l i …l’ i ], y[r j …r’ j ] Time: O(N 2 / log 2 N) times the cost of step 4 t t t
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Heuristic Local Aligners BLAST, WU-BLAST, BlastZ, MegaBLAST, BLAT, PatternHunter, ……
Background image of page 4
State of biological databases Sequenced Genomes: Human 3 × 10 9 Yeast 1.2 × 10 7 Mouse 2.7 × 10 9 × 12 different strains Rat 2.6 × 10 9 Neurospora 4 × 10 7 14 more fungi within next year Fugu fish 3.3 × 10 8 Tetraodon 3 × 10 8 ~250 bacteria/viruses Mosquito 2.8 × 10 8 Next year: Drosophila 1.2 × 10 8 Dog, Chimpanzee, Chicken Worm 1.0 × 10 8 2 sea squirts × 1.6 × 10 8 Current rate of sequencing: Rice 1.0 × 10 9 4 big labs × 3 × 10 9 bp /year/lab Arabidopsis 1.2 × 10 8 10s small labs
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
State of biological databases Number of genes in these genomes: Vertebrate: ~30,000 Insects: ~14,000 Worm: ~17,000 Fungi: ~6,000-10,000 Small organisms: 100s-1,000s Each known or predicted gene has an associated protein sequence >1,000,000 known / predicted protein sequences
Background image of page 6
Some useful applications of alignments Given a newly discovered gene, Does it occur in other species? How fast does it evolve? Assume we try Smith-Waterman: The entire genomic database Our new gene 10 4 10 10 - 10 11
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Some useful applications of alignments Given a newly sequenced organism, Which subregions align with other organisms? Potential genes Other biological characteristics Assume we try Smith-Waterman: The entire genomic database Our newly sequenced mammal 3 × 10 9 10 10 - 10 11
Background image of page 8
BLAST ( B asic L ocal A lignment S earch T ool) Main idea: 1. Construct a dictionary of all the words in the query 2. Initiate a local alignment for each word match between query and DB Running Time: O(MN) However, orders of magnitude faster than Smith-Waterman query DB
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
BLAST Original Version Dictionary: All words of length k (~11) Alignment initiated between words of alignment score T (typically T = k) Alignment: Ungapped extensions until score below statistical threshold Output: All local alignments with score > statistical threshold …… …… query DB query scan
Background image of page 10
BLAST Original Version A C G A A G T A A G G T C C A G T C C C T T C C T G G A T T G C G A Example: k = 4, T = 4 The matching word GGTC initiates an alignment Extension to the left and right with no gaps until alignment falls < 50% Output: GTAAGGTCC
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 35

HMM-Lecture4 - Sequence Alignment Cont'd Linear-space...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online