lect4 Keyword matching

An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)

Info iconThis preview shows pages 1–18. Sign up to view the full content.

View Full Document Right Arrow Icon
Fa05 CSE 182 CSE182-L4: Keyword matching
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Fa05 CSE 182 Backward scoring Defin S b [i,j] : Best scoring alignment of the suffixes s[i+1. .n] and t[j+1. .m] Q: What is the score of the best alignment of the two strings s and t? HW: Write the recurrences for S b
Background image of page 2
Fa05 CSE 182 Forward/Backward computations F[j] = Score of the best scoring alignment of s[1. .n/2] and t[1. .j] F[j] = S[n/2,j] B[j] = Score of the best scoring alignment of s[n/2+1. .n] and t[j+1. .m] B[j] = S b [n/2,j] n/2 j 1 m
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Fa05 CSE 182 Forward/Backward computations At the optimal coordinate, j F[j]+B[j]=S[n,m] In O(nm) time, and O(m) space, we can compute one of the coordinates on the optimum path. n/2 j 1 m
Background image of page 4
Fa05 CSE 182 Forward, Backward computation There exists a coordinate, j F[j]+B[j]=S[n,m] In O(nm) time, and O(m) space, we can compute one of the coordinates on the optimum path.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Fa05 CSE 182 Linear Space Alignment Align(1. .n,1. .m) For all 1<=j <= m Compute F[j]=S(n/2,j) For all 1<=j <= m Compute B[j]=S b (n/2,j) j* = max j {F[j]+B[j] } X = Align(1. .n/2,1. .j*) Y = Align(n/2. .n,j*. Return X,j*,Y
Background image of page 6
Fa05 CSE 182 Linear Space complexity T(nm) = c.nm + T(nm/2) = O(nm) Space = O(m)
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Fa05 CSE 182 Summary We considered the basics of sequence alignment Opt score computation Reconstructing alignments Local alignments Affine gap costs Space saving measures Can we recreate Blast?
Background image of page 8
Fa05 CSE 182 Blast and local alignment Concatenate all of the database sequences to form one giant sequence. Do local alignment computation with the query.
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Fa05 CSE 182 Large database search Query (m) Database (n) Database size n=100M, Querysize m=1000. O(nm) = 10 11 computations
Background image of page 10
Fa05 CSE 182 Why is Blast Fast?
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Fa05 CSE 182 Silly Question! True or False: No two people in new york city have the same number of hair
Background image of page 12
Fa05 CSE 182 Observations Much of the database is random from the query’s perspective Consider a random DNA string of length n. Pr[A]=Pr[C] = Pr[G]=Pr[T]=0.25 Assume for the moment that the query is all A’s (length m). What is the probability that an exact match to the query can be found?
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Fa05 CSE 182 Basic probability Probability that there is a match starting at a fixed position i = 0.25 m What is the probability that some position i has a match. Dependencies confound probability estimates.
Background image of page 14
Fa05 CSE 182 Basic Probability:Expectation Q: Toss a coin: each time it comes up heads, you get a dollar What is the money you expect to get after n tosses? Let X i be the amount earned in the i-th toss E ( X i ) =1. p + 0.(1- p ) = p Total money you expect to earn E ( X i ) = E ( X i ) = i np i
Background image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Fa05 CSE 182 Expected number of matches Expected number of matches can still be computed. Let X i =1 if there is a match starting at position i, X i =0 otherwise Pr(Match at Position i) = p i = 0.25 m E ( X i ) = p i = 0.25 m Expected number of matches = E ( X i ) = E ( X i ) i i = n 1 4 ( ) m i
Background image of page 16
Fa05 CSE 182 Expected number of exact Matches is small!
Background image of page 17

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 18
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/14/2008 for the course CSE 182 taught by Professor Bafna during the Fall '06 term at UCSD.

Page1 / 60

lect4 Keyword matching - CSE182-L4: Keyword matching Fa05...

This preview shows document pages 1 - 18. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online