LecturesPart06

When blast statistical significance s a key to the

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: report the matches to that sublibrary when done). when BLAST Statistical significance s A key to the utility of BLAST is the ability key to calculate expected probabilities of occurrence of Maximum Segment Pairs (MSPs) given w and T (MSPs) s This allows BLAST to rank matching This sequences in order of “significance” and to cut off listings at a user-specified probability probability BLAST Statistical significance s From Karlin-Altschul formulation, the From expected value (mean) of the HSPs between a query and a set of random sequences is u ≅ [ e (Kmn)]/λ log or u ≅ [ Kmn)]/λ ln( BLAST Statistical significance s BLAST uses a correction to this formulation BLAST that takes into account the effective sequence lengths of the query and the sequence database sequences database u [( m)/ =n ′ ′] l Kn λ BLAST Statistical significance s The corrected lengths are given by m′ = m − (lnKmn) / H n′ = n − (lnKmn) / H with H = (lnKmn) / l s where l is the average length of the alignment that where can be achieved between random sequences of length m and n BLAST Statistical significance s Given u, we can calculate the probability p of Given we observing a score S between a query sequence and a given database sequence that is equal to or greater than x greater −xu λ− () p ≥ = e (e ( x 1 x− S ) −p ) BLAST Statistical significance s s Lastly, we have to consider that we are searching Lastly, many database sequences and can expect even a relatively rare score to occur with high chance given enough comparisons given For a database of D sequences, this is For − sx p≥D () E1e ≈ − Summary of Database Search Methods Authors (Program) Description Needleman & Wunsch full alignment Wilbur & Lipman match k-tuple - form diag - NW Lipman & Pearson k-tuple - diag - rescore (FASTP) Pearson & Lipman FASTP - join diags(FASTA) NW Altschul et al (BLAST) word match list statistics...
View Full Document

This note was uploaded on 01/13/2012 for the course BIO 101 taught by Professor Staff during the Fall '10 term at DePaul.

Ask a homework question - tutors are online