Lect10 LW statisticsAssembly

An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
CSE182-L10 LW statistics/Assembly
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Whole Genome Shotgun Break up the entire genome into pieces Sequence ends, and assemble using a computer LW statistics & Repeats argue against the success of such an approach Alternative: build a roadmap of the genome, with physical clones mapped for each region. Sequence each of the clones, and put them together
Background image of page 2
Questions Algorithmic: How do you put the genome back together from the pieces? Statistical? How many pieces do you need to sequence, etc.? The answer to the statistical questions had already been given in the context of mapping, by Lander and Waterman.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lander Waterman Statistics G L The fragments are falling randomly on the genome Overlapping fragments form islands of contiguous sequence. Ideally, we want one island for each chromosome. How many fragments should we sequence?
Background image of page 4
Lander Waterman Statistics G L G = Genome Length L = Fragment Length N = Number of Fragments T = Required Overlap c = Coverage = LN/G a = N/G q = T/L s = 1- q
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
LW statistics: questions As the coverage c increases, more and more areas of the genome are likely to be covered. Ideally, you want to see 1 island. Q1: What is the expected number of islands? Ans: N exp(-c σ ) The number increases at first, and gradually decreases.
Background image of page 6
Analysis: Expected Number Islands Computing Expected # islands. Let X i =1 if an island ends at position i, X i =0 otherwise. Number of islands = i X i Expected # islands = E( i X i ) = i E(X i )
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Prob. of an island ending at i E(X i ) = Prob (Island ends at pos. i) = Prob(clone began at position i-L+1 AND no clone began in the next L-T positions) i L T E ( X i ) = a 1- a ( ) L - T = ae - cs Expected # islands = E ( X i ) = i Gae - cs = Ne - cs
Background image of page 8
Pr[Island contains exactly j clones]? Consider an island that has already begun. With probability e -c σ , it will never be continued. Therefore
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/14/2008 for the course CSE 182 taught by Professor Bafna during the Fall '06 term at UCSD.

Page1 / 39

Lect10 LW statisticsAssembly - CSE182-L10 LW...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online