Lect10 LW statisticsAssembly

# An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)

This preview shows pages 1–10. Sign up to view the full content.

CSE182-L10 LW statistics/Assembly

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Whole Genome Shotgun Break up the entire genome into pieces Sequence ends, and assemble using a computer LW statistics & Repeats argue against the success of such an approach Alternative: build a roadmap of the genome, with physical clones mapped for each region. Sequence each of the clones, and put them together
Questions Algorithmic: How do you put the genome back together from the pieces? Statistical? How many pieces do you need to sequence, etc.? The answer to the statistical questions had already been given in the context of mapping, by Lander and Waterman.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Lander Waterman Statistics G L The fragments are falling randomly on the genome Overlapping fragments form islands of contiguous sequence. Ideally, we want one island for each chromosome. How many fragments should we sequence?
Lander Waterman Statistics G L G = Genome Length L = Fragment Length N = Number of Fragments T = Required Overlap c = Coverage = LN/G a = N/G q = T/L s = 1- q

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
LW statistics: questions As the coverage c increases, more and more areas of the genome are likely to be covered. Ideally, you want to see 1 island. Q1: What is the expected number of islands? Ans: N exp(-c σ ) The number increases at first, and gradually decreases.
Analysis: Expected Number Islands Computing Expected # islands. Let X i =1 if an island ends at position i, X i =0 otherwise. Number of islands = i X i Expected # islands = E( i X i ) = i E(X i )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Prob. of an island ending at i E(X i ) = Prob (Island ends at pos. i) = Prob(clone began at position i-L+1 AND no clone began in the next L-T positions) i L T E ( X i ) = a 1- a ( ) L - T = ae - cs Expected # islands = E ( X i ) = i Gae - cs = Ne - cs
Pr[Island contains exactly j clones]? Consider an island that has already begun. With probability e -c σ , it will never be continued. Therefore

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/14/2008 for the course CSE 182 taught by Professor Bafna during the Fall '06 term at UCSD.

### Page1 / 39

Lect10 LW statisticsAssembly - CSE182-L10 LW...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online