lecture3 - Csci 3003: Introduction to Computing in Biology...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Overview of whole- genome sequencing technology Prof. Chad Myers Department of Computer Science and Engineering University of Minnesota cmyers@cs.umn.edu Csci 3003: Introduction to Computing in Biology (Spring 2010)
Background image of page 2
The start of the genomic revolution: sequencing Human genome sequence published, Feb 2001
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
The problem Goal: Find the complete sequence of A, C, G, T’s for entire genome Challenge: There is no machine that takes long DNA as an input, and gives the complete sequence as output Can only sequence single fragments (~500) bases at a time (slide from Serafim Batzoglou) ~67,000 bases / hr. Margulies, M. et al . Nature 437, 376 380 (2005). Sanger sequencing machine (capillary electrophoresis)
Background image of page 4
Randomly break apart (shotgun) Sequence short chunks Put everything back together (“assemble” whole genome) Method for sequencing longer regions
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Method for sequencing longer regions Basic idea: detect overlap between short chunks “tile” them together to infer underlying sequence Requires: redundancy in coverage (7X) smart algorithms for comparing pairs/assembling Potential problems?
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Challenges with Fragment Assembly Sequencing errors ~1-2% of bases are wrong Repeats % of repeats in whole genome: Bacterial genomes: 5% Mammals: 50% false overlap due to repeat (slide from Serafim Batzoglou)
Background image of page 8
The politics of the Human Genome project Public Project: Mapped long chunks of the genome first (to sort out
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 10/21/2011 for the course CSCI 3003 taught by Professor Staff during the Spring '08 term at Minnesota.

Page1 / 26

lecture3 - Csci 3003: Introduction to Computing in Biology...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online