lecture-5-handouts

lecture-5-handouts - 9/6/2011 Practical Bioinformatics for...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Practical Bioinformatics for Life Scientists Week 3, Lecture 5 István Albert Bioinformatics Consulting Center Penn State Processing sequencing reads Sequenced DNA fragments (DNA library) Unknown genome De novo assembly (contigs) Read mapping (alignments) gene discovery (annotation) Known genome (reference) chip-Seq, RNA-Seq, SNP calling etc. Sequence alignments Arranging two or more sequences such as to maximize the length of the common regions between the two It is a very well developed field the roots of the bioinformatics started with various alignment software We will only cover pair-wise alignments searching a database with a query High throughput sequencing poses special constraints: a very large number of very short reads - traditional methods were not feasible Alignment concepts A GCAAG TAT GTAAG GGC GCAG AAAA GCAAAG GCAAG GCAAG perfect match one mismatch GCAAG GCAAG insertion vs ref. deletion vs ref NOTE: mismatches or indels can be longer than 1 base! It gets complicated very quickly
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 5

lecture-5-handouts - 9/6/2011 Practical Bioinformatics for...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online