FragAssembly

FragAssembly - 02/10/12 1 Fragment Assembly 02/10/12 2...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 02/10/12 1 Fragment Assembly 02/10/12 2 Introduction Fragments are typically of 200-700 bp long Target string is about 30k 100k bp long Problem: given a set of fragments reconstruct the target 02/10/12 3 Introduction Multiple-alignment of the fragments ignoring spaces at the end The alignment is called layout The output is called the consensus sequence An optimization problem 02/10/12 4 Complications Base-call errors: Substitution errors [p 107] Insertion errors (possibly from the host sequence) [p 108, fig 4.3] Deletion error [fig 4.4] Majority voting solves them (or some form of optimization) 02/10/12 5 Complications Chimeras : To non-contiguous fragments get joined as a single fragment [p 109, fig 4.5] Needs to be weeded out as a preprocessing step Similar to chimeras, contaminant fragments (possibly from host) needs to be filtered out as well 02/10/12 6 Complications Unknown orientation: Fragments may come from either strand Even from the opposite strand, its reverse-complement must be in the target string Consequence: try both forward and rev- complement of each fragment (2^n trial in worst, for n fragments) [p 109, fig 4.6] 02/10/12 7 Complications Repeats : Regions (super-string of some fragments) may repeat in a target Consequent problem: where do the fragments really come from, on approximate alignment? [p 110, fig 4.7] Problem 2: where should the inter-repeat fragments go? [p111, fig 4.8, fig 4.9] Inverted repeats: repeat of the reverse complement [fig 4.10] 02/10/12 8 Complications Insufficient coverage: Chance of coverage increases with redundancy (a heuristic: cover 8 times the target length) Chance of covering a gap reduces when it remains uncovered even after multiple fragments are aligned): random sampling is not good solution here 02/10/12 9 Complications Insufficient coverage: What you get with insufficient coverage is multiple contigs, not one contig t-contig is where we expect t-long...
View Full Document

This note was uploaded on 02/10/2012 for the course CSE 5615 taught by Professor Mitra during the Fall '11 term at FIT.

Page1 / 31

FragAssembly - 02/10/12 1 Fragment Assembly 02/10/12 2...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online