This preview shows page 1. Sign up to view the full content.
Unformatted text preview: computers, running that many comparisons is impracCcal, so seeded algorithm are used Overlapping Reads •
•
• Sort all k mers in reads Find pairs of reads sharing a k mer Extend to full alignment – throw away if not >95% similar TACA TAGATTACACAGATTAC T GA
   
TAGT TAGATTACACAGATTAC TAGA Finding Overlapping Reads Create local mulCple alignments from the overlapping reads. TAGATTACACAGATTACTGA
TAGATTACACAGATTACTGA
TAG TTACACAGATTATTGA
TAGATTACACAGATTACTGA
TAGATTACACAGATTACTGA
TAGATTACACAGATTACTGA
TAG TTACACAGATTATTGA
TAGATTACACAGATTACTGA Finding Overlapping Reads Correct errors using mulCple alignment. • Find locaCons where there is a deviaCon in which 1% of the data diverge from the rest. • Make those posiCons agree with the rest. TAGATTACACAGATTACTGA
TAGATTACACAGATTACTGA
TAG TTACACAGATTATTGA
TAGATTACACAGATTACTGA
TAGATTACACAGATTACTGA Build the Overlap Graph • Overlap graph: the nodes represent actual reads, and edges represent overlaps between these reads. • Thus, the genome assembly becomes equivalent to ﬁnding a path through the graph that visits each node exactly once (i.e., a Hamiltonian path). 24 An overlap graph. Nodes are complete reads and edges connect reads that overlap. Note that in an actual graph, reads and overlaps would be much larger. 25 Layout • Finding a Hamiltonian path through the overlap graph is not a trivial task. • In order to decrease the size of the graph, the OLC assembly graph is simpliﬁed in the layout stage, where segments of the graph are compressed into conCgs • Thus, we have to ﬁnd a manner to decrease the complexity of the graph Graph ReducCon • A conCg would be a subgraph, or a group of nodes, with many connecCons among each other, as they all overlap with each other and refer to the same sequence (A and B). • Once a subgraph has been idenCﬁed, these nodes and edges are compressed into one node, or a conCg, thereby simplifying the graph (C) 27 28 SeparaCng ConCgs • Th...
View
Full
Document
This note was uploaded on 02/10/2014 for the course CS 425 taught by Professor Asabenhur during the Fall '13 term at Colorado State.
 Fall '13
 AsaBenHur

Click to edit the document details