perfetctphylogeny

perfetctphylogeny - UC Davis Computer Science Technical...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
UC Davis Computer Science Technical Report CSE-2003-29 1 Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination Dan Gusfield, Satish Eddhu, Charles Langley November 24, 2003 1 To appear in Journal of Bioinformatics and Computational Biology. 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination Dan Gusfield and Satish Eddhu Department of Computer Science University of California, Davis gusfield,eddhu @cs.ucdavis.edu Charles Langley Division of Evolution and Ecology University of California, Davis [email protected] Keywords: Molecular Evolution, Phylogenetic Networks, Ancestral Recombination Graph, Recombination, SNP, bi- convex graph Abstract A phylogenetic network is a generalization of a phylogenetic tree, allowing structural properties that are not tree-like. With the growth of genomic data, much of which does not fit ideal tree models, there is greater need to understand the algorithmics and combinatorics of phylogenetic networks [17, 18]. However, to date, very little has been published on this, with the notable exception of the paper by Wang et al.[21]. Other related papers include [9, 10, 12, 20, 19]. Wang et al. [21] studied the problem of constructing a phylogenetic network for a set of binary sequences derived from the all-0 ancestral sequence, when each site in the sequence can mutate from 0 to 1 at most once in the network, and recombination between sequences is allowed. They gave a polynomial-time algorithm that was intended to determine whether the sequences could be derived on such a phylogenetic network where the recombination cycles are node disjoint. In this paper, we call such a phylogenetic network a “galled-tree”. That work is seminal in focusing on galled-trees, and for its assertion that reconstruction of such networks can be done in polynomial time. Unfortunately, the algorithm in [21] is incomplete and does not constitute a necessary test for the existence of a galled-tree for the data. In this paper, we completely solve the problem of determining whether a set of binary sequences can be derived on a galled-tree. By more deeply analyzing the combinatorial constraints on cycle-disjoint phylogenetic networks, we obtain an efficient algorithm that is guaranteed to be both a necessary and sufficient test for the existence of a galled-tree for the data. If there is a galled-tree, the algorithm constructs one which is optimal, minimizing the number of recombinations over all phylogenetic networks for the data (using the all-0 ancestral sequence), even phylogenetic networks that are not restricted to be galled-trees, and even if their recombination events allow multiple-crossover recombinations. We also prove that when there is a galled-tree for the data, the galled-tree minimizing the number of recombinations is “essentially unique”, with only limited modifications permitted. We also note two additional results: first, any set of sequences that can be derived on
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 10/01/2009 for the course CS BCB/Co taught by Professor Olivereulenstein during the Fall '06 term at Iowa State.

Page1 / 29

perfetctphylogeny - UC Davis Computer Science Technical...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online