This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Mutation models I: basic nucleotide sequence mutation models Peter Beerli September 14, 2005 1 Basics By the end of this lecture, I hope, you will get an idea how to calculate the probability to go from nucleotide sequence A to sequence B taking into account the uncertainty that we do not really know the number of mutations. In the lecture about parsimony we counted the number of changes on a tree and seeked a score that minimizes the number of changes. This approach is sometimes misleading because we know that some nucleotide locations mutate fast and might have mutated more than once, this can be seen in data sets where we can find more than one specific nucleotide at a specific locus: in parsimony we count a change from A C G A as no changes but in fact there were 3 changes. Stochastic mutation models can take this into account. 1.1 Branch length and scale Figure 1: Branch length and change of state We discussed phylogenetic trees, but ignored so far the time one needs to wait between nodes on such trees, we did not worry whether a branch is long or short, we were only interested in the topology. For likelihood and Bayesian methods we need an explicit model for these branch lengths. But what does it mean to see a branch length of 0.05? Dependent on the scaling this 1 BSC5936-Fall 2005 Computational Evolutionary Biology might mean rather different things, typically in phylogenetics it could mean that on average 5% of the sites might have changed, whereas in population genetics it could mean the same or that 0.05 generations scaled by the population size have passed. 1.2 A simple model 1.2.1 Discrete time We look first at a very simple model with two states U and Y , if you wish you could think of this as pUrines (either the nucleotide adenosine [A] or a guanine [G]) or pYrimidines (either cytosine [C] or tyrosine [T]). We have the states and substitution rate for the substitution rate from U to Y and the the rate from Y to U (Figure ?? ). This is a very simple model. We could think of Figure 2: Simple model with two states and one mutation rate introducing a different rate for going from Y to U , but will refrain to do so for this outline. We will see later that there are models that set back-mutation to zero. The most common of these is the infinite sites mutation model. It will be discussed in chapter mutation models III . The model shown in Figure ?? assumes that time is discrete and that we evaluate the transition from one state in time i to the same or other state in time i + 1, where the allele B is at risk to mutate to Y with rate or to stay in B with rate 1- . The same logic applies to Y where Y changes to B with rate and stays in Y with rate 1- . We can express this as a transition matrix R = 1- 1- !...
View Full Document