This preview shows pages 1–17. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Maximum Likelihood Estimation for Allele Frequencies Biostatistics 666 Lecture 7 Last Three Lectures: Introduction to Coalescent Models z Computationally efficient framework • Alternative to forward simulations z Predictions about sequence variation • Number of polymorphisms • Frequency of polymorphisms Coalescent Models: Key Ideas z Proceed backwards in time z Genealogies shaped by • Population size • Population structure • Recombination rates z Given a particular genealogy ... • Mutation rate predicts variation Next Series of Lectures z Estimating allele and haplotype frequencies from genotype data • Maximum likelihood approach • Application of an EM algorithm z Challenges • Using information from related individuals • Allowing for noncodominant genotypes • Allowing for ambiguity in haplotype assignments Objective: Parameter Estimation z Learn about population characteristics • E.g. allele frequencies, population size z Using a specific sample • E.g. a set sequences, unrelated individuals, or even families Maximum Likelihood z A general framework for estimating model parameters z Find the set of parameter values that maximize the probability of the observed data z Applicable to many different problems Example: Allele Frequencies z Consider… • A sample of n chromosomes • X of these are of type “a” • Parameter of interest is allele frequency… X n X p p X n X n p L − − ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = ) 1 ( ) ,  ( Evaluate for various parameters 0.000 0.0 1.0 0.006 0.2 0.8 0.111 0.4 0.6 0.251 0.6 0.4 0.088 0.8 0.2 0.000 1.0 0.0 L 1p p Likelihood Plot 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 1.0 Allele Frequency Likelihood For n = 10 and X = 4 In this case z The likelihood tells us the data is most probable if p = 0.4 z The likelihood curve allows us to evaluate alternatives… • Is p = 0.8 a possibility? • Is p = 0.2 a possibility? Example: Estimating 4N µ z Consider S polymorphisms in sample of n sequences… z Where P n is calculated using the Q n and P 2 functions defined previously )  ( ) ,  ( θ θ S P S n L n = Likelihood Plot 4N µ Likelihood With n = 5, S = 10 MLE Maximum Likelihood Estimation z Two basic steps… z In principle, applicable to any problem where a likelihood function exists )  ( maximizes that ˆ of value Find b) )  ( )  ( function likelihood down Write a) x L x f x L θ θ θ θ ∝ MLEs z Parameter values that maximize likelihood • θ where observations have maximum probability z Finding MLEs is an optimization problem z How do MLEs compare to other estimators? Comparing Estimators z How do MLEs rate in terms of … • Unbiasedness • Consistency • Efficiency z For a review, see Garthwaite, Jolliffe, Jones (1995) Statistical Inference , Prentice Hall Analytical Solutions z Write out loglikelihood …...
View
Full
Document
This note was uploaded on 12/26/2011 for the course BIO 666 taught by Professor Staff during the Fall '06 term at University of Michigan.
 Fall '06
 STAFF
 Genetics, RNA

Click to edit the document details