Course Hero Logo

Likelihood ClassAnnotations March10.pdf - EEOB 6330 –...

Course Hero uses AI to attempt to automatically extract content from documents to surface to you and others so you can study better, e.g., in search results, to enrich docs, and more. This preview shows page 1 - 8 out of 43 pages.

EEOB 6330 – Spring 2020Week 7 - March 3, 2020Topic: Likelihood1 / 43
History of maximum likelihood in phylogeneticsIFelsenstein (1981)is often cited as the first maximum likelihood method forinferring phylogeniesIBut....IEdwards and Cavalli-Sforza (1964)andCavalli-Sforza and Edwards (1967)provide important contributions to viewing estimation of phylogeny as aproblem instatistical inferenceIEdwards (1970)- maximum likelihood inference using gene frequency dataINeyman (1971)- maximum likelihood for sequence data under a simplesubstitution model for three taxaIFelsenstein (1973)- maximum likelihood for discrete character data2 / 43
Model-based phylogenetic inferenceIMotivation:Consider the evolutionary process along a phylogeny, and develop aprobabilistic model for the data observed at the tips of the treeIBegin by considering DNA sequence data, but models can be used for trait dataas wellIThebranch lengthswill now have meaning for us – they can be interpreted asproportional to the amount of evolutionary changeIStart by considering the mutation process on one branch – model the probabilitythat one branch changes to another over a time unit of lengtht– What are someimportant properties of this probability?3 / 43-PAC(o)=OPaalo)=I-proportionaltothelengtht:A£-00CastT,PAAle)twant:Pack)=probabilitythatAchange,tocoverabrinkoflengtht
Markov models of substitutionIThe most common models are:Ifinite-stateIcontinuous-timeIMarkov modelsIWe also commonly assumehomogeneityacross branchesIWe define these models by specifying aninstantaneous rate matrix:4 / 43{nucleotideSefuenus.:4otitis.=A,C,G.icodonsi64Stitesor61Stitesaminoacids:20Artesmorphologicalactsipresence/absence0Itcco-fornucleotides,measuredinexpected#ofsubstitutionspersitefirst-order-mutationprobs.onlydependonthecurrentstate,notthepasthistory.Q--¥.I-t)a--int::L:S:thenextinstantoftime
Markov models of substitutionIFornucleotide sequence data,the most general rate matrix is given byQcca[A][C][G][T][A]≠≠μafiCμbfiGμcfiT[C]μgfiA≠≠μdfiGμefiT[G]μhfiAμjfiC≠≠μffiT[T]μifiAμkfiCμlfiG≠≠RddbwhereIμ= mean rate of substitutionIfiA,fiC,fiG,fiT= base frequency parametersIalare rates of specific types of changes5 / 43qdiagonalekoyenffg-yownyetirc.vnQ=Ostationarydirt=prob.ofA(e.g.)atasitiftis
Markov models of substitutionIUsually assume that the substitution process istime-reversibleIOverall rate of change from baseitojis the same as from basejtoiIMathematically:fiiPij(t) =fijPji(t)IThis forces some constraints onQ:Qcca[A][C][G][T][A]≠≠μafiCμbfiGμcfiT[C]μafiA≠≠μdfiGμefiT[G]μbfiAμdfiC≠≠μffiT[T]μcfiAμefiCμffiG≠≠RddbIThis is called theGeneral Time Reversible (GTR)model6 / 43
Jukes-Cantor Model (JC69)ISimplest sub-model of GTRIAssumes thatIall nucleotides are equally frequentIall mutations happen at the same rateQcca[A][C][G][T][A]34μ14μ14μ14μ[C]14μ34μ14μ14μ[G]14μ14μ34μ14μ[T]14μ14μ14μ34μRddb7 / 43ITAzTo=Tf=IT,=LTQ--

Upload your study docs or become a

Course Hero member to access this document

Upload your study docs or become a

Course Hero member to access this document

End of preview. Want to read all 43 pages?

Upload your study docs or become a

Course Hero member to access this document

Term
Fall
Professor
NoProfessor
Tags
Maximum likelihood, Likelihood function, Computational phylogenetics

Newly uploaded documents

Show More

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture