MIT6_047f08_rec06

MIT6_047f08_rec06 - MIT OpenCourseWare http://ocw.mit.edu...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms . 6.047/6.878 Fall 2008- Recitation 6 - Maximum Likelihood October 10, 2008 1 Model of evolution Our goal in developing this maximum likelihood approach is to find the best tree (most likely) that explains our sequence data. To represent a tree, let the leaves of a tree be numbered 1 ,...,n and the ancestral nodes be n + 1 , 2 n 1 . Let the branches of the tree be numbered by the most recent of the two nodes it touches (e.g. branch i connects node i and parent ( i ) ). For a tree, we have its topology T and the branch times t 1 ,...,t 2 n 2 , where t i is the time between nodes i and parent ( i ) . Our sequence data can be represented as a matrix x ( n rows, m columns), such that x i,j is the j th character of the i th sequence. We will be given sequence data for the extant (modern) sequences x 1 ,... x n , and will have to integrate over the ancestral sequences x n + 1 ,... x 2n 1 . Each sequence has length m . With these definitions, our goal in the maximum likelihood method is to solve the following equation arg max P ( x 1 ,..., x n | T, t ) . T, t 1.1 Defining the distributions: factoring by branches Before we can tackle the equation above, we must first define the distributions of our variables. We will make several assumptions about the process of sequence evolution in order to make the math and algorithm tractable. First note that the distribution above is a marginal of the joint distribution over all sequences P ( x 1 ,..., x n | T, t ) = P ( x 1 ,..., x 2n 1 | T, t ) x n + 1 ,..., x 2n 1 The first assumption we will make is that...
View Full Document

Page1 / 6

MIT6_047f08_rec06 - MIT OpenCourseWare http://ocw.mit.edu...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online