Integrative Biology 200A
University of California, Berkeley
Principals of Phylogenetics
Lean on me: Finding Support for your Trees using
Bootstrap, Jackknife, and Bremer Support
So ran a heuristic search and you got a tree. Now what? How do you tell how well (or poorly)
supported the tree you’ve come up with is? Well, of course the truth is that for most cases in
phylogenetics, it is impossible to know how closely your tree matches evolutionary history.
Nonetheless, there are various different ways to get a sense of how robust your data is – that is, is
that final tree just a fluke? Or, given the data that you have, are very few other trees possible?
The goals for this lab are for you to use a test tree to perform each kind of support analysis,
understand how they work, and be able whichever ones you choose in your own final project.
Today, we’ll go ahead and use parsimony for all our searches, but you can also use the support
measures with distance (which will be very quick, and is sometimes used only in bootstraps for this
reason) or likelihood (which may take a long time).
Build a test tree
If you’d like to use your own data, please do. Otherwise, you can use the primate data that we used
last time for MrBayes. It’s in the MrBayes file in program files and also available on the syllabus page
of the class website.
Place the file you’d like to work with in a new folder on the desktop.
Execute the file in PAUP.
If you got the same results as me, the search found two trees. You can view a consensus tree using
If you want to save the consensus tree to a file, use
contree all /treefile=
Estimating support by bootstrapping
Ok, now let’s figure out how well supported these groupings are. One measure of support is called
Here’s how it works: It choose columns randomly from the matrix – until it has
chosen the same number of columns as were in the originally matrix. Because it returns to the
original matrix each time it chooses a new column, some characters may be represented several
times in the bootstrap matrix, while others are omitted. This is known as resampling the data with
replacement. In practice, although it is possible to randomize taxa, bootstrapping almost always
randomizes characters Bootstrapping calculates a support value for each node based on the fraction
of samples that support that node.
The highest support value is 100, while values below 70 are