Molecular Ecology and Phylogenetics
1 / 55
Term:
Definition:
Show example sentence
Show hint
Keyboard Shortcuts
  • Previous
  • Next
  • F Flip card

Complete list of Terms and Definitions for Molecular Ecology and Phylogenetics

Terms Definitions
Monophyly includes all ancestors of that individual, a group that shares a common ancestor or node= cluster and all of its descendants, the best kind of clade
Fst Can calculate migration rate, variation among local populationsH= heterozygosity; (H(total)- H(s))/ H(total); 1/(1+4Nm)
Pyrimidine a nitrogenous base that has a single-ring structure; one of the two general categories of nitrogenous bases found in DNA and RNA; thymine, cytosine, or uracil
why do some phylogeneticists prefer simplest model while others prefer the most complex model Depends on one's own philosophy prefer to use simplest depending on the use of too many assumptions and none are correct, assuming none use JC bc it is the most complicated in assuming alot of things about the model because may have alot of ti and tv or differences in the # of bases; some use more complex model because it contains all the info needed and can give general consensus.
+ Match / - mismatch scoring matrix For the BLASTN when a word 'hit' is located, the scores are reward for a nucleotide match where the matching sequence is compared to the query sequence, and penalty for a nucleotide mismatch; determines H and percent identity
Classical School species with a coherent unit and has low genetic variation and has some problems, but holds up for majority of genes
Genotype an organism's combination of alleles, the genetic makeup of an organism or group of organisms with reference to a single trait, set of traits, or an entire complex of traits.
how genetically similar/ different some / sub populations are, how variation is partitioned, is their significant population genetic structure, what environmental or geographic factors explain the most among group variance ecological hypothesis that requires an AMOVA to test
E-Value for BLAST can set the cutoff value but shows the % chance that it can randomly be drawn out, to get the match by chance. proteins have higher # want it to be almost 0 and E- ##
Why is substitution saturation a problem in phylogenetics The reliability of results from molecular phylogenetics of sequence data depends on how well the analysis deals with this problem is whether some or all sequences in the data set have already lost phylogenetic information; for example A to A can't be seen but neither can A to C to A. Saturation is when the sequence change rapidly and you can't see that then how would you know
describe three statistics that are used to estimate reliability of a particular node in a phylogenetic tree MP , ML, PP, and NJ Boot strap value,this takes the alignment and randomly samples columns out of the alignment and a tree is generated. If do this a lot then sample with replacements generates another tree give high % then means that node really exists and how much your data supports this change, if # is low then it may not exist due to consistency in change, look at PP If data has high probability then node probably does exist and the conditional probability that is assigned aftermath relevant evidence is taken into account, jackknifing using 1/2 the data that we have without replacements using 50%, newer method PHYML uses aLRT( approximate likelihood ration test) much faster calculation than bootstraps and has much better data consensus but is like ML but based on a specific node in the tree and has better fit to the data than with bootstrap and is still an approximation
Homoplasy analogous all individuals have it; shared characters that are not homologous and do not reflect descent from a common ancestor; not useful for grouping similar taxa b/c has misleading similarities
Phylogram branches length represent evolutionary distances
Purine one of two types of nitrogenous bases found in nucleotides, characterized by a six-membered ring fused to a five-membered ring. Adenine (A) and guanine (G) are purines found in DNA or RNA
Paraphyly includes some ancestors, The condition that a taxon or other group of organisms contains the most recent common ancestor of all members of the group but excludes some descendants of that ancestor; contrasts with monophyly and polyphyly.
describe an objective way to select an appropriate model of DNA sequence evolution using the JCantor is a good model or findmodel and calculating all the criteria then using that information to figure out which model is the most appropriate and without personal biases; start with the simplest model and keep getting more and more complex
BLAST search genbank to find matches to the sequence, grouping different characteristics of the sequence and it takes the sequence put in and breaking it up into as many 11 base pair segments for Dna sequences,(if u have 100 it'll be 99 sequences) as it can and search all these against the database and if it finds one then it extends them together and if the match doesnt get any better then it stops and gives the alignment.
If two alignment programs yield different results, what are two methods you could use to reconcile their differences Do many alignment combinations and stick them together, altavista can be used to see where the places are disagreeing and cut out the places (most extreme), calculate a lot of trees for two different alignments and make a consensus of all trees
Indel the insertion or deletion of one or more nucleotides due to errors in duplication of genetic information
.meg MEGA
Balance School high genetic variation and has balanced thought...
AMOVA Analysis of molecular variance; compares variation between two groups given the amount of variation within each group and the total amount of variation tests genetic similarity of populations, so want to know how different population and subpopulations and is their gene flow. Can be done with microsatellites, AFLP and genotypes
exhaustive ; branch and bound will both guarantees the location of the globally optimal phylogeny
in what cases would you expect to see a higher chance of substitution saturation? In the extreme case when sequences have experienced full substitution saturation, the similarity between the sequences will depend entirely on the similarity in nucleotide frequencies that often does not reflect phylogenetic relationships. This can be seen in organisms and individual genes that change really quick. Also within genes such as intron elements, encoding sequence, wobble position on the nucleotide.
how does search criterion differ among parsimony(MP), maximum likelihood(ML) and bayesian(PP) methods parsimony search for simplest and look for the least amount of changes and steps (min evolution) everything counts as one change, ML looks at probability instead of steps and the big difference is given the tree what is the possibility the data will give u that tree, so it takes a tree and tries to match the data to it. Picking the tree with the best score and each tree is scored separately. Bayesian gives the probability of that tree giving the dataset. So u have data, what's the most probable tree. Have lot of trees and when u do the summary statistics you take the consensus of all the trees using your model. The last two are statistical changes and parsimony is just lookin for changes
most common models of DNA sequence evolution and how do the common models differ in their parameters and assumptions JCantor-(EB)all base freq. are equal and counts only one type of substitution,F81-(UB/ES)model with unequal bases and = substitution, and K80-(EB) model with equal bases ti/tv, TN 93-(UB) unequal bases, Y ,R, tv; GTR-(UB)allow 6 substitutions with unequal base freq. F86-3 different tv or substitutions weighing them differently
Natural Selection process by which individuals/ alleles that are better suited to their environment survive and reproduce most successfully; also called survival of the fittest due to positive selective pressure and if the allele has a negative impact on fitness then it is less likely to survive
Apomorphic descendant character
what does the I and G parameters of evolutionary models describe G= gamma, variation of mutation rate along the sequence and you can see how the distribution varies, allows distribution of differ sites among differ rates I=invariance, proportion sites that never change
Symplesiomorphy every descendant has similarities and are identical; ancestral character shared by all descendants of an ancestor ; not useful for grouping similar taxa
.fasta alot of programs can use fasta almost all
homology/ similarity applying to DNA sequence data for a character derived by descent and similarity calculating are they the same or different or if they are derived from the same thing; for a sequence if you have homologous limbs they are derived by descent. In similarity, if have 2 random things u can calculate and see the similarity
Transition a nucleotide substitution that exchanges a purine for a purine or a pyrimidine for a pyrimidine. (A <->G)(C<->T)
Plesiomorphic ancestor character
.nex Mr .Bayes., PAUP, MACCLADE
polytomy multifurcating relationships between phylogenetic trees so they all branch from the same point. soft polytomy doesnt have enough data. hard polytomy has several evolutionary events happened at the same time without the genes evolving so basically the step was skipped.
Phenotype a way to represent the classification of organisms what an organism looks like as a consequence of its genotype, describing the morphology, biochemistry, performance, and behavior
Synapomorphy special shared derived characteristics but not all descendents have it but it can be used to group similar taxa
Gap opening and gap extensions penalties used in alignment algorithms penalties used in alignment algorithims for changes the line alot so that muscle and many other algorithms can open them and can think about matching up, they want to find out the positional homology and find what lines up. The penalties are used because if you let it put in as many gaps the alignment will be longer due to the gaps even though the alignment is perfect. The penalties can be used to find out the position homologies and what should truly be there
Orthologous homologues in different species that coalesce to a common ancestral gene without gene duplication or horizontal transmission, they evolved independently
Neutralist vs. Selectionist Debate most is random genetic mutation that have differ chance of things getting fixed like neutralist vs. selection and fixation. Neutralist- all of neutral processing like random genetic variation. Selectionist- means their is something acting on the variation of the genes.
Algorithmic vs Tree-searching to construct phylogenetic trees differences algorithmic results in one tree, tree search get a lot of trees an can do consensus or pick one by judging the trees this way you don't have bias to one algorithm by using a certain criteria to search for the tree.
Genetic Drift the random fluctuation in allele frequency due to random sampling of gametes from generation to generation in a population
Homologous similarity due to shared ancestry
NCBI genbank search for DNA sequences protein, nucleotide, accession numbers, taxonomy, entrez, blast, popset
Paralogous genes that are homologues that diverged after a duplication event within species, so they are found in more than one copy in the same genome usually in same organism
isolation-by-distance describes more genetic difference with geographic distance describes the tendency of individuals to find mates from nearby populations rather than distant populations. As a result of this tendency, populations that live near each other are genetically more similar than populations that live further apart. Isolation by distance results in clinal distribution of traits across a geographic region.
.phy PHYML and PHYLUP
Autapomorphy every individual has unique characteristics to one taxon; useful for identifying a taxon but not useful for determining relationships
majority-rule consensus tree conserve the sequences and nodes to be strict on results and find more similarities and weed out the most dissimilar, means that if 51% of the trees have that node then that node must exist, can tell it what % to set the bootstrap and can be changed to higher # for accuracy.
Cladogram ancestor-descent relationships
what are the differences between the exhaustive, branch-and-bound, and heuristics searches exhaustive looks at everything will give good trees but are slow, branch and bound make tree and add one branch at a time are you improving or making score worse and once make it worse then will keep making it worse can speed up search time by weeding out all the bad ones but the program is slow bc has lots of weeding out to do, heuristics nearest neighbor interchange tree bisection and reconnection, includes all algorithmic searches, boot strap starts these at different places for different searches
Transversions a nucleotide substitution that exchanges a purine for a pyrimidine or a pyrimidine for a purine (A<->C, A<->T,G<->C, G<->T)
Phenetic vs. Cladistic phenetic- looking at organisms and see how many features are shared among organisms in the group, clade- look at evolutionary connection of phylogenetic grouping things based on descent.
Why is the Q matrix of the GTR model symmetrical has 4X4 matrix only count 6 want it symmetrical to make it time reversible and don't want to assign polartiy across the tree so dont assign ancestor and descendants before start