Math 3380
Homework #2
Due Oct 11
Problem 1 The sequence x = ABAAABAB
was generated using a Hidden Markov model with 2 states, 0 and 1, two letter alphabet, cfw_A,B
and the following transition and emission probabilities:
p
0
1
0
1
0.8 0.2
0.1 0.9
e (b )
A
Molecular Modeling
Idea
Represent molecules by mechanical models: atoms = particles, bonds = springs
Represent interatomic forces as nonlinear potentials
Use Newtons equations to describe the dynamics of the molecule as a motion of the
atoms
Interpret dyn
Math 3380
Homework #1
Due Sep 20
1) What is the probability of occurrence of an 800 bases long DNA segment without stop
codons in a random iid DNA sequence?
(Hint: There is an easy approximate answer and a harder recursive formula for the
probability.)
2)
Significance of score
Found optimal alignment with score S
Is the alignment due to real sequence similarity or could it arise by chance?
Problem: Determine whether the score S is significantly higher than the maximal score obtained
by optimal alignment of
Pairwise alignment
Main goal:
to identify genes and regulatory elements in genomic sequences by comparing to
known genes in related organisms
Evolution:
introduces incremental changes in genetic code these changes lead to formation
of new
species
Substitu
Multiple protein alignment
Homologous residues among sets of sequences are aligned in columns
Homologous: more than 30% identical
Ideally, each column represents one 3D position in the structure.
Assumption: Sequences are evolutionary related
Standard res
Reconstruction of trees
Probabilistic methods
Score trees according to their likelihood P(data|tree) or posterior probability P(tree|data)
Data set of aligned sequences
Requires probabilistic model of evolution:
P ( x | y , t ) = probability that sequence
Molecular Forces
Fundamentals
atoms are treated as point masses mi with positions given by coordinates
x i = ( xi1, xi 2 , xi 3 )
Total energy of a molecule with N atoms is given by the function E = E ( x1 , x 2 ,., x N )
Geometry
rij = rij = x j x i
Dist
Phylogenetic trees
Tree connected graph with no cycles
Used to graphically represent relatedness (similarity, distance) between objects
Terminology binary tree, root, branch, leaf, neighbors
General tree is unrooted, Rooted tree has direction
unrooted tre
Markov Models of sequences
Useful for locating repetitive structures in sequences: CpG islands, exons, introns, long
repetitive sequences, etc.
Model setup
Parameter estimation
n-th order Markov chain:
P ( xi = a ) depends on what bases are at xi j , j =
Math 3380
Homework #3
Due Nov 10
Problem 1 A simple model of the water molecule is given by the potential energy
function:
E ( , r1 , r2 ) = K (cos cos ) 2 + S ( r1 r1 ) 2 + S ( r2 r2 ) 2
where r1 , r2 are the O-H bond lengths and is the H-O-H bond angle,
Tertiary structure
cartoon rendering
stick rendering
space filling rendering
HU protein in complex with DNA (PDB ID: 1P78.pdb)
Structure visualization programs
RasMol, PyMol, Chime
Structure description file: PDB format
header contains information about t