Ke, Yue 11/3/2008
1. In programme development, types of requirements include
functional, performance, reliability, availability, error handling, interface, and constraints. In developing software for
Study Aid 2
For 40% of sequenced genes, functionality cannot be ascertained using only
comparisons to sequences of other known genes
Microarrays allow biologists to infer gene function even when sequ
Study Aid 1
How Gibbs Sampling Works
1) Randomly choose starting positions
s = (s1,.,st) and form the set of l-mers associated
with these starting positions.
2) Randomly choose one of the t sequences.
Lecture Notes 4
Randomized Algorithms
Randomized algorithms incorporate random, rather than deterministic, decisions
Commonly used in situations where no exact and/or fast algorithm is known
Main adva
Lecture Notes 3
Fitting Distance Matrix
Given n species, we can compute the n x n distance matrix Dij
Evolution of these genes is described by a tree that we dont know.
We need an algorithm to constru
Lecture Notes 2
Around the time the giant panda riddle was solved, a DNA-based reconstruction
of the human evolutionary tree led to the Out of Africa Hypothesis that claims
our most ancient ancestor l
Lecture Notes 1
Clique Graphs
A clique is a graph where every vertex is connected via an edge to every other
vertex
A clique graph is a graph where each connected component is a clique
The concept of
HW 3
1. Motif Finding Problem: Given a list of t sequences each of length n, find the
best pattern of length l that appears in each of the t sequences.
2. A New Motif Finding Approach
3. Motif Finding
HW 2
1. Select Analysis
2. Select seems risky compared to sort
3. To improve Select, we need to choose m
to give good splits
4. It can be proven that to achieve O(n) running time, we dont need a
perfe
HW 1
1. Randomized algorithms incorporate random, rather than deterministic, decisions
2. Commonly used in situations where no exact and/or fast algorithm is known
3. Main advantage is that no input c
Study Aid 3
Hierarchical Clustering: Recomputing Distances
dmin(C, C*) = min d(x,y)
for all elements x in C and y in C*
Distance between two clusters is the smallest distance between any pair of
the