CS573: Homework 1
Due date: Thursday September 9, start of class
1
Elements of Data Mining (5 pts)
Read the following paper at a highlevel (don’t worry about the lowlevel details):
M. Deodhar and J. Ghosh (2006).
Consensus Clustering for Detection of Overlapping
Clusters in Microarray Data.
Proceedings of the ICDM 2006 Workshop on Data Mining in
Bioinformatics (DMB 2006)
. (http://www.ideal.ece.utexas.edu/papers/deodhar06overlap.pdf)
Identify the following components of the work:
1. The task
2. The data representation
3. The knowledge representation
4. The learning technique (search method + scoring function)
5. The inference technique (if applicable) and evaluation method
2
Probability (3 pts)
Suppose that we have three colored boxes
r
(red),
b
(blue), and
g
(green). Box
r
contains
3 apples, 4 oranges, and 3 limes; box
b
contains 1 apple, 1 orange, and 0 limes; box
g
contains 3 apples, 3 oranges, and 4 limes. If a box is chosen at random with probabilities
p
(
r
) = 0
.
2
, p
(
b
) = 0
.
2
, p
(
g
) = 0
.
6, and a piece of fruit is removed from the box (with
equal probability of selecting any of the items in the box), then what is the probability of
selecting an apple? If we observe that the selected fruit is in fact an orange, what is the
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '08
 Staff
 Normal Distribution, Data Mining, pts, Maximum likelihood, Estimation theory, Likelihood function

Click to edit the document details