Massachusetts Institute of Technology
6.867 Machine Learning, Fall 2006
Problem Set 1 Solutions
Section B
1. (a) The functions are on the course website hw1/solutions. The major cause of variation
among the solutions was the choice of the order in which
Massachusetts Institute of Technology
6.867 Machine Learning, Fall 2006
Problem Set 2: Solutions
1. (a) (5 p oints) From the lecture notes (Eqn 14, Lecture 5), the optimal parameter values for linear
regression given the matrix of training examples X and
Massachusetts Institute of Technology
6.867 Machine Learning, Fall 2006
Problem Set 1 Solutions
Section A (background questions)
1. Lets begin with a little math. Let us denote by Pn the probability that n people with
randomly chosen birthdays (chosen un
Massachusetts Institute of Technology
6.867 Machine Learning, Fall 2006
Problem Set 3: Solutions
1. (a) (5 p oints) If A and B are random variables (RVs) with the same probability distribution, then
E [f (A)] = E [f (B )]. This may be clearer when we writ
Massachusetts Institute of Technology
6.867 Machine Learning, Fall 2006
Problem Set 4: Solutions
1. (a) (8 p oints) We have
nd
L(D; ) =
P (xri |yr )P (yr )
(1)
r =1 i=1
where the number of examples is n. We can parameterize P (y ) with the parameter , as
Massachusetts Institute of Technology
6.867 Machine Learning, Fall 2006
Problem Set 5: Solutions
1. (a) For the LM, there is no need to iterate; the maximum likelihood estimates are easy to derive; they
are simply the normalized counts.
The updates for T