{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

midterm2004-solution - Solution to 10-701/15-781 Midterm...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
Solution to 10-701/15-781 Midterm Exam Fall 2004 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
1 Introductory Probability and Statistics (12 points) (a) (2 points) If A and B are disjoint events, and Pr ( B ) > 0, what is the value of Pr ( A | B )? Answer : 0 ( Note that: A B = ) (b) (2 points) Suppose that the p.d.f of a random variable X is as follows: f ( x ) = 4 3 (1 - x 3 ) , for 0 x 1 0 , otherwise Then Pr ( X < 0) =? Answer : 0 ( Note that: 0 x 1) (c) (4 points) Suppose that X is a random variable for which E ( X ) = μ and V ar ( X ) = σ 2 , and let c be an arbitrary constant. Which one of these statements is true: A. E [( X - c ) 2 ] = ( μ - c ) 2 + σ 2 D. E [( X - c ) 2 ] = ( μ - c ) 2 + 2 σ 2 B. E [( X - c ) 2 ] = ( μ - c ) 2 E. E [( X - c ) 2 ] = μ 2 + c 2 + 2 σ 2 C. E [( X - c ) 2 ] = ( μ - c ) 2 - σ 2 F. E [( X - c ) 2 ] = μ 2 + c 2 - 2 σ 2 Answer : A E [( X - c ) 2 ] = E [ X 2 ] - 2 cE [ X ] + c 2 = V ar ( X ) + [ E ( X )] 2 - 2 + c 2 = ( μ - c ) 2 + σ 2 2
Background image of page 2
(d) (4 points) Suppose that k events B 1 , B 2 , ..., B k form a partition of the sample space S. For i = 1 , ..., k , let Pr ( B i ) denote the prior probability of B i . There is another event A that Pr ( A ) > 0. Let Pr ( B i | A ) denote the posterior probability of B i given that the event A has occurred. Prove that if Pr ( B 1 | A ) < Pr ( B 1 ), then Pr ( B i | A ) > Pr ( B i ) for at least one value of i ( i = 2 , ..., k ). (Hint: one or more of these tricks might help: P ( B i | A ) P ( A ) = P ( B i A ), k i =1 P ( B i ) = 1, k i =1 P ( B i | A ) = 1, P ( B i A ) + P ( B i ∧ ¬ A ) = P ( B i ), k i =1 P ( B i A ) = P ( A )) Answer : We need to prove that if Pr ( B 1 | A ) < Pr ( B 1 ), then Pr ( B i | A ) > Pr ( B i ) for at least one value of i ( i = 2 , ..., k ). Proof : We know that k i =1 Pr ( B i ) = 1 and k i =1 Pr ( B i | A ) = 1, Suppose that for all i ( i = 2 , ..., k ), we have Pr ( B i | A ) Pr ( B i ), then we can get that k i =1 Pr ( B i ) = Pr ( B 1 ) + k i =2 Pr ( B i ) > Pr ( B 1 | A ) + k i =2 Pr ( B i ) > Pr ( B 1 | A ) + k i =2 Pr ( B i | A ) So we get that 1 > 1. Confliction!. 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 Linear Regression (12 points) We have a dataset with R records in which the i th record has one real-valued input attribute x i and one real-valued output attribute y i . (a) (6 points) First, we use a linear regression method to model this data. To test our linear regressor, we choose at random some data records to be a training set, and choose at random some of the remaining records to be a test set. Now let us increase the training set size gradually. As the training set size increases, what do you expect will happen with the mean training and mean testing errors? (No explanation required) - Mean Training Error: A. Increase; B. Decrease - Mean Testing Error: A. Increase; B. Decrease Answer : The training error tends to increase. As more examples have to be fitted, it becomes harder to ’hit’, or even come close, to all of them. The test error tends to decrease. As we take into account more examples when training, we have more information, and can come up with a model that better resembles the true behavior. More training examples lead to better generalization. (b) (6 points) Now we change to use the following model to fit the data. The model has one unknown parameter w to be learned from data.
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}