This preview shows pages 1–5. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Solution to 10701/15781 Midterm Exam Fall 2004 1 1 Introductory Probability and Statistics (12 points) (a) (2 points) If A and B are disjoint events, and Pr ( B ) > 0, what is the value of Pr ( A  B )? Answer : 0 ( Note that: A B = ) (b) (2 points) Suppose that the p.d.f of a random variable X is as follows: f ( x ) = 4 3 (1 x 3 ) , for 0 x 1 , otherwise Then Pr ( X < 0) =? Answer : 0 ( Note that: 0 x 1) (c) (4 points) Suppose that X is a random variable for which E ( X ) = and V ar ( X ) = 2 , and let c be an arbitrary constant. Which one of these statements is true: A. E [( X c ) 2 ] = (  c ) 2 + 2 D. E [( X c ) 2 ] = (  c ) 2 + 2 2 B. E [( X c ) 2 ] = (  c ) 2 E. E [( X c ) 2 ] = 2 + c 2 + 2 2 C. E [( X c ) 2 ] = (  c ) 2 2 F. E [( X c ) 2 ] = 2 + c 2 2 2 Answer : A E [( X c ) 2 ] = E [ X 2 ] 2 cE [ X ] + c 2 = V ar ( X ) + [ E ( X )] 2 2 c + c 2 = (  c ) 2 + 2 2 (d) (4 points) Suppose that k events B 1 ,B 2 ,...,B k form a partition of the sample space S. For i = 1 ,...,k , let Pr ( B i ) denote the prior probability of B i . There is another event A that Pr ( A ) > 0. Let Pr ( B i  A ) denote the posterior probability of B i given that the event A has occurred. Prove that if Pr ( B 1  A ) < Pr ( B 1 ), then Pr ( B i  A ) > Pr ( B i ) for at least one value of i ( i = 2 ,...,k ). (Hint: one or more of these tricks might help: P ( B i  A ) P ( A ) = P ( B i A ), k i =1 P ( B i ) = 1, k i =1 P ( B i  A ) = 1, P ( B i A ) + P ( B i A ) = P ( B i ), k i =1 P ( B i A ) = P ( A )) Answer : We need to prove that if Pr ( B 1  A ) < Pr ( B 1 ), then Pr ( B i  A ) > Pr ( B i ) for at least one value of i ( i = 2 ,...,k ). Proof : We know that k i =1 Pr ( B i ) = 1 and k i =1 Pr ( B i  A ) = 1, Suppose that for all i ( i = 2 ,...,k ), we have Pr ( B i  A ) Pr ( B i ), then we can get that k i =1 Pr ( B i ) = Pr ( B 1 ) + k i =2 Pr ( B i ) > Pr ( B 1  A ) + k i =2 Pr ( B i ) > Pr ( B 1  A ) + k i =2 Pr ( B i  A ) So we get that 1 > 1. Confliction!. 3 2 Linear Regression (12 points) We have a dataset with R records in which the i th record has one realvalued input attribute x i and one realvalued output attribute y i . (a) (6 points) First, we use a linear regression method to model this data. To test our linear regressor, we choose at random some data records to be a training set, and choose at random some of the remaining records to be a test set. Now let us increase the training set size gradually. As the training set size increases, what do you expect will happen with the mean training and mean testing errors? (No explanation required) Mean Training Error: A. Increase; B. Decrease Mean Testing Error: A. Increase; B. Decrease Answer : The training error tends to increase. As more examples have to be fitted, it becomes harder to hit, or even come close, to all of them....
View
Full
Document
This note was uploaded on 01/26/2010 for the course MACHINE LE 10701 taught by Professor Ericp.xing during the Fall '08 term at Carnegie Mellon.
 Fall '08
 EricP.Xing

Click to edit the document details