{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

practice-midterm

# practice-midterm - CS229 Practice Midterm 1 CS 229 Autumn...

This preview shows pages 1–3. Sign up to view the full content.

CS229 Practice Midterm 1 CS 229, Autumn 2007 Practice Midterm Notes: 1. The midterm will have about 5-6 long questions, and about 8-10 short questions. Space will be provided on the actual midterm for you to write your answers. 2. The midterm is meant to be educational, and as such some questions could be quite challenging. Use your time wisely to answer as much as you can! 1. [13 points] Generalized Linear Models Recall that generalized linear models assume that the response variable y (conditioned on x ) is distributed according to a member of the exponential family: P ( y ; η ) = b ( y ) exp( ηT ( y ) a ( η )) , where η = θ T x . For this problem, we will assume η R . (a) [10 points] Given a training set { ( x ( i ) , y ( i ) ) } m i =1 , the loglikelihood is given by ( θ ) = m summationdisplay i =1 log p ( y ( i ) | x ( i ) ; θ ) . Give a set conditions on b ( y ), T ( y ), and a ( η ) which ensure that the loglikelihood is a concave function of θ (and thus has a unique maximum). Your conditions must be reasonable, and should be as weak as possible. (E.g., the answer “any b ( y ), T ( y ), and a ( η ) so that ( θ ) is concave” is not reasonable. Similarly, overly narrow conditions, including ones that apply only to specific GLIMs, are also not reasonable.) (b) [3 points] When the response variable is distributed according to a Normal distribu- tion (with unit variance), we have b ( y ) = 1 2 π e - y 2 2 , T ( y ) = y , and a ( η ) = η 2 2 . Verify that the condition(s) you gave in part (a) hold for this setting. 2. [15 points] Bayesian linear regression Consider Bayesian linear regression using a Gaussian prior on the parameters θ R n +1 . Thus, in our prior, θ ∼ N ( vector 0 , τ 2 I n ), where τ 2 R , and I n +1 is the n + 1-by- n + 1 identity matrix. Also let the conditional distribution of y ( i ) given x ( i ) and θ be N ( θ T x ( i ) , σ 2 ), as in our usual linear least-squares model. 1 Let a set of m IID training examples be given (with x ( i ) R n +1 ). Recall that the MAP estimate of the parameters θ is given by: θ MAP = arg max θ parenleftBigg m productdisplay i =1 p ( y ( i ) | x ( i ) , θ ) parenrightBigg p ( θ ) Find, in closed form, the MAP estimate of the parameters θ . For this problem, you should treat τ 2 and σ 2 as fixed, known, constants. [Hint: Your solution should involve deriving something that looks a bit like the Normal equations.] 1 Equivalently, y ( i ) = θ T x ( i ) + ε ( i ) , where the ε ( i ) ’s are distributed IID N (0 , σ 2 ).

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
CS229 Practice Midterm 2 3. [18 points] Kernels In this problem, you will prove that certain functions K give valid kernels. Be careful to justify every step in your proofs. Specifically, if you use a result proved either in the lecture
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}