midterm2011_SOLUTION

# midterm2011_SOLUTION - STANFORD UNIVERSITY CS 229 Autumn...

This preview shows pages 1–5. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STANFORD UNIVERSITY CS 229, Autumn 2011 Midterm Examination Wednesday, November 9, 6:00pm-9:00pm Question Points 1 Generalized Linear Models /15 2 Gaussian Naive Bayes /15 3 Linear Invariance of Logistic Regression /12 4 ‘ 2-Regularized SVM /18 5 Uniform Convergence /16 6 Short Answers /38 Total /114 Name of Student: SUID: The Stanford University Honor Code: I attest that I have not given or received aid in this examination, and that I have done my share and taken an active part in seeing to it that others as well as myself uphold the spirit and letter of the Honor Code. Signed: CS229 Midterm 2 1. [15 points] Generalized Linear Models In class, we showed that the Bernoulli and Gaussian distributions are exponential family distributions, which are of the form p ( y ; η ) = b ( y ) exp( η T T ( y )- a ( η )) In this problem, we will consider a different exponential family distribution, specifi- cally the Exponential distribution, which has a density given by p ( y ; λ ) = λ exp(- λy ) Here, y ≥ 0 is a non-negative real number, and the distribution is parameterized by λ ∈ R . (a) [5 points] Write the Exponential distribution in the exponential family form given above. You will need to come up with expressions for η , b ( y ), T ( y ), and a ( η ). Answer: p ( y ; λ ) = exp(log λ- λy ) b ( y ) = 1 η =- λ T ( y ) = y a ( η ) =- log (- η ) Note: an equally valid solution has T ( y ) =- y,η = λ , and a ( η ) =- log( η ) . This will give sign flips in parts b and c, but results in an identical Hessian in part d. CS229 Midterm 3 (b) [2 points] Derive the canonical response function g ( η ), which gives the Expo- nential distribution’s mean as a function of the natural parameter η . You may use the fact that an Exponential distribution (with parameter λ ) has mean 1 λ . Answer: g ( η ) = 1 λ =- 1 η (c) [2 points] Assuming that we have a training set { ( x (1) ,y (1) ) ,..., ( x ( m ) ,y ( m ) ) } of m independently and identically distributed (IID) examples, write down the log-likelihood ‘ ( θ ) of the parameters. Answer: L ( θ ) = m Y i =1 p ( y ( i ) | x ( i ) ; θ ) = m Y i =1 exp ( η T T ( y ( i ) )- a ( η ) ) = m Y i =1 exp ( θ T x ( i ) ) T y ( i ) + log (- θ T x ( i ) ) ‘ ( θ ) = m X i =1 θ T x ( i ) y ( i ) + log (- θ T x ( i ) ) (d) [6 points] Find the hessian H of the log-likelihood ‘ ( θ ), and show that it is negative semi-definite. CS229 Midterm 4 Answer: First, the gradient: ∂‘ ( θ ) ∂θ j = m X i =1 ∂ ∂θ j ( θ T x ( i ) y ( i ) + log (- θ T x ( i ) )) = m X i =1 x ( i ) j y ( i ) + ∂ ∂θ j log (- θ T x ( i ) ) = m X i =1 x ( i ) j y ( i ) + 1- θ T x ( i ) ∂ ∂θ j (- θ T x ( i ) ) = m X i =1 x ( i ) j y ( i ) + x ( i ) j θ T x ( i ) = m X i =1 y ( i ) + 1 θ T x ( i ) x ( i ) j Now, the Hessian: ∂‘ ( θ ) ∂θ j = m X i =1 y ( i ) + 1 θ T x ( i ) x ( i ) j ∂ 2 ∂θ j ∂θ k = m X i =1 x ( i ) j ∂ ∂θ k 1 θ T x ( i ) = m X i =1 x ( i ) j- 1 ( θ T x ( i...
View Full Document

## This document was uploaded on 01/06/2012.

### Page1 / 24

midterm2011_SOLUTION - STANFORD UNIVERSITY CS 229 Autumn...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online