ps2_solution - CS229 Problem Set#2 Solutions 1 CS 229...

Info icon This preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
CS229 Problem Set #2 Solutions 1 CS 229, Public Course Problem Set #2 Solutions: Kernels, SVMs, and Theory 1. Kernel ridge regression In contrast to ordinary least squares which has a cost function J ( θ ) = 1 2 m summationdisplay i =1 ( θ T x ( i ) y ( i ) ) 2 , we can also add a term that penalizes large weights in θ . In ridge regression , our least squares cost is regularized by adding a term λ bardbl θ bardbl 2 , where λ > 0 is a fixed (known) constant (regularization will be discussed at greater length in an upcoming course lecutre). The ridge regression cost function is then J ( θ ) = 1 2 m summationdisplay i =1 ( θ T x ( i ) y ( i ) ) 2 + λ 2 bardbl θ bardbl 2 . (a) Use the vector notation described in class to find a closed-form expreesion for the value of θ which minimizes the ridge regression cost function. Answer: Using the design matrix notation, we can rewrite J ( θ ) as J ( θ ) = 1 2 ( vector y ) T ( vector y ) + λ 2 θ T θ . Then the gradient is θ J ( θ ) = X T X T vector y + λθ . Setting the gradient to 0 gives us 0 = X T X T vector y + λθ θ = ( X T X + λI ) 1 X T vector y . (b) Suppose that we want to use kernels to implicitly represent our feature vectors in a high-dimensional (possibly infinite dimensional) space. Using a feature mapping φ , the ridge regression cost function becomes J ( θ ) = 1 2 m summationdisplay i =1 ( θ T φ ( x ( i ) ) y ( i ) ) 2 + λ 2 bardbl θ bardbl 2 . Making a prediction on a new input x new would now be done by computing θ T φ ( x new ). Show how we can use the “kernel trick” to obtain a closed form for the prediction on the new input without ever explicitly computing φ ( x new ). You may assume that the parameter vector θ can be expressed as a linear combination of the input feature vectors; i.e., θ = m i =1 α i φ ( x ( i ) ) for some set of parameters α i .
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS229 Problem Set #2 Solutions 2 [Hint: You may find the following identity useful: ( λI + BA ) 1 B = B ( λI + AB ) 1 . If you want, you can try to prove this as well, though this is not required for the problem.] Answer: Let Φ be the design matrix associated with the feature vectors φ ( x ( i ) ) . Then from parts (a) and (b), θ = ( Φ T Φ + λI ) 1 Φ T vector y = Φ T ( ΦΦ T + λI ) 1 vector y = Φ T ( K + λI ) 1 vector y. where K is the kernel matrix for the training set (since Φ i,j = φ ( x ( i ) ) T φ ( x ( j ) ) = K ij .) To predict a new value y new , we can compute vector y new = θ T φ ( x new ) = vector y T ( K + λI ) 1 Φ φ ( x new ) = m summationdisplay i =1 α i K ( x ( i ) , x new ) . where α = ( K + λI ) 1 vector y . All these terms can be efficiently computing using the kernel function. To prove the identity from the hint, we left-multiply by λ ( I + BA ) and right-multiply by λ ( I + AB ) on both sides. That is, ( λI + BA ) 1 B = B ( λI + AB ) 1 B = ( λI + BA ) B ( λI + AB ) 1 B ( λI + AB ) = ( λI + BA ) B λB + BAB = λB + BAB. This last line clearly holds, proving the identity.
Image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern