This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CS229 Problem Set #2 1 CS 229, Autumn 2011 Problem Set #2: Naive Bayes, SVMs, and Theory Due in class (9:30am) on Wednesday, November 2. Notes: (1) These questions require thought, but do not require long answers. Please be as concise as possible. (2) When sending questions to [email protected] , please make sure to write the homework number and the question number in the subject line, such as Hwk 2 Q4 , and send a separate email per question. (3) If you missed the first lecture or are unfamiliar with the class’ collaboration or honor code policy, please read the policy on Handout #1 (available from the course website) before starting work. (4) For problems that require programming, please include in your submission a printout of your code (with comments) and any figures that you are asked to plot. (5) Please indicate the submission time and number of late dates clearly in your submission. SCPD students: Please email your solutions to [email protected] , and write “Prob lem Set 2 Submission” on the Subject of the email. If you are writing your solutions out by hand, please write clearly and in a reasonably large font using a dark pen to improve legibility. 1. [15 points] Constructing kernels In class, we saw that by choosing a kernel K ( x, z ) = φ ( x ) T φ ( z ), we can implicitly map data to a high dimensional space, and have the SVM algorithm work in that space. One way to generate kernels is to explicitly define the mapping φ to a higher dimensional space, and then work out the corresponding K . However in this question we are interested in direct construction of kernels. I.e., suppose we have a function K ( x, z ) that we think gives an appropriate similarity measure for our learning problem, and we are considering plugging K into the SVM as the kernel function. However for K ( x, z ) to be a valid kernel, it must correspond to an inner product in some higher dimensional space resulting from some feature mapping φ . Mercer’s theorem tells us that K ( x, z ) is a (Mercer) kernel if and only if for any finite set { x (1) , . . . , x ( m ) } , the matrix K is symmetric and positive semidefinite, where the square matrix K ∈ R m × m is given by K ij = K ( x ( i ) , x ( j ) ). Now here comes the question: Let K 1 , K 2 be kernels over R n × R n , let a ∈ R + be a positive real number, let f : R n mapsto→ R be a realvalued function, let φ : R n → R d be a function mapping from R n to R d , let K 3 be a kernel over R d × R d , and let p ( x ) a polynomial over x with positive coefficients. For each of the functions K below, state whether it is necessarily a kernel. If you think it is, prove it; if you think it isn’t, give a counterexample....
View
Full Document
 Fall '09
 Machine Learning, Email spam

Click to edit the document details