margin w w 2w COS424SML 302 Classification methods 27 57

Margin w w 2w cos424sml 302 classification methods 27

• Notes
• 57

This preview shows page 27 - 37 out of 57 pages.

margin w 0 w 2/||w|| COS424/SML 302 Classification methods February 20, 2019 27 / 57 Subscribe to view the full document.

Fitting an SVM to training data We can fit an SVM to training data, D = { ( x 1 , z 1 ) , . . . , ( x n , z n ) } , z ∈ {- 1 , 1 } by solving the following optimization problem: w = arg min || w || 2 Subject to: w T x + w 0 > + 1 , for positive examples w T x + w 0 < - 1 , for negative examples margin w 0 w 2/||w|| COS424/SML 302 Classification methods February 20, 2019 28 / 57 SVMs for binary classification Given a fitted SVM ( w and w 0 ), compute η * = f ( x * ) = w T x * + w 0 η * is the distance from the new point to the hyperplane. Then our prediction for class label ˆ z * = sign ( η * ). η η 51 1 51 1 η > 1 η < 51 COS424/SML 302 Classification methods February 20, 2019 29 / 57 Subscribe to view the full document.

Hinge loss and SVMs We define the We can define this problem of finding the maximum margin classifier as an optimization problem with respect to the training data. slope= w η < 51 51 1 0 COS424/SML 302 Classification methods February 20, 2019 30 / 57 SVMs for binary classification Poor predictions result in a larger value for hinge loss: max(0 , 1 - z η ). When prediction | η | ≥ 1, and prediction sign matches truth z , hinge loss is zero. When prediction is between the margin and hyperplane (i.e., 0 ≥ | η | ≥ 1), hinge loss is small. When prediction has unmatched sign to truth, hinge loss is large. η 51 1 η 51 1 COS424/SML 302 Classification methods February 20, 2019 31 / 57 Subscribe to view the full document.

SVM implicitly performs feature selection SVMs perform feature selection : any point that does not lie on the margin does not play a role in the optimization problem. The support vectors are the points that define the margin. margin w 0 w 2/||w|| Where are the support vectors? COS424/SML 302 Classification methods February 20, 2019 32 / 57 SVM: optimization problem This is formally defined as the following optimization problem: min w , w 0 1 2 || w || 2 + c · n X i =1 (1 - z i η i )) + This expression is not differentiable. η margin w 0 w 51 1 η < 51 2/||w|| COS424/SML 302 Classification methods February 20, 2019 33 / 57 Subscribe to view the full document.

SVM: optimization problem, with errors Let’s include a slack term ξ i : replace hard constraint z i η i 1 with soft margin constraints z i η i 1 - ξ i to allow mistakes in classification. Then, we have the following optimization [Vapnik & Cortes 1995] : min w , w 0 1 2 || w || 2 + c · n X i =1 ξ i , s . t . ξ i 0 z i ( x T i w + w 0 ) 1 - ξ i This is a quadratic program, and it takes O ( n 2 ) time to solve. Its solution takes the form ˆ w = n i =1 α i z i x i , where α i is sparse and selects only support vectors that define the margin. COS424/SML 302 Classification methods February 20, 2019 34 / 57 SVMs: prediction revisited Recall that prediction for x * is performed using: ˆ z ( x * ) = sign ˆ w 0 + ˆ w T x * = sign ˆ w 0 + n X i =1 α i z i x i ! T x * = sign ˆ w 0 + n X i =1 α i z i x T i x * ! . COS424/SML 302 Classification methods February 20, 2019 35 / 57 Subscribe to view the full document.

SVMs: using kernels Let’s look at the form of this classifier: ˆ z ( x * ) = sign ˆ w 0 + n X i =1 α i z i x T i x * !  • Spring '09
• Machine Learning, K-nearest neighbor algorithm, Support vector machine, Statistical classification

What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes