This preview shows page 1. Sign up to view the full content.
Unformatted text preview: the data is on the
if side the distance is . is known. M aximum M argin Clas s ifie rs in the Line arly s e parable cas e
Choose the line farthest from both classes or choose the line which has the maximum distance from the closest point. (i.e maximize the margin)
where Fiqure 28.3 depicts a hyperplane defined by the equation
hyperplane is a line.
Let us rewrite
1. is label and is distance . Since they are in , the by using the following properties: is orthogonal to the hyperplane Two points lying on the hyperplane. Figure 28.3 The linear algebra of a hyperplane wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 59/74 10/09/2013 Hence, Stat841  Wiki Cour se Notes is orthogonal to 2. For any point , and is the vector normal to the hyperplane. on the hyperplane, For any point on the hyperplane, multiplying by gives negative value of the intercept of the hyperplane. 3. The signed distance for any point to the hyperplane is
Since the length of is not known, let it be unit vector. . by property 2 We had , and since we now know how to compute Suppose is not on the hyperplane for Figure 28.4 This is known as the canonical representation of the decision hyperplane.
For only the direction is important, so does not change its direction, the hyperplane will be the same. which is equivalent as
Reference:
Hastie,T.,Tibshirani,R., Friedman,J.,(2008).The Elements of Statistical Learning:129 130
Exte ns ionM ulticlas s SVM [28] (http://e n.wikipe dia.org/wiki/Support_ve ctor_machine #M ulticlas s _SVM )
SVM is only directly applicable for two class case. We want to generalize this algorithm to multi class tasks.
wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 60/74 10/09/2013 Stat841  Wiki Cour se Notes Multiclass SVM aims to assign labels to instances by using support vector machines, where the labels are drawn from a finite set of several elements. The dominating
approach for doing so is to reduce the single multiclass problem into multiple binary problems. Each of the problems yields a binary classifier, which is assumed to produce
an output function that gives relatively large values for examples from the positive class and relatively small values for examples belong...
View
Full
Document
This document was uploaded on 03/07/2014.
 Winter '13

Click to edit the document details