Unformatted text preview: ucker.pdf) ) give us a closer look into the Lagrangian equation and the associated conditions.
Suppose we are looking to minimize
1. At the optimal point,
2.
3.
4. such that . If and are differentiable, then the necessary conditions for to be a local minimum are: ; i.e. . (Dual Feasibility)
(Complementary Slackness)
(Primal Feasibility) If any of these conditions are violated, then the problem is deemed not feasible.
These are all trivial except for condition 3. Let's examine it further in our support vector machine problem. Support Vectors
Support vectors are the training points that determine the optimal separating hyperplane that we seek. Also, they are the most difficult points to classify or the most
informative for the classification.
In our case, the function is: Substituting
into KKT condition 3, we get
In order for this condition to be satisfied either
or wikicour senote.com/w/index.php?title= Stat841&pr intable= yes . 64/74 10/09/2013 All points Stat841  Wiki Cour se Notes will be either 1 or greater than 1 distance away from the hyperplane. Cas e 1: a point away from the margin If . If point is not on the margin, then the corresponding Cas e 2: a point . away from the margin If
If point is on the margin, then the corresponding Points on the margin, with corresponding . , are called support vect ors. Us ing s upport vectors
Support vectors are important because they allow the support vector machine algorithm to be insensitive to outliers. If
, then the cost function is also 0, and won't
contribute to the solution of the SVM problem; only points on the margin — support vectors — contribute. Hence the model given by SVM in entirely defined by the set of
support vectors, a subset of the entire training set. This is interesting because in the NN methods(and can be generalize to classical statistical learning) previous to this the
configuration of the network needed to be specified. In this case we have a data driven or 'nonparametric' model in which is the training set and algorithm will determine the
support vectors, instead of fitting a set of parameters using CV or other error minimization functions.
References: Wang, L, 2005. Support Vector Machines: Theory and Applications, Springer, 3
The s upport ve ctor mach...
View
Full
Document
This document was uploaded on 03/07/2014.
 Winter '13

Click to edit the document details