This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ing to the negative class. Two common
methods to build such binary classifiers are where each classifier distinguishes between (i) one of the labels to the rest (one- versus- all) or (ii) between every pair of classes
(one- versus- one). Classification of new instances for one- versus- all case is done by a winner- takes- all strategy, in which the classifier with the highest output function assigns
the class (it is important that the output functions be calibrated to produce comparable scores). For the one- versus- one approach, classification is done by a max- wins
voting strategy, in which every classifier assigns the instance to one of the two classes, then the vote for the assigned class is increased by one vote, and finally the class with
most votes determines the instance classification. Optimizing The Support Vector Machine - November 16th, 2009
We currently derive Support Vector Machine for the case where two classes are separable in the given feature space. This margin can be written as
distance of each point from the hyperplane, where
is the distance and is used as the sign. , or the Margin Maximizing Problem for the Support Vector Machine
can be rewritten as
Note that the term
This implies if .
is on the hyperplane, but such that if is not on the hyperplane. . Divide through by C to produce . compose a hyperplane that can have different values but we care about the direction, dividng through by a constant does not change the direction of the hyperplane.
Thus, by assuming scaled values for we eliminate C, so that . Implying that the lower bound on Now in order to maximize the margin, we simply need to find
In other words, our optimization problem is now to find maximum is .
, under the constraint that . Note that we're dealing with the norm of . There are many different choices of possible norms, in general p- norm (http://en.wikipedia.org/wiki/P- norm#p- norm) . The 1norm of a vector is simply the sum of the absolute value of each element (also known as the taxicab or Manhattan distance), and is apparently more accurate, but also ha...
View Full Document
This document was uploaded on 03/07/2014.
- Winter '13