Stat841f09 - Wiki Course Notes

# For the one versus one approach classification is

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ing to the negative class. Two common methods to build such binary classifiers are where each classifier distinguishes between (i) one of the labels to the rest (one- versus- all) or (ii) between every pair of classes (one- versus- one). Classification of new instances for one- versus- all case is done by a winner- takes- all strategy, in which the classifier with the highest output function assigns the class (it is important that the output functions be calibrated to produce comparable scores). For the one- versus- one approach, classification is done by a max- wins voting strategy, in which every classifier assigns the instance to one of the two classes, then the vote for the assigned class is increased by one vote, and finally the class with most votes determines the instance classification. Optimizing The Support Vector Machine - November 16th, 2009 We currently derive Support Vector Machine for the case where two classes are separable in the given feature space. This margin can be written as distance of each point from the hyperplane, where is the distance and is used as the sign. , or the Margin Maximizing Problem for the Support Vector Machine can be rewritten as Note that the term This implies if . is on the hyperplane, but such that if is not on the hyperplane. . Divide through by C to produce . compose a hyperplane that can have different values but we care about the direction, dividng through by a constant does not change the direction of the hyperplane. Thus, by assuming scaled values for we eliminate C, so that . Implying that the lower bound on Now in order to maximize the margin, we simply need to find In other words, our optimization problem is now to find maximum is . , under the constraint that . Note that we're dealing with the norm of . There are many different choices of possible norms, in general p- norm (http://en.wikipedia.org/wiki/P- norm#p- norm) . The 1norm of a vector is simply the sum of the absolute value of each element (also known as the taxicab or Manhattan distance), and is apparently more accurate, but also ha...
View Full Document

## This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online