The documents having distance from the hyperplane are

The SVM algorithm determines a hyperplane which is located between the positive and negative examples of the training set. The parameters b j are adapted in such a way that the distance ξ – called margin – between the hyperplane and the closest positive and negative example documents is maximized, as shown in Fig. 3.1.5. This amounts to a constrained quadratic optimization problem which can be solved efficiently for a large number of input vectors. documents of class 1 hyp erp lan e ma rgin ma rgin x x Figure 2: Hyperplane with maximal distance (margin) to examples of positive and negative classes constructed by the support vector machine. The documents having distance ξ from the hyperplane are called support vectors and determine the actual location of the hyperplane. Usually only a small fraction of documents are support vectors. A new document with term vector td is classified in L1 if the value f (td ) > 0 and into L2 otherwise. In case that the document vectors of the two c...
