Unformatted text preview: f Text Mining
The SVM algorithm determines a hyperplane which is located between the
positive and negative examples of the training set. The parameters b j are adapted
in such a way that the distance ξ – called margin – between the hyperplane and
the closest positive and negative example documents is maximized, as shown in
Fig. 3.1.5. This amounts to a constrained quadratic optimization problem which
can be solved efﬁciently for a large number of input vectors.
documents of class 1 hyp erp lan e
ma rgin ma rgin x x Figure 2: Hyperplane with maximal distance (margin) to examples of positive and negative classes constructed by the support vector machine. The documents having distance ξ from the hyperplane are called support
vectors and determine the actual location of the hyperplane. Usually only a
small fraction of documents are support vectors. A new document with term
vector td is classiﬁed in L1 if the value f (td ) > 0 and into L2 otherwise. In
case that the document vectors of the two c...
View Full Document