This preview shows page 1. Sign up to view the full content.
Unformatted text preview: k that the ﬁnal
decision depends only on relatively few terms. A decisive improvement may be
achieved by boosting decision trees (Schapire & Singer 1999), i.e. determining a
set of complementary decision trees constructed in such a way that the overall
error is reduced. Schapire & Singer (2000) use even simpler one step decision
trees containing only one rule and get impressive results for text classiﬁcation.
3.1.5 Support Vector Machines and Kernel Methods A Support Vector Machine (SVM) is a supervised classiﬁcation algorithm that
recently has been applied successfully to text classiﬁcation tasks (Joachims 1998;
Dumais et al. 1998; Leopold & Kindermann 2002). As usual a document d
is represented by a – possibly weighted – vector (td1 , . . . , tdN ) of the counts of
its words. A single SVM can only separate two classes — a positive class L1
(indicated by y = +1) and a negative class L2 (indicated by y = −1). In the
space of input vectors a hyperplane may be deﬁned by setting y = 0 in the
following linear equation.
y = f (td ) = b0 + 34 N ∑ bj tdj j =1 LDV-FORUM A Brief Survey o...
View Full Document
- Summer '11