This preview shows page 1. Sign up to view the full content.
Unformatted text preview: e easy and
hard points. Thus the AdaBoost focuses on the more informative or difficult points.
Many boosting algorithms belong to a class called AnyBoost which are gradient descent algorithms for choosing linear combinations of elements of an inner product space in
order to minimize some cost functional.
We are primarily interested in voted combinations of classifiers We want to find H such that the cost functional is minimized for a suitable cost function c are weak base classifiers from some class and αj are classifier weights. The margin of an example (x i,yi) is defined by yiH(x i). The base hypotheses h and their linear combinations H can be considered to be elements of an inner product function space
We define the inner product as . but the AnyBoost algorithm is valid for any cost function and inner product. We have a function H as a linear combination of base classifiers and wish to add a base classifier h to H so that cost
by maximizing decreases for arbitrarily small ε. The direction we seek is found AnyBoost algorithm:
1. Find that maximizes the inner product 2. If then
1. Return 3. Choose step size
3. The final classifier is Other voting methods, including AdaBoost, can be viewed as special cases of this algorithm. Bagging
Bagging, or bootstrap aggregating (http://en.wikipedia.org/wiki/Bootstrap_aggregating) , is another meta- technique used to reduce the variance of classifiers with high
variability. It exploits the fact that a bootstrap mean is approximately equal to the posterior average. It is most effective for highly nonlinear classifiers such as decision trees.
In particular because of the highly unstable nature of these classifiers, they stand most likely to benefit from bagging.
The idea is to train classifiers
classifiers as follows: to using B bootstrap samples from the data set. The final classification is obtained using an average or 'plurality vote' of the B Many classifiers, such as trees, already have underlying functions that estimate the class probabilities at . An alternative...
View Full Document
This document was uploaded on 03/07/2014.
- Winter '13