An Example: Text Classification
each example is a text document
label s the type of document (e.g. articles I will find interesting)
Give algorithm based on naive bayes which is very effective
Two key decision:
text document needs to be converted to attri
Boosting
(NOTE: Ep = Epsilon)
Let's begin by considering two questions
1. Suppose you are given a PAC algorithm A1, that works. For any Ep. but only for
delta=1/2.
That is, in poly time it outputs a hyp h with error D(h) <= E with prob >= 1/2
2. Suppose y
Evaluating Hypotheses (chap 5)
With enough data this is easily handled using a large validation set. Focus here is on
doing this when data
is limited. Two key difficulties:
1. Bias in estimate - observed accuracy over training exs often poor estimator due
Computational learning theory
Goal: Identify concept classes that are inherently difficult or easy to learn. For a
concept class C, we want to characterize the number of training excersizes necesary to
reach desired accuracy (with respect to the unknown d
Two-sided and one-sided bounds
Two-sided bound with N% confidence is error[s](h) - Z[N]*delta <= error[D](h)
<=error[s](h) + Z[N]*delta
Suppose you just want to say that error[D](h) <= x. Then you just need a one-sided
bound.
Let a=1-N/100
Then there is a
Gradient Descent (cfw_<x->, t>, n (where <x ->, t> are the training exs)
Initialize each wi to some small random value like -.05 to .05
Repeat until termination condition is met
o
Initialize each delta(wi) to zero
o
For each training ex <x->, t>
Compute o
Artificial Neural Networks (ANN)
Robust (ie noise-tolerant) approach to approximating real-valued, discrete-valued or
vector valued target functions. There are lots of things that humans do well that
computers can't do well. ANN is best known algorithm fo
Concept Learning and Version Spaces
Concept Learning: inferring a Boolean-valued function from training examples of its
input and output (supervised learning)
-label is + or (boolean)
-things are described by their properties
ex. Regarding the property of
Decision Tree Learning
One of the most widely used inductive inference method. Provides method for approximate
discrete-valued target functions.
Nice feature of decision trees is that they can be interpreted by humans.
Sample DT:
can express as a disjunct
Variations of basic decision tree algorithm
Avoiding overfitting:
Pre-pruning: stop growing the dt before it begins overfitting (before it perfectly
classifies the training data)
stop growing when the information gain is less than some fixed constant (E)
Example for reinforcement learning: Playing
Checkers
Task: playing checkers (and winning)
Performance: % games won against opponent (human)
Experience: practice against self
If given the quality of each move, then you would have supervised, on-line learni
What is Machine Learning? Here is the definition given by Tom Mitchell.
Machine Learning: Any computer program that improves its performance P at some
task
T through experience E.
Example: Learn to play checkers
T: play checkers & win
P: % games won in to