This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Aims
11s1: COMP9417 Machine Learning and Data Mining Fundamentals of Concept Learning This lecture aims to develop your understanding of representing and
searching hypothesis spaces for concept learning. Following it you should
be able to:
• deﬁne a representation for concepts March 1, 2011 • deﬁne a hypothesis space in terms of generality ordering on concepts
• describe an algorithm to search a hypothesis space
• express the framework of version spaces Acknowledgement: Material derived from slides for the book
Machine Learning, Tom Mitchell, McGrawHill, 1997
http://www2.cs.cmu.edu/~tom/mlbook.html • describe an algorithm to search a hypothesis space using the framework
of version spaces
• explain the role of inductive bias in concept learning
COMP9417: March 1, 2011 Overview Fundamentals of Concept Learning: Slide 1 Training Examples for EnjoySport Concept Learning inferring a Booleanvalued function from training
examples of its input and output.
• Learning from examples Sky
Sunny
Sunny
Rainy
Sunny Temp
Warm
Warm
Cold
Warm Humid
Normal
High
High
High Wind
Strong
Strong
Strong
Strong Water
Warm
Warm
Warm
Cool Forecst
Same
Same
Change
Change EnjoySpt
Yes
Yes
No
Yes • Generaltospeciﬁc ordering over hypotheses
• Version spaces and candidate elimination algorithm What is the general concept? • Picking new examples
• The need for inductive bias
Note: simple approach assuming no noise, illustrates key concepts COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 2 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 3 Representing Hypotheses The Prototypical Concept Learning Task Many possible representations . . . • Given: Here, h is a conjunction of constraints on attributes. – Instances X : Possible days, each described by the attributes Each constraint can be: Attribute
Sky
AirTemp
Humid
Wind
Water
Forecast • a speciﬁc value (e.g., W ater = W arm)
• don’t care (e.g., “W ater =?”)
• no value allowed (e.g.,“Water=∅”) Values
Sunny, Cloudy, Rainy
Warm, Cold
Normal, High
Strong, Weak
Warm, Cool
Same, Change For example,
Sky
Sunny AirTemp
? COMP9417: March 1, 2011 Humid
? Wind
Strong Water
? Forecst
Same Fundamentals of Concept Learning: Slide 4 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 5 The inductive learning hypothesis The Prototypical Concept Learning Task – Target function c: EnjoySport : X → {0, 1}
– Hypotheses H : Conjunctions of literals. E.g.
?, Cold, High, ?, ?, ?.
– Training examples D: Positive and negative examples of the target
function
x 1 , c( x 1 ) , . . . x m , c( x m ) Any hypothesis found to approximate the target function well over
a suﬃciently large set of training examples will also approximate the
target function well over other unobserved examples. • Determine: A hypothesis h in H such that h(x) = c(x) for all x in D
(usually called the target hypothesis). COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 6 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 7 Concept Learning as Search Concept Learning as Search Hypothesis space Question: What can be learned ?
Answer: (only) what is in the hypothesis space Sky × AirTemp × . . . × Forecast = 5 × 4 × 4 × 4 × 4 × 4 How big is the hypothesis space for EnjoySport ? = 5120 Instance space (semantically distinct∗ only) = 1 + (4 × 3 × 3 × 3 × 3 × 3)
= 973 Sky × AirTemp × . . . × Forecast = 3 × 2 × 2 × 2 × 2 × 2
= 96 ∗ any hypothesis with an ∅ constraint covers no instances, hence all are
semantically equivalent.
The learning problem ≡ searching a hypothesis space. How ?
COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 8 Instances, Hypotheses, and MoreGeneralThan
Instances X COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 9 A generality order on hypotheses Hypotheses H Specific h
1 x1
x h
h 2 Deﬁnition: Let hj and hk be Booleanvalued functions deﬁned over
instances X . Then hj is more general than or equal to hk (written
hj ≥g hk ) if and only if 3 (∀x ∈ X )[(hk (x) = 1) → (hj (x) = 1)] 2
General Intuitively, hj is more general than or equal to hk if any instance
satisfying hk also satisﬁes hj .
x1= <Sunny, Warm, High, Strong, Cool, Same> h 1= <Sunny, ?, ?, Strong, ?, ?> x = <Sunny, Warm, High, Light, Warm, Same>
2 h = <Sunny, ?, ?, ?, ?, ?>
2
h = <Sunny, ?, ?, ?, Cool, ?>
3 hj is (strictly) more general than hk (written hj >g hk ) if and only if
(hj ≥g hk ) ∧ (hk ≥g hj ).
hj is more speciﬁc than hk when hk is more general than hj . COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 10 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 11 The FindS Algorithm Hypothesis Space Search by FindS
Instances X Hypotheses H 1. Initialize h to the most speciﬁc hypothesis in H h0  x3 2. For each positive training instance x
• For each attribute constraint ai in h
If the constraint ai in h is satisﬁed by x
Then do nothing
Else replace ai in h by the next more general constraint that is
satisﬁed by x +
x1 2 +
x4 x 2 = <Sunny Warm High Strong Warm Same>, +
x 3 = <Rainy Cold High Strong Warm Change>,  x = <Sunny Warm High Strong Cool Change>, +
4 Fundamentals of Concept Learning: Slide 12 h 2,3 x+ x 1 = <Sunny Warm Normal Strong Warm Same>, + COMP9417: March 1, 2011 Specific
h1 h General h = <∅, ∅, ∅, ∅, ∅, ∅>
0
h1 = <Sunny Warm Normal Strong Warm Same>
h2 = <Sunny Warm ? Strong Warm Same>
h = <Sunny Warm ? Strong Warm Same>
3
h = <Sunny Warm ? Strong ? ? >
4 COMP9417: March 1, 2011 FindS  does it work ? 4 Fundamentals of Concept Learning: Slide 13 Complaints about FindS Assume: a hypothesis hc ∈ H describes target function c, and training
data is errorfree. • Can’t tell whether it has learned concept
learned hypothesis may not be the only consistent hypothesis By deﬁnition, hc is consistent with all positive training examples and can
never cover a negative example. • Can’t tell when training data inconsistent For each h generated by FindS, hc is more general than or equal to
h. • Picks a maximally speciﬁc h (why?) cannot handle noisy data might require maximally general h So h can never cover a negative example. COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 14 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 15 Version Spaces The ListThenEliminate Algorithm A hypothesis h is consistent with a set of training examples D of
target concept c if and only if h(x) = c(x) for each training example
x, c(x) in D.
Consistent(h, D) ≡ (∀x, c(x) ∈ D) h(x) = c(x) 1. V ersionSpace ← a list containing every hypothesis in H
2. For each training example, x, c(x)
remove from V ersionSpace any hypothesis h for which h(x) = c(x)
3. Output the list of hypotheses in V ersionSpace The version space, V SH,D , with respect to hypothesis space H and
training examples D, is the subset of hypotheses from H consistent
with all training examples in D.
V SH,D ≡ {h ∈ H Consistent(h, D)} COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 16 COMP9417: March 1, 2011 Example Version Space Fundamentals of Concept Learning: Slide 17 Representing Version Spaces
The General boundary, G, of version space V SH,D is the set of its
maximally general members S: { <Sunny, Warm, ?, Strong, ?, ?> } The Speciﬁc boundary, S, of version space V SH,D is the set of its
maximally speciﬁc members
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?> Every member of the version space lies between these boundaries G: V SH,D = {h ∈ H (∃s ∈ S )(∃g ∈ G)(g ≥ h ≥ s)} { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> } COMP9417: March 1, 2011 where x ≥ y means x is more general or equal to y Fundamentals of Concept Learning: Slide 18 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 19 The Candidate Elimination Algorithm
The Candidate Elimination Algorithm G ← maximally general hypotheses in H • If d is a negative example S ← maximally speciﬁc hypotheses in H
For each training example d, do
• If d is a positive example
– Remove from G any hypothesis inconsistent with d
– For each hypothesis s in S that is not consistent with d
∗ Remove s from S
∗ Add to S all minimal generalizations h of s such that
1. h is consistent with d, and
2. some member of G is more general than h
∗ Remove from S any hypothesis that is more general than another
hypothesis in S
COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 20 – Remove from S any hypothesis inconsistent with d
– For each hypothesis g in G that is not consistent with d
∗ Remove g from G
∗ Add to G all minimal specializations h of g such that
1. h is consistent with d, and
2. some member of S is more speciﬁc than h
∗ Remove from G any hypothesis that is less general than another
hypothesis in G COMP9417: March 1, 2011 Fundamentals of Concept Learning: Example Trace
S:
0 Slide 21 Example Trace
S : {<
∅, ∅, ∅, ∅, ∅, ∅ >}
0 {<Ø, Ø, Ø, Ø, Ø, Ø>} S 1 : {<Sunny, Warm, Normal, Strong, Warm, Same> } S 2 : {<Sunny, Warm, ?, Strong, Warm, Same>} G ,G ,G :
012 {<?, ?, ?, ?, ?, ?> } Training examples: G 0:
COMP9417: March 1, 2011 1 . <Sunny, Warm, Normal, Strong, Warm, Same>, Enjoy Sport = Yes {<?, ?, ?, ?, ?, ?>} 2 . <Sunny, Warm, High, Strong, Warm, Same>, Enjoy Sport = Yes Fundamentals of Concept Learning: Slide 22 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 23 Example Trace Example Trace S 2 , S 3 : { <Sunny, Warm, ?, Strong, Warm, Same> } S 3 : {<Sunny, Warm, ?, Strong, Warm, Same>} S 4: G 3: { <Sunny, Warm, ?, Strong, ?, ?> } {<Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same> } G 4: {<Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?>}
G 2: {<?, ?, ?, ?, ?, ?> } G 3: {<Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same>} Training Example: Training Example: 3. <Rainy, Cold, High, Strong, Warm, Change>, EnjoySport=No 4. <Sunny, Warm, High, Strong, Cool, Change>, EnjoySport = Yes COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 24 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 25 Which Training Example Is Best To Choose Next ? Example Trace S 4 : {<Sunny, Warm, ?, Strong, ?, ?>} S: { <Sunny, Warm, ?, Strong, ?, ?> } <Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?> <Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?> G : {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>}
4 G: COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 26 { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> } COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 27 How Should New Instances Be Classiﬁed ? Which Training Example To Choose Next ? S: { <Sunny, Warm, ?, Strong, ?, ?> } S: { <Sunny, Warm, ?, Strong, ?, ?> } <Sunny, ?, ?, Strong, ?, ?>
<Sunny, ?, ?, Strong, ?, ?> G: <Sunny, Warm, ?, ?, ?, ?> G: { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> } <?, Warm, ?, Strong, ?, ?> Fundamentals of Concept Learning: { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> } Sunny Warm Normal Strong Cool Change
Rainy Cold Normal Light Warm Same
Sunny Warm Normal Light Warm Same Sunny W arm N ormal Light W arm Same
COMP9417: March 1, 2011 <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?> Slide 28 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 29 What Justiﬁes this Inductive Leap ? How Should New Instances Be Classiﬁed ? S: { <Sunny, Warm, ?, Strong, ?, ?> } + Sunny W arm N ormal Strong Cool Change
+ Sunny W arm N ormal Light W arm Same <Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?> S : Sunny W arm N ormal ? ? ?
G: { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> } Why believe we can classify this unseen instance ?
Sunny W arm N ormal Strong W arm Same Sunny Warm Normal Strong Cool Change (6 + /0−)
Rainy Cold Normal Light Warm Same (0 + /6−)
Sunny Warm Normal Light Warm Same (3 + /3−)
COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 30 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 31 An UNBiased Learner Inductive Bias
Consider Idea: Choose H that expresses every teachable concept (i.e. H is the
power set of X )
Consider H = disjunctions, conjunctions, negations over previous H .
E.g. • concept learning algorithm L
• instances X , target concept c
• training examples Dc = {x, c(x)} Sunny W arm N ormal ? ? ? ∨ ¬? ? ? ? ? Change • let L(xi, Dc) denote the classiﬁcation assigned to the instance xi by L
after training on data Dc. What are S , G in this case? Deﬁnition: The inductive bias of L is any minimal set of assertions B
such that for any target concept c and corresponding training examples
Dc
(∀xi ∈ X )[(B ∧ Dc ∧ xi) L(xi, Dc)]
where A B means A logically entails B S←
G←
COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 32 COMP9417: March 1, 2011 Inductive Systems and Equivalent Deductive Systems Fundamentals of Concept Learning: Slide 33 Three Learners with Diﬀerent Biases Inductive system
Training examples Candidate
Elimination
Algorithm New instance Classification of
new instance, or
"don’t know" Using Hypothesis
Space H 1. Rote learner: Store examples, Classify x iﬀ it matches previously
observed example.
2. Version space candidate elimination algorithm
3. FindS Equivalent deductive system
Classification of
new instance, or
"don’t know" Training examples
New instance Theorem Prover Assertion " H contains
the target concept" Inductive bias
made explicit COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 34 COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 35 Summary Points
1. Concept learning as search through H
2. Generaltospeciﬁc ordering over H
3. Version space candidate elimination algorithm
4. S and G boundaries characterize learner’s uncertainty
5. Learner can generate useful queries
6. Inductive leaps possible only if learner is biased
7. Inductive learners can be modelled by equivalent deductive systems
[Suggested reading: Mitchell, Chapter 2] COMP9417: March 1, 2011 Fundamentals of Concept Learning: Slide 36 ...
View Full
Document
 Three '11
 some
 Data Mining, Machine Learning

Click to edit the document details