# Lect26 - Announcements Final 7-8:15 PM Wed 12/15 here Q/A...

Announcements Final 7-8:15 PM, Wed. 12/15 here Q/A session 11-noon Mon. 12/13 2405SC Projects (for 4 credits) due Tue. 12/7 Code Sample I/O (if it doesn’t work, say so) Paper discussing What you did & why What you learned How you would do it differently given… 1

VC Dimension of a Concept Class Can be challenging to prove Can be non-intuitive Signum(sin(  x)) on the real line Convex polygons in the plane 2
Learnability Often the hypothesis space (or concept class) is syntactically parameterized n-Conjuncts, k-DNF, k- CNF, m of n, MLP w/ k units,… The concept class is PAC learnable if there exists an algorithm whose running time grows no faster than polynomially in the natural complexity parameters: 1/ , 1/ , others Clearly, polynomially-bounded growth in the minimum number of training examples is a necessary condition. 3

Suppose… All h H are very low accuracy, say < 0.1% correct VC(H) is 100 Training set S contains 80 labeled examples What’s the probability that an arbitrary h gets the first training example right? What is the best some h H can possibly do on all 80 elements of S? Will this h work well in general? 4
log(labelings) vs. |S| |S| labelings(|S|) 1 100 10,000 1,000,000 5 10 20 15 All Labelings (exponential growth) Labelings Possible by H (polynomial growth after VC(H) Sauer’s Lemma) VC(H) 5

Back to Perceptrons (linear threshold units, linear discriminators) If there is one perceptron, there are many Are some better? Is one best? Can we tell? Can we find it? 6
What’s the Best Separating Hyperplane? + - - - - - - - + + + + + 7

What’s the Best Separating Hyperplane? + - - - - - - - + + + + + 8
What’s the Best Separating Hyperplane? + - - - - - - - + + + + + 9

What’s the Best Separating Hyperplane? + - - - - - - - + + + + + The larger the margin, the lower the capacity But we can have any margin we want by expanding the space… Need to normalize 10
What’s the Best Separating Hyperplane?

