MIT6_045JS11_lec21 - 6.080/6.089 GITCS Feb 5, 2008 Lecture...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 6.080/6.089 GITCS Feb 5, 2008 Lecture 21 Lecturer: Scott Aaronson Scribe: Scott Aaronson / Chris Granade 1 Recap and Discussion of Previous Lecture Theorem 1 (Valiant) m = O 1 log ( | C | /δ ) samples suffice for ( , δ )-learning. Theorem 2 (Blumer et al.) m = O 1 VCdim ( C ) log 1 samples suffice. δ In both cases, the learning algorithm that achives the bound is just “find any hypothesis h compatible with all the sample data, and output it.” You asked great, probing questions last time, about what these theorems really mean. For example, “why can’t I just draw little circles around the ‘yes’ points, and expect that I can therefore predict the future?” It’s unfortunately a bit hidden in the formalism, but what these theorems are “really” saying is that to predict the future, it suffices to find a succinct description of the past–a description that takes many fewer bits to write down than the past data itself. Hence the dependence on | C | or VCdim ( C ): the size or dimension of the concept class from which our hypothesis is drawn. We also talked about the computational problem of finding a small hypothesis that agrees with the data. Certainly we can always solve this problem in polynomial time if P = NP . But what if P = NP ? Can we show that “learning is NP-hard ”? Here we saw that we need to distinguish two cases: Proper learning problems (where the hypothesis has to have a certain form): Sometimes we can show these are NP-hard . Example: Finding a DNF expression that agrees with the data. Improper learning problems (where the hypothesis can be any Boolean circuit): It’s an open problem whether any of these are NP-hard . (Incidentally, why do we restrict the hypothesis to be a Boolean circuit? It’s equivalent to saying, we should be able to compute in polynomial time what a given hypothesis predicts.) So, if we can’t show that improper (or “representation-independent”) learning is NP-complete , what other evidence might there be for its hardness? The teaser from last time: we could try to show that finding a hypothesis that explains past data is at least as hard as breaking some cryptographic code!...
View Full Document

This note was uploaded on 12/26/2011 for the course ENGINEERIN 18.400J taught by Professor Prof.scottaaronson during the Spring '11 term at MIT.

Page1 / 4

MIT6_045JS11_lec21 - 6.080/6.089 GITCS Feb 5, 2008 Lecture...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online