MIT6_045JS11_lec20

MIT6_045JS11_lec20 - 6.080/6.089 GITCS 1 April 2008 Lecture...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 6.080/6.089 GITCS 1 April 2008 Lecture 20 Lecturer: Scott Aaronson Scribe: Geoffrey Thomas Probably Approximately Correct Learning In the last lecture, we covered Valiant’s model of “Probably Approximately Correct” (PAC) learn- ing. This involves: S : A sample space (e.g., the set of all points) D : A sample distribution (a probability distribution over points in the sample space) c : S → { , 1 } : C : A concept , which accepts or rejects each point in the sample space A concept class , or collection of concepts For example, we can take our sample space to be the set of all points on the blackboard, our sample distribution to be uniform, and our concept class to have one concept corresponding to each line (where a point is accepted if it’s above the line and rejected if it’s below it). Given a set of points, as well as which points are accepted or rejected, our goal is to output a hypothesis that explains the data: e.g., draw a line that will correctly classify most of the future points. A bit more formally, there’s some “true concept” c ∈ C that we’re trying to learn. Given sample points x 1 ,...,x m , which are drawn independently from D , together with their classifications c ( x 1 ) ,...,c ( x m ), our goal is to find a hypothesis h ∈ C such that Pr [ h ( x ) = c ( x )] ≥ 1 − . Furthermore, we want to succeed at this goal with probability at least 1 − δ over the choice of x i ’s. In other words, with high probability we want to output a hypothesis that’s approximately correct (hence “Probably Approximately Correct”). How many samples to learn a finite class? The first question we can ask concerns sample complexity : how many samples do we need to have seen to learn a concept effectively? It’s not hard to prove the following theorem: after we see m = O 1 log | C | δ samples drawn from D , any hypothesis h ∈ C we can find that agrees with all of these samples...
View Full Document

This note was uploaded on 12/26/2011 for the course ENGINEERIN 18.400J taught by Professor Prof.scottaaronson during the Spring '11 term at MIT.

Page1 / 4

MIT6_045JS11_lec20 - 6.080/6.089 GITCS 1 April 2008 Lecture...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online