41 Pages

cs101.2-08-active-learning

Course: CS 101, Fall 2009
School: Caltech
Rating:
 
 
 
 
 

Word Count: 1507

Document Preview

Learning Active and Optimized Information Gathering Lecture 8 Active Learning CS 101.2 Andreas Krause Announcements Homework 1: Due today Office hours Come to office hours before your presentation! Andreas: Monday 3pm-4:30pm, 260 Jorgensen Ryan: Wednesday 4:00-6:00pm, 109 Moore 2 Outline Background in learning theory Sample complexity Key challenges Heuristics for active learning Principled algorithms for...

Register Now

Unformatted Document Excerpt

Coursehero >> California >> Caltech >> CS 101

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Learning Active and Optimized Information Gathering Lecture 8 Active Learning CS 101.2 Andreas Krause Announcements Homework 1: Due today Office hours Come to office hours before your presentation! Andreas: Monday 3pm-4:30pm, 260 Jorgensen Ryan: Wednesday 4:00-6:00pm, 109 Moore 2 Outline Background in learning theory Sample complexity Key challenges Heuristics for active learning Principled algorithms for active learning 3 Spam or Ham? Spam x2 + + o o o o o o + Ham o o o o o o o o x1 label = sign(w0 + w1 x1 + w2 x2) (linear separator) Labels are expensive (need to ask expert) Which labels should we obtain to maximize classification accuracy? 4 Recap: Concept learning Set X of instances, with distribution PX True concept c: X {0,1} Data set D = {(x1,y1),,(xn,yn)}, xi PX, yi = c(xi) Hypothesis h: X {0,1} from H = {h1, , hn, } Assume c H (c also called target hypothesis) errortrue(h) = EX |c(x)-h(x)| errortrain(h) = (1/n) i |c(xi)-h(xi)| If n large enough, errortrue(h) errortrain(h) for all h 5 Recap: PAC Bounds How many samples n to we need to get error with probability 1- ? No noise: n 1/ ( log |H| + log 1/ ) Noise: n 1/2 ( log |H| + log 1/ ) Requires that data is i.i.d.! Today: Mainly no-noise case (more next week) 6 Statistical passive/active learning protocol Data source PX (produces inputs xi) Active learner assembles data set Dn = {(x1,y1),,(xn,yn)} by selectively obtaining labels Learner outputs hypothesis h errortrue(h) = Ex~P[h(x) c(x)] Data set NOT sampled i.i.d.!! 7 Example: Uncertainty sampling Budget of m labels Draw n unlabeled examples Repeat until weve picked m labels Assign each unlabeled data an uncertainty score Greedily pick the most uncertain example One of the most commonly used class of heuristics! 8 Uncertainty sampling for linear separators 9 Active learning bias 10 Active learning bias If we can pick at most m = n/2 labels, with overwhelmingly high probability, US pick points such that there remains a hypothesis with error > .1!!! With standard passive learning, error 0 as n 11 Wish list for active learning Minimum requirement Consistency: Generalization error should go to 0 asymptotically Wed like more than that: Fallback guarantee: Convergence rate of error of active learning at least as good as passive learning What were really after Rate improvement: Error of active learning decreases much faster than for passive learning 12 From passive to active Passive PAC learning 1. 2. 3. Collect data set D of n 1/ ( log |H| + log 1/ ) data points and their labels i.i.d. from PX Output consistent hypothesis h With probability at least 1-, errortrue(h) Key idea Sample n unlabeled data points DX={x1,,xn} i.i.d. Actively query labels until all hypotheses consistent with these labels agree on the labels of all unlabeled data 13 Why might this work? 14 Formalization: Relevant hypothesis Data set D = {(x1,y1),,(xn,yn)}, Hypothesis space H Input data: DX = {x1,,xn} Relevant hypothesis H(DX) = H = Restriction of H on DX Formally: H = {h: DX {0,1} h H s.t. x DX: h(x)=h(x)} 15 Example: Threshold functions 16 Version space Input data DX = {x1,,xn} Partially labeled: Have L = {(xi ,yi ),,(xi ,yi )} 1 1 m m The (relevant) version space is the set of all relevant hypotheses consistent with the labels L Formally: Why useful? Partial labels L imply all remaining labels for DX |V|=1 17 Version space Input data DX = {x1,,xn} Partially labeled: Have L = {(xi ,yi ),,(xi ,yi )} 1 1 m m The (relevant) version space is the set of all relevant hypotheses consistent with the labels L Formally: V(DX,L) = V = {h H(DX): h(xi )=yi for 1 j m} j j Why useful? Partial labels L imply all remaining labels for DX |V|=1 18 Example: Binary thresholds 19 Pool-based active learning with fallback 1. 2. 3. Collect n 1/ ( log |H| + log 1/ ) unlabeled data points DX from PX Actively request labels L until there remains a single hypothesis h H thats consistent with these labels (i.e., |V(H,L)| = 1) Output any hypothesis hH consistent with the obtained labels. With probability 1- errortrue(h) Get PAC guarantees for active learning Bounds on #labels for fixed error carry over from passive to active Fallback guarantee 20 Wish list for active learning Minimum requirement Consistency: Generalization error should go to 0 asymptotically Wed like more than that: Fallback guarantee: Convergence rate of error of active learning at least as good as passive learning What were really after Rate improvement: of Error active learning decreases much faster than for passive learning 21 Pool-based active learning with fallback 1. 2. 3. Collect n 1/ ( log |H| + log 1/ ) unlabeled data points DX from PX Actively request labels L until there remains a single hypothesis h H thats consistent with these labels (i.e., |V(H,L)| = 1) Output any hypothesis hH consistent with the obtained labels. With probability 1- errortrue(h) 22 Example: Threshold functions 23 Generalizing binary search [Dasgupta 04] Want to shrink the version space (number of consistent hypotheses) as quickly as possible. General (greedy) approach: For each unlabeled instance xi compute vi,1 = vi,0 = vi = min {vi,1, vi,0 } Obtain label yi for xi where i = argmaxj {vj} 24 Ideal case 25 Is it always possible to half the version space? 26 Typical case much more benign 27 Query trees A query tree is a rooted, labeled tree on the relevant hypothesis H Each node is labeled with an input x DX Each edge is labeled with {0,1} Each path from root to hypothesis h H is a labeling L such that V(DX,L) = {h} Want query trees of minimum height 28 Example: Threshold functions 29 Example: linear separators (2D) 30 Number of labels needed to identify hypothesis Depends on target hypothesis! Binary thresholds (on n inputs D_X) Optimal query tree needs O(log n) labels! For linear separators in 2D (on n inputs D_X) For some hypotheses, even optimal tree needs n labels On average, optimal query tree needs O(log n) labels! Average-case analysis of active learning 31 Average case query tree learning Query tree T Cost(T) = 1/|H| h` H depth(h,T) Want T* = argminT Cost(T) Superexponential number of query trees Finding the optimal one is hard 32 Greedy construction of query trees [Dasgupta 04] Algorithm GreedyTree(DX, L) V = H(DX) If V={h} return Leaf(h) Else For each unlabeled instance xi compute vi,1 = |V(H,L {(xi,1)}| and vi,0 = |V(H,L {(xi,0)}| vi = min {vi,1, vi,0} Let i = argmaxj {vj} LeftSubTree = GreedyTree(DX, L {(xi,1)}) RightSubTree = GreedyTree(DX, L {(xi,0)}) return Node xi with children LeftSubTree (1) and RightSubTree(0) 33 Near-optimality of greedy tree [Dasgupta 04] Theorem: Let T* = argminT Cost(T) Then GreedyTree constructs a query tree T such that Cost(T) = O(log |H|) Cost(T*) 34 Limitations of this algorithm Often computationally intractable Finding most-disagreeing hypothesis is difficult No-noise assumption Will see how we can relax these assumptions in the talks next week. 35 Bayesian or not Bayesian? Greedy querying needs at most O(log |H|) queries more than optimal query tree on average Assumes prior distribution (uniform) on hypotheses If our assumption is wrong, generalizat...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Auburn - MECH - 3230
Auburn - MECH - 2120
I.7 Simulation of Kinematic Chains with M athematicaT M0Contents7 Simulation of Kinematic Chains with MathematicaT M 7.1 Position Analysis . . . . . . . . . . . . . . . . . . . 7.2 Velocity and Acceleration Analysis . . . . . . . . . 7.3 Contour
Auburn - MERT - 3060
EXPERIMENT 3: Amplitude Modulation and EnvelopesNew Modules: VARIABLE DC, UTILITIES, SPEECH MODULEPre-Lab Reading: 1) TIMS-301 Basic Modules User Manual pages: 23-25. 2) Advanced Modules Manual pages: 98-99. 3) Student Text Vol-A1 pages: 47-52, 61
Caltech - ETD - 03302006
Caltech - ETD - 08172004
Caltech - ETD - 05212007
V-1CHAPTER VStrain Rate Induced Crystallization in Bulk Metallic Glass Forming LiquidAbstractWe report on the solidification of Au49Ag5.5Pd2.3Cu26.9Si16.3 bulk metallic glass under various strain rates. Using a copper mold casting technique wi
Caltech - ETD - 09102002
Caltech - ETD - 04262004
187Chapter 7Conclusions188 The use of ultrasonic irradiation, alone or in combination with ozonation, has been shown to enhance a variety of different chemical reactions in aqueous solution. In Chapter 3, we demonstrated the potential of a nove
Caltech - ETD - 12022003
Caltech - ETD - 04262004
Chemical Effects of Acoustic CavitationThesis by Timothy Michael LeskoIn Partial Fulfillment of the Requirements for the Degree of Doctor of PhilosophyCalifornia Institute of Technology Pasadena, California 2004 (Defended April 21, 2004)ii
Caltech - ETD - 05262005
1Chapter 1INTRODUCTION1.1. MOTIVATION AND BACKGROUND1.1.1. Research with femtosecond laser pulses Understanding the propagation of femtosecond light pulses is of great value for both scientific and technological applications. The short pulse
Auburn - MPF - 0001
Group Project-Option 1 Required Tasks for Success in a Business and Marketing Class and Accommodations to Allow Access to All Students RSED 5000/6000/6006 Dr. Dunn-Summer 2008 By Brandon Brown Michelle Foshee Chanda Haselman Brittney JohnsonPurpose
Auburn - THOMPH - 1
Morocco and the US Free Trade Agreement: Rural Wages, Energy Imports, and UnemploymentMostafa Malki University of Texas - BrownsvilleHenry Thompson Auburn UniversityFebruary 2009The impact of the US Free Trade Agreement in Morocco is examined
Auburn - THOMPH - 1
Auburn - MT - 439
RB757WH/CH Hotpoint 30" Free-Standing Electric RangeDimensions (in inches)KW Rating29-7/87-1/27-1/2 7 5 3-1/2REAR WALL240V 208V2-1/411.7 8.8Breaker Size 40 Amps2-1/446-1/2 1/4CENTER LINENote: Check local codes for required
Auburn - MT - 439
GE AdoraTM 30" Free-Standing Electric RangeMfg Part Number: JBP83SHSS THD SKU: 206309 UPC: 084691103806x Self-Clean Heavy-Duty Oven Racks - Double coated procelain-enameled racks no longerrequire hand cleaning x Self-Clean Oven - Conveniently cle
Auburn - MT - 439
Americana 30" Free-Standing Gas RangeMfg Part Number: AGBS300PJWW THD SKU: 206939 UPC: 084691107194x Extra-Large Oven Capacity - Provides a large oven interior ideal for cooking more items atonce x All Purpose Burners - Delivers a wide range of h
Auburn - MT - 439
AmericanaTM 30" Free-Standing Electric RangeMfg Part Number: ABS300JWW THD SKU: 204762 UPC: 084691107170x Extra-Large Oven Capacity - Provides a large oven interior ideal for cooking more items atonce x Coil Heating Elements - Provides consistent
Caltech - ABSTRACTS - 02
A SYNAPTIC LEARNING RULE FOR LOCAL SYNAPTIC INTERACTIONS BETWEEN EXCITATION AND SHUNTING INHIBITION1Chun-Hui Mo1 and Christof Koch1,3 2 Division of Biology and Division of Engineering and Applied Science California Institute of Technology Pasadena
Caltech - ABSRTACTS - 04
Computational subunits in thin dendrites of pyramidal cellsAlon Polsky1, Bartlett W Mel2 & Jackie Schiller11 2Department of Physiology, Technion Medical School, Bat-Galim, Haifa 31096, Israel. Department of Biomedical Engineering University of So
Caltech - CHEM - 1
Ch 1a The 18-Electron RuleSupplemental MaterialsIntroduction: Electron counting is a formalism and does not necessarily reflect the distribution of electrons in a compound. There are many compounds that violate the 18Electron Rule, but we will no
Caltech - CHEM - 1
1Introduction to Nuclear Magnetic Resonance (NMR)04-1 NMR is a powerful tool for the study of molecular structure and dynamics with mass spectrometry, provides high resolution methods for chemical analysis Magnetic Resonance Imaging (MRI) met
Caltech - CHEM - 1
1Today: Degeneracies of energy levels Boltzmann Distributions Degeneracies & populations of energy levels Molecular rotational spectroscopy quantum numbers rotational states reduced mass & moment of Inertia rotational constants rotational absorptio
Auburn - E - 10
Solutions to Homework 10 Problems ELEC 7250 VLSI Testing (Spring 2005)April 25, 2005Problem 12.1 Non-robust path-delay testThe given circuit has no redundant single stuck-at fault. This can be veried either by an ATPG program or by manually simul
Auburn - E - 7250
Solutions to Homework 10 Problems ELEC 7250 VLSI Testing (Spring 2005)April 25, 2005Problem 12.1 Non-robust path-delay testThe given circuit has no redundant single stuck-at fault. This can be veried either by an ATPG program or by manually simul
Auburn - E - 7250
Solutions to Homework 6 Problems ELEC 7250 VLSI Testing (Spring 2005)March 11, 2005Problem 7.3 D-ALGWe level order the signals and proceed as follows: Step no. 1 Action Fault Activation Immediate impl. Immediate impl. Immediate impl. Immediate i
Auburn - E - 7250
Solutions to Homework 5 Problems ELEC 7250 VLSI Testing (Spring 2005)March 8, 2005Problem 6.3The following figure shows a two-bit shift register. Initially, both flip-flops are in the 0 state. The first two 0 inputs initialize the flip-flops to
Auburn - E - 123
Solutions to Homework 3 Problems ELEC 7250 VLSI Testing (Spring 2005)February 18, 2005Problem 4.9 Functional equivalenceFaulty functions for the circuit of Figure 4.6 corresponding to the two faults are: z(c s - a - 1) = ab.(ab.b) = ab.(ab + b)
Auburn - E - 123
Solutions to Homework 1 Problems ELEC 7250 VLSI Testing (Spring 2005)February 10, 2005Problem 1.1 Chip testingThe events of Example 1.1 are redened as follows: PQ: FQ: chip is good chip is bad P: F: chip passes the test chip fails the testA 70
Auburn - E - 7250
Solutions to Homework 5 Problems ELEC 7250 VLSI Testing (Spring 2005)March 8, 2005Problem 6.3 SCOAPx x x x (1,1)11 2 3 4 5 (1,1)9 6 (1,1)10 (1,1)10 (1,1)9 (3,2)8 11 10 8 9 8 x 9 x 1 (1,1)9 (4,2)8 (4,2)6 (4,2)6 w 1 (5,5)3 (8,5)0 (4,2)6 w (CC0,CC1
Auburn - E - 123
Solutions to Homework 2 Problems ELEC 7250 VLSI Testing (Spring 2005)February 13, 2005Problem 3.1 Economic decisionWe start with the following formula for the price of the car deriven by John (Equation 3.2 on page 38 of the book): P = 20, 000 +
Auburn - MECH - 4420
MECH 4420: Vehicle Dynamics Fall 2007 Aero 254, TR 11:00-12:15 Instructor: Dr. David M. Bevly (Ross 264) Phone: 844-3446 Office Hours: Tuesday-Thursday, 12:30-2 Class Website: Textbook: http:/eng.auburn.edu/~dmbevly/mech4420/Gillespie, "Fundamental
Auburn - EDMD - 3300
Journal #1 Hi, My name is Anne Joseph. I am a junior majoring in Early Childhood Education and I am from LaGrange, GA. It is only about 40 min. up I-85. Very close, but I dont seem to make it home very often. I come from an amazing family and we have
Auburn - EDMD - 3300
April 13, 2005 Dr. Chris Murphy Technology Coordinator Central School System 123 Sycamore St. Springville, AL 35146-6433 Dear Dr. Murphy, I am writing on behalf of my EDMD 3300 class. We are evaluating websites and I would like to inform you of what
Auburn - EDMD - 3300
Anne Joseph Journal #7 I think Web-CT has many positive aspects that can enhance a course. The information is always there and you can access it at any time of the day. The teachers can post information there for their students instead of e-mailing t
Auburn - EDMD - 3300
ChalkCChalkboardboa rdWindowWindowFlagCalendarGlobeStudent's tableWindowStudent's tableART CENTERStudent's tableComputer CenterStudent's tableWindowStudent's tableTeacher DeskStorage CenterReading CenterMusic Center
Auburn - EDMD - 3300
April 13, 2005 Dr. Chris Murphy Technology Coordinator Central School System 123 Sycamore St. Springville, AL 35146-6433 Dear Dr. Murphy,I am writing on behalf of my EDMD 3300 class. We are evaluating software and I would like to inform you on what
Auburn - EDMD - 3300
Anne Josephs Teaching Josephs PhilosophyTeachingBeing a teacher is the most rewarding job someone could have. It is our job to implement learning in every student that we come in contact with. We have the power to impact these students lives good
Auburn - EDMD - 3300
April 13, 2005 Dr. Chris Murphy Technology Coordinator Central School System 123 Sycamore St. Springville, AL 35146-6433 Dear Dr. Murphy, I am writing on behalf of my EDMD 3300 class. We are evaluating videos and I would like to inform you of what I
Auburn - EDMD - 3300
Elementary Gradebook Ms. Joseph Monday - Friday 8:00-3:00 Possible 10 Name Allen, Lori Campbell, Kristin Fields, William Johnson, Casey McCulloch, Lindsey Payant, Cindy Class Average HW 1 9 10 7 8 10 9 Possible 10 HW 2 8 7 8 10 9 10 Possible 10 HW 3
Auburn - FY - 614
Auburn - FY - 614
Auburn - BULL - 107
245 show practically no residual effect, for the very unfavorable weather conditions of 1899 may have been responsible for the above mentioned negative results. With larger amounts of seed, and on other soils observation has shown that cotton seed do
Auburn - BULL - 87
SOIL INOCULATIONFOR LEGUMINOUS PLANTS.kGENERAL~BY J. F. DUGGAR.~OBJECTOF THESEEXPERIMENTS.A summary of this bulletin is given on page 483.The subject of maintaining the fertility of the land very dosely concerns every tiller of the s
Auburn - BULL - 69
253BORDEAUX MIXTURE.As commonly applied, it is formed of six pounds of copper sulphate (bluestone) and four to six pounds of quick lime dissolved in fifty gallons of Water. The bluestoneshould be dissolved as in the preceding formula. The mixture
Auburn - BULL - 107
296 The Rhizoctonia of cotton is very widely distributed. It probably occurs in every cotton field in the State. During wet, unfavorable spring~ it kills a great manyplants, and yet owing to the habit of very heavy seedingand of only chopping to a
Auburn - BSCI - 7100
Auburn University Department of Building Science BSCI 7100-005 INFORMATION TECHNOLOGY IN CONSTRUCTION COURSE SYLLABUS Summer 2007Instructor:Salman Azhar, Ph.D. Office: Gorrie 216; Office Phone: 844-5383; Email: salman@auburn.eduOffice Hours: 3:
Caltech - AE - 232
Time marching schemes Introduction Linear multistep methods Runge Kutta methodsAe232a. Tim Colonius1ODE PDE in fluid mechanics have spatial and temporal derivatives Time and time-like variables Causality (present doesn't depend on futur
Caltech - BI - 1
The Stem Cell PromiseDavid Baltimore Physics at Caltech 19 Oct., 06The Central Dogma of Molecular BiologyUpdated Dogma (1970)Reverse TranscriptionDiscovery of Reverse Transcription In 1960, when I entered the eld of molecular biology, we kne
Caltech - ETD - 03242005
Caltech - ETD - 08212008
Caltech - ETD - 10162002
Caltech - ETD - 08062004
Caltech - ETD - 04172003
Caltech - ETD - 12042006
Auburn - JONESP - 1
MECH 4430 08S Final Examination NAME _150 minutes, 25 points, 5 questions/problems, weighted as marked. Closed book, Closed notes, except for both sides of one 8 11 in. page. Turn in only the attached exam papers; front and back may be u
Auburn - MECH - 4240
MECH 4240 Group Leader Report Corporation 7 Project: NASA Directional Solidification Centrifuge Group Leader: Joel Reed Leadership Period: 11/01/00-12/08/00 Date: 01/10/01 Group members Dan Hendley Bruce Randall Joel Reed Brad Short Christopher K. Sm
Auburn - COMP - 7700
Process Phase Introduced in This ChapterChapter 1 Programming Review and Introduction to Software DesignRequirements Analysis Design Framework Architecture Detailed DesignKey: x = main emphasis x = secondary emphasisImplementationCOMP 7700
Caltech - CDC - 03
Cross-Disciplinary Research and Industrial Collaboration: A Two-Edged SwordB. Ross Barmish Department of Electrical and Computer Engineering University of Wisconsin-Madison Madison, Wisconsin 53706 barmish@engr.wisc.eduAbstract In my presentation,
Caltech - CDC - 03
Abstract: Integrated International Services for Industry The Industrial Control Centre at the University of Strathclyde was established over two decades ago to help transfer the technology of advanced systems engineering into industry. The Centre was
Caltech - ED - 02
Control and Dynamical Systems in Science and EngineeringProposal for the FIPSE/CAPES USA-Brazil Program 2002 Proposal Narrative1OverviewThis consortium brings together the California Institute of Technology (Caltech), Princeton University, and