mlintro[1] - Machine Learning Intro CPS 170 Ron Parr...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Machine Learning Intro CPS 170 Ron Parr Why Study Learning? •  Considered a hallmark of intelligence •  Viewed as way to reduce programming burden •  Many algorithms assume parameters that are difficult to determine exactly a priori 1 Examples •  SPAM classificaJon •  ComputaJonal Biology/medicine –  DisJnguish healthy/diseased Jssue (e.g., skin/colon cancer) –  Find structure in biological data (regulatory pathways) •  Financial events –  Predict good/bad credit risks –  Predict price changes –  Response to markeJng •  •  •  •  Drilling sites likely to have oil Document categorizaJon Learn to play games Learn to control systems –  Fly Helicopter –  OpJmize OS components •  Public database of learning problems: –  hYp://www.ics.uci.edu/~mlearn/MLSummary.html Who Does Machine Learning? •  In AI –  Core AI topic (AAAI, IJCAI) –  Specialized communiJes (ICML, NIPS) •  Databases (data mining  ­ KDD) •  Used in (CS): –  Vision –  Systems –  Comp. Bio •  StaJsJcs 2 Who Does Machine Learning (@Duke) •  CS: –  Faculty: Pankaj Agarwal, Vince Conitzer, Alex Hartemink, Kamesh Munagala, Ron Parr, Carlo Tomasi, Jun Yang •  ISDS (everybody, but especially): –  ScoY Schmidler, Sayan Mukherjee •  IGSP: –  Terry Furey, Uwe Ohler •  Engineering: –  Larry Carin, Silvia Ferrari, Rebecca WilleY Who Hires in Machine Learning? •  •  •  •  •  •  UniversiJes Microsoe Research Search: Google/Yahoo/Amazon Defense contractors Some financial insJtuJons (quietly) Many startups •  ML viewed as good background for many other tasks (roboJcs, vision, systems, engineering) 3 What is Machine Learning •  Learning Element –  The thing that learns •  Performance Element –  ObjecJve measure of progress •  Learning is simply an increase in the ability of the learning element over Jme to achieve the task specified by the performance element ML vs. StaJsJcs? •  Machine learning is: –  –  –  –  –  Younger More empirical More algorithmic (arguably) More pracJcal (arguably) More decision theoreJc •  StaJsJcs is: –  More mature –  (arguably) More formal and rigorous 4 ML vs. Data Mining •  Machine Learning is: –  (Arguably) more formal –  (Arguably) more task driven/decision theoreJc •  Data Mining is: –  More constrained by size of data set –  More closely Jed to database techniques Types of Learning •  InducJve Learning –  Acquiring new informaJon that previously was not available –  Learning concepts •  Speedup learning –  Learning to do something you already “know” faster or beYer 5 Feedback in Learning •  Supervised Learning –  Given examples of correct behavior •  Unsupervised Learning –  No external noJon of what is correct –  Is this well ­defined? •  Reinforcement Learning –  Indirect indicaJon of effecJveness Learning Methodology •  DisJncJon between training and tesJng is crucial •  Correct performance on training set is just memorizaJon! •  Researcher should never look at the test data •  Raises some troubling issues for “benchmark” learning problems 6 ComputaJonal Learning Theory •  Formal study of what can be learned from data •  Closely related to ML, but also to CS theory •  AssumpJons: –  Training examples must be representaJve –  Algorithm needn’t always work, but should scale well •  Goals: –  Algorithms that have a low error rate with high probability –  Good characterizaJon of how performance scales COLT •  Learning theory is elegant and mathemaJcally rich. However, –  It someJmes isn’t construcJve –  It someJmes tells us how many data are needed, but not how to manipulate the data efficiently •  Through the late 90’s, learning theory drieed away from pracJcal learning algorithms •  New advances fresh thinking have led to a rapprochement, e.g.: –  Support vector machines –  BoosJng 7 Example: Supervised Learning •  Classical framework •  Target concept, e.g., green •  Learner is presented with labeled instances –  True: Green cones, green cubes, green spheres –  False: Red cones, red cubes, red spheres, blue cones, blue cubes, blue spheres •  Learner must correctly idenJfy the target concept from the training data Performance Measure •  Training set won’t have all possible objects •  Test set will contain novel objects –  Blue cylinders, yellow tetrahedra •  To learn successfully, learner must have good performance when confronted w/novel objects –  This is what we would expect from people –  A blue Broccolisaurus is sJll blue 8 Why Learning Is Tricky •  Suppose we have seen: –  Red tetrahedron(f), Blue sphere(t), Blue cone (t), green cube(f) •  Possible concepts: –  Blue –  (Blue Sphere) or (Blue Cone) –  Objects a prime number from start –  Objects with a circular cross ­secJon •  What if some data are mislabeled? Learning and RepresentaJon •  Learning is very sensiJve to representaJon •  Every learning algorithm can be viewed as a search through a space of concepts •  Space of concepts determines –  –  –  –  Difficulty of task Appropriate algorithm RestricJng too aggressively can trivialize problem Failure to restrict (or regularize) can trivialize the problem •  Example Space: ConjuncJons of colors and shapes –  Eliminates primes and (possibly) cross secJons 9 Management of the Hypothesis Space •  Ockham’s Razor: –  All things being equal, favor the simplest consistent hypothesis –  Guiding principle of science,e.g., Einstein: ‘In my opinion the theory here is the logically simplest relaJvisJc field theory that is at all possible. But this does not mean that nature might not obey a more complex theory. More complex theories have frequently been proposed… In my view, such more complicated systems and their combinaJons should be considered only if there exist physical ­empirical reasons to do so.’ •  Ockham’s razor is not provably correct, but –  ComputaJonal learning theory shows us that the more choices we have, the more data we need to disJnguish reliably among these choices –  Well known trade off between bias and variance •  How many points do you need to fit a degree 2 polynomial? •  How many points do you need to fit a degree 100 polynomial? •  Ockham’s razor is embodied in a wide range of methods Learning Intro Final Thoughts •  Machine learning is one of the most successful areas of AI –  Many pracJcal applicaJons –  Many ways to succeed without solving the “whole problem” –  Many fields view machine learning as a special sauce that will give them an advantage •  Machine learning conferences are almost as large as the general AI conferences 10 How to Succeed with Machine Learning •  TheoreJcal/algorithmic success –  Maneuver through space of hypotheses efficiently –  Efficiency •  Make good use of data •  Make good use of Jme •  PracJcal Success –  Gepng something to learn can be hard (my job!) –  Know your problem! •  Pick training data carefully •  Crae hypothesis space 11 ...
View Full Document

This note was uploaded on 02/17/2012 for the course COMPSCI 170 taught by Professor Parr during the Spring '11 term at Duke.

Ask a homework question - tutors are online