lecture7 - 1 Eric Xing @ CMU, 2006-2008 1 Machine Learning...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 Eric Xing @ CMU, 2006-2008 1 Machine Learning Machine Learning 10 10-701/15 701/15-781, Fall 2008 781, Fall 2008 Support Vector Machines Support Vector Machines Eric Xing Eric Xing Lecture 7, September 29, 2008 Reading: Chap. 6&7, C.B book, and list papers Eric Xing @ CMU, 2006-2008 2 Outline z Maximum margin classification z Constrained optimization z Lagrangian duality z Kernel trick z Non-separable cases 2 Eric Xing @ CMU, 2006-2008 3 What is a good Decision Boundary? z Consider a binary classification task with y = 1 labels (not 0/1 as before). z When the training examples are linearly separable, we can set the parameters of a linear classifier so that all the training examples are classified correctly z Many decision boundaries! z Generative classifiers z Logistic regressions z Are all decision boundaries equally good? Class 1 Class 2 Eric Xing @ CMU, 2006-2008 4 What is a good Decision Boundary? 3 Eric Xing @ CMU, 2006-2008 5 Not All Decision Boundaries Are Equal! z Why we may have such boundaries? z Irregular distribution z Imbalanced training sizes z outliners Eric Xing @ CMU, 2006-2008 6 Classification and Margin z Parameterzing decision boundary z Let w denote a vector orthogonal to the decision boundary, and b denote a scalar "offset" term, then we can write the decision boundary as: Class 1 Class 2 d - = + b x w T d + 4 Eric Xing @ CMU, 2006-2008 7 Classification and Margin z Parameterzing decision boundary z Let w denote a vector orthogonal to the decision boundary, and b denote a scalar "offset" term, then we can write the decision boundary as: Class 1 Class 2 = + b x w T z Margin w T x i + b > + c for all x i in class 2 w T x i + b < c for all x i in class 1 Or more compactly: ( w T x i + b ) y i >c The margin between two points m = d + d + = d - d + Eric Xing @ CMU, 2006-2008 8 Maximum Margin Classification z The margin is: z Here is our Maximum Margin Classification problem: ( ) w c x x w w m j i T 2 = = * * i c b x w y w c i T i w + , ) ( s.t max 2 5 Eric Xing @ CMU, 2006-2008 9 Maximum Margin Classification, con'd. z The optimization problem: z But note that the magnitude of c merely scales w and b , and does not change the classification boundary at all! (why?) z So we instead work on this cleaner problem: z The solution to this leads to the famous Support Vector Machines--- believed by many to be the best "off-the-shelf" supervised learning algorithm i c b x w y w c i T i b w + , ) ( s.t max , i b x w y w i T i b w + , ) ( s.t max , 1 1 Eric Xing @ CMU, 2006-2008 10 Support vector machine z A convex quadratic programming problem...
View Full Document

This note was uploaded on 01/26/2010 for the course MACHINE LE 10701 taught by Professor Ericp.xing during the Fall '08 term at Carnegie Mellon.

Page1 / 17

lecture7 - 1 Eric Xing @ CMU, 2006-2008 1 Machine Learning...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online